0

Example:

array1 = ["budget2017.doc", "accounting2017.doc", "mydogisdumb.doc"]
array2 = ["budget.doc", "accounting.doc", "imstupid.doc"] 

I would like to compare the two arrays for similarity and return the associated element from array1.

array1.select { |x| x.include?(array2) }

I need the result to be a new array with ["budget2017.doc", "accounting2017.doc"]

But obviously the above won't work because "budget.doc" is not a match with "budget2017.doc". I could accomplish what I need if I could just match the first few characters of each element and return the associated element from array1.

5
  • The logic you want to run is, for each element X1 in array1, iterate over all of the elements in array2, and for each element X2 in array2, if X1 matches X2, add it to the result. You can use a regex search with the =~ operator (x1 =~ x2 iirc). Commented Apr 4, 2017 at 13:59
  • Is there an exact pattern here? If it's the first two, you can just remove digits. But with the latter one, you have a bit of a mix. In the generic case, you can use a string similarity algorithm like Levenstein or Jaro–Winkler distance. Could you clarify what is needed to consider two strings matching abstractly? Commented Apr 4, 2017 at 14:08
  • ndn - there will be hundreds of filenames in each array, the filenames in array1 will have the same beginning as the filenames in array2 but will contain some extra characters before the extension. I am only interested in a match if the first few characters of each filename are the same. Commented Apr 4, 2017 at 14:24
  • So mydogisdumb.doc and imstupid.doc shouldn't match then? Also the result that you gave us shows two elements from array1, not one from array1 and one from array2, what is up with that? Also can you define "a few"? Is two enough? Is one enough? Commented Apr 4, 2017 at 14:28
  • that's right mydogisdumb and imstupid should not match. i want to compare every element in array1 with every element in array2 and if there are any matches - say the first 7 characters of an element from array1 matches the first 7 characters of an element from array2 - i want the result to be the associated element from array1. Commented Apr 4, 2017 at 14:38

3 Answers 3

1
array1 = %w[budget2017.doc accounting2017.doc mydogisdumb.doc]
array2 = %w[budget.doc accounting.doc imstupid.doc] 

array1.select do |elem|
  array2.any? do |ee|
    s, e = ee.split('.')
    elem.start_with?(s) && elem.end_with?(e)
  end
end
#⇒ ["budget2017.doc", "accounting2017.doc"] 

Or, a bit more efficient:

selectors = array2.map { |e| e.split('.') }
array1.select do |elem|
  selectors.any? do |(s, e)|
    elem.start_with?(s) && elem.end_with?(e)
  end
end
#⇒ ["budget2017.doc", "accounting2017.doc"] 
Sign up to request clarification or add additional context in comments.

Comments

0

As per the comments, finds all elements of array1, which have the same first 7 characters as an element out of array2:

array1.select do |element|
  array2.any? { |match_candidate| match_candidate.start_with? element[0...7] }
end

1 Comment

ndn - I will try that tonight. Thank you!
0
arr1 = ["budget2017.doc", "acc2017.doc", "acc.doc", "budget2016.doc", "foo.doc"] 
arr2 = ["budget.doc", "acc.doc", "foo.docx,", "goo.doc"] 

a2 = arr2.map { |s| s.split('.') }
  #=> [["budget", "doc"], ["acc", "doc"], ["foo", "docx,"], ["goo", "doc"]] 
arr1.select { |s1| a2.any? { |pfx, sfx| s1 =~ /\A#{pfx}.*\.#{sfx}\z/ } }
  #=> ["budget2017.doc", "acc2017.doc", "acc.doc", "budget2016.doc"]

1 Comment

start_with? and end_with? are proven to be way faster than regexp match.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.