3

Given an array of arrays [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]

What is the simplest way to merge the array items that contain members that are shared by any two or more arrays items. For example the above should be [["A", "B", "C", "D","E", "F"], ["G"]] since "B" and "C" are shared by the first and second array items.

Here are some more test cases.

[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
=> [["A", "B", "C", "D", "E", "F", "G"]]

[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
=> [["A", "B", "C", "D", "E", "F"], ["G", "H,"]]
6
  • Shouldn't the merged result be [["B", "C"], ["E", "F", "A", "D", "G"]]?? Commented Nov 4, 2009 at 14:11
  • No! the 'merged' result should be [["A", "B", "C", "D","E", "F"], ["G"]] Commented Nov 4, 2009 at 14:18
  • What would your desired output be for [["B", "C", "E", "F"], ["A","B","C","D"], ["F", "G"]]? Commented Nov 4, 2009 at 14:50
  • 1
    @George: in your example given in the comment, why is G not included? Commented Nov 4, 2009 at 16:25
  • 1
    Oops sorry it should have been [["A","B","C","D","E","F","G"]]. Thanks for the correction. The idea is that i want only sets which do not share members. If a set shares a member with another one, then 'merge' the sets. thus for example [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]] becomes [["A", "B", "C", "D","E", "F"], ["G"]] . Please note that "G" was not shared as a member by any other set. Thank you for the comments and discussion. Commented Nov 4, 2009 at 17:14

8 Answers 8

2

Here is my quick version which can be optimized I am sure :)

# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]

array.collect! do |e|
  t = e
  e.each do |f|
    array.each do |a|
      if a.index(f)
        t = t | a
      end
    end
  end
  e = t.sort
end

p array.uniq
Sign up to request clarification or add additional context in comments.

Comments

1

Edit: Martin DeMello code was fixed.

When running Martin DeMello code (the accepted answer) I get:

[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]] =>
[["B", "C", "E", "F", "A", "D", "G"], ["A", "B", "C", "D"], ["F", "G"]]
and
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]] =>
[["B", "C", "E", "F", "A", "D"], ["A", "B", "C", "D"], ["G", "H"], ["G", "H"]]

which does not seem to meet your spec.

Here is my approach using a few of his ideas:

a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
b = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]

def reduce(array)
  h = Hash.new {|h,k| h[k] = []}
  array.each_with_index do |x, i| 
    x.each do |j|
      h[j] << i
      if h[j].size > 1
        # merge the two sub arrays
        array[h[j][0]].replace((array[h[j][0]] | array[h[j][1]]).sort)
        array.delete_at(h[j][1])
        return reduce(array)
        # recurse until nothing needs to be merged
      end
    end
  end
  array
end

puts reduce(a).to_s #[["A", "B", "C", "D", "E", "F", "G"]]
puts reduce(b).to_s #[["A", "B", "C", "D", "E", "F"], ["G", "H"]]

Comments

1

Different algorithm, with a merge-as-you-go approach rather than taking two passes over the array (vaguely influenced by the union-find algorithm). Thanks for a fun problem :)

A = [["A", "G"],["B", "C", "E", "F"], ["A", "B", "C", "D"], ["B"], ["H", "I"]]
H = {}
B = (0...(A.length)).to_a

def merge(i,j)
  A[j].each do |e|
    if H[e] and H[e] != j
      merge(i, H[e])
    else
      H[e] = i
    end
  end

  A[i] |= A[j]
  B[j] = i
end

A.each_with_index do |x, i| 
  min = A.length
  x.each do |j| 
    if H[j]
      merge(H[j], i)
    else
      H[j] = i
    end
  end
end

out = B.sort.uniq.map {|i| A[i]}
p out

Comments

0

Not the simplest ,may be the longest :)

l = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
puts l.flatten.inject([[],[]])  {|r,e| if l.inject(0) {|c,a| if a.include?(e) then c+1 else c end} >= 2 then r[0] << e ; r[0].uniq! else r[1] << e end ; r}.inspect
#[["B", "C"], ["E", "F", "A", "D", "G"]]

1 Comment

Doesn't work for test case [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
0
 l = [["B", "C", "E", "F"], ["A", "B","C", "D"], ["G"]] 
 p l.inject([]){|r,e|
     r.select{|i|i&e!=[]}==[]&&(r+=[e])||(r=r.map{|i|(i&e)!=nil&&(i|e).sort||i})
 }

im not sure about your cond.

1 Comment

Doesn't work for test case [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
0

The simplest way to do it would be to take the powerset of an array (a set containing every possible combination of elements of the array), throw out any of the resulting sets if they don't have a common element, flatten the remaining sets and discard subsets and duplicates.

Or at least it would be if Ruby had proper Set support. Actually doing this in Ruby is horribly inefficient and an awful kludge:

power_set = array.inject([[]]){|c,y|r=[];c.each{|i|r<<i;r<<i+[y]};r}.reject{|x| x.empty?}
collected_powerset = power_set.collect{|subset| subset.flatten.uniq.sort unless
  subset.inject(subset.last){|acc,a| acc & a}.empty?}.uniq.compact

collected_powerset.reject{|x| collected_powerset.any?{|c| (c & x) == x && x.length < c.length}}

Power set operation comes from here.

2 Comments

that's pretty ugly, though - if you have m sets with a total of n elements, powerset is O(2^m), and then seeing if a set of sets has a common element is O(n), flattening is O(n), discarding subsets is O(m.n^2) at least, and discarding duplicates O(mn)
I never said it was efficient. I just thought it was the simplest algorithm to explain
0

Straightforward rather than clever. It's destructive of the original array. The basic idea is:

  • go down the list of arrays, noting which array each element appears in
  • for every entry in this index list that shows an element in more than one array, merge all those arrays into the lowest-indexed array
  • when merging two arrays, replace the lower-indexed array with the merged result, and the higher-indexed array with a pointer to the lower-indexed array.

It's "algorithmically cheaper" than intersecting every pair of arrays, though the actual running speed will depend on what ruby hands over to the C layer.

a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]

h = Hash.new {|h,k| h[k] = []}
a.each_with_index {|x, i| x.each {|j| h[j] << i}}
b = (0...(a.length)).to_a
h.each_value do |x| 
  x = x.sort_by {|i| b[i]}
  if x.length > 1
    x[1..-1].each do |i| 
      b[i] = [b[i], b[x[0]]].min
      a[b[i]] |= a[i]
    end 
  end
end

a = b.sort.uniq.map {|i| a[i]}

4 Comments

This throws an undefined method error: "undefined method `|' for "G":String"
EmFi: ah - i see, it was a scoping issue that was fixed in 1.9 (the |a,i| in each_with_index splatted the original a in 1.8). fixing the code.
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]] => [["B", "C", "E", "F", "A", "D", "G"], ["A", "B", "C", "D"], ["F", "G"]] which does not meet is spec outlined above. This should be [["A", "B", "C", "D", "E", "F", "G"]].
oops, missed a line while pasting in. fixed now
0
def merge_intersecting(input, result=[])
  head = input.first
  tail = input[1..-1]
  return result if tail.empty?
  intersection = tail.select { |arr| !(head & arr).empty? }
  unless intersection.empty?
    merged = head | intersection.flatten
    result << merged.sort
  end
  merge_intersecting(tail, result)
end


require 'minitest/spec'
require 'minitest/autorun'

describe "" do
  it "merges input array" do
    input  = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
    output = [["A", "B", "C", "D", "E", "F", "G"]]
    merge_intersecting(input).must_equal output
  end
  it "merges input array" do
    input  = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
    output = [["A", "B", "C", "D", "E", "F"], ["G", "H"]]
    merge_intersecting(input).must_equal output
  end
end

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.