1

I would like to (quickly) determine if one array contains all the elements of another array, taking into account that the arrays may have repeated elements.

Thus, I tried something like this:

alice = %w(a a a b)
bob = %w(a a b c d e)
alice & bob => ["a", "b"]
alice - bob => []

But what I would like is an operator that will let me determine that bob does not include all the elements of alice, because bob doesn't have enough "a" characters.

4
  • 2
    What have you tried? Commented Jun 15, 2012 at 16:18
  • All the core methods don't account for duplicates, so you'll probably have to write your own. Commented Jun 15, 2012 at 16:24
  • I'm new to Ruby and the core/std library is pretty extensive, so I wanted to see if someone had a simple answer before I go writing anything unnecessary. Commented Jun 15, 2012 at 17:17
  • BTW, I tried stuff like alice.each { |x| bob.delete(x) { return false } } but the first .delete(x) removes all the occurrences of x. This led to looping comparisons which look really un-ruby-like, so I figured there was probably a simple operator already written. Apparently not. Commented Jun 15, 2012 at 17:26

4 Answers 4

3
alice.select{|x| alice.count(x) > bob.count(x)}

Update Setting up a benchmark:

require 'benchmark'

def subset_multivalue?(a, b)
  bb = b.clone
  a.each do |e|
    i = bb.index(e)
    if i
      bb.delete_at(i)
    else
      return false
    end
  end
  return true
end

def subset_multivalue2?(a, b)
  a.find{|x| a.count(x) > b.count(x)}
end

def subset_multivalue3?(alice, bob)
  alice_counts = alice.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }
  bob_counts = bob.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }

  alice_counts.all? do |k, v|
    bob_counts.has_key?(k) && bob_counts[k] >= v
  end
end

alice = %w(a a a b)
bob = %w(a a b c d e)

Benchmark.bm do |x|
  x.report("dave:") do
    1000000.times do
      subset_multivalue?(alice, bob)
    end
  end

  x.report("me:") do
    1000000.times do
      subset_multivalue2?(alice,bob)
    end
  end

  x.report("andrew:") do
    1000000.times do
      subset_multivalue3?(alice,bob)
    end
  end
end

Results:

       user     system      total        real
dave: 15.054000   0.000000  15.054000 ( 15.108864)
me: 11.529000   0.031000  11.560000 ( 11.689669)
andrew: 65.036000   0.047000  65.083000 ( 67.463859)
Sign up to request clarification or add additional context in comments.

6 Comments

Wow, so simple. But it was more than 2x slower than my version; I tried a couple variations, it looks like count(Array) is just godawful slow.
Well, yours stops after the first occurrence. If that's what you want change my 'select' to 'find'.
That's the first variation I tried. It helps, but count(Array) is still the culprit.
This, unfortunately, has a rather poor O(n^2) time complexity.
@Dave, if the speed that much of an issue then maybe ruby is not the best language for your project. I updated to show a benchmark, mine actually does ok.
|
3

Probably easiest to count how often each element occurs so we don't have to worry too much about keeping track of duplicates we have/haven't counted:

alice_counts = alice.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }
#=> {"a"=>3, "b"=>1}

bob_counts = bob.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }
#=> {"a"=>2, "b"=>1, "c"=>1, "d"=>1, "e"=>1}

Then check that each key in alice_counts has a value equal or greater in bob_counts:

alice_counts.all? { |k, v| bob_counts[k] >= v }
#=> false

6 Comments

As someone who is learning Ruby, and who has limited experience with functional languages, I am continuously amazed at the power of the language in such a succinct amount of code. (+1)
This solution will check for exact correspondence; sort of the same as the == operator. I extended the array class with this solution (called it aaa). alice.aaa bob #=> false as expected and bob.aaa bob #=> true as expected. However, bob.dup.send(:+, [bob.last]).aaa bob #=> false when the new array still has all elements of bob. I duplicated bob to emphasize they are different objects. May be that I read the question wrong and this is the functionality required but I wanted to point this out.
Shockingly, if you reverse the objects in comparison, the test passes. So bob.aaa bob.dup.send :+, [bob.last] #=> true. Interesting
@Yasky a = bob + [bob.last]; a.aaa(bob) should be false, since a has more than bob, thus bob cannot contain all of a; similarly, alice has more than bob so alice.aaa(bob) is false. The way you did it the method name should be something like a.subset_of?(b).
Isn't bob_counts.has_key?(k) check superfluous? bob_counts[k] where k is not a key will be 0 anyway, as you set it as a default value of bob_counts, and the only way 0 >= v could be true is if alice_counts[k] == 0, which would also have to be a default, and so alice_count.all? would not yield it.
|
0

I think I'm settling on this:

def subset_multivalue?(a, b)
    bb = b.clone
    a.each do |e|
            i = bb.index(e)
            if i
                    bb.delete_at(i)
            else
                    return false
            end
    end
    return true
end

I realize it's not very rubyish, but it seems to do the job.

Comments

0

Enumerable#group_by() will be a choice.

alice = %w(a a a b)
bob = %w(a a b c d e)
alice_group = alice.group_by{|a| a}.map{|k,v| [k ,v.length]}
#=> [["a", 3], ["b", 1]]
bob_group = bob.group_by{|a| a}.map{|k,v| [k ,v.length]}
#=> [["a", 2], ["b", 1], ["c", 1], ["d", 1], ["e", 1]]
alice_group-bob_group
#=> [["a", 3]]
bob_group-alice_group
#=>[["a", 2], ["c", 1], ["d", 1], ["e", 1]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.