67

[2, 6, 13, 99, 27].include?(2) works well for checking if the array includes one value. But what if I want to check if an array includes any one from a list of multiple values? Is there a shorter way than doing Array.include?(a) or Array.include?(b) or Array.include?(c) ...?

1
  • Did you know that index is faster than include?? Commented Sep 15, 2014 at 3:29

11 Answers 11

124

You could take the intersection of two arrays, and see if it's not empty:

([2, 6, 13, 99, 27] & [2, 6]).any?
Sign up to request clarification or add additional context in comments.

5 Comments

At first, I thought this answer is neater. But later, I came to think CodeGnome's answer might be more efficient.
@sawa I would expect Array & Array to be faster for the kind of small cases we're talking about here (avoiding .include?(a) || .include?(b)), as (I assume) it is implemented in C, where CodeGnome's answer is pure Ruby. Of course if the arrays are large, & is going to be much slower to produce the complete intersection just to test it with a simple any?
@meagar Even on the smaller scale of the OP's question, array intersection takes around 30% longer on my CPU than any? with a code block. Benchmarked at 100_000 iterations on an Intel i5; your mileage may vary. However, I like this answer for its brevity.
@CodeGnome Technically the question did ask for "quicker", but I think they mean "quicker" as in "easier to write". Since we're talking about arrays small enough to feasibly hardcode .include?(a) || .include?(b) || ... I don't think performance really matters :p
@meagar I don't think the performance of a single iteration matters at all, but in the interests of science the MRI benchmark is available as a gist.
26

You can use the Enumerable#any? method with a code block to test for inclusion of multiple values. For example, to check for either 6 or 13:

[2, 6, 13, 99, 27].any? { |i| [6, 13].include? i }

6 Comments

If performance were important, I would prefer this over array intersection, but why not go the extra mile and replace [6, 13] with a set containing those values? include? would still apply.
@CarySwoveland The OP is looking for any of the numbers to match, not all of them. Array#any? is correct if you're looking to iterate an OR condition.
@CarySwoveland Go ahead and benchmark it. I can't see how require 'set'; [2, 6, 13, 99, 27].to_set.disjoint? [6, 13].to_set can possibly be faster. In fact, it seems to be about 11x slower. If you can tune it to be faster than Enumerable#any?, I'd be very interested.
The [6,13] array is created 5 times this way. 5.times{p [6,13].object_id}
@steenslag Absolutely. All versions (except a single pass) could get a speed boost by moving the literal out of the block if they're looping, but we're benchmarking iterations of what we do in a single pass to make the cost of that code more visible. Within a normal loop, though, you're absolutely right: a single assignment outside the loop would be cheaper.
|
22

I was interested in seeing how these various approach compared in performance, not so much for the problem at hand, but more for general comparisons of array vs set intersection, array vs set include? and include? vs index for arrays. I will edit to add other methods that are suggested, and let me know if you'd like to see different benchmark parameters.

I for one would like to see more benchmarking of SO answers done. It's not difficult or time-consuming, and it can provide useful insights. I find most of the time is preparing the test cases. Notice I've put the methods to be tested in a module, so if another method is to be benchmarked, I need only add that method to the module.

Methods compared

module Methods
  require 'set'
  def august(a,b)    (a&b).any? end
  def gnome_inc(a,b) a.any? { |i| b.include? i } end
  def gnome_ndx(a,b) a.any? { |i| b.index i } end
  def gnome_set(a,b) bs=b.to_set; a.any? { |i| bs.include? i } end
  def vii_stud(a,b)  as, bs = Set.new(a), Set.new(b); as.intersect?(bs) end
end

include Methods
@methods = Methods.instance_methods(false)
  #=> [:august, :gnome_inc, :gnome_ndx, :gnome_set, :vii_stud]

Test data

def test_data(n,m,c,r)
  # n: nbr of elements in a
  # m: nbr of elements in b
  # c: nbr of elements common to a & b
  # r: repetitions
  r.times.each_with_object([]) { |_,a|
    a << [n.times.to_a.shuffle, [*(n-c..n-c-1+m)].shuffle] }
end

d = test_data(10,4,2,2)
  #=> [[[7, 8, 0, 3, 2, 9, 1, 6, 5, 4], [11, 10,  9, 8]], 
  #    [[2, 6, 3, 4, 7, 8, 0, 9, 1, 5], [ 9, 11, 10, 8]]]
# Before `shuffle`, each of the two elements is:
  #=> [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [8, 9, 10, 11]] 

def compute(d, m)
  d.each_with_object([]) { |(a,b),arr| arr << send(m, a, b) }
end  

compute(d, :august)
 #=> [true, true]

Confirm methods return the same values

d = test_data(1000,100,10,3)
r0 = compute(d, @methods.first) 
puts @methods[1..-1].all? { |m| r0 == compute(d, m) }
  #=> true

Benchmark code

require 'benchmark'

@indent = methods.map { |m| m.to_s.size }.max

def test(n, m, c, r, msg)
  puts "\n#{msg}"
  puts "n = #{n}, m = #{m}, overlap = #{c}, reps = #{r}"
  d = test_data(n, m, c, r)
  Benchmark.bm(@indent) do |bm|
    @methods.each do |m|
      bm.report m.to_s do
        compute(d, m)
      end
    end
  end
end

Tests

n = 100_000
m = 1000
test(n, m,    0,  1, "Zero overlap")
test(n, m, 1000,  1, "Complete overlap")
test(n, m,    1, 20, "Overlap of 1")
test(n, m,    5, 20, "Overlap of 5")
test(n, m,   10, 20, "Overlap of 10")
test(n, m,   20, 20, "Overlap of 20")
test(n, m,   50, 20, "Overlap of 50")
test(n, m,  100, 20, "Overlap of 100")

Zero overlap
n = 100000, m = 1000, overlap = 0, reps = 1
                                 user     system      total        real
august                       0.010000   0.000000   0.010000 (  0.005491)
gnome_inc                    4.480000   0.010000   4.490000 (  4.500531)
gnome_ndx                    0.810000   0.000000   0.810000 (  0.822412)
gnome_set                    0.030000   0.000000   0.030000 (  0.031668)
vii_stud                     0.080000   0.010000   0.090000 (  0.084283)

Complete overlap
n = 100000, m = 1000, overlap = 1000, reps = 1
                                 user     system      total        real
august                       0.000000   0.000000   0.000000 (  0.005841)
gnome_inc                    0.010000   0.000000   0.010000 (  0.002521)
gnome_ndx                    0.000000   0.000000   0.000000 (  0.000350)
gnome_set                    0.000000   0.000000   0.000000 (  0.000655)
vii_stud                     0.090000   0.000000   0.090000 (  0.097850)

Overlap of 1
n = 100000, m = 1000, overlap = 1, reps = 20
                                 user     system      total        real
august                       0.110000   0.000000   0.110000 (  0.116276)
gnome_inc                   61.790000   0.100000  61.890000 ( 62.058320)
gnome_ndx                   10.100000   0.020000  10.120000 ( 10.144649)
gnome_set                    0.360000   0.000000   0.360000 (  0.357878)
vii_stud                     1.450000   0.050000   1.500000 (  1.501705)

Overlap of 5
n = 100000, m = 1000, overlap = 5, reps = 20
                                 user     system      total        real
august                       0.110000   0.000000   0.110000 (  0.113747)
gnome_inc                   16.550000   0.050000  16.600000 ( 16.728505)
gnome_ndx                    2.470000   0.000000   2.470000 (  2.475111)
gnome_set                    0.100000   0.000000   0.100000 (  0.099874)
vii_stud                     1.630000   0.060000   1.690000 (  1.703650)

Overlap of 10
n = 100000, m = 1000, overlap = 10, reps = 20
                                 user     system      total        real
august                       0.110000   0.000000   0.110000 (  0.112674)
gnome_inc                   10.090000   0.020000  10.110000 ( 10.131339)
gnome_ndx                    1.470000   0.000000   1.470000 (  1.478400)
gnome_set                    0.060000   0.000000   0.060000 (  0.062762)
vii_stud                     1.430000   0.050000   1.480000 (  1.476961)

Overlap of 20
n = 100000, m = 1000, overlap = 20, reps = 20
                                 user     system      total        real
august                       0.100000   0.000000   0.100000 (  0.108350)
gnome_inc                    4.020000   0.000000   4.020000 (  4.026290)
gnome_ndx                    0.660000   0.010000   0.670000 (  0.663001)
gnome_set                    0.030000   0.000000   0.030000 (  0.024606)
vii_stud                     1.380000   0.050000   1.430000 (  1.437340)

Overlap of 50
n = 100000, m = 1000, overlap = 50, reps = 20
                                 user     system      total        real
august                       0.120000   0.000000   0.120000 (  0.121278)
gnome_inc                    2.170000   0.000000   2.170000 (  2.236737)
gnome_ndx                    0.310000   0.000000   0.310000 (  0.308336)
gnome_set                    0.020000   0.000000   0.020000 (  0.015326)
vii_stud                     1.220000   0.040000   1.260000 (  1.259828)

Overlap of 100
n = 100000, m = 1000, overlap = 100, reps = 20
                                 user     system      total        real
august                       0.110000   0.000000   0.110000 (  0.112739)
gnome_inc                    0.720000   0.000000   0.720000 (  0.712265)
gnome_ndx                    0.100000   0.000000   0.100000 (  0.105420)
gnome_set                    0.010000   0.000000   0.010000 (  0.009398)
vii_stud                     1.400000   0.050000   1.450000 (  1.447110)

Comments

16

Simple way:

([2, 6] - [2, 6, 13, 99, 27]).empty?

Comments

4

I extend Array with these:

class Array

  def include_exactly?(values)
    self.include_all?(values) && (self.length == values.length)
  end
  def include_any?(values)
    values.any? {|value| self.include?(value)}
  end
  def include_all?(values)
    values.all? {|value| self.include?(value)}
  end
  def exclude_all?(values)
    values.all? {|value| self.exclude?(value)}
  end

end

1 Comment

You could also write: values.all? { self.include?(it) }
2
require 'set'

master = Set.new [2, 6, 13, 99, 27]
data = Set.new [27, -3, -4]
#puts data.subset?(master) ? 'yes' : 'no'  #per @meager comment
puts data.intersect?(master) ? 'yes' : 'no'

--output:--
yes

5 Comments

Your inputs don't match the question. He wants to know if any items from one set are contained in another, not if all items are contained.
@meager Well, it just so happens sets have an intersect method, too.
When I saw require 'set' I thought you were going to do something different, namely, convert the smaller of the two arrays to a set, then step through the larger array looking for an element in the set (and quitting if/when one were found). Wouldn't that tend to be faster than taking the intersection of two sets (which I would expect is effectively how the intersection of two arrays is implemented)?
@CarySwoveland intersect? will almost certainly do the same thing. It doesn't have to produce the entire intersection, it can return true as soon as any intersection is found. intersect would have to produce the entire intersection, but intersect? just returns boolean true/false.
@meagar, yes, but to use intersect? there is the overhead of first converting both arrays to sets, whereas I'm suggesting that only the smaller of the two be converted. All of this is moot, of course, if the arrays are not huge.
2

One of my favourite methods of doing that in specs is to convert an array and a value to the Set and check it via #superset? & #subset? methods.

For example:

[1, 2, 3, 4, 5].to_set.superset?([1, 2, 3].to_set) # true
[1, 2, 3].to_set.subset?([1, 2, 3, 4, 5].to_set)   # true
[1, 2].to_set.subset?([1, 2].to_set)               # true
[1, 2].to_set.superset?([1, 2].to_set)             # true

However, being a set means that all values in a collection are unique, so it may not always be appropriate:

[1, 1, 1, 1, 1].to_set.subset? [1, 2].to_set       # true

To avoid calling .to_set every time I usually define a matcher for that:

it 'returns array of "shown" proposals' do
  expect(body_parsed.first.keys).to be_subset_of(hidden_prop_attrs)
end

In my humble opinion, being a superset or a subset is just more readable than doing:

([1, 2, 3] & [1, 2]).any?

However, converting an array to a set may be a less performant. Tradeoffs ¯\_(ツ)_/¯

Comments

2

If you want to check two elements are present in the array.

2.4.1 :221 >   ([2, 6, 13, 99, 27] & [2, 6]).many?
 => true

1 Comment

Agree, but this will only come as an answer where one needs to check with only 2 elements. if someone needs to check with more than 2 elements, then it's failing.. ([2, 6, 13, 99, 27] & [6, 2, 3]).many? => true
1

Ruby 3.1 introduced Array#intersect?:

[2, 6, 13, 99, 27].intersect?([2, 3])
=> true

[2, 6, 13, 99, 27].intersect?([3, 4])
=> false

If you want the overlapping elements, use Array#intersection:

[2, 6, 13, 99, 27].intersection([2, 3])
=> [2]

Array#intersection can even be used with multiple arrays (whereas Array#intersect? cannot):

a = [1, 2, 3, 5, 8, 13]
b = [1, 3, 6, 9]
c = [1, 3, 9, 27]

a.intersection(b, c)
=> [1, 3]

Comments

0

This works - if any of the value matches:

arr = [2, 6, 13, 99, 27]
if (arr - [2, 6]).size < arr.size
 puts 'element match found'
else
 puts 'element not found'
end

Comments

0

I extend Array with these:

class Array

  def include_any?(arr)
    (self & Array(arr)).any?
  end

end

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.