22

I've been thinking about a following problem - there are two arrays, and I need to find elements not common for them both, for example:

a = [1,2,3,4]
b = [1,2,4]

And the expected answer is [3].

So far I've been doing it like this:

a.select { |elem| !b.include?(elem) }

But it gives me O(N ** 2) time complexity. I'm sure it can be done faster ;)

Also, I've been thinking about getting it somehow like this (using some method opposite to & which gives common elements of 2 arrays):

a !& b  #=> doesn't work of course

Another way might be to add two arrays and find the unique element with some method similar to uniq, so that:

[1,1,2,2,3,4,4].some_method #=> would return 3
7
  • 3
    (a-b) | (b-a) # => [3] See ruby-doc.org/core-1.9.3/Array.html#method-i-2D and note that it's not commutative, i.e. in general a-b != b-a Commented Nov 25, 2013 at 22:55
  • 1
    That should be: (a-b) | (b-a) Commented Nov 25, 2013 at 22:56
  • @ShawnBalestracci Right you are. I had even written it correctly in my test console, but rewrote it wrong. Commented Nov 25, 2013 at 22:57
  • @iamnotmaynard: isn't that O(N ** 2) as well? Commented Nov 25, 2013 at 22:59
  • 1
    Or sort the arrays (which should be O(n log n)), then iterate through the two, adding elements which are in only one array to a result array (O(n)). Commented Nov 25, 2013 at 23:10

4 Answers 4

32

The simplest (in terms of using only the arrays already in place and stock array methods, anyway) solution is the union of the differences:

a = [1,2,3,4]
b = [1,2,4]
(a-b) | (b-a)
=> [3]

This may or may not be better than O(n**2). There are other options which are likely to give better peformance (see other answers/comments).

Edit: Here's a quick-ish implementation of the sort-and-iterate approach (this assumes no array has repeated elements; otherwise it will need to be modified depending on what behavior is wanted in that case). If anyone can come up with a shorter way to do it, I'd be interested. The limiting factor is the sort used. I assume Ruby uses some sort of Quicksort, so complexity averages O(n log n) with possible worst-case of O(n**2); if the arrays are already sorted, then of course the two calls to sort can be removed and it will run in O(n).

def diff a, b
  a = a.sort
  b = b.sort
  result = []
  bi = 0
  ai = 0
  while (ai < a.size && bi < b.size)
    if a[ai] == b[bi]
      ai += 1
      bi += 1
    elsif a[ai]<b[bi]
      result << a[ai]
      ai += 1
    else
      result << b[bi]
      bi += 1
    end
  end
  result += a[ai, a.size-ai] if ai<a.size
  result += b[bi, b.size-bi] if bi<b.size
  result
end
Sign up to request clarification or add additional context in comments.

1 Comment

Union of the difference! Thanks for that! Using underscore: result = _.union(_.difference(a, b), _.difference(b, a));
26

As @iamnotmaynard noted in the comments, this is traditionally a set operation (called the symmetric difference). Ruby's Set class includes this operation, so the most idiomatic way to express it would be with a Set:

Set.new(a) ^ b

That should give O(n) performance (since a set membership test is constant-time).

Comments

11
a = [1, 2, 3]
b = [2, 3, 4]
a + b - (a & b)
# => [1, 4]

2 Comments

Which one is preferrable? This or the accepted solution?
this solution is better from garbage collection point of view (you can see less value for total_allocated_object with disabled GC).
1

The solution for Array divergences is like:

a = [1, 2, 3]
b = [2, 3, 4]
(a - b) | (b - a)
# => [1, 4]

You can also read my blog post about Array coherences

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.