14

Let's say I have the following array and I would like to get rid of contiguous duplicates:

arr = [1,1,1,4,4,4,3,3,3,3,5,5,5,1,1,1]

I would like to get the following:

=> [1,4,3,5,1]

It would be great if there's something simpler and more efficient than my solutions (or variants thereof):

(arr + [nil]).each_cons(2).collect { |i| i[0] != i[1] ? i[0] : nil }.compact

or

(arr + [nil]).each_cons(2).each_with_object([]) { 
   |i, memo| memo << i[0] unless i[0] == i[1] 
 }

EDIT: It looks like @ArupRakshit's solution below is very simple. I'm still looking for better efficiency than my solution.

EDIT:

I'll be benchmarking responses as they come:

require 'fruity'
arr = 10000.times.collect { [rand(5)] * (rand(4) + 2) }.flatten

compare do
  abdo { (arr + [nil]).each_cons(2).collect { 
    |i| i[0] != i[1] ? i[0] : nil }.compact 
  }
  abdo2 { 
          (arr + [nil]).each_cons(2).each_with_object([]) { 
           |i, memo| memo << i[0] unless i[0] == i[1] 
          }
  }
  arup { arr.chunk(&:to_i).map(&:first) }
  arupv2 { arr.join.squeeze.chars.map(&:to_i) }
  agis {
    i = 1
    a = [arr.first]

    while i < arr.size
      a << arr[i] if arr[i] != arr[i-1]
      i += 1
     end
    a
  }
  arupv3 { arr.each_with_object([]) { |el, a| a << el if a.last != el } }
end

Benchmark results:

agis is faster than arupv3 by 39.99999999999999% ± 10.0%
arupv3 is faster than abdo2 by 1.9x ± 0.1
abdo2 is faster than abdo by 10.000000000000009% ± 10.0%
abdo is faster than arup by 30.000000000000004% ± 10.0%
arup is faster than arupv2 by 30.000000000000004% ± 10.0%

If we use:

arr = 10000.times.collect { rand(4) + 1 } # less likelihood of repetition

We get:

agis is faster than arupv3 by 19.999999999999996% ± 10.0%
arupv3 is faster than abdo2 by 1.9x ± 0.1
abdo2 is similar to abdo
abdo is faster than arupv2 by 2.1x ± 0.1
arupv2 is similar to arup
20
  • 1
    "I'll be benchmarking responses as they come"... followed by acceptance of the first answer, ensuring responses won't come... Commented Feb 12, 2014 at 14:32
  • 1
    I'm sure people (especially Rubyists) don't stop posting after an answer has been accepted. Commented Feb 12, 2014 at 14:37
  • 1
    @MarkThomas I totally understand your point but at the same time, over the past few days, I've seen a bunch of guys (such as: aruprakshit, careswoveland, steenslag, matt) answering questions from a while ago and improving them just for because they enjoy to! Commented Feb 12, 2014 at 14:52
  • 1
    Why would you prefer a slightly faster solution over a clearer one? Is this code a demonstrated bottleneck in your application? Commented Feb 12, 2014 at 15:57
  • 2
    @WayneConrad I have accepted the clearer solution below as you can see =) Check my comments on Agis' response :-) Commented Feb 12, 2014 at 15:58

2 Answers 2

14

Do as below using Enumerable#chunk :

arr = [1,1,1,4,4,4,3,3,3,3,5,5,5,1,1,1]
arr.chunk { |e| e }.map(&:first)
# => [1, 4, 3, 5, 1]
# if you have only **Fixnum**, something magic
arr.chunk(&:to_i).map(&:first)
# => [1, 4, 3, 5, 1]

UPDATE

as per @abdo's comment, here is another choice :

arr.join.squeeze.chars.map(&:to_i)
# => [1, 4, 3, 5, 1]

another choice

arr.each_with_object([]) { |el, a| a << el if a.last != el }
Sign up to request clarification or add additional context in comments.

6 Comments

It looks like mine are faster :s .. I'll post benchmarks above
@Abdo Done.. Don't remove( from the original comment box) the linked comment of your in my answer. :)
@Abdo check again! :)
The join.squeeze method will only work for single digit numbers. The first solution is very nice: it is succinct, easy to understand, works with any comparable data type, and is almost certainly fast enough.
@WayneConrad I like the different (educational) approach taken by join.squeeze in case we're using characters vs numbers and am glad aruprakshit provided alternatives :-)
|
4

The less elegant yet most efficient solution:

require 'benchmark'

arr = [1,1,1,4,4,4,3,3,3,3,5,5,5,1,1,1]

GC.disable
Benchmark.bm do |x|
  x.report do
    1_000_000.times do
      i = 1
      a = [arr.first]

      while i < arr.size
        a << arr[i] if arr[i] != arr[i-1]
        i += 1
      end
    end
  end
end
#      user     system      total        real
# 1.890000   0.010000   1.900000 (  1.901702)

GC.enable; GC.start; GC.disable

Benchmark.bm do |x|
  x.report do
    1_000_000.times do
      (arr + [nil]).each_cons(2).collect { |i| i[0] != i[1] ? i[0] : nil }.compact
    end
  end
end
#      user     system      total        real
# 6.050000   0.680000   6.730000 (  6.738690)

3 Comments

Yup, this is at least 2x faster than my fastest solution. (My benchmarks are different because I'm using a larger array). Your solution, however, is buggy: can you try: agis([0,1,4,4,4,3,3,3,3,5,5,5,1]) => returning [0, 0, 1, 4, 3, 5, 1] expected: [0, 1, 4, 3, 5, 1] . Let me know when fixed so I can post alongside my benchmarks =)
I doubt, how many Ruby coder would use, such a loop for this kind of solution.. :)
@Agis, you get a +1. ArupRakshit 's last solution is very close in efficiency to your solution. His solution, however, is much easier on the eyes. My question required BOTH simplicity and efficiency.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.