arr = [1,2,1,3,5,2,4]
How can I count the array by group value with sorting? I need the following output:
x[1] = 2
x[2] = 2
x[3] = 1
x[4] = 1
x[5] = 1
arr = [1,2,1,3,5,2,4]
How can I count the array by group value with sorting? I need the following output:
x[1] = 2
x[2] = 2
x[3] = 1
x[4] = 1
x[5] = 1
x = arr.inject(Hash.new(0)) { |h, e| h[e] += 1 ; h }
inject "injects" an accumulator into an Enumerable, which in our case is a Hash with a default value of 0. On every iteration, we add one to the value with the key of the current element (e). Finally we return the accumulator. ruby-doc.org/core/classes/Enumerable.html#M001494.each_with_object over inject when building hashes versus arithmetic. See @sawa's answer below.There is a short version which is in ruby 2.7 => Enumerable#tally.
[1,2,1,3,5,2,4].tally #=> { 1=>2, 2=>2, 3=>1, 5=>1, 4=>1 }
tally doesn't accept block in 2.7 docs.ruby-lang.org/en/2.7.0/Enumerable.html#method-i-tallyOnly available under ruby 1.9
Basically the same as Michael's answer, but a slightly shorter way:
x = arr.each_with_object(Hash.new(0)) {|e, h| h[e] += 1}
In similar situations,
Array, Hash, String, you can use each_with_object, as in the case above.When the starting element is an immutable object such as Numeric, you have to use inject as below.
sum = (1..10).inject(0) {|sum, n| sum + n} # => 55
each_with_object has been added to avoid h[e] += 1 ; harr.group_by(&:itself).transform_values(&:size)
#=> {1=>2, 2=>2, 3=>1, 5=>1, 4=>1}
Yet another - similar to others - approach:
result=Hash[arr.group_by{|x|x}.map{|k,v| [k,v.size]}]
result[1]=2 ....Whenever you find someone asserting that something is the fastest on this type of primitive routine, I always find its interesting to confirm that because without confirmation most of us are really just guessing. So I took all of the methods here and benchmarked them.
I took an array of 120 links I extracted from a web page that I needed to group by count and implemented all of these using a seconds = Benchmark.realtime do loop and got all the times.
Assume links is the name of the array I need to count:
#0.00077
seconds = Benchmark.realtime do
counted_links = {}
links.each { |e| counted_links[e] = links.count(e) if counted_links[e].nil?}
end
seconds
#0.000232
seconds = Benchmark.realtime do
counted_links = {}
links.sort.group_by {|x|x}.each{|x,y| counted_links[x] = y.size}
end
#0.00076
seconds = Benchmark.realtime do
Hash[links.uniq.map{ |i| [i, links.count(i)] }]
end
#0.000107
seconds = Benchmark.realtime do
links.inject(Hash.new(0)) {|h, v| h[v] += 1; h}
end
#0.000109
seconds = Benchmark.realtime do
links.each_with_object(Hash.new(0)) {|e, h| h[e] += 1}
end
#0.000143
seconds = Benchmark.realtime do
links.inject(Hash.new(0)) { |h, e| h[e] += 1 ; h }
end
And then a little bit of ruby to figure out the answer:
times = [0.00077, 0.000232, 0.00076, 0.000107, 0.000109, 0.000143].min
==> 0.000107
So the actual fastest method, ymmv of course, is:
links.inject(Hash.new(0)) {|h, v| h[v] += 1; h}
#tally is the fastest option now, unless you need to count based on some derived value, in which case the each_with_object option is faster than map plus tally for large arrays.x = Hash[arr.uniq.map{ |i| [i, arr.count(i)] }]
Latest Ruby has to_h method:
x = arr.uniq.map{ |i| [i, arr.count(i)] }.to_h
count method on the array. Maybe using built in methods has their advantage. :)count, but thought it wouldn't scale well with array length, so replaced it by my current answer. Can you run your benchmark with somewhat bigger array and compare again.O(n2) it is faster in benchmarks with small arrays, but it will increadibly slow with big arrays. My fault is I was testing present array in million cycle bench - so it was 20% faster.I am sure there are better ways,
>> arr.sort.group_by {|x|x}.each{|x,y| print "#{x} #{y.size}\n"}
1 2
2 2
3 1
4 1
5 1
assign x and y values to a hash as needed.
sort before group_by. arr.group_by {...} will do the same thing[2,1].group_by {|x|x} #=> {2=>[2], 1=>[1]} kurumi, what ways are better?group_by returns a Hash which has no order. The order in which entries in a Hash is iterated is unpredictable.group_by preserves order, at least in MRI 1.9+. AFAIK, it is not documented, but should be, as it's part of the spec.Hash. So it would NOT have anything to do with [2,1].group_by {|x|x} #=> {2=>[2], 1=>[1]}Just for the record, I recently read about Object#tap here. My solution would be:
Hash.new(0).tap{|h| arr.each{|i| h[i] += 1}}
The #tap method passes the caller to the block and then returns it. This is pretty handy when you have to incrementally build an array/hash.