arr = [
["John Doe", "12/31/2015", "1504"],
["Jane Doe", "12/31/2015", "0904"],
["John Doe", "04/08/2015", "1300"],
["Jimmy Dean", "01/01/2014", "0406"],
["John Doe", "04/08/2015", "1402"],
["Jane Doe", "12/31/2015", "0908"],
["Jane Doe", "12/31/2015", "1045"]
]
arr.each_with_object({}) { |(name,date,val),h|
h.update(name => { date: date, val: [val.to_i] }) { |_,h1,h2|
{ date: h1[:date], val: h1[:val] + h2[:val] } } }.
map { |name, h| [name, h[:date], *h[:val].minmax.map { |n| "%04d" % n }] }
#=> [["John Doe", "12/31/2015", "1300", "1504"],
# ["Jane Doe", "12/31/2015", "0904", "1045"],
# ["Jimmy Dean", "01/01/2014", "0406", "0406"]]
I will explain how this works and will also try to describe the thinking process that led to this answer. I realize that you are new to Ruby, so it may not all make sense the first time through, or even the fourth time through.
We need to do some aggregation, or grouping, of the elements (arrays) of arr; namely, we want to group elements by name, the first element of each element (array) of arr. When you want to aggregate, think "hash", with a single key being the (unique) object by which the aggregation is done, here the name. There are two ways of doing that: build a hash from scratch (starting with an empty hash, {}) or use a method that returns a suitable hash. One such method that is applicable here is Enumerable#group_by1,2:
arr.group_by { |a| a.first }
#=> {"John Doe" =>[["John Doe", "12/31/2015", "1504"],
# ["John Doe", "04/08/2015", "1300"],
# ["John Doe", "04/08/2015", "1402"]],
# "Jane Doe" =>[["Jane Doe", "12/31/2015", "0904"],
# ["Jane Doe", "12/31/2015", "0908"],
# ["Jane Doe", "12/31/2015", "1045"]],
# "Jimmy Dean"=>[["Jimmy Dean", "01/01/2014", "0406"]]}
I could have used group_by3, but chose the first route, building the hash from scratch. Let's start with:
h = {}
To build the hash h we can use the method Hash#update (aka merge!). For example, if h = { :a=>1 }, then
h.update({ :b=>2 }) #=> { :a=>1, :b=>2 }
Ruby allows us to write this without the braces:
h.update(:b=>2) #=> { :a=>1, :b=>2 }
and to use a short form when the keys are symbols:
h.update(b: 2) #=> { a: 1, b: 2 }
so I'll do that from here on. We also have:
{ a: 1 }.update(a: 2) #=> { a: 2 }
What we want is something like:
{ a: [1] }.update(a: [2]) #=> { a: [1,2] }
We can obtain that using the form of update (see the doc) that employs a hash to determine the values of keys that are present in both hashes being merged:
arr.each { |a|
h.update(a[0]=>{ date: a[1], val: [a[2].to_i] }) { |k,h1,h2|
{ date: h1[:date], val: h1[:val] + h2[:val] } } }
Before examining this more closely, let's disambiguate the block variable a into its three elements, name, date and val. We have:
arr.each { |name,date,val|
h.update(name=>{ date: date, val: [val.to_i] }) { |k,h1,h2|
{ date: h1[:date], val: h1[:val] + h2[:val] } } }
each returns its receiver, arr, not the updated value of h, which is:
h #=> {"John Doe" =>{:date=>"12/31/2015", :val=>[1504, 1300, 1402]},
# "Jane Doe" =>{:date=>"12/31/2015", :val=>[904, 908, 1045]},
# "Jimmy Dean"=>{:date=>"01/01/2014", :val=>[406]}}
We can step through this calculation as follows:
enum = arr.each
#=> #<Enumerator: [["John Doe", "12/31/2015", "1504"],
# ["Jane Doe", "12/31/2015", "0904"],
# ["John Doe", "04/08/2015", "1300"],
# ["Jimmy Dean", "01/01/2014", "0406"],
# ["John Doe", "04/08/2015", "1402"],
# ["Jane Doe", "12/31/2015", "0908"],
# ["Jane Doe", "12/31/2015", "1045"]]:each>
The first value of the enumerator enum (["John Doe", "12/31/2015", "1504"]) is passed to the block and the block values are assigned, using parallel assignment (or multiple assignment). We can simulate that using Enumerator#next:
name, date, val = enum.next
#=> ["John Doe", "12/31/2015", "1504"]
name
#=> "John Doe"
date
#=> "12/31/2015"
val
#=> "1504"
and the block calculation is performed:
h.update(name=>{ date: date, val: [val.to_i] })
#=> {}.update("John Doe"=>{ :date=>"12/31/2015", :val=>["1504"] })
#=> {"John Doe"=>{:date=>"12/31/2015", :val=>[1504]}}
The return value is the updated value of h.
Since we are merging { "John Doe"=>{ :date=>"12/31/2015", :val=>"1504" } } into {} the two hashes have no shared keys. Therefore, the block for determining values (which I've not included above) is not used.
Now the second element of enum (["Jane Doe", "12/31/2015", "0904"]) is passed to the block and the block calculation is performed:
name, date, val = enum.next
#=> ["Jane Doe", "12/31/2015", "0904"]
name
#=> "Jane Doe"
date
#=> "12/31/2015"
val
#=> "0904"
h.update(name=>{ date: date, val: [val.to_i] })
#=> {"John Doe"=>{:date=>"12/31/2015", :val=>[1504]}}.
# update("Jane Doe"=>{ :date=>"12/31/2015", :val=>["0904"] })
#=> {"John Doe"=>{:date=>"12/31/2015", :val=>[1504]},
# "Jane Doe"=>{:date=>"12/31/2015", :val=>[904]}}
Again, the block for determining values is not used because the two hashes ({"John Doe"=>{:date=>"12/31/2015", :val=>["1504"]}} and { "Jane Doe"=>{ :date=>"12/31/2015", :val=>["0904"] } }) have no common keys.
The third value is passed to the block:
name, date, val = enum.next
#=> ["John Doe", "04/08/2015", "1300"]
h.update(name=>{ date: date, val: [val.to_i] }) { |k,h1,h2|
{ date: h1[:date], val: h1[:val] + h2[:val] } }
#=> h.update("John Doe"=>{ date: "04/08/2015", val: [1300] }) { |k,h1,h2|
{ date: h1[:date], val: h1[:val] + h2[:val] } }
#=> {"John Doe"=>{:date=>"12/31/2015", :val=>[1504, 1300]},
# "Jane Doe"=>{:date=>"12/31/2015", :val=>[904]}}
This time both hashes being merged have the key "John Doe", so the block is used to determine the value of "John Doe". We have4:
k #=> "John Doe"
h1 #=> { date: "12/31/2015", val: [1504] } # "old" value
h2 #=> { date: "04/08/2015", val: [1300] } # "new" value
{ date: h1[:date], val: h1[:val] + h2[:val] }
#=> { date: "12/31/2015", val: [1504] + [1300] }
#=> { date: "12/31/2015", val: [1504, 1300] }
The calculations are similar for the remaining elements of enum. As shown above, the result is the hash:
h #=> {"John Doe" =>{:date=>"12/31/2015", :val=>[1504, 1300, 1402]},
# "Jane Doe" =>{:date=>"12/31/2015", :val=>[904, 908, 1045]},
# "Jimmy Dean"=>{:date=>"01/01/2014", :val=>[406]}}
It remains to convert the hash to the desired array. This is actually the easy part. It involves the calculation of the minimum and maximum values of each key :val in the inner hash, and altering the format. If integer values were desired for the min and max5, we could do this:
h.map { |k,v| [k, v[:date], v[:val].minmax] }
#=> [["John Doe", "12/31/2015", [1300, 1504]],
# ["Jane Doe", "12/31/2015", [904, 1045]],
# ["Jimmy Dean", "01/01/2014", [406, 406]]]
Since four-character strings (with leading zeroes) are desired for the min and max values, another step is required:
h.map { |k,v| [k, v[:date], v[:val].minmax.map { |n| "%04d" % n }] }
As this final step is not central to the question, I will omit the explanation of the conversion.
Lastly:
- rather than defining
h = {}, first, I used the method Enumerable#each_with_object, the "object" being a hash represented by the block variable h, which is the value returned by the method. The initial value of the hash is given by the argument {}.
- since the block variable
k in the block for determining the values of keys that are in both hashes being merged is not used in the block calculation, I've changed it to the local variable _, which is customary.
- I chained the construction of the hash to its mapping to the desired array.
1 When, as here, the receiver arr is an array, you'll want to look for methods you might use in the class Array or in the module Enumerable. Enumerable is included ("mixed-in) by several classes, Array being one. Similarly, if the receiver were a hash, you'd look in the class Hash and in Enumerable.
2 One day, long ago, a very wise man from the land of the Rising Sun noticed that many methods he used for arrays were very similar to those he used for hashes, ranges and other collections. He saw that he could write them so that the only difference was how the method each was implemented, so he put them all in a module he called "可算の" ("Enumerable"), and then in all the classes for different types of collections (Array, Hash, Range, Set, etc.) he added include Enumerable and a method each. After doing this, he thought, "生活は快適です" ("life is good").
3 Once you understand the approach I have taken, see if you can answer the question using group_by.
4 The problem is made easier by the fact that the value of :date is the same for all elements of enum having the same name, so below I could use either h1[:date] or h2[:date].
5 Computing the minimum and maximum values of an array is a fairly common task, so you should expect Ruby to provide a method to do that. Peruse the docs for Array for such a method. Nothing there, so try Enumerable. Bingo: Enumerable#minmax.
Enumerable#group_byis probably the first step you want, but without more information it's pretty hard to say.