0

I have an array of hashes that looks like:

[
  {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]

I'd like to combine hashes based on the id value while preserving it, preserve the name, and sum the net_worth and vehicles values.

So the final array would look like:

[
  {"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4},
  {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]
7
  • 1
    Welcome to Stack Overflow. We expect you to show us code you've written toward solving the question. We'll gladly help you correct the code. Without the code it looks like you're asking us to write it for you, which isn't what Stack Overflow is for. Commented Mar 9, 2015 at 16:17
  • 1
    How do you decide which name to preserve? And, what is your question? Commented Mar 9, 2015 at 16:19
  • I agree. It's polite to show the code you've written to prove you aren't trying to get out of actually doing your homework or something of that nature. Commented Mar 9, 2015 at 16:28
  • In future, when you give an example (which is generally advisable), make sure the objects are correct and give them names (as in both answers to date), so that readers giving answers can just cut-and-paste for testing and can refer to your variable names in their answers. Another advantage is that by running your code, you can correct mistakes before posting (e.g. = that should be =>). I suggest you edit to correct that, as many SO members may see your question in future. Commented Mar 9, 2015 at 17:07
  • Edited to reflect correct syntax. I had tried using the merge and inject methods before and was receiving errors with the results. Should I include what I've tried in future posts? Commented Mar 9, 2015 at 17:50

2 Answers 2

2

Here is solution of your problem. As you can see you should group rows by id and name, then calculate sum of other values and build result:

rows = [
    {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>1, "name"=>"Batman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]

groups = rows.group_by {|row| [row['id'], row['name']] }

result = groups.map do |key, values|
  id, name = *key

  total_net_worth = values.reduce(0) {|sum, value| sum + value['net_worth'] }
  total_vehicles = values.reduce(0) {|sum, value| sum + value['vehicles'] }

  { "id" => id, "name" => name, "net_worth" => total_net_worth, "vehicles" => total_vehicles }
end

p result
Sign up to request clarification or add additional context in comments.

1 Comment

Okay, this is way cool. Can you read what I think is happening just to make sure I understand the concept. Your solution works flawlessly, but I'd love to actually make sure I know what's happening. So the groups_by method looks through each hash and creates a temp hash with a key based on the current value of id and name. If it comes across a hash that has an identical id and name it appends that hash to the temp one created. Then it sets the temp hash to groups. Then you just iterate over the values assigned to key and add together the rows that need summing and return it as a new hash
1

Here are two ways of doing it that work with any number of key-value pairs, and do not depend on the names of keys (other than "id" and "name", of course, which are part of the specification).

Using update

This is a way that uses the form of Hash#update (akamerge!) that employs a block to determine the values of keys that are present in both hashes:

arr = [
  {"id"=>1, "name"=>"Batman",      "net_worth"=>100, "vehicles"=>2}, 
  {"id"=>1, "name"=>"Batman",      "net_worth"=>100, "vehicles"=>2}, 
  {"id"=>2, "name"=>"Superman",    "net_worth"=>100, "vehicles"=>2},
  {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100, "vehicles"=>2}
]   

arr.each_with_object({}) { |g,h|
  h.update(g["id"]=>g.dup) { |_,oh,nh|
    oh.update(nh) { |k,ov,nv|
      (['id','name'].include?(k)) ? ov : ov+nv } } }.values
  #=> [{"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4}, 
  #    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  #    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100,"vehicles"=>2}]   

Using group_by

This could also be done by using Enumerable#group_by, as @maxd has done, but the following is a more compact and general implementation:

arr.map(&:dup).
    group_by { |row| row['id'] }.
    map { |_,arr|
      arr.reduce { |h, g|
        (g.keys - ['id','name']).each { |k| h[k] += g[k] }; h } }

  #=> [{"id"=>1, "name"=>"Batman", "net_worth"=>200, "vehicles"=>4}, 
  #    {"id"=>2, "name"=>"Superman", "net_worth"=>100, "vehicles"=>2},
  #    {"id"=>3, "name"=>"Wonderwoman", "net_worth"=>100,"vehicles"=>2}]   

arr.map(&:dup) is to avoid mutating arr. I used reduce without an argument to avoid the need for copying the key-value pairs having keys "id" and "name".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.