1

I am using Ruby 2.6 in my application.

I want to remove the duplicate element in array of hashes. Here is my input

array_of_hashes = [
{"Date"=> "2019-05-6", "ID" => 100, "Rate" => 10, "Count" => 1},
{"Date"=> "2019-05-6", "ID" => 100, "Rate" => nil, "Count" => 0},
{"Date"=> "2019-05-6", "ID" => 101, "Rate" => 25, "Count" => 3},
{"Date"=> "2019-05-6", "ID" => 102, "Rate" => nil, "Count" => 0},
{"Date"=> "2019-05-6", "ID" => 102, "Rate" => 35, "Count" => 0},
{"Date"=> "2019-05-6", "ID" => 103, "Rate" => 20, "Count" => 6}
]

I am creating key, value pair from the hash for the need of my application.

result = array_of_hashes.map { |row| [[row['ID'], row['Date'], row] }.to_h

If there are two records with same "ID" and "Date" values in a hash, I want to rows the row where "Rate" != 0 where input records order might shuffle. Here is my Actual and Expected result.

Actual Result:

 {[100, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>100, "Rate"=>nil, "Count"=>0},
 [101, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>101, "Rate"=>25, "Count"=>3},
 [102, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>102, "Rate"=>35, "Count"=>0},
 [103, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>103, "Rate"=>20, "Count"=>6}}

Expected result:

 {[100, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>100, "Rate"=>10, "Count"=>1}, 
 [101, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>101, "Rate"=>25, "Count"=>3},
 [102, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>102, "Rate"=>35, "Count"=>0},
 [103, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>103, "Rate"=>20, "Count"=>6}}

How can I get the above expected result?

1
  • 1. Can the "expected result" contain a value (hash) for which Rate = nil? 2. Can array_of_hashes contain two elements having the same values for "ID" and "Date" and neither has a nil value for "Rate"? If "yes", which should be selected? Commented May 8, 2019 at 16:09

3 Answers 3

3

Here is another group by option

array_of_hashes.group_by {|h| h.values_at("ID","Date")}.transform_values do |v|   
  v.find {|r| r["Rate"]}
end

#=> {[100, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>100, "Rate"=>10, "Count"=>1}, 
#    [101, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>101, "Rate"=>25, "Count"=>3}, 
#    [102, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>102, "Rate"=>35, "Count"=>0}, 
#    [103, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>103, "Rate"=>20, "Count"=>6}}

group by id and date then transform the Hash values to the first Hash where "Rate" is not nil.

If multiple values are acceptable then find_all or select could be substituted for find.

If you want the original structure maintained just add values to the end.

Sign up to request clarification or add additional context in comments.

4 Comments

...or !r["Rate"].nil? to read better (?) and not worry about the value of "Rate" being false (however unlikely that may be).
@CarySwoveland Really you think that reads better? I would prefer v.lazy.reject {|r| r["Rate"].nil? }.first over that and to the same effect as find first non nil rate hash returned without regard for other hashes in the group.
"Reads better" because when I see {|r| r["Rate"]} the question, "what about false?" immediately comes to mind and requires processing. My "(?)" reflects the need for !. What I'd really like is {|r| r["Rate"].non_nil? }.
@CarySwoveland you could go with something super ugly like r unless r['Rate'].nil?
2

We can construct the desired hash by making a single pass through array_of_hashes.

array_of_hashes.each_with_object({}) do |g,h|
  k = [g['ID'], g['Date']]
  h.update(k=>g) unless h.key?(k) && h[k]['Rate'] != nil
end
  #=> {[100, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>100, "Rate"=>10, "Count"=>1},
  #    [101, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>101, "Rate"=>25, "Count"=>3},
  #    [102, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>102, "Rate"=>35, "Count"=>0},
  #    [103, "2019-05-6"]=>{"Date"=>"2019-05-6", "ID"=>103, "Rate"=>20, "Count"=>6}}

This assumes that if two elements of array_of_hashes match on the values of 'ID' and 'Date', and neither has a value of nil for 'Rate', the first of the two hashes is retained. If the latter of the two should be retained change the second line of the method to:

h.update(k=>g) unless h.key?(k) && g['Rate'].nil?

2 Comments

for the first solution you could go with if h.dig(k,'Rate').nil? it is fail fast so the result would be the same.
@engineersmnky, I've not seen that before. Clever!
1

Use group_by and filter nil rates from the values.

array_of_hashes
  .group_by { |h| [h["ID"], h["Date"]] }
  .map { |key, values| [key, values.reject { |row| row["Rate"].nil? }.last] }
  .to_h

4 Comments

You don't want the rows where rate is nil I assume?
Yes. I want the rows where rate is not nil and input records are in shuffled every-time
Why .last is needed here?
It is if we want the result you specified where only one row is returned. If you want potentially several rows per ID-Date tuple you can just simply drop the .last.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.