6

Starting with the following array (of hashes):

[
  {:name=>"site a", :url=>"http://example.org/site/1/"}, 
  {:name=>"site b", :url=>"http://example.org/site/2/"}, 
  {:name=>"site c", :url=>"http://example.org/site/3/"}, 
  {:name=>"site d", :url=>"http://example.org/site/1/"}, 
  {:name=>"site e", :url=>"http://example.org/site/2/"}, 
  {:name=>"site f", :url=>"http://example.org/site/6/"},
  {:name=>"site g", :url=>"http://example.org/site/1/"}
]

How can I add an index of the duplicate urls like so:

[
  {:name=>"site a", :url=>"http://example.org/site/1/", :index => 1}, 
  {:name=>"site b", :url=>"http://example.org/site/2/", :index => 1}, 
  {:name=>"site c", :url=>"http://example.org/site/3/", :index => 1}, 
  {:name=>"site d", :url=>"http://example.org/site/1/", :index => 2}, 
  {:name=>"site e", :url=>"http://example.org/site/2/", :index => 2}, 
  {:name=>"site f", :url=>"http://example.org/site/6/", :index => 1},
  {:name=>"site g", :url=>"http://example.org/site/1/", :index => 3}
]

3 Answers 3

5

I would use a hash for keeping track of the indices. Scanning the previous entries again and again seems inefficient

counts = Hash.new(0)
array.each { | hash | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}

or a bit cleaner

array.each_with_object(Hash.new(0)) { | hash, counts | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}
Sign up to request clarification or add additional context in comments.

4 Comments

I think it is inefficient too, but depends on Array size.
you can use liek @mu each_with_object here: array.each_with_object(Hash.new(0)){ |counts, hash| hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1}
both @fl00r and this answer worked great; just found this one easier to grok.
@floor: Thanks, updated. This is because I'm using 1.8.6 here which does not have each_with_object.
3
array = [
  {:name=>"site a", :url=>"http://example.org/site/1/"}, 
  {:name=>"site b", :url=>"http://example.org/site/2/"}, 
  {:name=>"site c", :url=>"http://example.org/site/3/"}, 
  {:name=>"site d", :url=>"http://example.org/site/1/"}, 
  {:name=>"site e", :url=>"http://example.org/site/2/"}, 
  {:name=>"site f", :url=>"http://example.org/site/6/"},
  {:name=>"site g", :url=>"http://example.org/site/1/"}
]

array.inject([]) { |ar, it| 
    count_so_far = ar.count{|i| i[:url] == it[:url]}
    it[:index] = count_so_far+1
    ar << it
}
#=>
[
  {:name=>"site a", :url=>"http://example.org/site/1/", :index=>1}, 
  {:name=>"site b", :url=>"http://example.org/site/2/", :index=>1}, 
  {:name=>"site c", :url=>"http://example.org/site/3/", :index=>1}, 
  {:name=>"site d", :url=>"http://example.org/site/1/", :index=>2}, 
  {:name=>"site e", :url=>"http://example.org/site/2/", :index=>2}, 
  {:name=>"site f", :url=>"http://example.org/site/6/", :index=>1}, 
  {:name=>"site g", :url=>"http://example.org/site/1/", :index=>3}
]

6 Comments

Excellent, now I'm just trying to get my head around how that works... :)
I reformatted the inject call to hopefully make it clearer. inject loops over the receiving array, and in every call to the inject block, ar will contain the URLs (and their running counts) it's seen "so far" - because they're being added at the end of the block. So at the start, you count how many of the "current" URL you've seen so far, and add that. It's a little clunky to explain because it's really a recursive operation in disguise. (With thanks to @fl00r for graciously letting me try and make his code comprehensible.)
I would replace count_so_far = ar.count{|i| i[:url] == it[:url]} with count_so_far = ar[ar.rindex{|i| i[:url] == it[:url]}][:index]If you have a lot of elements will have better performance.
@Sii why? When in G the rindex will return 3, as I would expect it to.
@Serabe: Actually, no, you're right, I missed that you're retrieving the previous running count. It just took me a while to handle the case where rindex returns nil
|
0

If I wanted it to be efficient, I'd write:

items_with_index = items.inject([[], {}]) do |(output, counts), h|
  new_count = (counts[h[:url]] || 0) + 1
  [output << h.merge(:index => new_count), counts.update(h[:url] => new_count)]
end[0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.