0

I have an array of arrays:

x = [
  ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1],
  ["delivered", 23], ["scheduled", 1], ["canceled", 51]
]

My sorting array is

order_array = [
  "ready", "in_progress", "recieved", "shipped", "scheduled", "pick_up",
 "delivered", "canceled", "failed", "refunded", "refund_failed"
]

I need to order x based on the value of the first element in each subarray. The required sorted array is:

[
  ["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23],
  ["canceled", 51], ["refunded", 1]
]

Using sort_by doesn't result in the required sorting, it leads to the same array.

result = x.sort_by {|u| order_array.index(u)}
# => [
#      ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1],
#      ["delivered", 23], ["scheduled", 1], ["canceled", 51]
# ]
2
  • 3
    What about ["pending", 1] – should it be removed because "pending" is not an element of order_array? Commented Mar 13, 2019 at 11:58
  • 1
    RE order_array[2]: chant, " 'I' before 'E' except after 'C' or when sounding like 'A' in 'neighbor' or 'weigh' ". (Exceptions exist.) Commented Mar 13, 2019 at 19:20

6 Answers 6

5
h = x.to_h
# => {"ready"=>5,
# "shipped"=>1,
# "pending"=>1,
# "refunded"=>1,
# "delivered"=>23,
# "scheduled"=>1,
# "canceled"=>51}

order_array.map{|key| [key, h[key]] if h.key?(key)}.compact
# => [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]

or

h = x.to_h{|k, v| [k, [k, v]]}
#=> {"ready"=>["ready", 5],
# "shipped"=>["shipped", 1],
# "pending"=>["pending", 1],
# "refunded"=>["refunded", 1],
# "delivered"=>["delivered", 23],
# "scheduled"=>["scheduled", 1],
# "canceled"=>["canceled", 51]}

order_array.map{|k| h[k]}.compact
#=> [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]

or

h = x.to_h{|k, v| [k, [k, v]]}
#=> {"ready"=>["ready", 5],
# "shipped"=>["shipped", 1],
# "pending"=>["pending", 1],
# "refunded"=>["refunded", 1],
# "delivered"=>["delivered", 23],
# "scheduled"=>["scheduled", 1],
# "canceled"=>["canceled", 51]}

h.values_at(*order_array).compact
#=> [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks this works but @SRack solution is much simple.
@Stefan Right. I implicitly assumed so.
these is no duplicates. second solution is also perfect!
Nice, I didn't know that 2.6 added a block variant for to_h. That voids my above comment.
@Stefan It was according to my request.
4

assoc seems helpful: "Searches through an array whose elements are also arrays comparing obj with the first element of each contained array using obj.==."

order_array.map{|e| x.assoc(e) }.compact

2 Comments

Never seen nor heard of assoc before - +1 for bringing it into my life :) Great answer.
You know that hash is faster :) stackoverflow.com/a/5552062/5239030
4

You're almost there with this: index isn't working as you're comparing the full array, rather than the first element of it. This will work:

result = x.sort_by { |u| order_array.index(u[0]) || 100 }
#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1], ["pending", 1]]

Please note, the 100 is there to default to the back of the sort if the value isn't found in order_array.


Edit

This was initially accepted, despite including ["pending", 1] suggesting it fit the requirements; however, here's a solution to avoid the unwanted entry, which also handles duplicates should the need arise.

order_array.each_with_object([]) { |ordered_by, array| array.push(*x.select { |item| item[0] == ordered_by }) }
#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]

Or, very fast though still allowing for duplicate values under each ordered item:

hash = x.each_with_object(Hash.new { |h,k| h[k] = [] }) { |item, h| h[item[0]] << item[1] }
order_array.flat_map { |key| [key, hash[key]] }

Benchmark

Here's a benchmark for this scenario with a larger dataset: https://repl.it/repls/SentimentalAdequateClick. Looks like Sawa's methods lead the way, though my last effort works handily should there be duplicate values in future. Also, my second effort sucks (which surprised me a little) :)

10 Comments

That does not give what the OP wanted.
Can see it's including pending, which I hadn't noticed was missing above @sawa. Your approach makes more sense if this is essential. Cheers for the comment.
What is the output that you really wanted?
I think this fits with the OP's question? The output is added to my answer: it matches the required output, albeit with ["pending", 1] bumped to the back of the array.
As you have noted, ["pending", 1] is added at the end, which the OP has not wanted (according to the question).
|
2

I'd suggest

x.keep_if { |e| order_array.include? e[0] }.sort_by { |e| order_array.index(e[0]) }

Since some values are not elements of order_array, for example "pending".

#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]


Benchmarked the answers up to now 500.times:

#        user       system     total       real
# sawa   0.006698   0.000132   0.006830 (  0.006996) # on the first method
# ray    0.005543   0.000123   0.005666 (  0.005770)
# igian  0.001923   0.000003   0.001926 (  0.001927)
# srack  0.005270   0.000168   0.005438 (  0.005540) # on the last method


Just for fun I tried to find a faster method for Ruby 2.5:

xx = x.to_h # less than Ruby 2.6
order_array.each.with_object([]) { |k, res| res << [k, xx[k]] if xx.has_key? k }

4 Comments

This was my thought as an edit for mine when I realised my mistake, though it's more complex that @sawa's answer, so didn't think it worthwhile. Still pretty readable though.
I found it looping 2 times (increasing complexity) and also index is quite slow in case of large array.
@ray, actually the benchmark surprised me. Maybe you can double check? I considered the array to be small in for te OP case.
@iGian credit where it's due - this does outperform both Ray's and my second answer, even as it scales. I've benchmarked with a larger dataset (here) and it's basically possible to conclude Sawa is king :)
1

You can try below code to find output efficiently,

order_array.map { |p| x.detect { |y| y[0] == p } }.compact
# => [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]

Comments

0

I've assumed:

  • the first element of each element of x is not necessarily unique;
  • all elements of x whose first element is the same and whose first element is a member of order_array appear consecutively in the returned (sorted) array in the order in which those elements appear in x;
  • any elements of x whose first element is not a member of order_array appears in the returned (sorted) array after all elements whose first element is in sorted_array, and all such elements appear in the returned array (at the end) in the order in which they occur in x; and
  • efficiency is paramount.

x = [
  ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1], ["originated", 3],
  ["delivered", 23], ["scheduled", 1], ["ready", 8], ["canceled", 51]
]

order_array = [
  "ready", "in_progress", "received", "shipped", "scheduled", "pick_up",
  "delivered", "canceled", "failed", "refunded", "refund_failed"
]

order_pos = order_array.each_with_object({}) { |word,h| h[word] = [] }
  #=> {"ready"=>[], "in_progress"=>[], "received"=>[], "shipped"=>[],
  #    "scheduled"=>[], "pick_up"=>[], "delivered"=>[], "canceled"=>[],
  #    "failed"=>[], "refunded"=>[], "refund_failed"=>[]} 
back = x.each_with_index.with_object([]) { |((word,v),i),back|
  order_pos.key?(word) ? (order_pos[word] << i) : back << [word,v] }
  #=> [["pending", 1], ["originated", 3]] 
order_pos.flat_map { |word,offsets| offsets.map { |i| x[i] } }.concat(back)
  #=> [["ready", 5], ["ready", 8], ["shipped", 1], ["scheduled", 1],
  #    ["delivered", 23], ["canceled", 51], ["refunded", 1], ["pending", 1],
  #    ["originated", 3]] 

Note:

order_pos
  #=> {"ready"=>[0, 7], "in_progress"=>[], "received"=>[], "shipped"=>[1],
  #    "scheduled"=>[6], "pick_up"==>[], "delivered"=>[5], "canceled"=>[8],
  #    "failed"=>[], "refunded"=>[3], "refund_failed"=>[]} 

It is necessary to initialise order_pos in order for its keys to be ordered by order_arr. This is an example of the worth of a controversial change made in Ruby 1.9 which guaranteed that hash keys will remain in key-insertion order.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.