Sort an array of arrays based on the order in another array

Question

I have an array of arrays:

x = [
  ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1],
  ["delivered", 23], ["scheduled", 1], ["canceled", 51]
]

My sorting array is

order_array = [
  "ready", "in_progress", "recieved", "shipped", "scheduled", "pick_up",
 "delivered", "canceled", "failed", "refunded", "refund_failed"
]

I need to order x based on the value of the first element in each subarray. The required sorted array is:

[
  ["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23],
  ["canceled", 51], ["refunded", 1]
]

Using sort_by doesn't result in the required sorting, it leads to the same array.

result = x.sort_by {|u| order_array.index(u)}
# => [
#      ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1],
#      ["delivered", 23], ["scheduled", 1], ["canceled", 51]
# ]

What about ["pending", 1] – should it be removed because "pending" is not an element of order_array? — Stefan
– Stefan, Commented Mar 13, 2019 at 11:58
RE order_array[2]: chant, " 'I' before 'E' except after 'C' or when sounding like 'A' in 'neighbor' or 'weigh' ". (Exceptions exist.) — Cary Swoveland
– Cary Swoveland, Commented Mar 13, 2019 at 19:20

sawa · Accepted Answer · 2019-03-13 12:20:47Z

5

h = x.to_h
# => {"ready"=>5,
# "shipped"=>1,
# "pending"=>1,
# "refunded"=>1,
# "delivered"=>23,
# "scheduled"=>1,
# "canceled"=>51}

order_array.map{|key| [key, h[key]] if h.key?(key)}.compact
# => [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]

or

h = x.to_h{|k, v| [k, [k, v]]}
#=> {"ready"=>["ready", 5],
# "shipped"=>["shipped", 1],
# "pending"=>["pending", 1],
# "refunded"=>["refunded", 1],
# "delivered"=>["delivered", 23],
# "scheduled"=>["scheduled", 1],
# "canceled"=>["canceled", 51]}

order_array.map{|k| h[k]}.compact
#=> [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]

or

h = x.to_h{|k, v| [k, [k, v]]}
#=> {"ready"=>["ready", 5],
# "shipped"=>["shipped", 1],
# "pending"=>["pending", 1],
# "refunded"=>["refunded", 1],
# "delivered"=>["delivered", 23],
# "scheduled"=>["scheduled", 1],
# "canceled"=>["canceled", 51]}

h.values_at(*order_array).compact
#=> [["ready", 5],
# ["shipped", 1],
# ["scheduled", 1],
# ["delivered", 23],
# ["canceled", 51],
# ["refunded", 1]]

edited Mar 13, 2019 at 12:20

answered Mar 13, 2019 at 11:37

sawa

169k51 gold badges287 silver badges398 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Selim Alawwa Over a year ago

Thanks this works but @SRack solution is much simple.

sawa Over a year ago

@Stefan Right. I implicitly assumed so.

Selim Alawwa Over a year ago

these is no duplicates. second solution is also perfect!

Stefan Over a year ago

Nice, I didn't know that 2.6 added a block variant for to_h. That voids my above comment.

sawa Over a year ago

@Stefan It was according to my request.

steenslag · Accepted Answer · 2019-03-13 15:40:06Z

4

assoc seems helpful: "Searches through an array whose elements are also arrays comparing obj with the first element of each contained array using obj.==."

order_array.map{|e| x.assoc(e) }.compact

answered Mar 13, 2019 at 15:40

steenslag

80.2k16 gold badges144 silver badges174 bronze badges

2 Comments

SRack Over a year ago

Never seen nor heard of assoc before - +1 for bringing it into my life :) Great answer.

iGian Over a year ago

You know that hash is faster :) stackoverflow.com/a/5552062/5239030

SRack · Accepted Answer · 2019-03-13 18:04:36Z

4

You're almost there with this: index isn't working as you're comparing the full array, rather than the first element of it. This will work:

result = x.sort_by { |u| order_array.index(u[0]) || 100 }
#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1], ["pending", 1]]

Please note, the 100 is there to default to the back of the sort if the value isn't found in order_array.

Edit

This was initially accepted, despite including ["pending", 1] suggesting it fit the requirements; however, here's a solution to avoid the unwanted entry, which also handles duplicates should the need arise.

order_array.each_with_object([]) { |ordered_by, array| array.push(*x.select { |item| item[0] == ordered_by }) }
#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]

Or, very fast though still allowing for duplicate values under each ordered item:

hash = x.each_with_object(Hash.new { |h,k| h[k] = [] }) { |item, h| h[item[0]] << item[1] }
order_array.flat_map { |key| [key, hash[key]] }

Benchmark

Here's a benchmark for this scenario with a larger dataset: https://repl.it/repls/SentimentalAdequateClick. Looks like Sawa's methods lead the way, though my last effort works handily should there be duplicate values in future. Also, my second effort sucks (which surprised me a little) :)

edited Mar 13, 2019 at 18:04

answered Mar 13, 2019 at 11:37

SRack

12.4k7 gold badges55 silver badges66 bronze badges

10 Comments

sawa Over a year ago

That does not give what the OP wanted.

SRack Over a year ago

Can see it's including pending, which I hadn't noticed was missing above @sawa. Your approach makes more sense if this is essential. Cheers for the comment.

sawa Over a year ago

What is the output that you really wanted?

SRack Over a year ago

I think this fits with the OP's question? The output is added to my answer: it matches the required output, albeit with ["pending", 1] bumped to the back of the array.

sawa Over a year ago

As you have noted, ["pending", 1] is added at the end, which the OP has not wanted (according to the question).

|

iGian · Accepted Answer · 2019-03-13 19:36:16Z

2

I'd suggest

x.keep_if { |e| order_array.include? e[0] }.sort_by { |e| order_array.index(e[0]) }

Since some values are not elements of order_array, for example "pending".

#=> [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]

Benchmarked the answers up to now 500.times:

#        user       system     total       real
# sawa   0.006698   0.000132   0.006830 (  0.006996) # on the first method
# ray    0.005543   0.000123   0.005666 (  0.005770)
# igian  0.001923   0.000003   0.001926 (  0.001927)
# srack  0.005270   0.000168   0.005438 (  0.005540) # on the last method

Just for fun I tried to find a faster method for Ruby 2.5:

xx = x.to_h # less than Ruby 2.6
order_array.each.with_object([]) { |k, res| res << [k, xx[k]] if xx.has_key? k }

edited Mar 13, 2019 at 19:36

answered Mar 13, 2019 at 11:55

iGian

11.2k3 gold badges24 silver badges38 bronze badges

4 Comments

SRack Over a year ago

This was my thought as an edit for mine when I realised my mistake, though it's more complex that @sawa's answer, so didn't think it worthwhile. Still pretty readable though.

ray Over a year ago

I found it looping 2 times (increasing complexity) and also index is quite slow in case of large array.

iGian Over a year ago

@ray, actually the benchmark surprised me. Maybe you can double check? I considered the array to be small in for te OP case.

SRack Over a year ago

@iGian credit where it's due - this does outperform both Ray's and my second answer, even as it scales. I've benchmarked with a larger dataset (here) and it's basically possible to conclude Sawa is king :)

ray · Accepted Answer · 2019-03-13 12:23:42Z

1

You can try below code to find output efficiently,

order_array.map { |p| x.detect { |y| y[0] == p } }.compact
# => [["ready", 5], ["shipped", 1], ["scheduled", 1], ["delivered", 23], ["canceled", 51], ["refunded", 1]]

answered Mar 13, 2019 at 12:23

ray

5,5521 gold badge22 silver badges41 bronze badges

Comments

Cary Swoveland · Accepted Answer · 2019-03-13 22:11:59Z

I've assumed:

the first element of each element of x is not necessarily unique;
all elements of x whose first element is the same and whose first element is a member of order_array appear consecutively in the returned (sorted) array in the order in which those elements appear in x;
any elements of x whose first element is not a member of order_array appears in the returned (sorted) array after all elements whose first element is in sorted_array, and all such elements appear in the returned array (at the end) in the order in which they occur in x; and
efficiency is paramount.

x = [
  ["ready", 5], ["shipped", 1], ["pending", 1], ["refunded", 1], ["originated", 3],
  ["delivered", 23], ["scheduled", 1], ["ready", 8], ["canceled", 51]
]

order_array = [
  "ready", "in_progress", "received", "shipped", "scheduled", "pick_up",
  "delivered", "canceled", "failed", "refunded", "refund_failed"
]

order_pos = order_array.each_with_object({}) { |word,h| h[word] = [] }
  #=> {"ready"=>[], "in_progress"=>[], "received"=>[], "shipped"=>[],
  #    "scheduled"=>[], "pick_up"=>[], "delivered"=>[], "canceled"=>[],
  #    "failed"=>[], "refunded"=>[], "refund_failed"=>[]} 
back = x.each_with_index.with_object([]) { |((word,v),i),back|
  order_pos.key?(word) ? (order_pos[word] << i) : back << [word,v] }
  #=> [["pending", 1], ["originated", 3]] 
order_pos.flat_map { |word,offsets| offsets.map { |i| x[i] } }.concat(back)
  #=> [["ready", 5], ["ready", 8], ["shipped", 1], ["scheduled", 1],
  #    ["delivered", 23], ["canceled", 51], ["refunded", 1], ["pending", 1],
  #    ["originated", 3]]

Note:

order_pos
  #=> {"ready"=>[0, 7], "in_progress"=>[], "received"=>[], "shipped"=>[1],
  #    "scheduled"=>[6], "pick_up"==>[], "delivered"=>[5], "canceled"=>[8],
  #    "failed"=>[], "refunded"=>[3], "refund_failed"=>[]}

It is necessary to initialise order_pos in order for its keys to be ordered by order_arr. This is an example of the worth of a controversial change made in Ruby 1.9 which guaranteed that hash keys will remain in key-insertion order.

Collectives™ on Stack Overflow

Sort an array of arrays based on the order in another array

6 Answers 6

5 Comments

2 Comments

10 Comments

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

5 Comments

2 Comments

10 Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related