Ruby: Sorting an array of strings, in alphabetical order, that includes some arrays of strings

Question

Say I have:

a = ["apple", "pear", ["grapes", "berries"], "peach"]

and I want to sort by:

a.sort_by do |f|
  f.class == Array ? f.to_s : f
end

I get:

[["grapes", "berries"], "apple", "peach", "pear"]

Where I actually want the items in alphabetical order, with array items being sorted on their first element:

["apple", ["grapes", "berries"], "peach", "pear"]

or, preferably, I want:

["apple", "grapes, berries", "peach", "pear"]

If the example isn't clear enough, I'm looking to sort the items in alphabetical order.

Any suggestions on how to get there?

I've tried a few things so far yet can't seem to get it there. Thanks.

what is the sorting logic, you want to have? explain it please — Arup Rakshit
– Arup Rakshit, Commented Feb 21, 2014 at 23:32
I'd like to sort in alphabetical order, with the first item of a string array being used in comparison to the other strings — steve_gallagher
– steve_gallagher, Commented Feb 21, 2014 at 23:49
Why grapes berries and not berries grapes? Descending order in the inner array? — Abdo
– Abdo, Commented Feb 22, 2014 at 0:02
You should edit I'd like to sort in alphabetical order, with the first item of a string array being used in comparison to the other strings into your question. One has to read the comments to understand your desired sorting scheme. — roippi
– roippi, Commented Feb 22, 2014 at 0:06

Rafa Paez · Accepted Answer · 2014-02-21 23:35:41Z

3

I think this is what you want:

a.sort_by { |f| f.class == Array ? f.first : f }

answered Feb 21, 2014 at 23:35

Rafa Paez

4,87023 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Abdo Over a year ago

add .flatten and you'd get him his preferred output. Regardless, +1 :-)

Rafa Paez Over a year ago

Thanks! Yes, you are right with the edited question. Anyway is the OP just wants keep this kind of order I would not flatten the array because it could be confusing and also it is costly as well.

Cary Swoveland Over a year ago

@Abdo, preferred format has "grapes, berries", not "grapes", "berries", which flatten gives.

Arup Rakshit · Accepted Answer · 2014-02-22 01:35:38Z

3

I would do

a = ["apple", "pear", ["grapes", "berries"], "peach"]
a.map { |e| Array(e).join(", ") }.sort
# => ["apple", "grapes, berries", "peach", "pear"]

edited Feb 22, 2014 at 1:35

answered Feb 22, 2014 at 0:18

Arup Rakshit

119k30 gold badges270 silver badges328 bronze badges

1 Comment

Cary Swoveland Over a year ago

I like the first one, Arup. You could have [e] instead of Array(e) (though I don't have a preference). join(', ') gives a more pleasing spacing. Did you notice that the results of your second solution gives results in a slightly incorrect format? If so, join the party, as at least two others, including me, made the same mistake.

Cary Swoveland · Accepted Answer · 2014-02-22 18:46:23Z

Array#sort_by clearly is the right method, but here's a reminder of how Array#sort would be used here:

  a.sort do |s1,s2| 
    t1 = (s1.is_a? Array) ? s1.first : s1
    t2 = (s2.is_a? Array) ? s2.first : s2
    t1 <=> t2
  end.map {|e| (e.is_a? Array) ? e.join(', ') : e }
    #=> ["apple", "grapes, berries", "peach", "pear"]

@theTinMan pointed out that sort is quite a bit slower than sort_by here, and gave a reference that explains why. I've been meaning to see how the Benchmark module is used, so took the opportunity to compare the two methods for the problem at hand. I used @Rafa's solution for sort_by and mine for sort.

For testing, I constructed an array of 100 random samples (each with 10,000 random elements to be sorted) in advance, so the benchmarks would not include the time needed to construct the samples (which was not insignificant). 8,000 of the 10,000 elements were random strings of 8 lowercase letters. The other 2,000 elements were 2-tuples of the form [str1, str2], where str1 and str2 were each random strings of 8 lowercase letters. I benchmarked with other parameters, but the bottom-line results did not vary significantly.

require 'benchmark'

# n: total number of items to sort
# m: number of two-tuples [str1, str2] among n items to sort
# n-m: number of strings among n items to sort
# k: length of each string in samples
# s: number of sorts to perform when benchmarking

def make_samples(n, m, k, s)
  s.times.with_object([]) { |_, a| a << test_array(n,m,k) }
end

def test_array(n,m,k)
  a = ('a'..'z').to_a 
  r = []
  (n-m).times { r << a.sample(k).join }
  m.times { r << [a.sample(k).join, a.sample(k).join] }
  r.shuffle!
end

# Here's what the samples look like:    
make_samples(6,2,4,4)
  #=> [["bloj", "izlh", "tebz", ["lfzx", "rxko"], ["ljnv", "tpze"], "ryel"],
  #    ["jyoh", "ixmt", "opnv", "qdtk", ["jsve", "itjw"], ["pnog", "fkdr"]],
  #    ["sxme", ["emqo", "cawq"], "kbsl", "xgwk", "kanj", ["cylb", "kgpx"]],
  #    [["rdah", "ohgq"], "bnup", ["ytlr", "czmo"], "yxqa", "yrmh", "mzin"]]

n = 10000 # total number of items to sort
m = 2000  # number of two-tuples [str1, str2] (n-m strings)
k = 8     # length of each string
s = 100   # number of sorts to perform

samples = make_samples(n,m,k,s)

Benchmark.bm('sort_by'.size) do |bm|
  bm.report 'sort_by' do
    samples.each do |s|
      s.sort_by { |f| f.class == Array ? f.first : f }
    end
  end

  bm.report 'sort' do
    samples.each do |s| 
      s.sort do |s1,s2| 
        t1 = (s1.is_a? Array) ? s1.first : s1
        t2 = (s2.is_a? Array) ? s2.first : s2
        t1 <=> t2
      end
    end
  end
end

              user     system      total        real
sort_by   1.360000   0.000000   1.360000 (  1.364781)
sort      4.050000   0.010000   4.060000 (  4.057673)

Though it was never in doubt, @theTinMan was right! I did a few other runs with different parameters, but sort_by consistently thumped sort by similar performance ratios.

Note the "system" time is zero for sort_by. In other runs it was sometimes zero for sort. The values were always zero or 0.010000, leading me to wonder what's going on there. (I ran these on a Mac.)

For readers unfamiliar with Benchmark, Benchmark#bm takes an argument that equals the amount of left-padding desired for the header row (user system...). bm.report takes a row label as an argument.

Thanks, @Jessie. I tacked on flatten to put the results in the 'preferred' format, but that was based on a misreading of the question. Later I twigged (see my comment on Matt's answer), but forgot to fix my answer. It should be OK now.
While sort can be used, it's going to be much slower than sort_by by its nature. For more information read about "Schwartzian transform".

onionjake · Accepted Answer · 2014-02-21 23:35:30Z

1

You are really close. Just switch .to_s to .first.

irb(main):005:0> b = ["grapes", "berries"]
=> ["grapes", "berries"]
irb(main):006:0> b.to_s
=> "[\"grapes\", \"berries\"]"
irb(main):007:0> b.first
=> "grapes"

Here is one that works:

a.sort_by do |f|
  f.class == Array ? f.first : f
end

Yields:

["apple", ["grapes", "berries"], "peach", "pear"]

answered Feb 21, 2014 at 23:35

onionjake

4,08530 silver badges47 bronze badges

Comments

Matt · Accepted Answer · 2014-02-22 00:38:49Z

1

a.map { |b| b.is_a?(Array) ? b.join(', ') : b }.sort

# => ["apple", "grapes, berries", "peach", "pear"]

edited Feb 22, 2014 at 0:38

answered Feb 21, 2014 at 23:40

Matt

20.8k1 gold badge61 silver badges76 bronze badges

1 Comment

Cary Swoveland Over a year ago

Nice one, Matt. If it were "grape, berry", and "grapefruit" were also in the list, the comma after "grape" would ensure that it is ordered properly ("grape, berry" before "grapefruit").

user513951 · Accepted Answer · 2014-02-22 00:55:08Z

1

Replace to_s with join.

a.sort_by do |f|
  f.class == Array ? f.join : f
end

# => ["apple", ["grapes", "berries"], "peach", "pear"]

Or more concisely:

a.sort_by {|x| [*x].join }

# => ["apple", ["grapes", "berries"], "peach", "pear"]

The problem with to_s is that it converts your Array to a string that starts with "[":

"[\"grapes\", \"berries\"]"

which comes alphabetically before the rest of your strings.

join actually creates the string that you had expected to sort by:

"grapesberries"

which is alphabetized correctly, according to your logic.

If you don't want the arrays to remain arrays, then it's a slightly different operation, but you will still use join.

a.map {|x| [*x].join(", ") }.sort

# => ["apple", "grapes, berries", "peach", "pear"]

edited Feb 22, 2014 at 0:55

answered Feb 21, 2014 at 23:56

user513951

14k8 gold badges71 silver badges91 bronze badges

Comments

Todd A. Jacobs · Accepted Answer · 2014-02-22 00:03:25Z

0

Sort a Flattened Array

If you just want all elements of your nested array flattened and then sorted in alphabetical order, all you need to do is flatten and sort. For example:

["apple", "pear", ["grapes", "berries"], "peach"].flatten.sort
#=> ["apple", "berries", "grapes", "peach", "pear"]

answered Feb 22, 2014 at 0:03

Todd A. Jacobs

85.1k15 gold badges147 silver badges209 bronze badges

2 Comments

roippi Over a year ago

This is not what the OP wants.

steve_gallagher Over a year ago

I know it seems reasonable to want this implementation, but this is for a table row's category type, which can have multiple types in some instances. So I have to keep the multiples together. Thanks.

Collectives™ on Stack Overflow

Ruby: Sorting an array of strings, in alphabetical order, that includes some arrays of strings

7 Answers 7

3 Comments

1 Comment

2 Comments

Comments

1 Comment

Comments

Sort a Flattened Array

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

3 Comments

1 Comment

2 Comments

Comments

1 Comment

Comments

Sort a Flattened Array

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related