1

I have a column of type VARCHAR(15). I want to sort it as if it were numbers, but if it has a letter, it is last in alphabetical order.

I have the data:

1
12
2
21
ABC13
ABC23

I tried:

NULLIF(regexp_replace(column, '\D', '', 'g'), '')::int

I expected:

1
2
12
21
ABC13
ABC23

But, actually resulted:

1
2
12
ABC13
21
ABC23
2
  • You're looking for a "natural sort". There's several answers on this site, see if one is acceptable. Commented Jan 15, 2023 at 0:27
  • One of decisions is fill some zerros before your number like lpad(nullif(regexp_replace(column, '\D', '', 'g'), ''), 10, '0')::int Commented Jan 15, 2023 at 6:47

2 Answers 2

1

The answer by Albina was helpful (just remember to use ~ not LIKE) but I wanted all values that were integers-as-strings sorted numerically at the start of the result set, and all non-integer values sorted lexicographically at the end of the result set.

You can do that by left padding:

order by
    (CASE WHEN maybe_int_string ~ '^[0-9]+$' -- match int-as-string only
     THEN LPAD(maybe_int_string, 5, '0')     -- order int-as-string numerically at the start
     ELSE LPAD(maybe_int_string, 5, '9')     -- sort non-integers, grouped at the end
     END)`

In this way, 3 becomes 00003, while 12 becomes 00012, so those sort as you'd expect (not 12 then 3), and bob becomes 99bob and gets stuck at the end, where it deserves.

Sign up to request clarification or add additional context in comments.

Comments

0

1. If you need to sort only strings with digits and "ignore" alphanumeric strings sorting:

SELECT * 
FROM string_data
ORDER BY CASE WHEN string_value LIKE '^[0-9]+$'
         THEN CAST(string_value as integer)
         ELSE NULLIF(regexp_replace(string_value, '\D', '9', 'g'), '')::int
         END;
  • string_value LIKE '^[0-9]+$' checks that given string consists of only digits
  • regexp_replace(string_value, '\D', '9', 'g') replaces all the letters (\D) in given string with a digit 9 to arrange string with letters and digits in the end of a result tuple.
  • as all the branches of a CASE should return the same datatype, all the letters are suggested to be replaced by 9. Therefore there is no natural sort by letters if a string contains any letters.

And the result looks like:

 2
 12
 21
 abc13
 ABC13
 abc23
 ABC23
 abc32

2. If you need to sort numeric values first, and then sort alphanumeric strings:

This query manipulates with column order in this way:

SELECT * 
FROM string_data 
ORDER BY COALESCE(SUBSTRING(string_value FROM '^(\d+)$')::INTEGER, 99999999999),
         SUBSTRING(string_value FROM '[a-zA-z_-]+'),
         COALESCE(SUBSTRING(string_value FROM '(\d+)$')::INTEGER, 0),
         string_value;
  • COALESCE(SUBSTRING(string_value FROM '^(\d+)$')::INTEGER, 99999999999) firstly, we retrieve strings consisting of only digits and put them in the first place in a result tuple because these strings have temporary value 99999999999 due to COALESCE. The regex ^(\d+)$ allows to do that.
  • SUBSTRING(string_value FROM '[a-zA-z_-]+') secondly, we retrieve strings which start with letters.
  • COALESCE(SUBSTRING(string_value FROM '(\d+)$')::INTEGER, 0) then, we retrieve alphanumeric strings which end with a number such as ABC123 and put them after all the sorted numbers.

So the result looks like:

 2
 12
 21
 1a
 1ab
 1ab
 abc13
 abc23
 abc32
 ABC13
 ABC23
 3f
 2r

The second approach is described here.

Here is the sql fiddle.

1 Comment

The use of a regex needs ~ not LIKE, making it: string_value ~ '^[0-9]+$'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.