16

I have a column name that represents a person's name in the following format:

firstname [middlename] lastname [, Sr.|Jr.]

For, example:

John Smith
John J. Smith
John J. Smith, Sr.

How can I order items by lastname?

1
  • 5
    In an ideal world, you would choose to store the name parts in seperate fields Commented Jan 24, 2012 at 15:12

3 Answers 3

16

A correct and faster version could look like this:

SELECT *
FROM   tbl
ORDER  BY substring(name, '([^[:space:]]+)(?:,|$)')

Or:

ORDER  BY substring(name, E'([^\\s]+)(?:,|$)')

Or even:

ORDER  BY substring(name, E'([^\\s]+)(,|$)')

Explain

[^[:space:]]+ .. first (and longest) string consisting of one or more non-whitespace characters.
(,|$) .. terminated by a comma or the end of the string.

The last two examples use escape-string syntax and the class-shorthand \s instead of the long form [[:space:]] (which loses the outer level of brackets when inside a character class).

We don't actually have to use non-capturing parenthesis (?:) after the part we want to extract, because (quoting the manual):

.. if the pattern contains any parentheses, the portion of the text that matched the first parenthesized subexpression (the one whose left parenthesis comes first) is returned.

Test

SELECT substring(name, '([^[:space:]]+)(?:,|$)')
FROM  (VALUES 
  ('John Smith')
 ,('John J. Smith')
 ,('John J. Smith, Sr.')
 ,('foo bar Smith, Jr.')
) x(name)
Sign up to request clarification or add additional context in comments.

Comments

2
SELECT *
FROM t
ORDER BY substring(name, E'^.*\\s([^\\s]+)(?=,|$)') ASC

While this should provide the sorting you are looking for, it would be a lot cheaper to store the name in multiple columns and index them based on which parts of the name you need to sort by.

2 Comments

Thank you, this will probably solve the problem as long as raw sql is concerned.
The regular expression is actually incorrect. Try: SELECT substring('John J. Smith, Sr.', E'^.*\\s([^\\s]+)(?=,|$)'). I posted a version that works.
2

You should use functional index for this purpose http://www.postgresql.org/docs/7.3/static/indexes-functional.html

In your case somehow....

CREATE INDEX test1_lastname_col1_idx ON test1 (split_part(col1, ' ', 3));
SELECT * FROM test1 ORDER BY split_part(col1, ' ', 3);

3 Comments

This does not work, because lastname isn't always the third element.
You can change the string expression in order to your needs, its not so hard to do using regexps or this manual dev.mysql.com/doc/refman/5.0/en/string-functions.html , but the main idea of my post was to attract attention to CREATE INDEX statement, because select will take extremely long time without index.
Oh, I've mistaken this url leads to MySQL manual))) Here is the right one postgresql.org/docs/9.1/static/functions-string.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.