2

I have a django project where I am sorting a queryset by this CharField:

name = models.CharField(max_length=50, unique=True)

On my the default django db.sqlite3 server, it sorts as I want (according to ASCII value) where objects with names beginning with a tilde ~ character appear at the bottom of the list:

item
jitem
litem
~kitem

but on my production postgresql server, they appear like this:

item
jitem
~kitem
litem

and it appears that the tilde is ignored when sorting.

How can I fix this?

4
  • @Ivan That's setting it globally for all columns and operations though. It's also documentation that considers only MySQL. Commented Sep 20, 2015 at 22:40
  • @CraigRinger Wow, my bad. Commented Sep 20, 2015 at 22:44
  • @Ivan Not at all, it's still useful. It just doesn't solve this particular issue. Commented Sep 20, 2015 at 23:24
  • @43Tesseracts What's your database locale? Show output of \l+ thedatabasename in psql, or select * from pg_database where datname = 'mydbname'. Commented Sep 21, 2015 at 0:07

1 Answer 1

3

This is normal behaviour. Your default collation is one that ignores punctuation and other special characters when sorting.

If you want to do a simple sort where every character is considered based on its ordinal value in the current text encoding, use COLLATE "C". This may be applied at the database, column, index or query level.

You can't change the collation of an existing database database-wide, but you can change columns and indexes.

If you want to change it for the whole database you'll need to pg_dump the database, drop the database, and re-create the database with COLLATE "C" as one of the CREATE DATABASE options. I do not recommend doing this, though, since it leads to what users will consider to be wrong sort order for real natural language text.

Instead, change it on a per-column basis, e.g.

ALTER TABLE collate_demo ALTER COLUMN test_col TYPE text COLLATE "C";

(Keep the same type as before, just change the collation)


Note that you can also specify the collation you want for individual queries, though Django might not let you do that. E.g.

SELECT * FROM collate_demo ORDER BY test_col COLLATE "C";

or

SELECT * FROM collate_demo WHERE test_col > 'A' COLLATE "C" ORDER BY test_col COLLATE "C";

The COLLATE clause applies only to the particular comparison operator or ORDER BY it's specified with. It isn't a modifier for the whole query.

Note that even if two strings are equal according to national language collation rules they'll never be reported as equal by PostgreSQL. It sorts by utf-8 representation byte-order if the natural language collation says they sort the same but they're not byte-wise identical.


There's a Django ticket open for COLLATE support, which has been closed as almost a duplicate of icontains can be case-sensitive on MySQL. It seems to have languished indefinitely. I've added a note strongly encouraging implementation of COLLATE as the right way to solve the problem.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.