@@ -664,13 +664,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
664664 </listitem>
665665 </varlistentry>
666666
667- <varlistentry>
668- <term><literal>de-u-co-phonebk-x-icu</literal></term>
669- <listitem>
670- <para>German collation, phone book variant</para>
671- </listitem>
672- </varlistentry>
673-
674667 <varlistentry>
675668 <term><literal>de-AT-x-icu</literal></term>
676669 <listitem>
@@ -683,13 +676,6 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
683676 </listitem>
684677 </varlistentry>
685678
686- <varlistentry>
687- <term><literal>de-AT-u-co-phonebk-x-icu</literal></term>
688- <listitem>
689- <para>German collation for Austria, phone book variant</para>
690- </listitem>
691- </varlistentry>
692-
693679 <varlistentry>
694680 <term><literal>und-x-icu</literal> (for <quote>undefined</quote>)</term>
695681 <listitem>
@@ -709,6 +695,90 @@ SELECT a COLLATE "C" < b COLLATE "POSIX" FROM test1;
709695 will draw an error along the lines of <quote>collation "de-x-icu" for
710696 encoding "WIN874" does not exist</>.
711697 </para>
698+
699+ <para>
700+ ICU allows collations to be customized beyond the basic language+country
701+ set that is preloaded by <command>initdb</command>. Users are encouraged
702+ to define their own collation objects that make use of these facilities to
703+ suit the sorting behavior to their requirements. Here are some examples:
704+
705+ <variablelist>
706+ <varlistentry>
707+ <term><literal>CREATE COLLATION "de-u-co-phonebk-x-icu" (provider = icu, locale = 'de-u-co-phonebk')</literal></term>
708+ <listitem>
709+ <para>German collation with phone book collation type</para>
710+ </listitem>
711+ </varlistentry>
712+
713+ <varlistentry>
714+ <term><literal>CREATE COLLATION "und-u-co-emoji-x-icu" (provider = icu, locale = 'und-u-co-emoji')</literal></term>
715+ <listitem>
716+ <para>
717+ Root collation with Emoji collation type, per Unicode Technical Standard #51
718+ </para>
719+ </listitem>
720+ </varlistentry>
721+
722+ <varlistentry>
723+ <term><literal>CREATE COLLATION digitslast (provider = icu, locale = 'en-u-kr-latn-digit')</literal></term>
724+ <listitem>
725+ <para>
726+ Sort digits after Latin letters. (The default is digits before letters.)
727+ </para>
728+ </listitem>
729+ </varlistentry>
730+
731+ <varlistentry>
732+ <term><literal>CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper')</literal></term>
733+ <listitem>
734+ <para>
735+ Sort upper-case letters before lower-case letters. (The default is
736+ lower-case letters first.)
737+ </para>
738+ </listitem>
739+ </varlistentry>
740+
741+ <varlistentry>
742+ <term><literal>CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-latn-digit')</literal></term>
743+ <listitem>
744+ <para>
745+ Combines both of the above options.
746+ </para>
747+ </listitem>
748+ </varlistentry>
749+
750+ <varlistentry>
751+ <term><literal>CREATE COLLATION numeric (provider = icu, locale = 'en-u-kn-true')</literal></term>
752+ <listitem>
753+ <para>
754+ Numeric ordering, sorts sequences of digits by their numeric value,
755+ for example: <literal>A-21</literal> < <literal>A-123</literal>
756+ (also known as natural sort).
757+ </para>
758+ </listitem>
759+ </varlistentry>
760+ </variablelist>
761+
762+ See <ulink url="http://unicode.org/reports/tr35/tr35-collation.html">Unicode
763+ Technical Standard #35</ulink>
764+ and <ulink url="https://tools.ietf.org/html/bcp47">BCP 47</ulink> for
765+ details. The list of possible collation types (<literal>co</literal>
766+ subtag) can be found in
767+ the <ulink url="http://www.unicode.org/repos/cldr/trunk/common/bcp47/collation.xml">CLDR
768+ repository</ulink>.
769+ The <ulink url="https://ssl.icu-project.org/icu-bin/locexp">ICU Locale
770+ Explorer</ulink> can be used to check the details of a particular locale
771+ definition.
772+ </para>
773+
774+ <para>
775+ Note that while this system allows creating collations that <quote>ignore
776+ case</quote> or <quote>ignore accents</quote> or similar (using
777+ the <literal>ks</literal> key), PostgreSQL does not at the moment allow
778+ such collations to act in a truly case- or accent-insensitive manner. Any
779+ strings that compare equal according to the collation but are not
780+ byte-wise equal will be sorted according to their byte values.
781+ </para>
712782 </sect4>
713783 </sect3>
714784
0 commit comments