|
113 | 113 | <entry><type>real</type></entry> |
114 | 114 | <entry> |
115 | 115 | Same as <function>word_similarity(text, text)</function>, but forces |
116 | | - extent boundaries to match word boundaries. |
| 116 | + extent boundaries to match word boundaries. Since we don't have |
| 117 | + cross-word trigrams, this function actually returns greatest similarity |
| 118 | + between first string and any continuous extent of words of the second |
| 119 | + string. |
117 | 120 | </entry> |
118 | 121 | </row> |
119 | 122 | <row> |
|
164 | 167 | This function returns a value that can be approximately understood as the |
165 | 168 | greatest similarity between the first string and any substring of the second |
166 | 169 | string. However, this function does not add padding to the boundaries of |
167 | | - the extent. Thus, a whole word match gets a higher score than a match with |
168 | | - a part of the word. |
| 170 | + the extent. Thus, the number of additional characters present in the |
| 171 | + second string is not considered, except for the mismatched word boundry. |
169 | 172 | </para> |
170 | 173 |
|
171 | 174 | <para> |
172 | 175 | At the same time, <function>strict_word_similarity(text, text)</function> |
173 | | - has to select an extent that matches word boundaries. In the example above, |
| 176 | + selects extent of words in the second string. In the example above, |
174 | 177 | <function>strict_word_similarity(text, text)</function> would select the |
175 | | - extent <literal>{" w"," wo","wor","ord","rds","ds "}</literal>, which |
176 | | - corresponds to the whole word <literal>'words'</literal>. |
| 178 | + extent of single word <literal>'words'</literal>, whose set of trigrams is |
| 179 | + <literal>{" w"," wo","wor","ord","rds","ds "}</literal> |
177 | 180 |
|
178 | 181 | <programlisting> |
179 | 182 | # SELECT strict_word_similarity('word', 'two words'), similarity('word', 'words'); |
|
186 | 189 |
|
187 | 190 | <para> |
188 | 191 | Thus, the <function>strict_word_similarity(text, text)</function> function |
189 | | - is useful for finding similar subsets of whole words, while |
| 192 | + is useful for finding the similarity to whole words, while |
190 | 193 | <function>word_similarity(text, text)</function> is more suitable for |
191 | | - searching similar parts of words. |
| 194 | + finding the similarity for parts of words. |
192 | 195 | </para> |
193 | 196 |
|
194 | 197 | <table id="pgtrgm-op-table"> |
|
0 commit comments