Commit 976a1a4
committed
Improve to_date/to_number/to_timestamp behavior with multibyte characters.
The documentation says that these functions skip one input character
per literal (non-pattern) format character. Actually, though, they
skipped one input *byte* per literal *byte*, which could be hugely
confusing if either data or format contained multibyte characters.
To fix, adjust the FormatNode representation and parse_format() so
that multibyte format characters are stored as one FormatNode not
several, and adjust the data-skipping bits to advance by pg_mblen()
not necessarily one byte. There's no user-visible behavior change
on the to_char() side, although the internal representation changes.
Commit e87d496 had already fixed most places where we skip characters
on the basis of non-literal format patterns to advance by characters
not bytes, but this gets one more place, the SKIP_THth macro. I think
everything in formatting.c gets that right now.
It'd be nice to have some regression test cases covering this behavior;
but of course there's no way to do so in an encoding-agnostic way, and
many of the interesting aspects would also require unportable locale
selections. So I've not bothered here.
Discussion: https://postgr.es/m/28186.1510957703@sss.pgh.pa.us1 parent 63ca863 commit 976a1a4
1 file changed
+41
-27
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
154 | | - | |
155 | | - | |
156 | 154 | | |
157 | 155 | | |
158 | 156 | | |
| |||
162 | 160 | | |
163 | 161 | | |
164 | 162 | | |
165 | | - | |
| 163 | + | |
166 | 164 | | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
172 | 170 | | |
173 | 171 | | |
174 | 172 | | |
| |||
1282 | 1280 | | |
1283 | 1281 | | |
1284 | 1282 | | |
| 1283 | + | |
| 1284 | + | |
1285 | 1285 | | |
1286 | 1286 | | |
1287 | 1287 | | |
1288 | 1288 | | |
1289 | 1289 | | |
1290 | | - | |
| 1290 | + | |
| 1291 | + | |
1291 | 1292 | | |
1292 | 1293 | | |
1293 | 1294 | | |
| |||
1297 | 1298 | | |
1298 | 1299 | | |
1299 | 1300 | | |
| 1301 | + | |
1300 | 1302 | | |
1301 | | - | |
| 1303 | + | |
| 1304 | + | |
1302 | 1305 | | |
1303 | 1306 | | |
1304 | 1307 | | |
| 1308 | + | |
1305 | 1309 | | |
1306 | 1310 | | |
1307 | 1311 | | |
| |||
1312 | 1316 | | |
1313 | 1317 | | |
1314 | 1318 | | |
| 1319 | + | |
1315 | 1320 | | |
1316 | | - | |
| 1321 | + | |
| 1322 | + | |
1317 | 1323 | | |
1318 | 1324 | | |
1319 | 1325 | | |
1320 | | - | |
| 1326 | + | |
1321 | 1327 | | |
1322 | 1328 | | |
1323 | 1329 | | |
| |||
1349 | 1355 | | |
1350 | 1356 | | |
1351 | 1357 | | |
1352 | | - | |
| 1358 | + | |
| 1359 | + | |
1353 | 1360 | | |
1354 | 1361 | | |
1355 | 1362 | | |
| |||
2008 | 2015 | | |
2009 | 2016 | | |
2010 | 2017 | | |
2011 | | - | |
2012 | | - | |
| 2018 | + | |
| 2019 | + | |
2013 | 2020 | | |
2014 | 2021 | | |
2015 | 2022 | | |
| |||
2076 | 2083 | | |
2077 | 2084 | | |
2078 | 2085 | | |
2079 | | - | |
| 2086 | + | |
| 2087 | + | |
2080 | 2088 | | |
2081 | 2089 | | |
2082 | 2090 | | |
| |||
2405 | 2413 | | |
2406 | 2414 | | |
2407 | 2415 | | |
2408 | | - | |
2409 | | - | |
| 2416 | + | |
| 2417 | + | |
2410 | 2418 | | |
2411 | 2419 | | |
2412 | 2420 | | |
| |||
2974 | 2982 | | |
2975 | 2983 | | |
2976 | 2984 | | |
2977 | | - | |
| 2985 | + | |
2978 | 2986 | | |
2979 | 2987 | | |
2980 | 2988 | | |
| |||
4217 | 4225 | | |
4218 | 4226 | | |
4219 | 4227 | | |
4220 | | - | |
| 4228 | + | |
4221 | 4229 | | |
4222 | 4230 | | |
4223 | 4231 | | |
| |||
4821 | 4829 | | |
4822 | 4830 | | |
4823 | 4831 | | |
4824 | | - | |
4825 | | - | |
4826 | | - | |
| 4832 | + | |
| 4833 | + | |
| 4834 | + | |
4827 | 4835 | | |
4828 | 4836 | | |
4829 | 4837 | | |
| |||
5081 | 5089 | | |
5082 | 5090 | | |
5083 | 5091 | | |
5084 | | - | |
5085 | | - | |
5086 | | - | |
| 5092 | + | |
5087 | 5093 | | |
5088 | 5094 | | |
5089 | | - | |
| 5095 | + | |
| 5096 | + | |
| 5097 | + | |
| 5098 | + | |
| 5099 | + | |
| 5100 | + | |
| 5101 | + | |
| 5102 | + | |
| 5103 | + | |
5090 | 5104 | | |
5091 | 5105 | | |
5092 | 5106 | | |
| |||
0 commit comments