diff options
| author | Marc Mutz <marc.mutz@qt.io> | 2024-10-01 12:01:24 +0200 |
|---|---|---|
| committer | Marc Mutz <marc.mutz@qt.io> | 2024-10-08 10:26:37 +0200 |
| commit | 62108a08c12abfc1421c283cf34e75ffeded2c12 (patch) | |
| tree | 1451bedf6148583a73851775559919c7c58a6889 /src/corelib/kernel/qmetaobjectbuilder.cpp | |
| parent | c095f7fbf820ac944c5d3096f48dd18752a218b3 (diff) | |
QStringConverter/ICU: optimize NUL-termination of codec name
ICU unfortunately requires converter names to be passed as
NUL-terminated C strings. This means that the names that come in via
QAnyStringView have to be encoding-converted (assuming US-ASCII,
ie. Latin-1), and NUL-terminated.
The old code used the convenient toString().toLatin1() methods for
this. This, however, transforms L1 and U8 inputs twice: first to
UTF-16, then to L1. It also always allocates memory.
To fix, first change the temporary string container to std::string
(which has an SSO buffer into which most common charset names will
fit, avoiding memory allocation) and then skip the conversion to
UTF-16, going directly from the source encoding to L1, treating UTF-8
as L1 (because US-ASCII is a common subset of both).
Unfortunately, our L1-to-U16 converter doesn't allow to select a
replacement character other than '?' for out-of-range input
characters, but valid charset names should not contain question marks,
so here's to hoping that ICU doesn't strip them willy-nilly, causing
False Positive matches. The old code had the same problem.
Amends f6c11ac4f20a16d0b2113014e2dac63b95d946ae.
Pick-to: 6.8
Fixes: QTBUG-126109
Change-Id: If1dd494cf4ee8e2d304a0648c22dc8806718f104
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Diffstat (limited to 'src/corelib/kernel/qmetaobjectbuilder.cpp')
0 files changed, 0 insertions, 0 deletions
