Skip to content

Conversation

@bwoebi
Copy link
Member

@bwoebi bwoebi commented Nov 19, 2025

This pre-allocates a large string, for usage with concatenations. Users must take care to keep the refcount to 1, if they desire benefiting from this.

Note that it is generally pointless to call str_extend("", $size) (i.e. extending an empty string), given that e.g. concatenation will special case empty strings, and then use the other string. (Which is why not a str_alloc($size), which would be pointless and thrown away during concat op.)

This has a slight performance improvement on the general case of appending a single byte in a loop (given that zend_string_extend now uses perealloc3) of about 8%. In particular zend_string_extend() will mostly run into the fast path of zend_mm_realloc_heap for huge allocations.

When using str_extend(), appending a single byte in a loop is 33% faster than the old baseline.

The tested loop is:

$str = str_extend("a", 1 << 26);
for ($i = 0; $i < 1 << 25; ++$i) {
        $str .= "a";
}

Specifically hyperfine (x.php being the above test script and y.php being the script, but with "a" directly instead of str_extend()):

# hyperfine '/root/php-src-X/baseline-php -dmemory_limit=1G y.php'
Benchmark 1: /root/php-src-X/baseline-php -dmemory_limit=1G y.php
  Time (mean ± σ):     495.3 ms ±  10.2 ms    [User: 348.1 ms, System: 137.8 ms]
  Range (min … max):   478.8 ms … 510.5 ms    10 runs

# hyperfine '/root/php-src-X/sapi/cli/php -dmemory_limit=1G y.php'
Benchmark 1: /root/php-src-X/sapi/cli/php -dmemory_limit=1G y.php
  Time (mean ± σ):     456.2 ms ±   8.4 ms    [User: 298.1 ms, System: 152.5 ms]
  Range (min … max):   443.4 ms … 468.5 ms    10 runs

# hyperfine '/root/php-src-X/sapi/cli/php -dmemory_limit=1G x.php'
Benchmark 1: /root/php-src-X/sapi/cli/php -dmemory_limit=1G x.php
  Time (mean ± σ):     325.7 ms ±   7.5 ms    [User: 288.7 ms, System: 29.7 ms]
  Range (min … max):   312.3 ms … 339.1 ms    10 runs

I haven't looked at improvements / overhead outside of the synthetic case though (not quite sure what public code to test against) - I could observe some improvements in string processing code though.

Ideally this feature may allow optimizations in JIT eventually whereby lightweight appending can be done with a bare capacity counter for looped string appends, compared against the value initially passed to str_extend(), avoiding repeated string extends.

This pre-allocates a large string, for usage with concatenations.
Users must take care to keep the refcount to 1, if they desire benefiting from this.

Note that it is generally pointless to call str_extend("", $size) (i.e. extending an empty string), given that e.g. concatenation will special case empty strings, and then use the other string.
(Which is why not a str_alloc($size), which would be pointless and thrown away during concat op.)

This has a very slight performance improvement on the general case of appending a single byte in a loop (given that zend_string_extend now uses perealloc3) of about 8%.
In particular zend_string_extend() will mostly run into the fast path of zend_mm_realloc_heap for huge allocations.

When using str_extend(), appending a single byte in a loop is 33% faster than the old baseline.

The tested loop is:
$str = str_extend("a", 1 << 26);
for ($i = 0; $i < 1 << 25; ++$i) {
        $str .= "a";
}
@bwoebi
Copy link
Member Author

bwoebi commented Nov 19, 2025

I suppose that test is quite pathological under asan :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant