array_merge() inside looping optimization

  115583
July 25, 2021 21:34 david.proweb@gmail.com (David Rodrigues)
Hi!

Using array_merge() inside a loop can greatly impact performance. From the
tests I did, doing it in the middle costs 250% more than doing it later:

- Inside: for(...) $x = array_merge($x, $y)
- After: for(...) $x[] = $y; array_merge(... $x)

Even using array_push() doesn't seem like a good alternative as it costs
140%.

https://pastebin.com/s7f4Ttm3
https://3v4l.org/Vqt7o (+83% for array_push(), +760% for array_merge()
inside)

As far as I know, the array_merge() function makes a copy of the array,
rather than updating the input array. It makes sense, but in some
situations, not so rare, it might be possible to optimize when the
destination variable is part of the argument.

$arr = array_merge($arr, [ 1, 2, 3 ]); // or
$arr = array_merge([ 1, 2, 3 ], $arr);

This could reduce processing cost and memory consumption, as it would no
longer be necessary to copy the array to generate another one.

Anyway, I don't know if this is actually possible, or if the cost-benefit
would be better than what there is today. But I believe it is a path to be
thought of.


Atenciosamente,
David Rodrigues
  115585
July 26, 2021 13:41 pollita@php.net (Sara Golemon)
On Sun, Jul 25, 2021 at 4:35 PM David Rodrigues proweb@gmail.com>
wrote:
> Anyway, I don't know if this is actually possible, or if the cost-benefit > would be better than what there is today. But I believe it is a path to be > thought of. >
Sadly, no. The function doesn't know where the variable is going to be stored to, so it can't know if the one reference it's receiving is about to be overwritten. Technically, the information is in the op_array, but exposing that to an implementation handler is... dangerous and sets a bad precedent. A much more direct and maintainable solution would be to introduce a new function that intentionally modifies by reference. This doesn't give us a free fix for existing code, but it does provide a path for new code to be more efficient. So then you have to ask, should we kitchen sink something in for this? Maybe, the perf argument is compelling. I'd also argue that this request reinforces calls for type methods. $arr->push($otherArr); would be a stronger candidate, for example. In fact, I think the DS extension might actually have what you're looking for already. TL;DR - Maybe look at https://www.php.net/manual/en/ds-map.putall.php ? -Sara
  115587
July 26, 2021 14:28 claude.pache@gmail.com (Claude Pache)
> Le 26 juil. 2021 à 15:41, Sara Golemon <pollita@php.net> a écrit : > > A much more direct and maintainable solution would be to introduce a new > function that intentionally modifies by reference.
That function already exists, it is `array_push()`: `$x = array_merge($x, $y);` → `array_push($x, ...$y)` —Claude