[RFC] Make sorting stable

  110143
May 12, 2020 14:18 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

This was previously discussed in https://externals.io/message/108841, but I
figured it would make sense to create an RFC for this change:
https://wiki.php.net/rfc/stable_sorting

As before, the implementation approach is to stick with the existing qsort
and use a fallback comparison criterion, which is cheap to implement
internally. The obvious alternative is to use a stable base sort like
Timsort. I gave this a quick try in https://github.com/php/php-src/pull/5559,
and it does better in some cases (already sorted data) and worse in others
(random data). I don't plan to pursue this direction personally. (It may
also be interesting to use pdqsort as the unstable base, which should make
the "already sorted" case faster.)

Regards,
Nikita
  110144
May 12, 2020 14:28 nicolas.grekas+php@gmail.com (Nicolas Grekas)
Hi Nikita,

Thanks for another nice RFC.

This was previously discussed in https://externals.io/message/108841, but I
> figured it would make sense to create an RFC for this change: > https://wiki.php.net/rfc/stable_sorting > > As before, the implementation approach is to stick with the existing qsort > and use a fallback comparison criterion, which is cheap to implement > internally. The obvious alternative is to use a stable base sort like > Timsort. I gave this a quick try in > https://github.com/php/php-src/pull/5559, > and it does better in some cases (already sorted data) and worse in others > (random data). I don't plan to pursue this direction personally. (It may > also be interesting to use pdqsort as the unstable base, which should make > the "already sorted" case faster.) >
IIRC, in PHP 5 sorting was stable. Then PHP 7 made it unstable. That was something to account for when migrating - BC break inside. Do you know why did this happen at the time? On my side, I would very much prefer a new sorting flag, to be 200% sure of the behavior: sort($array, SORT_STABLE) That would allow ppl that don't care about stability to let the engine go with the fastest algorithm. WDYT? Nicolas
  110145
May 12, 2020 14:39 nikita.ppv@gmail.com (Nikita Popov)
On Tue, May 12, 2020 at 4:29 PM Nicolas Grekas grekas+php@gmail.com>
wrote:

> Hi Nikita, > > Thanks for another nice RFC. > > This was previously discussed in https://externals.io/message/108841, but >> I >> figured it would make sense to create an RFC for this change: >> https://wiki.php.net/rfc/stable_sorting >> >> As before, the implementation approach is to stick with the existing qsort >> and use a fallback comparison criterion, which is cheap to implement >> internally. The obvious alternative is to use a stable base sort like >> Timsort. I gave this a quick try in >> https://github.com/php/php-src/pull/5559, >> and it does better in some cases (already sorted data) and worse in others >> (random data). I don't plan to pursue this direction personally. (It may >> also be interesting to use pdqsort as the unstable base, which should make >> the "already sorted" case faster.) >> > > IIRC, in PHP 5 sorting was stable. Then PHP 7 made it unstable. That was > something to account for when migrating - BC break inside. >
In PHP 5 sorting was unstable. In PHP 7 sorting became stable for arrays with 16 elements or less. In PHP 8 sorting would become stable for *all* arrays.
> On my side, I would very much prefer a new sorting flag, to be 200% sure > of the behavior: > sort($array, SORT_STABLE) > > That would allow ppl that don't care about stability to let the engine go > with the fastest algorithm. >
I'd prefer to avoid an extra option. PHP is a high-level language, and the user should not have to concern themselves with sort stability considerations. I'd go for a flag if stable sorting had a huge perf impact. But as it doesn't, I don't see the point in an option. Regards, Nikita
  110146
May 12, 2020 14:52 sebastian@php.net (Sebastian Bergmann)
Am 12.05.2020 um 16:39 schrieb Nikita Popov:
> I'd prefer to avoid an extra option. PHP is a high-level language, and the > user should not have to concern themselves with sort stability > considerations.
+1
  110299
May 29, 2020 11:05 nikita.ppv@gmail.com (Nikita Popov)
On Tue, May 12, 2020 at 4:18 PM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > This was previously discussed in https://externals.io/message/108841, but > I figured it would make sense to create an RFC for this change: > https://wiki.php.net/rfc/stable_sorting > > As before, the implementation approach is to stick with the existing qsort > and use a fallback comparison criterion, which is cheap to implement > internally. The obvious alternative is to use a stable base sort like > Timsort. I gave this a quick try in > https://github.com/php/php-src/pull/5559, and it does better in some > cases (already sorted data) and worse in others (random data). I don't plan > to pursue this direction personally. (It may also be interesting to use > pdqsort as the unstable base, which should make the "already sorted" case > faster.) >
If nothing new comes up, I plan to open voting on this RFC soon. Nikita