[RFC] clamp

  115076
June 23, 2021 18:42 hallbergkim@gmail.com (Kim Hallberg)
Hello internals,

The RFC for the clamp function is now open and under discussion, you now have 2 weeks 
to discuss, suggest improvements and open issues before voting is considered.

Any and all feedback is welcomed.

The RFC is available for viewing here: https://wiki.php.net/rfc/clamp <https://wiki.php.net/rfc/clamp> 
The implementation is available in a PR here: https://github.com/php/php-src/pull/7191 <https://github.com/php/php-src/pull/7191>

Thank you,
Kim Hallberg
  115085
June 23, 2021 23:50 tysonandre775@hotmail.com (tyson andre)
Hi Kim Hallberg,

> The RFC for the clamp function is now open and under discussion, you now have 2 weeks > to discuss, suggest improvements and open issues before voting is considered. > > Any and all feedback is welcomed. > > The RFC is available for viewing here: https://wiki.php.net/rfc/clamp > The implementation is available in a PR here: https://github.com/php/php-src/pull/7191
https://wiki.php.net/rfc/howto mentions:
> Listen to the feedback, and try to answer/resolve all questions. > **Update your RFC to document all the issues and discussions. > Cover both the positive and negative arguments.** Put the RFC URL into all your replies.
So I think the major objections are that: 1. This is easy to implement in userland and there's negligible performance benefit (in code that is a performance sensitive loop, it may still be worse compared to `$num < $min ? $min : ($num > $max : $max : $num)` when validity of types and min/max are known, especially with the JIT) 2. People not being sure if they'd ever use it personally, especially with the ability to customize parameter order and permitted argument types in userland 3. It's inconsistent with min() and max(), which support any comparable type such as GMP(https://www.php.net/manual/en/class.gmp) (arbitrary precision numbers), DateTime, etc., and that may lead to surprises. Although PHP's comparison operator and https://www.php.net/manual/en/function.min.php have many, many inconsistencies, already (Though special casing GMP is probably a bad idea due to it being optional and having an incompatible license with core for packagers) 4. I'm not sure what this is meant to do with the float NAN (Not A Number) from the RFC description, but that's solvable ``` php > var_dump(min(gmp_init('123'), gmp_init('456'))); object(GMP)#1 (1) { ["num"]=> string(3) "123" } php > var_dump(max(new DateTime('@0'), new DateTime())); object(DateTime)#2 (3) { ["date"]=> string(26) "2021-06-23 23:44:47.302531" ["timezone_type"]=> int(3) ["timezone"]=> string(3) "UTC" } php > echo json_encode(max([0,2],[0,1])); [0,2] ``` The RFC should probably link to this RFC announcement thread https://externals.io/message/115076 as well. Thanks, Tyson
  115282
July 3, 2021 14:03 hallbergkim@gmail.com (Kim Hallberg)
> On 24 Jun 2021, at 1:50 AM, tyson andre <tysonandre775@hotmail.com> wrote: > > So I think the major objections are that: > > 1. This is easy to implement in userland and there's negligible performance benefit > (in code that is a performance sensitive loop, it may still be worse compared to `$num < $min ? $min : ($num > $max : $max : $num)` when validity of types and min/max are known, especially with the JIT) > 2. People not being sure if they'd ever use it personally, especially with the ability to customize parameter order and permitted argument types in userland > 3. It's inconsistent with min() and max(), which support any comparable type such as GMP(https://www.php.net/manual/en/class.gmp) > (arbitrary precision numbers), DateTime, etc., and that may lead to surprises. > Although PHP's comparison operator and https://www.php.net/manual/en/function.min.php have many, many inconsistencies, already > > (Though special casing GMP is probably a bad idea due to it being optional and having an incompatible license with core for packagers) > 4. I'm not sure what this is meant to do with the float NAN (Not A Number) from the RFC description, but that's solvable
> On 24 Jun 2021, at 2:15 AM, tyson andre <tysonandre775@hotmail.com> wrote: > > I'd strongly prefer an actual benchmark for context and accuracy - is it actually faster or slower than the most efficient userland implementation and by how much? > E.g. in an optimized NTS build with `CFLAGS=-O2`, opcache enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1), > and no debug configure flags, how many calls per second can be made on variable values of $num for both? > (I'd assume over twice as fast as calling both min/max from another function, but possibly slower than efficient_clamp, but haven't run this) > > For userland implementations that did use min/max, they probably weren't performance sensitive for the application. > > ```php > function efficient_clamp(int|float $num, int|float $min, int|float $max): int|float { > return $num < $min ? $min : ($num > $max ? $max : $num); > } > ```
Will combine both emails into one here. Will tackle your first point since that has to with performance as your second email also does. Haven’t tested it with opcache enabled so cannot comment on the performance, but for my general testing the internal version is more than 50% faster than the min/max version. Slower than the “efficient” version in some cases and faster in others. That is to be expected though since the efficient version doesn’t check for a valid range, if a user mixes up the min and max values the efficient clamp function doesn’t work as intended and will always return either the min/max version and never the clamped value. For your second point, that could be said for most internal functions, `str_contains`, `str_starts_with` and `str_ends_with` could all three be implemented in userland with customises parameter orders, as can most other internals function. Frankly I don’t see this as much of an objection or issue as a phrase used to gate keep what can become a new standardised internal function. Thirdly, yes I will agree it’s inconsistent with min/max functions. That is mainly - as you stated, min/max takes mixed arguments and works with all comparable types. Though it technically isn’t inconsistent with most userland implementations as they mostly accept numerical values such as ints or floats into their clamp function, meaning only ints and floats are used anyway. Though I could see a version of clamping being useful in the `DatePeriod` class to check if a `DateTime` is in range of a certain date range, but that’s OT for this RFC so. As for your last and forth point, I’ll be updating the RFC with NaN behaviour, sticking with what Nikita suggested to throw when NaN is added to min or max, and return NaN is it’s added to value. Thanks, Kim.
  115088
June 24, 2021 00:15 tysonandre775@hotmail.com (tyson andre)
Hello Kim Hallberg,

> The RFC for the clamp function is now open and under discussion, you now have 2 weeks > to discuss, suggest improvements and open issues before voting is considered.
From https://wiki.php.net/rfc/clamp -
> Current userland implementations are handled in several ways, some of which use min and max to check the bound, > which is costly and slow when called often. > Because userland implementations are for the most part not cost-effective when called multiple times, > a language implementation is desired.
I'd strongly prefer an actual benchmark for context and accuracy - is it actually faster or slower than the most efficient userland implementation and by how much? E.g. in an optimized NTS build with `CFLAGS=-O2`, opcache enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1), and no debug configure flags, how many calls per second can be made on variable values of $num for both? (I'd assume over twice as fast as calling both min/max from another function, but possibly slower than efficient_clamp, but haven't run this) For userland implementations that did use min/max, they probably weren't performance sensitive for the application. ```php function efficient_clamp(int|float $num, int|float $min, int|float $max): int|float { return $num < $min ? $min : ($num > $max ? $max : $num); } ``` Thanks, Tyson
  115103
June 24, 2021 09:44 michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Marcin_Brzuchalski?=)
czw., 24 cze 2021 o 02:15 tyson andre <tysonandre775@hotmail.com>
napisał(a):

> Hello Kim Hallberg, > > > The RFC for the clamp function is now open and under discussion, you now > have 2 weeks > > to discuss, suggest improvements and open issues before voting is > considered. > > > From https://wiki.php.net/rfc/clamp - > > > Current userland implementations are handled in several ways, some of > which use min and max to check the bound, > > which is costly and slow when called often. > > Because userland implementations are for the most part not > cost-effective when called multiple times, > > a language implementation is desired. > > I'd strongly prefer an actual benchmark for context and accuracy - is it > actually faster or slower than the most efficient userland implementation > and by how much? > E.g. in an optimized NTS build with `CFLAGS=-O2`, opcache > enabled(zend_extension=opcache, opcache.enable=1,opcache.enable_cli=1), > and no debug configure flags, how many calls per second can be made on > variable values of $num for both? > (I'd assume over twice as fast as calling both min/max from another > function, but possibly slower than efficient_clamp, but haven't run this) > > For userland implementations that did use min/max, they probably weren't > performance sensitive for the application. > > ```php > function efficient_clamp(int|float $num, int|float $min, int|float $max): > int|float { > return $num < $min ? $min : ($num > $max ? $max : $num); > } > ``` >
Do we really need it? IMO the performance of this is not gonna change anything much. We're talking about nanoseconds here possibly. I'm looking forward to Hello World RFC - that's also often implemented in userland. ```php function hello(?string $who = 'World', bool $return = false, bool $newline = true): ?string { $str = "Hello {$who}!" . ($newline ? PHP_EOL : ''); if ($return) return $str; echo $str; } ``` I get an impression that we constantly add things into standard library which are from a language perspective irrelevant and that all could be developed in userland with no harm. Cheers, -- Michał Marcin Brzuchalski
  115114
June 24, 2021 11:59 pierre-php@processus.org (Pierre)
Le 24/06/2021 à 11:44, Michał Marcin Brzuchalski a écrit :
> I get an impression that we constantly add things into standard library > which are from a language perspective irrelevant > and that all could be developed in userland with no harm. > > Cheers, > -- > Michał Marcin Brzuchalski
Hello, I respectfully do not agree with this last statement. A simple method such as this one entirely belong to the standard library. I wouldn't want to end-up with a NPM-like ecosystem where I'd have to choose between clamp-php, php-clamp, php-bound-int, int-php-bound, etc... The more the standard library will be complete, the less projects will need dependencies, which is better for your projects long term maintenance. You often end-up in dependency resolution hell because library A use library B, which use foo/clamp which is not compatible with another C library you use, or because it doesn't support PHP core version X or Y. A simple function such as clamp in standard library makes much sense, anything like it must go into standard library, this way the userland ecosystem can focus on domain related stuff, or more complex topics, without worrying too much about this kind of detail. Having a great and complete standard library lowers community fragmentation, and that's for the best ! Moreover, it also ensures that all developers speak the same language regarding primitives provided by the standard library, and makes communication more efficient within community. Regards, -- Pierre
  115122
June 24, 2021 16:08 mike@newclarity.net (Mike Schinkel)
> On Jun 24, 2021, at 7:59 AM, Pierre <pierre-php@processus.org> wrote: > > Le 24/06/2021 à 11:44, Michał Marcin Brzuchalski a écrit : >> I get an impression that we constantly add things into standard library >> which are from a language perspective irrelevant >> and that all could be developed in userland with no harm. >> >> Cheers, >> -- >> Michał Marcin Brzuchalski > > Hello, > > I respectfully do not agree with this last statement. A simple method such as this one entirely belong to the standard library. I wouldn't want to end-up with a NPM-like ecosystem where I'd have to choose between clamp-php, php-clamp, php-bound-int, int-php-bound, etc...
> > The more the standard library will be complete, the less projects will need dependencies, which is better for your projects long term maintenance. You often end-up in dependency resolution hell because library A use library B, which use foo/clamp which is not compatible with another C library you use, or because it doesn't support PHP core version X or Y. > > A simple function such as clamp in standard library makes much sense, anything like it must go into standard library, this way the userland ecosystem can focus on domain related stuff, or more complex topics, without worrying too much about this kind of detail. Having a great and complete standard library lowers community fragmentation, and that's for the best ! > > Moreover, it also ensures that all developers speak the same language regarding primitives provided by the standard library, and makes communication more efficient within community.
This. All of this. -Mike
  115167
June 28, 2021 09:16 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Jun 23, 2021 at 8:43 PM Kim Hallberg <hallbergkim@gmail.com> wrote:

> Hello internals, > > The RFC for the clamp function is now open and under discussion, you now > have 2 weeks > to discuss, suggest improvements and open issues before voting is > considered. > > Any and all feedback is welcomed. > > The RFC is available for viewing here: https://wiki.php.net/rfc/clamp < > https://wiki.php.net/rfc/clamp> > The implementation is available in a PR here: > https://github.com/php/php-src/pull/7191 < > https://github.com/php/php-src/pull/7191> > > Thank you, > Kim Hallberg >
The RFC is missing a precise description of how this function works with NaN and negative zero. The expected behavior is that if min or max are NaN, an exception is thrown, if num if NaN then NaN is returned, and the behavior wrt negative zero is unspecified. This makes for an IEEE754-2008 style clamping operation. Making negative zero smaller than positive zero would be an IEEE754-2019 style clamp. I'm not really convinced that this is a worthwhile addition, but also not particularly opposed to it. Regards, Nikita
  115283
July 3, 2021 14:04 hallbergkim@gmail.com (Kim Hallberg)
> On 28 Jun 2021, at 11:16 AM, Nikita Popov ppv@gmail.com> wrote: > > The RFC is missing a precise description of how this function works with NaN and negative zero. The expected behavior is that if min or max are NaN, an exception is thrown, if num if NaN then NaN is returned, and the behavior wrt negative zero is unspecified. This makes for an IEEE754-2008 style clamping operation. Making negative zero smaller than positive zero would be an IEEE754-2019 style clamp. > > I'm not really convinced that this is a worthwhile addition, but also not particularly opposed to it. > > Regards, > Nikita
Will update the RFC with a more precise description. Didn’t know there existed IEEE standards for clamping operations. Adding exceptions for NaN values into min/max is something I agree would be expected behaviour since a non number in the range don’t really make sense. As for the negative zero behaviour though, I’m not sure how to address that since frankly it’s out of my league, and when would a range of -0, 0 make sense to have? Cannot see a value being placed between that expect for 0, so a `clamp(0, -0, 0)` would return 0 then? Thanks, Kim.
  115366
July 8, 2021 10:47 hallbergkim@gmail.com (Kim Hallberg)
> On 23 Jun 2021, at 8:42 PM, Kim Hallberg <hallbergkim@gmail.com> wrote: > > Hello internals, > > The RFC for the clamp function is now open and under discussion, you now have 2 weeks > to discuss, suggest improvements and open issues before voting is considered. > > Any and all feedback is welcomed. > > The RFC is available for viewing here: https://wiki.php.net/rfc/clamp <https://wiki.php.net/rfc/clamp> > The implementation is available in a PR here: https://github.com/php/php-src/pull/7191 <https://github.com/php/php-src/pull/7191> > > Thank you, > Kim Hallberg >
Since the voting window has been missed, the proposed version will be updated to PHP 8.2 instead. Thanks, Kim.
  115367
July 8, 2021 10:56 hallbergkim@gmail.com (Kim Hallberg)
> On 23 Jun 2021, at 8:42 PM, Kim Hallberg <hallbergkim@gmail.com> wrote: > > Hello internals, > > The RFC for the clamp function is now open and under discussion, you now have 2 weeks > to discuss, suggest improvements and open issues before voting is considered. > > Any and all feedback is welcomed. > > The RFC is available for viewing here: https://wiki.php.net/rfc/clamp > The implementation is available in a PR here: https://github.com/php/php-src/pull/7191 > > Thank you, > Kim Hallberg >
Since the voting window has been missed, the proposed version will be updated to PHP 8.2 instead. Thanks, Kim.