zend_atol() and zend_atoi()

  105646
May 8, 2019 18:58 pollita@php.net (Sara Golemon)
I fell down a WTF hole today that led me to zend_atol().
The end result is the PR which I'd like to present for discussion (I'll add
tests before I push anything, though it might necessitate a vote).
https://github.com/php/php-src/pull/4132

The issue is explained in the commit message, but I'll copy here:

zend_ato[il] don't just do number parsing.
They also check for a 'K', 'M', or 'G' at the end of the string,
and multiply the parsed value out accordingly.

Unfortunately, they ignore any other non-numerics between the
numeric component and the last character in the string.
This means that numbers such as the following are both valid
and non-intuitive in their final output.

   - "123KMG" is interpreted as "123G" -> 132070244352
   - "123G " is interpreted as "123 " -> 123
   - "123GB" is interpreted as "123B" -> 123
   - "123 I like tacos." is also interpreted as "123." -> 123

This diff primarily adds warnings for these cases when the output
would be a potentially unexpected, and unhelpful value.

Additionally, several places in php-src use these functions
despite not actually wanting their KMG behavior such as
session.upload_progress.freq which will happily parse "1 banana"
as a valid value.

For these settings, I've switched them to ZEND_STRTOL which preserves
their existing /intended/ behavior.

   - It won't respect KMG suffixes, but they never really wanted that logic
   anyway.
   - It will ignore non-numeric suffixes so as to not introduce new
   failures.

We should probably reexamine that second bullet point separately.

Lastly, with these changes, zend_atoi() is no longer used in php-src,
but I left it as a valid API for 3rd party extensions.
Note that I did make it a proxy to zend_atol() since deferring the
truncation till the end is effectively the same as truncation during
multiplication, but this avoid code duplication.

I think we should consider removing zend_atoi() entirely (perhaps in 8.0?)
and rename zend_atol() to something reflecting it's KMG specific behavior.

-Sara
  105670
May 11, 2019 01:43 pierre.php@gmail.com (Pierre Joye)
On Thu, May 9, 2019, 1:59 AM Sara Golemon <pollita@php.net> wrote:

good catch, I did not know all these side effects but the kmg :)

   - It won't respect KMG suffixes, but they never really wanted that logic
> anyway.
The only place where I thought it would e used was for handling php.ini and the likes. I suppose even this area does not use it? Best, Pierre