Re: [PHP-DEV] [RFC] Permit trailing whitespace in numeric strings

  104594
March 6, 2019 06:39 markus@fischer.name (Markus Fischer)
Hello,

On 06.03.19 01:16, Andrea Faulds wrote:
> https://wiki.php.net/rfc/trailing_whitespace_numerics > > I expect this should be an uncontroversial proposal, but maybe I'm > jinxing it there. I hope you all like it. :) > > Thanks to Nikita for reminding me it existed and thus motivating me to > pick it up again. Also thanks to Nikita for suggesting a potential > follow-up RFC, and also making the “Saner string to number comparisons” > RFC, both providing additional impetus for me to finally clean this up > and put it to discussion.
Hmm, my first reaction would be the opposite. For all the changes which happened during the years, PHP got stricter; this would make it more lax. I know this is totally not the goal of the RFC, by my feeling is if we would change something (ignoring BC break for a sec) that it would be to actually _remove_ support for leading whitespace 🤷‍♀️ - Markus
  104598
March 6, 2019 08:58 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Mar 6, 2019 at 7:41 AM Markus Fischer <markus@fischer.name> wrote:

> Hello, > > On 06.03.19 01:16, Andrea Faulds wrote: > > https://wiki.php.net/rfc/trailing_whitespace_numerics > > > > I expect this should be an uncontroversial proposal, but maybe I'm > > jinxing it there. I hope you all like it. :) > > > > Thanks to Nikita for reminding me it existed and thus motivating me to > > pick it up again. Also thanks to Nikita for suggesting a potential > > follow-up RFC, and also making the “Saner string to number comparisons” > > RFC, both providing additional impetus for me to finally clean this up > > and put it to discussion. > > Hmm, my first reaction would be the opposite. > > For all the changes which happened during the years, PHP got stricter; > this would make it more lax. > > I know this is totally not the goal of the RFC, by my feeling is if we > would change something (ignoring BC break for a sec) that it would be to > actually _remove_ support for leading whitespace 🤷‍♀️ > > - Markus >
I'm always a fan of making things stricter, but think that in this particular case there are some additional considerations we should keep in mind. 1. What is more important to me here than strictness is consistency. Either both " 123" and "123 " are numeric, or neither are. Making "123 " numeric is a change we can easily do, because it makes the numeric string definition more permissive and is thus mostly backwards compatible. Doing the reverse change is certainly not compatible and will be a much harder sell. 2. I believe that a large part of the motivation here is that by making the numeric string definition slightly more lax (in a consistent manner), we can make *other* things more strict, because this essentially eliminates the only "somewhat reasonable" case of trailing characters. The RFC already mentions two of them: a) We can hard reject "123foo" inputs to "int" arguments (and some other places). Currently this is allowed with a notice. I think if we resolve the trailing whitespace question, then there cannot be any reasonable opposition to this change. b) My own RFC on number to string comparisons would benefit from this. From initial testing it has surprisingly little impact, but one of the few cases that turned up was this comparison with a string that had trailing whitespace. Personally I think both of those changes are a lot more valuable than a stricter numeric string definition without leading/trailing whitespace. Regards, Nikita
  105065
April 3, 2019 23:15 ajf@ajf.me (Andrea Faulds)
Nikita Popov wrote:
> I'm always a fan of making things stricter, but think that in this > particular case there are some additional considerations we should keep in > mind. > > 1. What is more important to me here than strictness is consistency. Either > both " 123" and "123 " are numeric, or neither are. Making "123 " > numeric is a change we can easily do, because it makes the numeric string > definition more permissive and is thus mostly backwards compatible. Doing > the reverse change is certainly not compatible and will be a much harder > sell. > > 2. I believe that a large part of the motivation here is that by making the > numeric string definition slightly more lax (in a consistent manner), we > can make *other* things more strict, because this essentially eliminates > the only "somewhat reasonable" case of trailing characters. The RFC already > mentions two of them: > > a) We can hard reject "123foo" inputs to "int" arguments (and some other > places). Currently this is allowed with a notice. I think if we resolve the > trailing whitespace question, then there cannot be any reasonable > opposition to this change. > b) My own RFC on number to string comparisons would benefit from this. From > initial testing it has surprisingly little impact, but one of the few cases > that turned up was this comparison with a string that had trailing > whitespace. > > Personally I think both of those changes are a lot more valuable than a > stricter numeric string definition without leading/trailing whitespace.
I'm kinda unsure how to go forward because of these points. I would like to see improved comparisons, and I would like to see the end of the “non-well-formed” numeric string, and I think this whitespace RFC could be helpful to both. But I can't see the future, I don't know whether people will vote for removing leading or permitting traiing whitespace and whether or not they will be influenced by or this will influence opinion on the further improvements. ¯\_(ツ)_/¯ I'm torn between: * Vote on allowing trailing whitespace * Vote on disallowing leading whitespace * Vote on which of those two approaches to go for * Trying to bundle everything together and voting on it as a package. I'm probably thinking too strategically. Andrea
  105270
April 14, 2019 13:53 markyr@gmail.com (Mark Randall)
A thought -

In the event that explicit casting specifically does get tightened up, 
what will become the suggested method for making a best-effort 
conversion to an integer?

Personally I'm in favour of explicit casts being a bit more forgiving, 
but in the event they're not, what will replace it as the recommended 
inbuilt method that handles strings with whitespace and trailing junk?

Will there be one?

I'm aware that the most common IDEs suggest using (int) instead of intval.

--
Mark Randall
  105070
April 4, 2019 10:49 lester@lsces.co.uk (Lester Caine)
On 06/03/2019 08:58, Nikita Popov wrote:
> 1. What is more important to me here than strictness is consistency. Either > both " 123" and "123 " are numeric, or neither are. Making "123 " > numeric is a change we can easily do, because it makes the numeric string > definition more permissive and is thus mostly backwards compatible. Doing > the reverse change is certainly not compatible and will be a much harder > sell.
I think it may be worth pointing out here the distinction at least with Firebird and I thought with other databases? A CHAR field is always returned padded with spaces to it's length, while a VARCHAR field is always trimmed to the last non space character. There are often reasons for using CHAR although many of those are probably historic, but trimming them without proper consideration is less than ideal. Returning historic material where numeric data is right justified in a CHAR field is equally valid. Handling these has always required care and it would be preferable that both returned the same result without having to trim and perhaps creating problems when the padded version needs to be returned ... -- Lester Caine - G8HFL ----------------------------- Contact - https://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - https://lsces.co.uk EnquirySolve - https://enquirysolve.com/ Model Engineers Digital Workshop - https://medw.co.uk Rainbow Digital Media - https://rainbowdigitalmedia.co.uk