[RFC] Base convert changes

  105740
May 18, 2019 09:21 scott@exussum.co.uk (Scott Dutton)
Hi all

I have made some changes to base_convert which I feel would be more 
consistent with the current PHP (warning when there are errors, and not 
just returning the best value it can)

The RFC has some examples of what will change.

Currently the code works but a lot of tests fail due to extreme range's 
(also mentioned in the RFC)

If this passes I will fix the effected tests and add some more covering 
the new behavior.

https://wiki.php.net/rfc/base_convert_improvements

Let me know your thoughts

Thanks

Scott
  105768
May 21, 2019 05:42 scott@exussum.co.uk (Scott Dutton)
Not sure if I can add and example to the RFC at this point, but I came 
across another example of this recently.

https://gist.github.com/iansltx/4820b02ab276c3306314daaa41573445#file-getlines-php-L9

     // bindec() was doing weird things, hence converting through hex 
first
     // sttrev() to match endian-ness to expectations; zip file values 
are little-endian
     $compressedSize = hexdec(bin2hex(strrev(fread($stream, 4)))); // 
compressed size

bindec was returning 0 without warning, as it was passed binary data 
not a binary string. No warning bin2hex actually does the correct 
translation.

a warning here would have at least indicated there was something wrong 
with the input it was being passed instead of just 0.

Thanks

Scott
  105771
May 21, 2019 08:43 george.banyard@gmail.com ("G. P. B.")
On Tue, 21 May 2019, 06:42 Scott Dutton, <scott@exussum.co.uk> wrote:

> > Not sure if I can add and example to the RFC at this point, but I came > across another example of this recently. > > > https://gist.github.com/iansltx/4820b02ab276c3306314daaa41573445#file-getlines-php-L9 > > // bindec() was doing weird things, hence converting through hex > first > // sttrev() to match endian-ness to expectations; zip file values > are little-endian > $compressedSize = hexdec(bin2hex(strrev(fread($stream, 4)))); // > compressed size > > bindec was returning 0 without warning, as it was passed binary data > not a binary string. No warning bin2hex actually does the correct > translation. > > a warning here would have at least indicated there was something wrong > with the input it was being passed instead of just 0. > > Thanks > > Scott > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php
Until the RFC is in voting you can still make amendments to it. So feel free to add these functions/examples to the RFC. Best regards George Peter Banyard
  105779
May 23, 2019 12:36 nikita.ppv@gmail.com (Nikita Popov)
On Sat, May 18, 2019 at 11:22 AM Scott Dutton <scott@exussum.co.uk> wrote:

> Hi all > > I have made some changes to base_convert which I feel would be more > consistent with the current PHP (warning when there are errors, and not > just returning the best value it can) > > The RFC has some examples of what will change. > > Currently the code works but a lot of tests fail due to extreme range's > (also mentioned in the RFC) > > If this passes I will fix the effected tests and add some more covering > the new behavior. > > https://wiki.php.net/rfc/base_convert_improvements > > Let me know your thoughts > > Thanks > > Scott >
Hi Scott, I definitely agree with the part of the RFC that warns on garbage characters in the string. I'm not so sure about the changes in sign handling. The problem I see is that certain bases (in particular hex) are pretty much never used with an explicit sign, instead they are understood to be in two's complement representation. For example, code like this will currently work: var_dump(dechex(0xffffffff00000000)); // string(16) "ffffffff00000000" While after your change it will result in "-100000000", which for hex numbers is rather unexpected. (This example is somewhat non-great because there is a loss of accuracy for 64bit numbers represented as doubles -- the same examples with 32-bit integers on 32-bit systems would be fully reliable.) Nikita
  105780
May 23, 2019 16:32 derick@php.net (Derick Rethans)
On Thu, 23 May 2019, Nikita Popov wrote:

> On Sat, May 18, 2019 at 11:22 AM Scott Dutton <scott@exussum.co.uk> wrote: > > > I have made some changes to base_convert which I feel would be more > > consistent with the current PHP (warning when there are errors, and > > not just returning the best value it can) > > > > The RFC has some examples of what will change. > > > > Currently the code works but a lot of tests fail due to extreme > > range's (also mentioned in the RFC) > > > > If this passes I will fix the effected tests and add some more > > covering the new behavior. > > > > https://wiki.php.net/rfc/base_convert_improvements > > I definitely agree with the part of the RFC that warns on garbage > characters in the string. I'm not so sure about the changes in sign > handling. The problem I see is that certain bases (in particular hex) are > pretty much never used with an explicit sign, instead they are understood > to be in two's complement representation. > > For example, code like this will currently work: > > var_dump(dechex(0xffffffff00000000)); > // string(16) "ffffffff00000000"
I concur with Nikita here. I don't think you should change the sign handling. I think this is something that actually use in real life projects. cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Like Xdebug? Consider a donation: https://xdebug.org/donate.php, or become my Patron: https://www.patreon.com/derickr twitter: @derickr and @xdebug
  105783
May 23, 2019 20:32 scott@exussum.co.uk (Scott Dutton)
Hi Nikita, Derick

Thanks for your time looking at this. Not sure if this will sway you 
either way, Other programming languages do handle negatives correctly, 
for example (i can add this to the RFC if that helps)

Javascript - https://jsfiddle.net/c16b4usp/
Python - https://repl.it/repls/OliveBlushingModem
Go - https://play.golang.org/p/aLWg15c00Fy

Its interesting that Javascript handles the large value example 
correctly, the other two error. There is likely a way to handle 
negatives in a way which also handles the large numbers though I'm not 
sure what that would be, I am happy to take any suggestions or look 
further into the engine to see how this could be done.

I was planning on two votes, one for the invalid char changes and one 
for the negative changes. Either way the documentation should be changes 
to make it clear that either negative numbers or extremely large values 
will not return the expected value (unless there is a way to avoid both 
of these?)

Happy to make any changes to the implementation as I said in the PR, my 
C is quite rusty

Thanks

Scott


On 23.05.2019 17:32, Derick Rethans wrote:
> On Thu, 23 May 2019, Nikita Popov wrote: > >> On Sat, May 18, 2019 at 11:22 AM Scott Dutton <scott@exussum.co.uk> >> wrote: >> >> > I have made some changes to base_convert which I feel would be >> more >> > consistent with the current PHP (warning when there are errors, >> and >> > not just returning the best value it can) >> > >> > The RFC has some examples of what will change. >> > >> > Currently the code works but a lot of tests fail due to extreme >> > range's (also mentioned in the RFC) >> > >> > If this passes I will fix the effected tests and add some more >> > covering the new behavior. >> > >> > https://wiki.php.net/rfc/base_convert_improvements >> >> I definitely agree with the part of the RFC that warns on garbage >> characters in the string. I'm not so sure about the changes in sign >> handling. The problem I see is that certain bases (in particular >> hex) are >> pretty much never used with an explicit sign, instead they are >> understood >> to be in two's complement representation. >> >> For example, code like this will currently work: >> >> var_dump(dechex(0xffffffff00000000)); >> // string(16) "ffffffff00000000" > > I concur with Nikita here. I don't think you should change the sign > handling. I think this is something that actually use in real life > projects. > > cheers, > Derick > > -- > https://derickrethans.nl | https://xdebug.org | https://dram.io > Like Xdebug? Consider a donation: https://xdebug.org/donate.php, > or become my Patron: https://www.patreon.com/derickr > twitter: @derickr and @xdebug
  105874
June 11, 2019 19:02 scott@exussum.co.uk (Scott Dutton)
Hi all,

I plan to put this to vote early next week with the two options

One vote for the invalid char deprecation for php7.4 and Exception in 
PHP8

One vote for negative numbers PHP8 only - due to BC concerns

Thanks

Scott
  105879
June 12, 2019 14:52 derick@php.net (Derick Rethans)
On Tue, 11 Jun 2019, Scott Dutton wrote:

> Hi all, >=20 > I plan to put this to vote early next week with the two options >=20 > One vote for the invalid char deprecation for php7.4 and Exception in PHP= 8
>=20 > One vote for negative numbers PHP8 only - due to BC concerns
Could you update the relevent section(s) in the RFC? =E2=86=92=20 https://wiki.php.net/rfc/base_convert_improvements#proposed_voting_choices You might be able to remove some other sections that have no content=20 besides the templated content as well. cheers, Derick --=20 https://derickrethans.nl | https://xdebug.org | https://dram.io Like Xdebug? Consider a donation: https://xdebug.org/donate.php, or become my Patron: https://www.patreon.com/derickr twitter: @derickr and @xdebug