Re: [PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

  110753
June 27, 2020 22:53 george.banyard@gmail.com ("G. P. B.")
Hello Andrea, Benjamin, and others,

Better error messages are obviously better than just replacing the name of
the token, however this argument is saying that because this isn't perfect
let's do nothing.

I wasn't aware that Rowan had made good progress on his patch, however, if
this patch needs an RFC, which it might not, there is still no guarantee on
having better error messages for PHP 8.0.

Now, this may just me, but I find the argument that "::" is present
relatively weak as when I scan a PHP parse error, the first thing I look
for is the token name as there are only 3 parse errors which have the added
information of what the token represents, namely T_SL, T_SR, and
T_PAAMAYIM_NEKUDOTAYIM. T_SL and T_SR also have this issue but I would
expect beginners to not hit this as often as I don't expect them to do
anything with bit shifts, and the only other way that they may encounter it
is by doing a bad git merge.

I find it highly frustrating, and borderline offensive, that we are being
asked to go from a simple, non BC breaking, easy to enact change, to a
semi-major overhaul of how error messages should look like. I personally
have no interest in learning Bison and how to implement that as a separate
improvement to PHP. I'm just glad that Rowan decided to take on this
challenge.

Even if better error messages come about, some people will still need to
deal with the token, such as static analyser, code sniffers, code style
tools, etc. Obviously people working on these tools aren't beginners and
deal with the weird token name just fine, or use the T_DOUBLE_COLON alias
if they can. But why should it still be like this? This change has no BC
break, makes English the consistent language for token names. Moreover,
something being historic doesn't mean it shouldn't be touched.

My perception is that most of the community finds it baffling why anyone
would be against this change.

On Sat, 27 Jun 2020 at 15:57, Andrea Faulds <ajf@ajf.me> wrote:

> As for parser errors, I don't know how easy they would be to improve… is > it even possible for us to do so without using a hand-written parser > instead of an auto-generated one? (I have no idea.)
This may be done using Bison 3.6, which got released in May of this year, as seen by this PR: https://github.com/php/php-src/pull/5416 However, I don't expect us to be able to use this as Bison 3.6, won't be present on most distrib until a couple of years. Best regards George P. Banyard
  110754
June 27, 2020 23:03 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> Better error messages are obviously better than just replacing the name of > the token, however this argument is saying that because this isn't perfect > let's do nothing.
I don't think this is the argument. I think the argument is rather than half-fix the problem wrong way, let's fully fix it the right way.
> I find it highly frustrating, and borderline offensive, that we are being > asked to go from a simple, non BC breaking, easy to enact change, to a > semi-major overhaul of how error messages should look like. I personally
I am not sure why is this "offensive" to you. Is there something in overhauling error messages that goes against your principles or religious beliefs? Sometimes overhauling is exactly what is the right way to go, and not showing internal parser tokens seems to be quite reasonable idea. If this idea is somehow "offensive" to you, I feel sorry for you but it's not a reason to abandon this idea. It is also quite a common things - many proposals in PHP have been rejected because the community thought it's not the right way to approach things, I myself have had some proposals rejected because of this. There's no reason to feel offended by that.
> My perception is that most of the community finds it baffling why anyone > would be against this change.
My perception is that most of the community couldn't care less for how the tokens are named, and really shouldn't. It's an internal thing and should stay this way. If they made to care, that's our fault and we should fix it. -- Stas Malyshev smalyshev@gmail.com
  110759
June 28, 2020 14:01 rowan.collins@gmail.com (Rowan Tommins)
On 27/06/2020 23:53, G. P. B. wrote:

> [...] there are only 3 parse errors which have the added > information of what the token represents, namely T_SL, T_SR, and > T_PAAMAYIM_NEKUDOTAYIM.
I'm not sure what you mean by there being only three. Since PHP 5.4, all parser errors include the content of the token as well as its name, even when this is basically repeating the same word, such as "unexpected 'echo' (T_ECHO)". There's even a bug in the implementation meaning that casts repeat the name three times, e.g. "unexpected '(int)' (int) (T_INT_CAST)". This is why I was able to get a working prototype for new error messages fairly quickly, and have had a draft PR open for nearly two weeks, but also why I've spent a bit of time refining it to make sure edge cases are handled better.
> I find it highly frustrating, and borderline offensive, that we are being > asked to go from a simple, non BC breaking, easy to enact change, to a > semi-major overhaul of how error messages should look like.
I understand the general sentiment, that there were a lot of people in the previous thread saying re-wording was a good idea, and not many offering to work on it. I also appreciate that I hadn't provided any public updates on my progress, and what hurdles needed to be over-come. But you did know I was working on a patch, so the simple solution would have been to ask me before opening the vote. Regards, -- Rowan Tommins (né Collins) [IMSoP]
  110765
June 29, 2020 07:08 claude.pache@gmail.com (Claude Pache)
> Le 28 juin 2020 à 00:53, G. P. B. banyard@gmail.com> a écrit : > > My perception is that most of the community finds it baffling why anyone > would be against this change. >
What baffles me, is the amount of discussion around changing the name of ONE token, whereas it is clear that a bunch of tokens (a few of them explicitly mentioned during the discussion phase of the RFC) have incomprehensible names, or worse, misleading names (T_STRING instead of T_IDENTIFIER), several of them appearing more often in error messages than T_DOUBLE_COLON (so, to be super-clear, I’m not talking about T_SL). And at this point I have still restricted my view to the issue of token name, whereas there is an obvious way to do better at relatively low cost, namely replacing the token name with a user-friendly description of the token. I really appreciate the efforts made to make PHP better; thanks for that. But I friendly suggest that, each time you are going to propose an improvement, you take first a step back and consider how your change fits in its wider context. —Claude