[PHP-DEV][RFC][VOTE] Rename T_PAAMAYIM_NEKUDOTAYIM to T_DOUBLE_COLON

  110728
June 25, 2020 14:52 george.banyard@gmail.com ("G. P. B.")
Hello internals,

As the two week discussion period has elapsed the vote is now open.

We did acknowledge the suggestion of dropping the token name from the error
message directly, but in our opinion this is an orthogonal change to the
one proposed, and has the risk of not landing in PHP 8.0.

The vote will close on the 9th of July.

https://wiki.php.net/rfc/rename-double-colon-token

Best regards

George P. Banyard
  110738
June 27, 2020 09:56 ajf@ajf.me (Andrea Faulds)
Hi,

G. P. B. wrote:

I have voted No to this, and I hope I can convince some others to do the 
same.

T_PAAMAYIM_NEKUDOTAYIM is such a famous token that there is probably 
nobody in internals who doesn't know what it means, and for new 
contributors, it is easy to find the definition, and note that it is 
hardly the only token name that they will need to look up. It is also a 
fun nod to the history of PHP, and I think it would be a shame to lose 
that.

I mention the internals usage first and foremost, because it should be 
remembered that token names are merely an implementation detail of the 
PHP interpreter, unless you're using token_get_all (which by the way 
already has the alias T_DOUBLE_COLON). In other words, it's not 
something the vast majority of userland developers should ever encounter 
or have to think about.

Of course, if T_PAAMAYIM_NEKUDOTAYIM was never encountered by userland 
developers, this RFC wouldn't exist. The thing is, I don't think 
T_DOUBLE_COLON should be encountered by userland developers either — in 
my view, as an implementation detail, token names shouldn't be part of 
parser error messages at all. If we were to remove token names from the 
parser errors, we would avoid the problem this RFC seeks to solve. For 
most tokens we could simply display the characters it corresponds to 
(e.g. "::" for T_PAAMAYIM_NEKUDOTAYIM, which we already do!), and for 
those with variable content (e.g. T_STRING) we could display a 
human-readable description of what is expected (e.g. "an identifier").

I think the case for not renaming T_PAAMAYIM_NEKUDOTAYIM, and instead 
improving the error messages, is stronger when you consider that is not 
the only token with a name that might confuse people outside internals. 
For example, T_STRING is a very common token, but the name is probably 
going to surprise most userland developers who encounter it in an error 
message, because it doesn't mean a literal string. Even for tokens with 
more conventional names, it is unnecessary extra information. I think 
renaming just T_PAAMAYIM_NEKUDOTAYIM is not a full solution to the 
problem this RFC intends to solve.

Apropos of that:

 > We did acknowledge the suggestion of dropping the token name from the
 > error message directly, but in our opinion this is an orthogonal
 > change to the one proposed, and has the risk of not landing in PHP
 > 8.0.

Is PHP 8.0 an all-important? If we _don't_ rename the tokens, but simply 
improve the error message, that might be allowable in a patch release 
(e.g. 8.0.1).

(I also don't think we should rush things if we are unsure about them, 
given the consequences that has had in the past.)

Thanks,
Andrea
  110742
June 27, 2020 13:57 ajf@ajf.me (Andrea Faulds)
Hi again,

A further and perhaps more important thought: I think the token names 
are actually the least confusing part of parser errors, even for the 
famous T_PAAMAYIM_NEKUDOTAYIM. Changing it to T_DOUBLE_COLON may not 
help much, because the parser only tells you what the next token it 
expected was, not *why* it expected it, i.e. what is wrong with the 
syntax. A user might think they need to add a :: but it's not their 
actual problem.

For example, if you google for “T_PAAMAYIM_NEKUDOTAYIM” errors, one of 
the classic examples where you got such an error was:

   var_dump(empty(TRUE));

If the error had said T_DOUBLE_COLON it would still be mystifying: why 
did PHP think you needed a :: there? And just adding one won't fix the 
problem! The actual issue was that empty() used to not support arbitrary 
expressions, only variables, and the expected T_PAAMAYIM_NEKUDOTAYIM is 
because the only way TRUE could be part of a variable would be if it was 
a class name (TRUE::$foo). The way to fix it is to replace empty() with 
the ! operator, but you'd have a hard time figuring that out from the 
error. I think this is the real reason T_PAAMAYIM_NEKUDOTAYIM was 
famous: even if you knew it meant double colon, the error message is 
still cryptic.

The good news is T_PAAMAYIM_NEKUDOTAYIM is no longer quite the menace it 
once was. PHP 7's parser and syntax overhaul (thank you Nikita!) fixed 
it in some places:

   $ php -r 'var_dump(isset(TRUE));'
   Fatal error: Cannot use isset() on the result of an expression (you 
can use "null !== expression" instead)

And other places where you might have once seen T_PAAMAYIM_NEKUDOTAYIM 
now give a different unhelpful parser error, which renaming 
T_PAAMAYIM_NEKUDOTAYIM will not help with:

   $ php -r 'unset(TRUE);'
   Parse error: syntax error, unexpected ')', expecting '[' in Command 
line code on line 1

So if there was ever a time to rename T_PAAMAYIM_NEKUDOTAYIM, it would 
have been many years ago. There's much less benefit to renaming it now, 
especially given it says “'::' (T_PAAMAYIM_NEKUDOTAYIM)” if you manage 
to get an error containing it, so you don't even need to google it. The 
specific name a token is given is the least of the problems there.

As for parser errors, I don't know how easy they would be to improve… is 
it even possible for us to do so without using a hand-written parser 
instead of an auto-generated one? (I have no idea.)

Regards,
Andrea