[VOTE] declare(function_and_const_lookup='global')

  108306
January 29, 2020 02:22 tysonandre775@hotmail.com (tyson andre)
Hi internals,

I've opened the vote on https://wiki.php.net/rfc/use_global_elements
after weighing the pros and cons of discussed alternative approaches.
Yesterday, I've finished the last set of updates I announced I would do based on RFC feedback.

(that update was to require the statement 'use function ...'/'use const ...' before declaring that function/const
outside the global namespace,
if declare(function_and_const_lookup='global') is opted into)

Voting closes on 2020-02-11.

Thanks,
- Tyson
  108307
January 29, 2020 04:03 theodorejb@outlook.com (Theodore Brown)
On Tue, Jan 28, 2020 at 8:22 PM tyson andre <tysonandre775@hotmail.com> wrote:

> Hi internals, > > I've opened the vote on https://wiki.php.net/rfc/use_global_elements
Thank you for working to address the issue of ambiguous function references.. However, I really don't think this is the right approach. The RFC mentions two problems it hopes to solve: a minor performance decrease, and developers having to deal with ambiguity. However, a third problem that I think is just as important to fix is the lack of function autoloading which makes extensive use of namespaced functions unfeasible. Unfortunately, the new directive proposed by this RFC would make declaring functions in namespaces even more difficult, as it would now be necessary to explicitly `use` each function and constant in the same file it is declared in. The RFC argues that having to "add multiple `use function function_name` and `use const MY_CONST` at the top of the namespace" is prone to merge conflicts and inconvenient to keep up to date. However, the RFC just shifts this problem from global functions to namespaced functions. Changing function/const resolution to always look in the global scope instead of current namespace is backwards from how classes and interfaces are resolved, and seems destined to become another language Sadness. Wouldn't it be more straightforward and intuitive to add a directive like `declare(namespace_lookup=1)` which would resolve functions/consts the same way as classes (without the global fallback)? The RFC argues that writing global functions as `\function_name()` is more verbose. But is one backslash character per global function call any more verbose than having to add a 44-character declare statement at the top of every file? Developers are already used to referencing global classes and functions with backslashes or explicit `use` statements, and in a future where we have function autoloading and utilize more namespaced functions, this approach will be less verbose than having to explicitly `use` every namespaced function in the same file it is declared in. Plus, IDEs and other tools can automatically add the backslashes or `use` statements for global functions. Best regards, Theodore
  108308
January 29, 2020 04:48 larry@garfieldtech.com ("Larry Garfield")
On Tue, Jan 28, 2020, at 10:03 PM, Theodore Brown wrote:

> The RFC mentions two problems it hopes to solve: a minor performance > decrease, and developers having to deal with ambiguity. However, a > third problem that I think is just as important to fix is the lack of > function autoloading which makes extensive use of namespaced functions > unfeasible.
Without weighing in on the RFC itself, I really don't think this statement is true anymore. Back in the day, sure; autoloading for class-like-things only made functions second class citizens. However, Composer use is now near-universal for new code. Composer can load specific files full of functions during autoload just fine, and then the functions are universally available. That would have been a performance hit, but with opcode caching the hit is minimal. With preloading in 7.4, even that is reduced to almost zero. Between those two, having a library that is mostly a pile of namespaced functions that are just always-available is quite reasonable and feasible these days. Whether it's architecturally good or bad is another question, but I don't think the lack of autoloading is a good excuse anymore. --Larry Garfield
  108309
January 29, 2020 06:18 tysonandre775@hotmail.com (tyson andre)
> Thank you for working to address the issue of ambiguous function references. However, I really don't think this is the right approach. > > The RFC mentions two problems it hopes to solve: a minor performance decrease, and developers having to deal with ambiguity. > However, a third problem that I think is just as important to fix is the lack of function autoloading which makes extensive use of namespaced functions unfeasible. > > Unfortunately, the new directive proposed by this RFC would make declaring functions in namespaces even more difficult, as it would now be necessary to explicitly `use` each function and constant in the same file it is declared in.
I'd expect that autoloading of those declared functions would work whether or not the functions were *declared* in a file with or without `declare(function_and_const_lookup='global')`. It'd be the *uses* of the functions that would be problems, if function autoloading was implemented but limited to uses of names that could be resolved without fallbacks. As I see it, - Many forms of function autoloading would be argued against if most PHP code would end up ambiguously referencing functions, because you either sacrifice performance or correctness to deal with the ambiguity. (right now, php permanently caches the resolved ambiguous function the first time it is called) - If unambiguously referencing functions was made less difficult, then there would be more support for implementing function or constant autoloading.
> The RFC argues that having to "add multiple `use function function_name` and `use const MY_CONST` at the top of the namespace" is prone to merge conflicts and inconvenient to keep up to date. However, the RFC just shifts this problem from global functions to namespaced functions. > > Changing function/const resolution to always look in the global scope instead of current namespace is backwards from how classes and interfaces are resolved, and seems destined to become another language Sadness. > Wouldn't it be more straightforward and intuitive to add a directive like `declare(namespace_lookup=1)` which would resolve functions/consts the same way as classes (without the global fallback)?
https://wiki.php.net/rfc/use_global_elements#look_up_elements_in_the_current_namespace_instead_of_the_global_namespace was brought up earlier. Right now, the majority of PHP's functions are in the global namespace. (count(), strlen(), is_string(), etc.) I agree it'd be more straightforward, but I don't have a personal reason to propose or use a change that would make core functionality such as `is_string` harder to use. If php's core functions and constants weren't almost all in the global namespace, then I'd have more reasons to propose `declare(namespace_lookup=1)` instead of this.
> The RFC argues that writing global functions as `\function_name()` is more verbose. > But is one backslash character per global function call any more verbose than having to add a 44-character declare statement at the top of every file? > Developers are already used to referencing global classes and functions with backslashes or explicit `use` statements, > and in a future where we have function autoloading and utilize more namespaced functions, > this approach will be less verbose than having to explicitly `use` every namespaced function in the same file it is declared in. > Plus, IDEs and other tools can automatically add the backslashes or `use` statements for global functions.
To expand on why I'm suggesting this as an alternative to `\function_name()` in https://wiki.php.net/rfc/use_global_elements#introduction, - A developer context switching between code bases that require/prefer `function_name()` and another requiring `\function_name()` would not get used to it. If function_and_const_lookup='global' was added, they would not need to context switch, and would just write `function_name()` in both projects. Having the equivalent of a `.phpeditorconfig` could avoid the need for thinking about it, but getting that supported in all common IDEs would have roadblocks (e.g. hypothetical plugins may require the user must configure the path to php 7.4 for their .phpeditorconfig to work, and there's no such thing of .phpeditorconfig for php syntax to my knowledge) - I'd prefer the syntax for unambiguously using built-in functionality be convenient to use. - New contributors to a project wouldn't expect that builds fail (etc.) because the `is_string` wasn't quoted. Many wouldn't have their IDEs set up to do add the use statements or `\`. - For existing projects that start switching to unambiguous names, `git blame` would be less useful on the line `if (!\is_string()) { ... }` because the most recent commit could be one one adding the `\`. - This provides another option for developers that did want less ambiguity but didn't want to use either `use function function_name;` or `\function_name()` for the reasons mentioned.
  108314
January 29, 2020 10:06 marandall@php.net (Mark Randall)
> On Tue, Jan 28, 2020 at 8:22 PM tyson andre <tysonandre775@hotmail.com> wrote: >> I've opened the vote on https://wiki.php.net/rfc/use_global_elements
I too would like to thank you for the effort put in, but I find myself in agreement with several of the other posters that I'm not sure this is the right way to go about it. I'm not a fan of the disparity between classes and functions that this would introduce if used. Support for namespaced level functions is already quite poor (with lack of autoloading) and rather than try to gain a few bytes here and there, I think the better long-term option would be to simply remove support for them entirely in favour of static methods, and at that point the door is open to make functions and constants be global-only. -- Mark Randall
  108310
January 29, 2020 08:55 bishop@php.net (Bishop Bettini)
On Tue, Jan 28, 2020 at 9:22 PM tyson andre <tysonandre775@hotmail.com>
wrote:

> > I've opened the vote on https://wiki.php.net/rfc/use_global_elements > after weighing the pros and cons of discussed alternative approaches. > Yesterday, I've finished the last set of updates I announced I would do > based on RFC feedback. > > (that update was to require the statement 'use function ...'/'use const > ...' before declaring that function/const > outside the global namespace, > if declare(function_and_const_lookup='global') is opted into) > > Voting closes on 2020-02-11. >
Notwithstanding the excellent work done in the PR, I lean toward a No vote. Suppose I'm presented with this diff (and only this diff): $a = array_diff($x, $y); + if (count($a)) die('Unexpected differences'); Where does count come from? I can't tell from this diff. It could be local to this namespace, or it could be global. In the current implementation, I have to scroll to the top of the file, check for what namespace I am in, and what use are made. In the proposed implementation, I have also to check the top of the file, but now its for "declare(function_and_cost_lookup='global')" and the presence (or not) of "use function count". It's not clear to me this is an improvement over status quo. Indeed, I worry that implementing the proposal may increase overall cognitive burden for PHP developers. Today, any developer can look at any file or snippet in any code base, anywhere in the world, and know that using an unqualified name follows a single, language-defined algorithm. Were this proposal accepted, developers no longer have that guarantee. The code might be using the original algorithm, or it might be using the new algorithm. The only way to know is to check for and remember the state based on the presence of -- and value of -- this new declaration. Resolving ambiguities has a compute cost and a cognitive cost. While the proposal improves the compute cost, I'm not sure the same can be said of the cognitive cost. So I am leaning toward a No vote. What do you think?
  108312
January 29, 2020 09:45 kjarli@gmail.com (Lynn)
On Wed, Jan 29, 2020 at 9:55 AM Bishop Bettini <bishop@php.net> wrote:

> Notwithstanding the excellent work done in the PR, I lean toward a No vote. > > Suppose I'm presented with this diff (and only this diff): > > $a = array_diff($x, $y); > + if (count($a)) die('Unexpected differences'); > > Where does count come from? I can't tell from this diff. It could be local > to this namespace, or it could be global. In the current implementation, I > have to scroll to the top of the file, check for what namespace I am in, > and what use are made. In the proposed implementation, I have also to check > the top of the file, but now its for > "declare(function_and_cost_lookup='global')" and the presence (or not) of > "use function count". > > It's not clear to me this is an improvement over status quo. > > [...] > > What do you think? >
Hi, I think that when we're at a point where developers start using existing core function names for custom functions and they start 'overriding' things, you already have this problem, regardless of this RFC. You already have to check whether or not the function was defined via a `use` statement on top. I don't think your example of the diff would have a difference between `use` and `declare` when it comes to reviewing. I would advice developers to avoid using core function names in namespaces for custom functions, as doing this would open up a whole different can of worms when it comes to developer experience. Regards, Lynn
  108315
January 29, 2020 10:07 claude.pache@gmail.com (Claude Pache)
> Le 29 janv. 2020 à 09:55, Bishop Bettini <bishop@php.net> a écrit : > > Suppose I'm presented with this diff (and only this diff): > > $a = array_diff($x, $y); > + if (count($a)) die('Unexpected differences'); >
Realistically, when I see that diff, I assume that those functions are the well-known global ones. The eventual presence `declare(function_and_const_lookup='global')` directive would make my assumption provable, which is an improvement. In the rare cases where that my assumption is false, an explicit `use function MyNs\{array_diff, count}` is welcome (and is already possible) in order to make my assumption refutable without needing to look for improbable redefinition of those functions in other files; such an explicit declaration would be mandatory in presence `declare(function_and_const_lookup='global')`, which is an improvement. —Claude
  108316
January 29, 2020 10:55 ocramius@gmail.com (Marco Pivetta)
Hey Tyson,

I voted "No" on this one, sorry.

TL;DR: I'd rather have a mechanism to disable global function fallback, not
something that makes un-imported symbols immediately global.

The idea to disable PHP's implicit
"fallback-to-something-that-may-or-may-not-exist" is great, and it could
simplify some of the engine, as well as remove some caching mechanisms
(lookup) in some contexts, but I'm pretty happy with disabling all global
lookup functionality, rather than making them the default.

I don't have a problem with `use function sprintf;` to import things from
PHP, and actually use it to quickly grep whenever I do something extremely
dangerous (like `use function umask;`).
This gives me also a good overview of how much of the global symbols are
still in my system, and to be replaced with something like
`thecodingmachine/safe`.

I also deal with merge conflicts regularly: not a big issue either.

Greets,

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/


On Wed, Jan 29, 2020 at 3:22 AM tyson andre <tysonandre775@hotmail.com>
wrote:

> Hi internals, > > I've opened the vote on https://wiki.php.net/rfc/use_global_elements > after weighing the pros and cons of discussed alternative approaches. > Yesterday, I've finished the last set of updates I announced I would do > based on RFC feedback. > > (that update was to require the statement 'use function ...'/'use const > ...' before declaring that function/const > outside the global namespace, > if declare(function_and_const_lookup='global') is opted into) > > Voting closes on 2020-02-11. > > Thanks, > - Tyson > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
  108322
January 29, 2020 18:42 bishop@php.net (Bishop Bettini)
On Wed, Jan 29, 2020 at 5:56 AM Marco Pivetta <ocramius@gmail.com> wrote:

> I voted "No" on this one, sorry. > > TL;DR: I'd rather have a mechanism to disable global function fallback, not > something that makes un-imported symbols immediately global. > > The idea to disable PHP's implicit > "fallback-to-something-that-may-or-may-not-exist" is great, and it could > simplify some of the engine, as well as remove some caching mechanisms > (lookup) in some contexts, but I'm pretty happy with disabling all global > lookup functionality, rather than making them the default. > > I don't have a problem with `use function sprintf;` to import things from > PHP, and actually use it to quickly grep whenever I do something extremely > dangerous (like `use function umask;`). > This gives me also a good overview of how much of the global symbols are > still in my system, and to be replaced with something like > `thecodingmachine/safe`. >
Agreed. I want to force explicit qualifications, and raise error on unqualified names:
  108323
January 29, 2020 20:12 claude.pache@gmail.com (Claude Pache)
> Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net> a écrit : > > Cons... have to enumerate everything, potentially lots of work to do that > to update old code to use; edge cases I'm not thinking about.
Not only it is much work (although it could be partially automated), but it does not make much sense. I’ve taken a random file (~650 LoC) in the codebase I’m working on. With `declare(strict_qualify=1)`, I would have to add: use function array_filter , array_flip , array_key_exists , array_keys , array_push , array_unshift , compact , ctype_digit , idate , in_array , is_int , is_string , ksort , ob_get_clean , ob_start , preg_match , sprintf , strlen , strncmp , substr_count , trim , usort; I don’t see a clear advantage in declaring more than 20 global functions in each and every of the hundreds of files of the codebase—where it is evident that they are all global function (no, I don’t have a namespaced version of `is_int()`). Also, I think that there are better ways to discover if my colleague introduced surreptitiously a `system()` call in the codebase. Some time ago, I’ve tried the alternative approach in two files: adding a backslash before each and every global function (carefully avoiding non-functions such as `isset()`, `empty()` and `list()`). Well, it reminded me of the good old days when I used TeX intensively, but objectively, it didn’t make the code more clear or more secure (I’ve not tested its velocity, though). —Claude
  108324
January 29, 2020 21:11 bishop@php.net (Bishop Bettini)
On Wed, Jan 29, 2020 at 3:12 PM Claude Pache pache@gmail.com> wrote:

> > > Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net> a écrit : > > Cons... have to enumerate everything, potentially lots of work to do that > to update old code to use; edge cases I'm not thinking about. > > > Not only it is much work (although it could be partially automated), but > it does not make much sense. > > I’ve taken a random file (~650 LoC) in the codebase I’m working on. With > `declare(strict_qualify=1)`, I would have to add: > > use function > array_filter > , array_flip > , array_key_exists > , array_keys > , array_push > , array_unshift > , compact > , ctype_digit > , idate > , in_array > , is_int > , is_string > , ksort > , ob_get_clean > , ob_start > , preg_match > , sprintf > , strlen > , strncmp > , substr_count > , trim > , usort; >
A fair critique. Under the "goad use of wildcard import" pro, the quick way: use \{*}; Any unqualified name is locked to the global namespace.
> I don’t see a clear advantage in declaring more than 20 global functions > in each and every of the hundreds of files of the codebase—where it is > evident that they are all global function (no, I don’t have a namespaced > version of `is_int()`). Also, I think that there are better ways to > discover if my colleague introduced surreptitiously a `system()` call in > the codebase. >
Well, first, it's faster as Tyson described in the RFC: those are opcoded and bypass the namespace lookup. Second, and I find this the best reason, that enumeration tells me quite a bit about what that code does. I don't need to see the code to know that it's compute oriented, doesn't touch resources (eg files), and is doing what PHP does best: string and array manipulation. It also shows me that code's doing output buffer manipulation, which seems out of place and suggests that should be refactored out. IOW, it's a really nice abstract / summary of what the code's doing. And that's some very good insight.
> Some time ago, I’ve tried the alternative approach in two files: adding a > backslash before each and every global function (carefully avoiding > non-functions such as `isset()`, `empty()` and `list()`). Well, it reminded > me of the good old days when I used TeX intensively, but objectively, it > didn’t make the code more clear or more secure (I’ve not tested its > velocity, though). >
I agree: qualifying the name at the usage site doesn't add much value, IMO. It's frankly more noise than anything else.
  108330
January 30, 2020 08:10 claude.pache@gmail.com (Claude Pache)
> Le 29 janv. 2020 à 22:11, Bishop Bettini <bishop@php.net> a écrit : > > On Wed, Jan 29, 2020 at 3:12 PM Claude Pache pache@gmail.com <mailto:claude.pache@gmail.com>> wrote: > > >> Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net <mailto:bishop@php.net>> a écrit : >> >> Cons... have to enumerate everything, potentially lots of work to do that >> to update old code to use; edge cases I'm not thinking about. > > > Not only it is much work (although it could be partially automated), but it does not make much sense. > > I’ve taken a random file (~650 LoC) in the codebase I’m working on. With `declare(strict_qualify=1)`, I would have to add: > > use function > array_filter > , array_flip > , array_key_exists > , array_keys > , array_push > , array_unshift > , compact > , ctype_digit > , idate > , in_array > , is_int > , is_string > , ksort > , ob_get_clean > , ob_start > , preg_match > , sprintf > , strlen > , strncmp > , substr_count > , trim > , usort; > > A fair critique. Under the "goad use of wildcard import" pro, the quick way: > use \{*};
As of today, we already have something like `use function *`. We name it “fallback to the global scope”. The current RFC proposes an alternative approach, with its own advantages and inconveniences.
> > > Second, and I find this the best reason, that enumeration tells me quite a bit about what that code does. I don't need to see the code to know that it's compute oriented, doesn't touch resources (eg files), and is doing what PHP does best: string and array manipulation. It also shows me that code's doing output buffer manipulation, which seems out of place and suggests that should be refactored out. > > IOW, it's a really nice abstract / summary of what the code's doing. And that's some very good insight.
It’s an illusion: the enumeration of used global functions doesn’t say all. For instance, you cannot guess if the code writes on the filesystem by manipulating an instance of `SplFileObject` (or some other custom class), or not. —Claude
  108329
January 30, 2020 03:28 tysonandre775@hotmail.com (tyson andre)
> Agreed. I want to force explicit qualifications, and raise error on unqualified names: > > declare(strict_qualify=1); > > namespace foo; > > use function \{array_diff, var_dump}; > use const \bar\{a, b}; > > $x = array_diff(a, b); > if (count($x)) { // throw an \Error of some kind, as count is unqualified > var_dump($x); > } > > Pros... no bc break; short declare identifier with obvious values; code valid even if declare is removed; no lookup performance penalty; > easy to remember logic; forces enumeration of the "surface area" the code wants to cover > (which in turn helps developers think about how much work the code's doing); has analogue in other similar languages (perl, python, node); > makes explicit what functions are being used and where they originate; might goad an implementation of wildcard use (`use function \{imap_*}`).
If this is allowing prefixing with backslashes, the "surface area" isn't guaranteed to be enumerated in one place. Forbidding backslashes or namespace aliases for functions/constants would have other objections.
> I've not done the work to implement this, though I am happy to if there's interest. > But, it may not be feasible. I'd hope Tyson would kindly say "STFU not possible" if, > from their recent experience, it couldn't be done. Apologies if earlier thread discussion ruled this out categorically. > > But I like that it adds super-strictness for those who want that, while keeping the super-flexibility of what we have now. That seems to be the way we're going with this dual-world of loose- and strict-ness in PHP.
The issue would be that I'd expect it to be voted against for similar reasons to my RFC, but it's easy to implement. `declare(function_and_const_lookup='global')` aimed to make something that was inconvenient to work with with tooling more convenient in some ways with and without tooling. Right now, it's already possible to fail your build if a file uses unqualified names, without code changes. Tools like `phpcs` and `phan --plugin NotFullyQualifiedUsagePlugin` can be run in continuous integration or as a check on file save, and have ways to detect the lack of explicit qualifications. (If a project could set up `php --syntax-check` on all files, they could likely also set `phpcs` checks up). Having a file `strict_qualify=1` throw/exit after a deployment or code release would be undesirable (emitting a compiler warning when compiling would be safer, but make that setting a less useful guarantee of strictness) It's possible to implement from a technical standpoint: the lines that get changed are the same ones you can see in the PR for my RFC. (If you mean throw at compile time, not at runtime)
  108332
January 30, 2020 13:58 guilliam.xavier@gmail.com (Guilliam Xavier)
> declare(strict_qualify=1); > > namespace foo; > > use function \{array_diff, var_dump}; > use const \bar\{a, b};
[Off-topic] All coding standards I've seen forbid writing a leading backslash in "use" and in strings. Indeed, `use \Ns\Klass;` gives the false impression that the leading backslash be required to avoid relative resolution, but `use Ns\Klass;` is already absolute. And for strings, `\Ns\Klass::class` (or `get_class(new \Ns\Klass())`) is equal to `'Ns\Klass'`, not `'\Ns\Klass'`. -- Guilliam Xavier