Re: [PHP-DEV] [VOTE] declare(function_and_const_lookup='global')

This is only part of a thread. view whole thread
  108322
January 29, 2020 18:42 bishop@php.net (Bishop Bettini)
On Wed, Jan 29, 2020 at 5:56 AM Marco Pivetta <ocramius@gmail.com> wrote:

> I voted "No" on this one, sorry. > > TL;DR: I'd rather have a mechanism to disable global function fallback, not > something that makes un-imported symbols immediately global. > > The idea to disable PHP's implicit > "fallback-to-something-that-may-or-may-not-exist" is great, and it could > simplify some of the engine, as well as remove some caching mechanisms > (lookup) in some contexts, but I'm pretty happy with disabling all global > lookup functionality, rather than making them the default. > > I don't have a problem with `use function sprintf;` to import things from > PHP, and actually use it to quickly grep whenever I do something extremely > dangerous (like `use function umask;`). > This gives me also a good overview of how much of the global symbols are > still in my system, and to be replaced with something like > `thecodingmachine/safe`. >
Agreed. I want to force explicit qualifications, and raise error on unqualified names:
  108323
January 29, 2020 20:12 claude.pache@gmail.com (Claude Pache)
> Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net> a écrit : > > Cons... have to enumerate everything, potentially lots of work to do that > to update old code to use; edge cases I'm not thinking about.
Not only it is much work (although it could be partially automated), but it does not make much sense. I’ve taken a random file (~650 LoC) in the codebase I’m working on. With `declare(strict_qualify=1)`, I would have to add: use function array_filter , array_flip , array_key_exists , array_keys , array_push , array_unshift , compact , ctype_digit , idate , in_array , is_int , is_string , ksort , ob_get_clean , ob_start , preg_match , sprintf , strlen , strncmp , substr_count , trim , usort; I don’t see a clear advantage in declaring more than 20 global functions in each and every of the hundreds of files of the codebase—where it is evident that they are all global function (no, I don’t have a namespaced version of `is_int()`). Also, I think that there are better ways to discover if my colleague introduced surreptitiously a `system()` call in the codebase. Some time ago, I’ve tried the alternative approach in two files: adding a backslash before each and every global function (carefully avoiding non-functions such as `isset()`, `empty()` and `list()`). Well, it reminded me of the good old days when I used TeX intensively, but objectively, it didn’t make the code more clear or more secure (I’ve not tested its velocity, though). —Claude
  108324
January 29, 2020 21:11 bishop@php.net (Bishop Bettini)
On Wed, Jan 29, 2020 at 3:12 PM Claude Pache pache@gmail.com> wrote:

> > > Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net> a écrit : > > Cons... have to enumerate everything, potentially lots of work to do that > to update old code to use; edge cases I'm not thinking about. > > > Not only it is much work (although it could be partially automated), but > it does not make much sense. > > I’ve taken a random file (~650 LoC) in the codebase I’m working on. With > `declare(strict_qualify=1)`, I would have to add: > > use function > array_filter > , array_flip > , array_key_exists > , array_keys > , array_push > , array_unshift > , compact > , ctype_digit > , idate > , in_array > , is_int > , is_string > , ksort > , ob_get_clean > , ob_start > , preg_match > , sprintf > , strlen > , strncmp > , substr_count > , trim > , usort; >
A fair critique. Under the "goad use of wildcard import" pro, the quick way: use \{*}; Any unqualified name is locked to the global namespace.
> I don’t see a clear advantage in declaring more than 20 global functions > in each and every of the hundreds of files of the codebase—where it is > evident that they are all global function (no, I don’t have a namespaced > version of `is_int()`). Also, I think that there are better ways to > discover if my colleague introduced surreptitiously a `system()` call in > the codebase. >
Well, first, it's faster as Tyson described in the RFC: those are opcoded and bypass the namespace lookup. Second, and I find this the best reason, that enumeration tells me quite a bit about what that code does. I don't need to see the code to know that it's compute oriented, doesn't touch resources (eg files), and is doing what PHP does best: string and array manipulation. It also shows me that code's doing output buffer manipulation, which seems out of place and suggests that should be refactored out. IOW, it's a really nice abstract / summary of what the code's doing. And that's some very good insight.
> Some time ago, I’ve tried the alternative approach in two files: adding a > backslash before each and every global function (carefully avoiding > non-functions such as `isset()`, `empty()` and `list()`). Well, it reminded > me of the good old days when I used TeX intensively, but objectively, it > didn’t make the code more clear or more secure (I’ve not tested its > velocity, though). >
I agree: qualifying the name at the usage site doesn't add much value, IMO. It's frankly more noise than anything else.
  108330
January 30, 2020 08:10 claude.pache@gmail.com (Claude Pache)
> Le 29 janv. 2020 à 22:11, Bishop Bettini <bishop@php.net> a écrit : > > On Wed, Jan 29, 2020 at 3:12 PM Claude Pache pache@gmail.com <mailto:claude.pache@gmail.com>> wrote: > > >> Le 29 janv. 2020 à 19:42, Bishop Bettini <bishop@php.net <mailto:bishop@php.net>> a écrit : >> >> Cons... have to enumerate everything, potentially lots of work to do that >> to update old code to use; edge cases I'm not thinking about. > > > Not only it is much work (although it could be partially automated), but it does not make much sense. > > I’ve taken a random file (~650 LoC) in the codebase I’m working on. With `declare(strict_qualify=1)`, I would have to add: > > use function > array_filter > , array_flip > , array_key_exists > , array_keys > , array_push > , array_unshift > , compact > , ctype_digit > , idate > , in_array > , is_int > , is_string > , ksort > , ob_get_clean > , ob_start > , preg_match > , sprintf > , strlen > , strncmp > , substr_count > , trim > , usort; > > A fair critique. Under the "goad use of wildcard import" pro, the quick way: > use \{*};
As of today, we already have something like `use function *`. We name it “fallback to the global scope”. The current RFC proposes an alternative approach, with its own advantages and inconveniences.
> > > Second, and I find this the best reason, that enumeration tells me quite a bit about what that code does. I don't need to see the code to know that it's compute oriented, doesn't touch resources (eg files), and is doing what PHP does best: string and array manipulation. It also shows me that code's doing output buffer manipulation, which seems out of place and suggests that should be refactored out. > > IOW, it's a really nice abstract / summary of what the code's doing. And that's some very good insight.
It’s an illusion: the enumeration of used global functions doesn’t say all. For instance, you cannot guess if the code writes on the filesystem by manipulating an instance of `SplFileObject` (or some other custom class), or not. —Claude
  108329
January 30, 2020 03:28 tysonandre775@hotmail.com (tyson andre)
> Agreed. I want to force explicit qualifications, and raise error on unqualified names: > > declare(strict_qualify=1); > > namespace foo; > > use function \{array_diff, var_dump}; > use const \bar\{a, b}; > > $x = array_diff(a, b); > if (count($x)) { // throw an \Error of some kind, as count is unqualified > var_dump($x); > } > > Pros... no bc break; short declare identifier with obvious values; code valid even if declare is removed; no lookup performance penalty; > easy to remember logic; forces enumeration of the "surface area" the code wants to cover > (which in turn helps developers think about how much work the code's doing); has analogue in other similar languages (perl, python, node); > makes explicit what functions are being used and where they originate; might goad an implementation of wildcard use (`use function \{imap_*}`).
If this is allowing prefixing with backslashes, the "surface area" isn't guaranteed to be enumerated in one place. Forbidding backslashes or namespace aliases for functions/constants would have other objections.
> I've not done the work to implement this, though I am happy to if there's interest. > But, it may not be feasible. I'd hope Tyson would kindly say "STFU not possible" if, > from their recent experience, it couldn't be done. Apologies if earlier thread discussion ruled this out categorically. > > But I like that it adds super-strictness for those who want that, while keeping the super-flexibility of what we have now. That seems to be the way we're going with this dual-world of loose- and strict-ness in PHP.
The issue would be that I'd expect it to be voted against for similar reasons to my RFC, but it's easy to implement. `declare(function_and_const_lookup='global')` aimed to make something that was inconvenient to work with with tooling more convenient in some ways with and without tooling. Right now, it's already possible to fail your build if a file uses unqualified names, without code changes. Tools like `phpcs` and `phan --plugin NotFullyQualifiedUsagePlugin` can be run in continuous integration or as a check on file save, and have ways to detect the lack of explicit qualifications. (If a project could set up `php --syntax-check` on all files, they could likely also set `phpcs` checks up). Having a file `strict_qualify=1` throw/exit after a deployment or code release would be undesirable (emitting a compiler warning when compiling would be safer, but make that setting a less useful guarantee of strictness) It's possible to implement from a technical standpoint: the lines that get changed are the same ones you can see in the PR for my RFC. (If you mean throw at compile time, not at runtime)
  108332
January 30, 2020 13:58 guilliam.xavier@gmail.com (Guilliam Xavier)
> declare(strict_qualify=1); > > namespace foo; > > use function \{array_diff, var_dump}; > use const \bar\{a, b};
[Off-topic] All coding standards I've seen forbid writing a leading backslash in "use" and in strings. Indeed, `use \Ns\Klass;` gives the false impression that the leading backslash be required to avoid relative resolution, but `use Ns\Klass;` is already absolute. And for strings, `\Ns\Klass::class` (or `get_class(new \Ns\Klass())`) is equal to `'Ns\Klass'`, not `'\Ns\Klass'`. -- Guilliam Xavier