Re: [PHP-DEV] Planning an RFC to allow calls to global functions inconstant expressions

This is only part of a thread. view whole thread
  108409
February 6, 2020 11:35 nikita.ppv@gmail.com (Nikita Popov)
On Sun, Feb 2, 2020 at 12:00 AM tyson andre <tysonandre775@hotmail.com>
wrote:

> > Hi internals, > > I have a working implementation for calling global functions in constant > expressions at https://github.com/php/php-src/pull/5139 > (see PR description for more details, see tests for how edge cases get > resolved) > > If there was interest, and no implementation or process blockers I was > unaware of, I was planning to set up an RFC with votes along these lines: > > 1. (Primary) Yes/no for supporting calls to global functions, with or > without the below limitation (requires 2/3 majority) > 2. (Secondary vote requiring 2/3 majority to allow calls to all global > functions) Support calls to any user-defined or internal global functions, > not just unambiguously resolved calls to the deterministic, side effect > free ones that are guaranteed to be installed in core extensions. > > Alternately, have a first vote on allowing a small subset of global > functions calls, then if that passes, start a second vote on allowing all > global functions. > > For example, allow `\count()`, `\strlen()`, `\array_merge()`, and > `\in_array()`, > but don't allow functions such as > `\strtolower()` (different in Turkish locale), > \sprintf() (Depends on locale and `ini_get('precision')`, but so does > `(string)EXPR`), > `json_encode()` (The `json` extension can be disabled), > or calls which aren't unambiguously resolved with `\` or `use function`. > I'll assemble the list when writing the RFC. > > https://github.com/phan/phan/blob/2.4.8/src/Phan/Plugin/Internal/UseReturnValuePlugin.php#L190 > has a list of candidate functions (non-deterministic ones such as rand() > will be excluded). > > I mentioned this earlier in https://externals.io/message/108238#108238 > The use case I have in mind is different from what was mentioned there, > and this is *just* adding supports for function calls by name, not any > other expressions (variables, properties, methods, etc) > > Motivation for the voting structure: There are many opinions on what a > constant *should* be used for, e.g. one opinion is > https://externals.io/message/108238#107992 > > > > Off the top of my head, I can think of three types of things that you > might call "constants": > > > > 1. A value that will never change, and you just want to give a name to. > For instance, MINUTES_IN_HOUR=60 > > 2. A value set at compile-time, to hard-code a particular condition. For > instance, DEBUG=false > > 3. A value which is set at run-time, but happens not to change during > the course of the application, so should not be over-written. > > > > PHP's constants are designed primarily to be type 1 - they are literal > > values, hard-coded in the source. Unlike C, there is no standard > > pre-processor for PHP, so type 2 is rarer, but it's possible to > > implement by manipulating the source file before deployment, or even > > when running the autoloader. >
Hey Tyson, As you've already realized, the main problem here is the behavior for functions that have side-effects or state. Currently we mostly get away with the illusion that it doesn't matter when exactly constexpr initializers are evaluated. Apart from the error location and (admittedly quite a few) other details, you ostensibly can't distinguish whether the expression is going to evaluated on declaration, on first access or on each access. I'm not a big fan of whitelisting functions, especially as this runs into the name resolution issue (you suggest that essentially the use of fully qualified/imported names will be forced -- unlike everywhere else in PHP), and because I've seen this "mark all the functions const/constexpr" race play out one too many times in other languages, which have a much stronger motivation for it than we do. If we want to expand the allowed operations inside constexpr initializers, I think this needs to start by considering precise evaluation semantics in the places where it matters. Those are primarily initializers for function parameters and non-static properties, because both of these have a reasonable expectation of evaluation at each use. I think the best way to approach those two may well be to relax the constexpr restrictions on them entirely (allowing say "public Foo $prop = new Foo()", something people want anyway) and make sure that things are evaluated on each access (unless they can be pre-evaluated without affecting behavior, of course). For the remaining cases (constant, class constant and static property initializers) we can probably get away with the current semantics, which is evaluation on first access, with a very broad definition of what "access" means. Regards, Nikita
  108427
February 9, 2020 00:55 tysonandre775@hotmail.com (tyson andre)
> As you've already realized, the main problem here is the behavior for > functions that have side-effects or state. Currently we mostly get away > with the illusion that it doesn't matter when exactly constexpr > initializers are evaluated. Apart from the error location and (admittedly > quite a few) other details, you ostensibly can't distinguish whether the > expression is going to evaluated on declaration, on first access or on each > access.
Good point - If PHP ends up supporting an option where **any** global function can get called, It'd make sense to evaluate any expression *with a function call* every time for instance properties and parameter defaults (e.g. `public $id = generate_unique_int_id()`). (the optimizer can later optimize any functions where it won't be noticeable). I think that would require updates to my PR for caching, since immutable returned zvals (including false/true) get cached permanently.
> I'm not a big fan of whitelisting functions, especially as this runs into > the name resolution issue (you suggest that essentially the use of fully > qualified/imported names will be forced -- unlike everywhere else in PHP), > and because I've seen this "mark all the functions const/constexpr" race > play out one too many times in other languages, which have a much stronger > motivation for it than we do.
I'd agree with all of those drawbacks of whitelisting functions. This is still the best compromise I can think of for allowing those functions I've mentioned, if it turns out there are too many objections to this making constants too dynamic.
> If we want to expand the allowed operations inside constexpr initializers, > I think this needs to start by considering precise evaluation semantics in > the places where it matters. Those are primarily initializers for function > parameters and non-static properties, because both of these have a > reasonable expectation of evaluation at each use. I think the best way to > approach those two may well be to relax the constexpr restrictions on them > entirely (allowing say "public Foo $prop = new Foo()", something people > want anyway) and make sure that things are evaluated on each access (unless > they can be pre-evaluated without affecting behavior, of course).
For a more general proposal, the variable scope is a potential problem. For example, would the below example create a new class definition every time f() was called? (Creating a brand new scope with no variables, allowing everything except variables (and relying on zend_forbid_dynamic_call to blacklist get_defined_variables()), or always using the global scope might be ways of handling this.) ``` function f($x) { return fn() => new class() { public $var = $x; // or new Foo($x) }; } $factory1 = f(1); $factory2 = f(2); ``` That reminds me: I forgot about functions such as `compact()`, `extract()` and `get_defined_vars()`. (and `func_num_args()`/`func_get_args()`/`func_get_arg()`). It looks like all of them check `zend_forbid_dynamic_call()` to throw, but I need to add tests and possibly flag the call as dynamic.
> For the remaining cases (constant, class constant and static property > initializers) we can probably get away with the current semantics, which is > evaluation on first access, with a very broad definition of what "access" > means.
Agreed - I plan on keeping the existing semantics for those, but I think the semantics of global(non-class) constants are that they are currently always evaluated immediately, based on the tests I've wrote for this RFC. Thanks, - Tyson