[RFC] Allow calls to global functions in constant expressions

  108430
February 9, 2020 21:02 tysonandre775@hotmail.com (tyson andre)
Hi internals,

https://wiki.php.net/rfc/calls_in_constant_expressions has been updated and moved to
"Under Discussion".

This proposes allowing function calls in constant declarations, static property defaults,
static variables, and parameter defaults.
It includes a secondary voting option to allow any function calls,
and not just a small set of whitelisted core functions.
Other expression types would continue to be forbidden in constant expressions (variables, method calls, etc)

Let me know if you have any comments on the RFC or implementation.

Earlier discussion can be found in https://externals.io/message/108343
("Planning an RFC to allow calls to global functions in constant expressions")

I made some updates to the RFC, implementation and the proposed whitelist of functions.
(in response to https://externals.io/message/108343#108427)

> > As you've already realized, the main problem here is the behavior for > > functions that have side-effects or state. Currently we mostly get away > > with the illusion that it doesn't matter when exactly constexpr > > initializers are evaluated. Apart from the error location and (admittedly > > quite a few) other details, you ostensibly can't distinguish whether the > > expression is going to evaluated on declaration, on first access or on each > > access. > > Good point - If PHP ends up supporting an option where **any** global function can get called, > It'd make sense to evaluate any expression *with a function call* > every time for instance properties and parameter defaults > (e.g. `public $id = generate_unique_int_id()`). > (the optimizer can later optimize any functions where it won't be noticeable). > > I think that would require updates to my PR for caching, > since immutable returned zvals (including false/true) get cached permanently.
I've completed the updates to parameter defaults. Parameter defaults containing calls now don't get cached: They always get evaluated every time the parameter isn't passed in. I also excluded support for calls in instance property defaults from the RFC. I'd want that feature, but the changes to PHP's internal representation of classes expand the scope of this RFC too much for me. - PHP currently evaluates all the property defaults and permanently caches them the first time a property gets accessed or an object gets instantiated. I'd probably have to change the internal class representation as well to fix that, and I don't know how many or what types of bugs/issues that would cause.
> That reminds me: > I forgot about functions such as `compact()`, `extract()` and `get_defined_vars()`. > (and `func_num_args()`/`func_get_args()`/`func_get_arg()`). > It looks like all of them check `zend_forbid_dynamic_call()` to throw, > but I need to add tests and possibly flag the call as dynamic.
Update: The implementation already properly throws an Error. Thanks, - Tyson
  108431
February 9, 2020 21:33 mike@newclarity.net (Mike Schinkel)
> On Feb 9, 2020, at 4:02 PM, tyson andre <tysonandre775@hotmail.com> wrote: > > https://wiki.php.net/rfc/calls_in_constant_expressions has been updated and moved to > "Under Discussion". > > This proposes allowing function calls in constant declarations, static property defaults, > static variables, and parameter defaults. > It includes a secondary voting option to allow any function calls, > and not just a small set of whitelisted core functions. > Other expression types would continue to be forbidden in constant expressions (variables, method calls, etc) > > Let me know if you have any comments on the RFC or implementation.
1. Why again are MyClass::methodName() not considered for the non-whitelist vote? Seems to me a developer would be more inclined to implement the expressions that define the class constant's value in a method of the class than in an external function. 2. Do we really want to add a standard library function 53 characters long? Can we not come up with a more concise name than get_defined_functions_allowed_in_constant_expressions(), like maybe get_const_expr_funcs() or get_const_expressions()? -Mike
  108432
February 10, 2020 01:34 tysonandre775@hotmail.com (tyson andre)
> 1. Why again are MyClass::methodName() not considered for the non-whitelist vote? > > Seems to me a developer would be more inclined to implement the expressions that define the class constant's value in a method of the class than in an external function.
My reasons for doing this were to keep the scope small and easy to implement, and work on an RFC for methods immediately after if the vote for allowing arbitrary functions passed. The other concern was the edge case `static::method_name()` in class constants, but `call_user_func(['static', 'method_name'])` or `array_map('static::method_name', VALS)` causes the same issues. This leads me to a blocker I've discovered with this implementation calling without a whitelist: The way PHP resolves the scope is to look for the closest closure or user-defined function, then to extract the class of that function. (in zend_get_executed_scope). I thought the scope would work because `const X = self::Y` works, but it turns out ZEND_AST_CLASS_CONST is special cased in AST evaluation. My current ideas for fixing this (for class constants and property defaults): (Parameter defaults, static variables, and global constants should already have reasonable scopes) 1. Wrap all of the calls the engine is making in a byte copy of call_user_func, but with the property `zend_class_entry *scope` set, so that zend_get_executed_scope will see that frame's scope and use it instead of the caller's scope. 2. Add a fake stack frame that acts as if an extra closure (bound to the declaring class's scope) was being evaluated. 3. Add a brand new function. If I don't do that, then constant/property evaluation uses the scope of the caller, not the class ``` class MyClass { public static function example() {} const X = call_user_func('self::example'); } // this throws in the current implementation // due to no active class scope, and that's a bug echo MyClass::X; ``` If I (or anyone else) can't figure out how to implement the fix, or the fix has issues, I might remove the secondary vote from the RFC and just vote on the whitelist, which excluded functions accepting callables. If anyone comes up with a working implementation for a fix (including exception handling), I'd like to know.
> 2. Do we really want to add a standard library function 53 characters long? > > Can we not come up with a more concise name than get_defined_functions_allowed_in_constant_expressions(), like maybe get_const_expr_funcs() or get_const_expressions()?
I was planning to change it if I thought of a better name. For get_const_expr_funcs(), I'd think it would be important to be consistent about naming in the standard library. `get_const_expr_functions()` would work, though.
  108433
February 10, 2020 04:10 mike@newclarity.net (Mike Schinkel)
> On Feb 9, 2020, at 8:34 PM, tyson andre <tysonandre775@hotmail.com> wrote: > >> 1. Why again are MyClass::methodName() not considered for the non-whitelist vote? >> >> Seems to me a developer would be more inclined to implement the expressions that define the class constant's value in a method of the class than in an external function. > > My reasons for doing this were to keep the scope small and easy to implement, > and work on an RFC for methods immediately after if the vote for allowing arbitrary functions passed. > > The other concern was the edge case `static::method_name()` in class constants, > but `call_user_func(['static', 'method_name'])` or `array_map('static::method_name', VALS)` > causes the same issues. > > This leads me to a blocker I've discovered with this implementation calling without a whitelist: > The way PHP resolves the scope is to look for the closest closure or user-defined function, > then to extract the class of that function. (in zend_get_executed_scope). > I thought the scope would work because `const X = self::Y` works, > but it turns out ZEND_AST_CLASS_CONST is special cased in AST evaluation. > My current ideas for fixing this (for class constants and property defaults): > (Parameter defaults, static variables, and global constants should already have reasonable scopes) > > 1. Wrap all of the calls the engine is making in a byte copy of call_user_func, but with the property `zend_class_entry *scope` set, so that zend_get_executed_scope will see that frame's scope and use it instead of the caller's scope. > 2. Add a fake stack frame that acts as if an extra closure (bound to the declaring class's scope) was being evaluated. > 3. Add a brand new function. > > If I don't do that, then constant/property evaluation uses the scope of the caller, not the class > > ``` > class MyClass { > public static function example() {} > const X = call_user_func('self::example'); > } > // this throws in the current implementation > // due to no active class scope, and that's a bug > echo MyClass::X; > ``` > > If I (or anyone else) can't figure out how to implement the fix, or the fix has issues, > I might remove the secondary vote from the RFC and just vote on the whitelist, > which excluded functions accepting callables. > If anyone comes up with a working implementation for a fix (including exception handling), > I'd like to know. >
Thanks for the reply.
>> 2. Do we really want to add a standard library function 53 characters long? >> >> Can we not come up with a more concise name than get_defined_functions_allowed_in_constant_expressions(), like maybe get_const_expr_funcs() or get_const_expressions()? > > I was planning to change it if I thought of a better name. > For get_const_expr_funcs(), I'd think it would be important to be consistent about > naming in the standard library. > `get_const_expr_functions()` would work, though.
Oh, okay. But don't you think the 'consistent naming in PHP' ship sailed eons ago?!? (Sorry, I just could not resist! :-) -Mike
  108434
February 10, 2020 08:45 ocramius@gmail.com (Marco Pivetta)
On Sun, Feb 9, 2020 at 10:33 PM Mike Schinkel <mike@newclarity.net> wrote:

> 2. Do we really want to add a standard library function 53 characters > long? > > Can we not come up with a more concise name than > get_defined_functions_allowed_in_constant_expressions(), like maybe > get_const_expr_funcs() or get_const_expressions()? >
For once, we would have a function with a self-describing, clear name: let's not butcher it with abbreviations please :-) CTRL+SPACE is your friend. Marco Pivetta http://twitter.com/Ocramius http://ocramius.github.com/
  108435
February 10, 2020 09:48 mike@newclarity.net (Mike Schinkel)
> > On Feb 10, 2020 at 3:45 AM, mailto:ocramius@gmail.com)> wrote: > > > > > > For once, we would have a function with a self-describing, clear name: > > > >
Seriously? That function name reminds me of COBOL programming.
> > > > CTRL+SPACE is your friend. > > >
CTRL+SPACE does not make long expressions short enough to fit in editor windows and on web pages that do not require left-to-right scrolling, such as on Github. #moderationinallthings -Mike
  108580
February 14, 2020 15:18 tysonandre775@hotmail.com (tyson andre)
Hi internals,

https://wiki.php.net/rfc/calls_in_constant_expressions has hit some roadblocks in the implementation shortly after the last email.
I've been blocked on how to resolve them in the implementation in a way I'm certain would work.
I had assumed the calling scope wouldn't have the below issues when I started working on this RFC, but was mistaken.

 - The helper method the C code uses, `zend_call_function()`, will always act as though `strict_types=0`. I'd consider this a blocker for `declare(strict_types=1); const X = intdiv("123", 2);`. (Internally, zend_call_function adds a fake call frame without a function, similar to what __get, invocations of error handlers, etc do, and that call frame makes the parameter parsing helper macros of the invoked function treat it like strict_types=0)

    It may also be desirable to specify calls in constants always work like strict_types=1, to avoid issues such as `strpos($float, '.')` being locale-dependent)
 - Getting the class scope *of the constant declaration* to work properly with `callable` (e.g. `public $x = array_map('self::helperMethod', VALUES);`).
    This isn't an issue with only a whitelist.
 - Solving the above two issues while continuing to throw for func_get_args(), get_defined_vars(), etc.
   get_defined_vars() would throw for dynamic calls, but the above approaches would act like 

Currently, my best ideas to fix some of the 3 issues were:
 - Create a fake zend_function placeholder in C with a placeholder name and add it to the internal stack of calls (EG(current_execute_data)).
   Currently, though, it's only possible to set strict_types=1 for user-defined functions.
 - Create an actual closure when compiling the code, which would be evaluated the same way as `public $x = (static function() { return array_map('self::helperMethod', VALUES); })()`
    (i.e. a closure without any `use` variables. The )

    This would solve it, but the performance would be unexpectedly worse than expected, and that would interfere with optimizations.
 - Create an alternative to zend_call_function, but that would duplicate code and be error prone.
   I'm not familiar enough with XDebug, NewRelic, uopz, and other replacements of zend_vm_execute to know if those would be able to support that,
   or if there would be even more issues I haven't thought up yet that would prevent .

Second, I realize it doesn't make much sense to restrain the constant expressions for parameter defaults and class constants in the same way.
The RFC was coherent proposing allowing any function call in 5 places, but less coherent proposing a whitelist of functions, even where those restrictions weren't necessary.

- It may be desirable to support many extra expression types in parameter defaults and static property defaults
  (`function myFunc(object $x = new DefaultObject()) {}`)
  but still limit class constants to the proposed whitelist of functions.

Third, I was considering a straw poll for this,
but don't know what's required to set one up.
It also didn't make sense to me without a solution to the roadblocks.

1. Desired expressions in param defaults (No change/Whitelisted functions/Any function/Any method + new X()+others)
2. ... in static property defaults
3. ... in static variable defaults
4. ... in class constant values
5. ... in const X = ...

Thanks,
- Tyson
  108587
February 14, 2020 17:54 tysonandre775@hotmail.com (tyson andre)
> - Solving the above two issues while continuing to throw for func_get_args(), get_defined_vars(), etc. >   get_defined_vars() would throw for dynamic calls
Then again, it's already possible to get function parameters through debug_backtrace(), so implicitly creating a closure and call with 0 bind vars wrapping the function calls might be an acceptable tradeoff. - Possibly leave that closure out for calls known to not be dynamic internal calls - It might make sense to blacklist known function names such as func_get_args(), get_defined_vars(), etc to prevent writing that type of buggy code, e..g. in parameter default expressions Thoughts?