Disable autovivification on false

  114595
May 25, 2021 16:23 tekiela246@gmail.com (Kamil Tekiela)
Hi Internals,

I'd like to start a discussion on the following RFC
https://wiki.php.net/rfc/autovivification_false
Particularly, I am looking for opinions on whether this behaviour should be
left alone, should be disabled on false, or should be disabled on null and
false, and left only for undefined variables.

Autovivification is very useful in PHP, especially with multidimensional
arrays and loops. However, the question is should we allow it on false and
null values going forward.

Kind Regards,
Kamil Tekiela
  114596
May 25, 2021 17:12 cschneid@cschneid.com (Christian Schneider)
Am 25.05.2021 um 18:23 schrieb Kamil Tekiela <tekiela246@gmail.com>:
> I'd like to start a discussion on the following RFC > https://wiki.php.net/rfc/autovivification_false > Particularly, I am looking for opinions on whether this behaviour should be > left alone, should be disabled on false, or should be disabled on null and > false, and left only for undefined variables. > > Autovivification is very useful in PHP, especially with multidimensional > arrays and loops. However, the question is should we allow it on false and > null values going forward.
I feel like a broken record, but I have to say it (again): Please add an E_WARNING step before changing something that used to be silent to an error. For easier migration. Thanks, - Chris
  114597
May 25, 2021 17:13 claude.pache@gmail.com (Claude Pache)
> Le 25 mai 2021 à 18:23, Kamil Tekiela <tekiela246@gmail.com> a écrit : > > Hi Internals, > > I'd like to start a discussion on the following RFC > https://wiki.php.net/rfc/autovivification_false > Particularly, I am looking for opinions on whether this behaviour should be > left alone, should be disabled on false, or should be disabled on null and > false, and left only for undefined variables. > > Autovivification is very useful in PHP, especially with multidimensional > arrays and loops. However, the question is should we allow it on false and > null values going forward. > > Kind Regards, > Kamil Tekiela
Hi, * IMO, null and undefined should be treated the same way, because for many purposes, they are not distinguished (e.g., isset() and ??). In general, I’m setting something to null when I want to avoid an “Undefined symbol” notice, I am sure it is not a bug, and I don’t want to scatter my code with `isset()` or `??` checks. * In any case, the removing of the autovivification behaviour MUST be preceded by a period of deprecation notices. —Claude
  114598
May 25, 2021 17:57 tekiela246@gmail.com (Kamil Tekiela)
>In any case, the removing of the autovivification behaviour MUST be preceded by a period of deprecation notices.
Yeah, obviously. Sorry I didn't make it clear in the RFC. I clarified this now. The goal is to first determine if we should restrict this behaviour or not.
  114599
May 25, 2021 19:47 pollita@php.net (Sara Golemon)
On Tue, May 25, 2021 at 11:23 AM Kamil Tekiela <tekiela246@gmail.com> wrote:

> I'd like to start a discussion on the following RFC > https://wiki.php.net/rfc/autovivification_false > Particularly, I am looking for opinions on whether this behaviour should be > left alone, should be disabled on false, or should be disabled on null and > false, and left only for undefined variables. > > Autovivification is very useful in PHP, especially with multidimensional > arrays and loops. However, the question is should we allow it on false and > null values going forward. > > I agree it's useful, and should 100% not be killed on undef. There is
doubtless a mountain of code in WordPress plugins alone which relies on this, not to mention other frameworks and libraries. As for `false` and `null`, I think it's reasonable to kill these (with a deprecation period) in a major version (e.g. 9.0). It might also be reasonable to only kill this behavior when strict_types is enabled. Essentially it would enforce making the array type accessors strict about operating on arrays. Would still need a deprecation period to go with that, but it would be less-rough on migrations of legacy codebases. -Sara
  114601
May 25, 2021 20:27 dusk@woofle.net (Dusk)
On May 25, 2021, at 09:23, Kamil Tekiela <tekiela246@gmail.com> wrote:
> I'd like to start a discussion on the following RFC > https://wiki.php.net/rfc/autovivification_false > Particularly, I am looking for opinions on whether this behaviour should be > left alone, should be disabled on false, or should be disabled on null and > false, and left only for undefined variables.
It seems to me there are two different sorts of autovivification which can happen. (They may actually be the same thing under the hood, but they feel different to me.) One is autovivification within an array, e.g. $x = ["b" => null, "c" => false]; $x["a"][] = 1; // from unset $x["b"][] = 2; // from null $x["c"][] = 3; // from false The other is autovivification on values outside arrays, e.g. $b = null; $c = false; $d = new stdClass(); $a[] = 1; // from unset $b[] = 2; // from null $c[] = 3; // from false $obj->a[] = 4; // from unset on an object property, same idea (The same behavior occurs in each case if an explicit key is used instead of [].) From my perspective, the latter is much more concerning than the former. Adding dimensions to an existing array feels like less of a type-correctness violation than calling an entire array into existence where a non-array value (or no value at all) existed before.
  114611
May 26, 2021 10:54 guilliam.xavier@gmail.com (Guilliam Xavier)
On Tue, May 25, 2021 at 10:27 PM Dusk <dusk@woofle.net> wrote:

> On May 25, 2021, at 09:23, Kamil Tekiela <tekiela246@gmail.com> wrote: > > I'd like to start a discussion on the following RFC > > https://wiki.php.net/rfc/autovivification_false > > Particularly, I am looking for opinions on whether this behaviour should > be > > left alone, should be disabled on false, or should be disabled on null > and > > false, and left only for undefined variables. > > It seems to me there are two different sorts of autovivification which can > happen. (They may actually be the same thing under the hood, but they feel > different to me.) > > One is autovivification within an array, e.g. > > $x = ["b" => null, "c" => false]; > $x["a"][] = 1; // from unset > $x["b"][] = 2; // from null > $x["c"][] = 3; // from false > > The other is autovivification on values outside arrays, e.g. > > $b = null; $c = false; $d = new stdClass(); > $a[] = 1; // from unset > $b[] = 2; // from null > $c[] = 3; // from false > $obj->a[] = 4; // from unset on an object property, same idea > > (The same behavior occurs in each case if an explicit key is used instead > of [].) > > From my perspective, the latter is much more concerning than the former. > Adding dimensions to an existing array feels like less of a > type-correctness violation than calling an entire array into existence > where a non-array value (or no value at all) existed before. >
I agree to that last point. I think the cases so far where I have relied on autovivification were all like: ``` function f(iterable $xs) { $map = []; // initialization foreach ($xs as $x) { // $map[foo($x)] ??= []; not needed // $map[foo($x)][bar($x)] ??= []; not needed $map[foo($x)][bar($x)][] = qux($x); // autovivification } // Then e.g.: foreach ($map as $foo => $submap) { foreach ($submap as $bar => $quxes) { /* ... */ } } } ``` All other cases I can remember were arguably bugs (missing initialization). That said, deprecating it on false would already be a +1. Regards, -- Guilliam Xavier
  114743
June 5, 2021 19:50 tekiela246@gmail.com (Kamil Tekiela)
Hi Internals,

I have reworked the RFC based on some feedback. The improved RFC now will
hold 2 votes. One vote to decide whether the behaviour should be deprecated
from false, and another from null.

If there are no objections then I would like to start the votes in a couple
of days.

However, I would still like to hear from you whether you
use autovivification from false and/or null in your projects. So far, I was
not able to identify when this would be useful in real-life scenarios.

RFC: https://wiki.php.net/rfc/autovivification_false

Regards,
Kamil
  114744
June 5, 2021 20:35 marandall@php.net (Mark Randall)
On 05/06/2021 20:50, Kamil Tekiela wrote:
> However, I would still like to hear from you whether you > use autovivification from false and/or null in your projects. So far, I was > not able to identify when this would be useful in real-life scenarios.
As per the RFC: "In PHP 8.1, appending to a variable of type false will throw a deprecation error" You might want to clarify this language, throwing an error is understood to have a specific meaning in PHP, and it's not what it's doing here. Mark Randall
  114746
June 5, 2021 22:13 tysonandre775@hotmail.com (tyson andre)
Hi Kamil,

> I have reworked the RFC based on some feedback. The improved RFC now will > hold 2 votes. One vote to decide whether the behaviour should be deprecated > from false, and another from null. > > If there are no objections then I would like to start the votes in a couple > of days. > > However, I would still like to hear from you whether you > use autovivification from false and/or null in your projects. So far, I was > not able to identify when this would be useful in real-life scenarios. > > RFC: https://wiki.php.net/rfc/autovivification_false
Without an implementation it'd be hard to actually tell what the impact would be. There isn't one linked to from the RFC or on github.com/php/php-src. You might have code in your application or external dependencies that relies on that without you remembering or being aware of it for null. **I started working on a prototype independently just to get an idea of how many things would encounter deprecations. See https://github.com/TysonAndre/php-src/pull/17** (I am not one of the RFC's authors. If someone bases their implementation on that prototype PR please keep the authorship of initial commits (e.g. `Co-Authored-By` in git) Also, what is the planned deprecation message, what about documenting all kinds of expressions that can cause autovivication, etc: e.g. `$x = &$falseVar['offset']` My assumption is that false would be reasonably practical to implement (this patch with `== IS_FALSE` instead of `>= IS_NULL`, plus some changes to the optimizer to account for the fact some dimension assignment statements might now have. For IS_NULL, that would probably require more familiarity with php's internals than I have to be certain the implementation is correct and to properly distinguish between undefined and null when automatically creating properties. Deprecation notices would be a lot more common for null than for false for example snippets such as the below, I see dozens of test failures in Zend/tests with that prototype `$this->someArray[$offset] = $event` for the implicitly null `public $someArray;`. https://github.com/vimeo/psalm/blob/105c6f3a1c6521e4077da39f05a94b1ddbd76249/src/Psalm/Internal/PhpVisitor/ReflectorVisitor.php#L399 is an example of that - the property's default is null, not the empty array. - Obviously, that project would fix it quickly (I'm just testing on projects I've already downloaded), but the point is there may be a lot of code like that elsewhere. And code such as https://github.com/nikic/php-ast/blob/v1.0.12/util.php#L25 (for a PECL I use - that would be fixed very quickly for null if this passed, there may be a lot of code like that elsewhere and other projects may not be maintained) ``` static $someVar; // equivalent to static $someVar = null; if ($someVar !== null) { return $someVar; } // ... someVar is still null $someVar[] = '123'; ``` ``` function example() { global $array; var_dump($array); // null, not undefined due to global $array converting undefined to null before creating a reference. $array[] = 1; } ``` ---- Overall, I'm in favor of this for false, but on the fence about null. It seems like a lot of old and possibly unmaintained applications/libraries might be converting null to arrays implicitly in ways such as the above, and there wouldn't be too much of a benefit to users to forcing them to start explicitly converting nulls to arrays before adding fields to arrays or taking references. If a project isn't already providing type information everywhere (and isn't using a static analyzer such as phan/psalm) it may be easy to miss the possibility of a value being null in rare code paths, but deprecating should give enough time to detect and fix typical issues between 8.1 and 9.0, and php 7.4's typed properties and 8.0's union types may make type information much easier to track. I try to avoid autovivication from null in php projects I work on, but I'm not the only contributor to those Regards, Tyson
  114762
June 7, 2021 10:17 nikita.ppv@gmail.com (Nikita Popov)
On Sun, Jun 6, 2021 at 12:14 AM tyson andre <tysonandre775@hotmail.com>
wrote:

> Hi Kamil, > > > I have reworked the RFC based on some feedback. The improved RFC now will > > hold 2 votes. One vote to decide whether the behaviour should be > deprecated > > from false, and another from null. > > > > If there are no objections then I would like to start the votes in a > couple > > of days. > > > > However, I would still like to hear from you whether you > > use autovivification from false and/or null in your projects. So far, I > was > > not able to identify when this would be useful in real-life scenarios. > > > > RFC: https://wiki.php.net/rfc/autovivification_false > > Without an implementation it'd be hard to actually tell what the impact > would be. There isn't one linked to from the RFC or on > github.com/php/php-src. > You might have code in your application or external dependencies that > relies on that without you remembering or being aware of it for null. > > **I started working on a prototype independently just to get an idea of > how many things would encounter deprecations. See > https://github.com/TysonAndre/php-src/pull/17** (I am not one of the > RFC's authors. If someone bases their implementation on that prototype PR > please keep the authorship of initial commits (e.g. `Co-Authored-By` in git) > > Also, what is the planned deprecation message, what about documenting all > kinds of expressions that can cause autovivication, etc: e.g. `$x = > &$falseVar['offset']` > > > My assumption is that false would be reasonably practical to implement > (this patch with `== IS_FALSE` instead of `>= IS_NULL`, plus some changes > to the optimizer to account for the fact some dimension assignment > statements might now have. > For IS_NULL, that would probably require more familiarity with php's > internals than I have to be certain the implementation is correct and to > properly distinguish between undefined and null when automatically creating > properties. >
This is a good point. After a cursory look, it's not obvious to me how we could distinguish these cases. We do need to perform the null initialization here, as we can't just leave behind an undef element in the property table. I think it would be best to limit this proposal to the false case only. A way to put this is that auto-vivification only happens for !isset(), which is precisely undef or null, but not false. (The historical behavior was that it happens for empty(), which has been cut down over time to the point that false is the only other accepted value, at which point this framing doesn't make sense anymore.) Regards, Nikita