[RFC - discussion] __toArray()

  108369
February 4, 2020 13:03 stevenwadejr@gmail.com (Steven Wade)
Hi all,

I’d like to officially open my __toArray() RFC <https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed changing the status until I had more time to respond to the discussion, but since it’s been brought up again <https://externals.io/message/108351>, I figured now is the best time.

https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array>

Cheers,

Steven Wade
  108371
February 4, 2020 13:10 ocramius@gmail.com (Marco Pivetta)
Linking (again) previous discussions:
https://externals.io/message/98539#98539

`__toArray` as a magic function call when `(array)` cast happen is a bad
idea: it is a BC break, and it removes one of the very few interactions
(with objects) that didn't cause any side-effects (
https://externals.io/message/98539#98545,
https://externals.io/message/98539#98567)

Greets,

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/


On Tue, Feb 4, 2020 at 2:03 PM Steven Wade <stevenwadejr@gmail.com> wrote:

> Hi all, > > I’d like to officially open my __toArray() RFC < > https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed > changing the status until I had more time to respond to the discussion, but > since it’s been brought up again <https://externals.io/message/108351>, I > figured now is the best time. > > https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array> > > Cheers, > > Steven Wade
  108374
February 4, 2020 13:33 stevenwadejr@gmail.com (Steven Wade)
> `__toArray` as a magic function call when `(array)` cast happen is a bad idea: it is a BC break,
Adding a new magic method is not a backwards compatibility break. The PHP documentation on magic methods states: "Caution PHP reserves all function names starting with __ as magical. It is recommended that you do not use function names with __ in PHP unless you want some documented magic functionality.” Any user implementing their own methods prefixed with a double underscore is taking the chance that their code could break in the future.
> and it removes one of the very few interactions (with objects) that didn't cause any side-effects (https://externals.io/message/98539#98545 <https://externals.io/message/98539#98545>, https://externals.io/message/98539#98567 <https://externals.io/message/98539#98567>)
PHP 7.4 introduced the "get_mangled_object_vars()” as a result of these concerns.
  108376
February 4, 2020 13:35 kontakt@beberlei.de (Benjamin Eberlei)
On Tue, Feb 4, 2020 at 2:10 PM Marco Pivetta <ocramius@gmail.com> wrote:

> Linking (again) previous discussions: > https://externals.io/message/98539#98539 > > `__toArray` as a magic function call when `(array)` cast happen is a bad > idea: it is a BC break, and it removes one of the very few interactions > (with objects) that didn't cause any side-effects ( > https://externals.io/message/98539#98545, > https://externals.io/message/98539#98567) >
I think we can't classify it as BC break, because no existing code implements __toArray at the moment, and hence it will not fail when this feature is introduced and code gets upgraded to newer versions. Its correct that this removes the last simple way to get access to all properties on an object, although we can potentially remedy this with a function (which makes a lot of sense).
> > Greets, > > Marco Pivetta > > http://twitter.com/Ocramius > > http://ocramius.github.com/ > > > On Tue, Feb 4, 2020 at 2:03 PM Steven Wade <stevenwadejr@gmail.com> wrote: > > > Hi all, > > > > I’d like to officially open my __toArray() RFC < > > https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed > > changing the status until I had more time to respond to the discussion, > but > > since it’s been brought up again <https://externals.io/message/108351>, > I > > figured now is the best time. > > > > https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array> > > > > Cheers, > > > > Steven Wade >
  108379
February 4, 2020 13:43 ocramius@gmail.com (Marco Pivetta)
On Tue, Feb 4, 2020, 14:36 Benjamin Eberlei <kontakt@beberlei.de> wrote:

> > > On Tue, Feb 4, 2020 at 2:10 PM Marco Pivetta <ocramius@gmail.com> wrote: > >> Linking (again) previous discussions: >> https://externals.io/message/98539#98539 >> >> `__toArray` as a magic function call when `(array)` cast happen is a bad >> idea: it is a BC break, and it removes one of the very few interactions >> (with objects) that didn't cause any side-effects ( >> https://externals.io/message/98539#98545, >> https://externals.io/message/98539#98567) >> > > I think we can't classify it as BC break, because no existing code > implements __toArray at the moment, and hence it will not fail when this > feature is introduced and code gets upgraded to newer versions. >
It is a BC break because it changes the semantic of `(array) $object`: the operation is no longer stable between two versions of the language. Code relying on `(array)` behaviour (stable) requires additional reflection wrappers to check if `$object` is **not** implementing `__toArray`, and then the operation can be safely used (or an exception is to be thrown). You can most certainly make a new operation, such as `(toArray) $object`: the cast operator is new and isn't changing any existing behaviour.
  108383
February 4, 2020 13:50 norbert@aimeos.com (Aimeos | Norbert Sendetzky)
Am 04.02.20 um 14:43 schrieb Marco Pivetta:
>> I think we can't classify it as BC break, because no existing code >> implements __toArray at the moment, and hence it will not fail when this >> feature is introduced and code gets upgraded to newer versions. > > It is a BC break because it changes the semantic of `(array) $object`: the > operation is no longer stable between two versions of the language.
It wouldn't be a BC breaking change if `(array) $object` works like before when __toArray() isn't implemented by an object. As nobody should have implemented __toArray() because it's a reserved name for magic methods, we should be fine.
  108386
February 4, 2020 14:23 ocramius@gmail.com (Marco Pivetta)
On Tue, Feb 4, 2020, 14:50 Aimeos | Norbert Sendetzky <norbert@aimeos.com>
wrote:

> Am 04.02.20 um 14:43 schrieb Marco Pivetta: > >> I think we can't classify it as BC break, because no existing code > >> implements __toArray at the moment, and hence it will not fail when this > >> feature is introduced and code gets upgraded to newer versions. > > > > It is a BC break because it changes the semantic of `(array) $object`: > the > > operation is no longer stable between two versions of the language. > > It wouldn't be a BC breaking change if `(array) $object` works like > before when __toArray() isn't implemented by an object. As nobody should > have implemented __toArray() because it's a reserved name for magic > methods, we should be fine. >
The operation in question, when seen by its signature, is: (array) :: object FieldTypes -> Map String FieldTypes The proposed RFC changes this to (pardon the weird union type: my type-fu is not that advanced): (array) :: (FieldTypes|IO ToArrayTypes a) => object a -> Map String a This changes the return type of a very much pure function (even makes it non-pure: fun), and is a very, very, very clear BC break.
>
  108390
February 4, 2020 15:01 chasepeeler@gmail.com (Chase Peeler)
On Tue, Feb 4, 2020 at 9:23 AM Marco Pivetta <ocramius@gmail.com> wrote:

> On Tue, Feb 4, 2020, 14:50 Aimeos | Norbert Sendetzky <norbert@aimeos.com> > wrote: > > > Am 04.02.20 um 14:43 schrieb Marco Pivetta: > > >> I think we can't classify it as BC break, because no existing code > > >> implements __toArray at the moment, and hence it will not fail when > this > > >> feature is introduced and code gets upgraded to newer versions. > > > > > > It is a BC break because it changes the semantic of `(array) $object`: > > the > > > operation is no longer stable between two versions of the language. > > > > It wouldn't be a BC breaking change if `(array) $object` works like > > before when __toArray() isn't implemented by an object. As nobody should > > have implemented __toArray() because it's a reserved name for magic > > methods, we should be fine. > > > > The operation in question, when seen by its signature, is: > > (array) :: object FieldTypes -> Map String FieldTypes > > The proposed RFC changes this to (pardon the weird union type: my type-fu > is not that advanced): > > (array) :: (FieldTypes|IO ToArrayTypes a) => object a -> Map String a > > This changes the return type of a very much pure function (even makes it > non-pure: fun), and is a very, very, very clear BC break. > > > > I think we all know that I'm very big on avoiding BC breaks. I personally
don't see this as a BC break, though. At least not one with PHP. Right now, behavior when casting things to an array is like so: (array)$scalar ==> [$scalar] (array)$array ==> $array (array)$object ==> [$prop1=>$val1, $prop2=>$val2, ...] So, assuming that right now I have the code: $x = new SomeThirdPartyArrayLikeObject(); //stuff $y = (array)$x; //$y ==> [$prop1=>$val1, $prop2 => $val2, ...] In a future version, in order to make that library more array-like, the following is added: public function __toArray(){ return [$this]; } Based on that, I'd argue that the BC break is with the library, not PHP. If the __toArray function is not implemented on that class, then nothing changes. The only way you'd get BC breaks with PHP itself is if core (and, arguably extension) classes started behaving differently when cast to an array. I'm personally in favor of anything that is going to allow us to create array-like objects that can be treated like arrays. I personally hate having to write: if(is_object($var)){ $x = [$var]; } else { $x = (array)$var; } No, the other question is whether we do it with a magic method, like __toArray() or an interface. I personally like magic methods, but, in the end I'm ambivalent on that. -- Chase Peeler chasepeeler@gmail.com
  108384
February 4, 2020 14:13 kontakt@beberlei.de (Benjamin Eberlei)
On Tue, Feb 4, 2020 at 2:43 PM Marco Pivetta <ocramius@gmail.com> wrote:

> > > On Tue, Feb 4, 2020, 14:36 Benjamin Eberlei <kontakt@beberlei.de> wrote: > >> >> >> On Tue, Feb 4, 2020 at 2:10 PM Marco Pivetta <ocramius@gmail.com> wrote: >> >>> Linking (again) previous discussions: >>> https://externals.io/message/98539#98539 >>> >>> `__toArray` as a magic function call when `(array)` cast happen is a bad >>> idea: it is a BC break, and it removes one of the very few interactions >>> (with objects) that didn't cause any side-effects ( >>> https://externals.io/message/98539#98545, >>> https://externals.io/message/98539#98567) >>> >> >> I think we can't classify it as BC break, because no existing code >> implements __toArray at the moment, and hence it will not fail when this >> feature is introduced and code gets upgraded to newer versions. >> > > It is a BC break because it changes the semantic of `(array) $object`: the > operation is no longer stable between two versions of the language. > > Code relying on `(array)` behaviour (stable) requires additional > reflection wrappers to check if `$object` is **not** implementing > `__toArray`, and then the operation can be safely used (or an exception is > to be thrown). > > You can most certainly make a new operation, such as `(toArray) $object`: > the cast operator is new and isn't changing any existing behaviour. >
I believe the definition of BC break is "can an existing code-base run on the new PHP version without changes" and here the answer is yes, since __toArray fn's are not used by any existing code bases. What you refer to is, can existing code work with new code using this feature, and then the answer is indeed no. But the same was true for example for typed properties, which also requires additional handling in most meta programming libraries, and many other new features that existing libraries needed to adapt to: namespaces, scalar type hints, return types, ....
  108392
February 4, 2020 15:19 internals@lists.php.net ("Levi Morrison via internals")
Sorry if it's been said in the discussion so far, but I do not see why
`print_r` should convert anything to an array. It accepts multiple
kinds of types including strings, numbers, and so on, and I think
adding this behavior to `print_r` is a different thing than wanting a
standard way for objects to be converted into arrays.

And on that note, what is the motivation for wanting a magic method
for converting an object into an array? Why not make it an explicit
operation? I do not see the point of the magic here, except _maybe_
for adding it to `ArrayObject` or something to allow it to work with
the array_* functions that work without references, in which case I
think we think this RFC is the wrong approach as it is inadequate for
that purpose.
  108394
February 4, 2020 16:04 stevenwadejr@gmail.com (Steven Wade)
> Sorry if it's been said in the discussion so far, but I do not see why > `print_r` should convert anything to an array. It accepts multiple > kinds of types including strings, numbers, and so on, and I think > adding this behavior to `print_r` is a different thing than wanting a > standard way for objects to be converted into arrays.
You’re right, that’s my bad. I swore I tested print_r() with a class and __toString() and it cast it, but I just ran it again and it just outputs the object. The goal would be to have __toArray() behave on arrays like __toString() does on strings. I’ll update the RFC to remove the reference to print_r(). Thanks!
> And on that note, what is the motivation for wanting a magic method > for converting an object into an array? Why not make it an explicit > operation? I do not see the point of the magic here, except _maybe_ > for adding it to `ArrayObject` or something to allow it to work with > the array_* functions that work without references, in which case I > think we think this RFC is the wrong approach as it is inadequate for > that purpose.
PHP’s pretty magical already, so adding another magic method isn’t out of the question and would keep things inline with how some things are done already. The idea for having magical casting is to make it simpler in the user land for general array behavior when reading or looping.
  108395
February 4, 2020 17:09 chasepeeler@gmail.com (Chase Peeler)
On Tue, Feb 4, 2020 at 11:04 AM Steven Wade <stevenwadejr@gmail.com> wrote:

> > > Sorry if it's been said in the discussion so far, but I do not see why > > `print_r` should convert anything to an array. It accepts multiple > > kinds of types including strings, numbers, and so on, and I think > > adding this behavior to `print_r` is a different thing than wanting a > > standard way for objects to be converted into arrays. > > You’re right, that’s my bad. I swore I tested print_r() with a class and > __toString() and it cast it, but I just ran it again and it just outputs > the object. The goal would be to have __toArray() behave on arrays like > __toString() does on strings. I’ll update the RFC to remove the reference > to print_r(). Thanks! > > > > And on that note, what is the motivation for wanting a magic method > > for converting an object into an array? Why not make it an explicit > > operation? I do not see the point of the magic here, except _maybe_ > > for adding it to `ArrayObject` or something to allow it to work with > > the array_* functions that work without references, in which case I > > think we think this RFC is the wrong approach as it is inadequate for > > that purpose. > > I think the motivation is exactly what you said. Allowing developers more control over how the object is treated when casted to an array - which
would include when it is passed into an array_* function. Here is a use-case: if(is_object($arrayOrObject)){ $a = array_map($callback,$arrayOrObject->toArray()); } else { $a = array_map($callback,$arrayOrObject); } becomes $a = array_map($callback,$arrayOrObject); I'm not making an argument one way or the other for whether the above is justification, but, it does at least allow the above simplification of code..
> PHP’s pretty magical already, so adding another magic method isn’t out of > the question and would keep things inline with how some things are done > already. The idea for having magical casting is to make it simpler in the > user land for general array behavior when reading or looping. > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
-- Chase Peeler chasepeeler@gmail.com
  108396
February 4, 2020 17:13 stevenwadejr@gmail.com (Steven Wade)
> I think the motivation is exactly what you said. Allowing developers more control over how the object is treated when casted to an array - which would include when it is passed into an array_* function.
I couldn’t (and didn’t) have said it better myself. My motivation is really just to give developers more control over their code and allow them to have cleaner code.
  108375
February 4, 2020 13:33 kontakt@beberlei.de (Benjamin Eberlei)
On Tue, Feb 4, 2020 at 2:03 PM Steven Wade <stevenwadejr@gmail.com> wrote:

> Hi all, > > I’d like to officially open my __toArray() RFC < > https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed > changing the status until I had more time to respond to the discussion, but > since it’s been brought up again <https://externals.io/message/108351>, I > figured now is the best time. > > https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array> >
I am open to the idea of having __toArray. I just have a few questions about the RFC details. 1. print_r($object) would somehow call __toArray you say. Why would it cause a cast when nothing else is cast? I would prefer print_r((array) $person); 2. In the parameter and return type examples you should add declare(strict_types=0); so it's clear this only works in weak mode. 3. The weak point of this proposal is the by reference handling for sort et al. Counterpoint: if you pass a variable to preg_match, then matches gets converted from anything to array, so i believe by reference casting should change the original value (https://3v4l.org/XUJ5m). This is weird, but consistent. Cheers,
> > Steven Wade
  108377
February 4, 2020 13:39 stevenwadejr@gmail.com (Steven Wade)
> I am open to the idea of having __toArray. I just have a few questions about the RFC details. > > 1. print_r($object) would somehow call __toArray you say. Why would it cause a cast when nothing else is cast? I would prefer print_r((array) $person);
Originally I intended the proposal to be specific to user casting like you suggested but when writing the RFC I decided to pursue making the magic method and casting more in line with the behavior of __toString(). For example, when an object implements the __toString() method and you simply “echo $obj” PHP automatically casts the object for you.
> 2. In the parameter and return type examples you should add declare(strict_types=0); so it's clear this only works in weak mode.
I think this is a good idea. I hadn’t looked into strict_types and how it handles strings and their casting. Thanks!
> 3. The weak point of this proposal is the by reference handling for sort et al. Counterpoint: if you pass a variable to preg_match, then matches gets converted from anything to array, so i believe by reference casting should change the original value (https://3v4l.org/XUJ5m <https://3v4l.org/XUJ5m>). This is weird, but consistent.
The proposal states that “array functions that operate on an array by reference such as sort or shuffle will not work on an object implementing __toArray() under this proposal”, and IMO that is consistent with other magical casting behaviors and I wouldn’t expect a class implementing __toArray() to be able to be written or referenced like a traditional array.
  108380
February 4, 2020 13:43 kontakt@beberlei.de (Benjamin Eberlei)
On Tue, Feb 4, 2020 at 2:39 PM Steven Wade <stevenwadejr@gmail.com> wrote:

> > I am open to the idea of having __toArray. I just have a few questions > about the RFC details. > > 1. print_r($object) would somehow call __toArray you say. Why would it > cause a cast when nothing else is cast? I would prefer print_r((array) > $person); > > > Originally I intended the proposal to be specific to user casting like you > suggested but when writing the RFC I decided to pursue making the magic > method and casting more in line with the behavior of __toString(). For > example, when an object implements the __toString() method and you simply > “echo $obj” PHP automatically casts the object for you. >
Yes, i see that it auto casts when you typehint for "array". print_r argument is not array but mixed, so it should not cast the object to array there.
> > 2. In the parameter and return type examples you should add > declare(strict_types=0); so it's clear this only works in weak mode. > > > I think this is a good idea. I hadn’t looked into strict_types and how it > handles strings and their casting. Thanks! > > 3. The weak point of this proposal is the by reference handling for sort > et al. Counterpoint: if you pass a variable to preg_match, then matches > gets converted from anything to array, so i believe by reference casting > should change the original value (https://3v4l.org/XUJ5m). This is weird, > but consistent. > > > The proposal states that “array functions that operate on an array by > reference such as sort or shuffle will not work on an object > implementing __toArray() under this proposal”, and IMO that is consistent > with other magical casting behaviors and I wouldn’t expect a class > implementing __toArray() to be able to be written or referenced like a > traditional array. >
Passing an object with toString by reference will change the original variable: https://3v4l.org/77lov
  108382
February 4, 2020 13:47 stevenwadejr@gmail.com (Steven Wade)
>> 3. The weak point of this proposal is the by reference handling for sort et al. Counterpoint: if you pass a variable to preg_match, then matches gets converted from anything to array, so i believe by reference casting should change the original value (https://3v4l.org/XUJ5m <https://3v4l.org/XUJ5m>). This is weird, but consistent. > > The proposal states that “array functions that operate on an array by reference such as sort or shuffle will not work on an object implementing __toArray() under this proposal”, and IMO that is consistent with other magical casting behaviors and I wouldn’t expect a class implementing __toArray() to be able to be written or referenced like a traditional array. > > Passing an object with toString by reference will change the original variable: https://3v4l.org/77lov <https://3v4l.org/77lov> Ah, I think I see what you mean. PHP’s making a copy of the string variable once it’s cast, so it’s no longer referencing the original object that implemented the __toString() method. I imagined the __toArray() would function the same. Once the cast is called on the object, it’s a new array variable and can be acted upon as such. What I was trying to get at was that the magic method does not give the ability to write to the original object. If that’s not clear then I need to update the RFC.
  108412
February 6, 2020 12:05 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Feb 4, 2020 at 2:03 PM Steven Wade <stevenwadejr@gmail.com> wrote:

> Hi all, > > I’d like to officially open my __toArray() RFC < > https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed > changing the status until I had more time to respond to the discussion, but > since it’s been brought up again <https://externals.io/message/108351>, I > figured now is the best time. > > https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array>
One of the things that stand out to me in this RFC is that once you set strict_types=1 (which surely we all do :P), not a lot is left of it. While there are quite a few places where objects can be implicitly converted to strings (say echo, concatenation, etc), there are not a lot of places where objects can be implicitly converted to array. A quick grep has around 500 string conversions in the codebase, and 20 array conversions. Nikita
  108416
February 6, 2020 19:03 mike@newclarity.net (Mike Schinkel)
> > On Feb 6, 2020 at 7:06 AM, mailto:nikita.ppv@gmail.com)> wrote: > > While there are quite a few places where objects can be implicitly > > converted to strings (say echo, concatenation, etc), there are not > > a lot of places where objects can be implicitly converted to array. > > A quick grep has around 500 string conversions in the codebase, > > and 20 array conversions. > > > Out of curiosity, what did you grep for in both cases?
Were those string conversions done with casting to string? -Mike
> > > >
  108420
February 6, 2020 22:33 php.lists@allenjb.me.uk (AllenJB)
What happens if I perform array-like operations on an object 
implementing __toArray() from this RFC?

For example:

$obj["foo"] = 42;

or:

$bar = $obj["foo"];

(Imagine these examples are perhaps within a loop iterating a collection 
of objects)

Which operations do and do not work? As they'd be operating on a new 
from-cast array, I assume they would not be able to effect the original 
object (properties), but is this what people expect?

What I'm trying to get at here is: Does this RFC create opportunities 
for bugs arising because objects are accidentally treated as arrays and 
users would no longer receive any kind of warning or error from such 
code when they would have in the past?

(Aside: Could this get even more interesting where a loop involving 
references being (ab)used is involved? Experienced developers know you 
should avoid references, but many newer developers use them - either 
through misunderstanding the language or copying others and not actually 
understanding what they're doing. While, anecdotally from helping others 
in various channels, this occurs less than it did in the PHP 5 era, it 
does still occur.)

If so, I believe this is obviously bad. If not, I believe this is also 
bad because the above example do not work as someone (particularly newer 
users) might expect. The language is creating something which sometimes, 
maybe, acts like an array, but not always.


The other problem I have with this type of magic on objects is that 
people don't usually mean "to array", they mean "to array for specific 
purpose" - eg. "to array for database record" or "to array for API 
output". Using a magic method for this obfuscates the purpose of the 
returned array and could lead to problems, such as accidental data 
leakage, from cross-usage.

AllenJB


On 04/02/2020 13:03, Steven Wade wrote:
> Hi all, > > I’d like to officially open my __toArray() RFC <https://wiki.php.net/rfc/to-array> up to discussion. I’ve delayed changing the status until I had more time to respond to the discussion, but since it’s been brought up again <https://externals.io/message/108351>, I figured now is the best time. > > https://wiki.php.net/rfc/to-array <https://wiki.php.net/rfc/to-array> > > Cheers, > > Steven Wade