Re: Proposal for a RFC

This is only part of a thread. view whole thread
  105686
May 13, 2019 13:46 stevenwadejr@gmail.com (Steven Wade)
> On May 4, 2019, at 10:58 AM, Steven Wade <stevenwadejr@gmail.com> wrote: > > Hi Internals team! > > I have an idea for a feature that I'd love to see in the language one day and wanted to run the idea by you all. > > The idea is to add a new magic method "__toArray()" that would allow a developer to specifiy how a class is cast to an array. The idea is the same mentality of __toString(), but, for arrays. > > I would personally love this feature and those I've run it by were also excited by the idea. So I'm soliciting feedback in hopes that things go well and I can officially write the RFC. As for implementation, Sara Golemon is awesome and while chatting a few months back, knocked out a proof-of-concept implementation <https://github.com/sgolemon/php-src/tree/experimental.toarray>. There's still work to be done if this proposal gets to the RFC phase, but again, just gauging interest here. > > I appreciate any feedback you all can provide. > > Thanks, > > - Steven Wade
Hi all, I wanted to re-ping the list to see if there is any more feedback on this proposal? Any technical concerns or true BC changes? This feature wouldn't be as exciting as the others in 7.4, but I think it'd be a nice little helper, and the community feedback I've received from developers has been positive, so I'd like to keep the conversation going. -- Steven Wade stevenwadejr@gmail.com
  105688
May 13, 2019 14:17 rowan.collins@gmail.com (Rowan Collins)
On Mon, 13 May 2019 at 14:46, Steven Wade <stevenwadejr@gmail.com> wrote:

> > Hi all, I wanted to re-ping the list to see if there is any more feedback > on this proposal? Any technical concerns or true BC changes? >
I'm personally unconvinced of the value of this, and would probably propose it was blocked by coding standards in my team if it was added, because its meaning is so ambiguous. I actually see quite a lot of classes with normal methods called things like "toArray", and my comment is always "to *what* array?" Most objects do not have a single "natural"/"canonical" array representation, and such a transform is usually actually used as part of some particular helper or code pattern - e.g. an intermediate form for serializing to XML/JSON, or a compatibility-wrapper for legacy code. There's nearly always a better name for the method that properly indicates its intent. As a thought experiment, imagine a similar method which allowed you to overload (object)$foo. Although (array)$foo tells you slightly more than that, I'm not convinced it tells you enough that you're not just hiding meaning behind cute syntax. JsonSerializable actually suffers from similar problems, and is IMO useful only because it's automatically recursive. I presume the proposed mechanism would not be, i.e. return [$foo] would not be interpreted as return [(array)$foo]. Regards, -- Rowan Collins [IMSoP]
  105692
May 13, 2019 18:55 stevenwadejr@gmail.com (Steven Wade)
> I'm personally unconvinced of the value of this, and would probably propose > it was blocked by coding standards in my team if it was added, because its > meaning is so ambiguous.
That's perfectly reasonable. Do you also block use of casting to a string with (string) $foo as well? I ask because this proposal is simply on par with the idea behind string casting for objects. I don't expect that everyone would use custom array casting via __toArray(), but for those that would like to have that control and ease of use, it'd be valuable for them.
> I actually see quite a lot of classes with normal methods called things > like "toArray", and my comment is always "to *what* array?" Most objects do > not have a single "natural"/"canonical" array representation
I think the same could be said about "__toString()". But with that, some classes can be boiled down to a single representation, such as the Ramsey\Uuid <https://github.com/ramsey/uuid/> package. The same is with arrays. You can have a single entity such as person to where its array representation can be first name, last name, age, race, gender, email, etc..., or you can have a collection of items, to where in that representation as an array, you have control over what information is returned and what isn't.
> As a thought experiment, imagine a similar method which allowed you to > overload (object)$foo. Although (array)$foo tells you slightly more than > that, I'm not convinced it tells you enough that you're not just hiding > meaning behind cute syntax.
I'm confused by example, as there's no real need to overload casting to an object as a class is already an object. Whereas, a class is not already an array. It's not about "cute syntax", it's honestly about providing a simple clutter free helper for developers to take control over how their classes are transformed to array representations.
> JsonSerializable actually suffers from similar problems, and is IMO useful > only because it's automatically recursive. I presume the proposed mechanism > would not be, i.e. return [$foo] would not be interpreted as return > [(array)$foo].
You bring up a good point. Could you for a moment pretend like you're behind this proposal and expand upon this question? If PHP were to have a __toArray() method, would you see it as being recursive? In your opinion, how should/would it react? I know not everyone will want or use this feature, but for those who would, it'd be a great addition IMO. So my goal of this initial thread was to see, "what if we had this feature, what would we want it to do?", and go from there. -- Steven Wade stevenwadejr@gmail.com
  105694
May 14, 2019 10:11 rowan.collins@gmail.com (Rowan Collins)
On Mon, 13 May 2019 at 19:55, Steven Wade <stevenwadejr@gmail.com> wrote:

> I'm personally unconvinced of the value of this, and would probably propose > it was blocked by coding standards in my team if it was added, because its > meaning is so ambiguous. > > > That's perfectly reasonable. Do you also block use of casting to a string > with (string) $foo as well? I ask because this proposal is simply on par > with the idea behind string casting for objects. >
I have seen valid uses of __toString(), but I would certainly approach it cautiously. For a complex object, it's not at all obvious if (string)$foo will give you a debug representation, a JSON serialisation, an HTML rendering, etc.
> Most objects do not have a single "natural"/"canonical" array > representation > > > I think the same could be said about "__toString()". But with that, some > classes can be boiled down to a single representation, such as the > Ramsey\Uuid <https://github.com/ramsey/uuid/> package. >
Indeed it could. I think the difference is that a "one-dimensional" object, like a UUID, probably does lend itself to a single canonical string representation. You wouldn't expect it to return XML, or JSON, or any other string format, so (string)$uuid is fairly unambiguous.
> The same is with arrays. You can have a single entity such as person to > where its array representation can be first name, last name, age, race, > gender, email, etc..., >
This is exactly the kind of place I would *not* want a simple toArray() function. Should (array)$person (or $person->toArray()) return ['firstName'=>'Rowan', 'lastName'=>'Collins'], or ['name' => 'Rowan Collins'], or ['name' => ['Rowan', 'Collins']]? What date format should 'dateOfBirth' be formatted to? If 'address' is an object, should that be converted to an object as well, and into what format? The answers to these questions are going to be different in different contexts, and it doesn't make sense for the Person class to determine the "one true array representation" - the only canonical representation is the object itself.
> or you can have a collection of items, to where in that representation as an array, you have control over what information is returned and what
isn't. This is a more reasonable case; given that objects can't completely mimic arrays, I can see value in a custom List class implementing an array cast as a quick "back door" for using existing array functionality.
> As a thought experiment, imagine a similar method which allowed you to > overload (object)$foo. Although (array)$foo tells you slightly more than > that, I'm not convinced it tells you enough that you're not just hiding > meaning behind cute syntax. > > > I'm confused by example, as there's no real need to overload casting to an > object as a class is already an object. Whereas, a class is not already an > array. >
Sure, it's extra vague because "return $this" would be a valid response, but imagine there was something other than objects - structs, or custom resources, or whatever - and there was special syntax to say "give me an object based on this thing". The immediate question would surely be "what object? what are you using it for?" I feel the same way about "give me an array based on this object" - it tells me very little about what you actually want, and why.
> It's not about "cute syntax", it's honestly about providing a simple > clutter free helper for developers to take control over how their classes > are transformed to array representations. >
If it's not recursive, it's just syntactic sugar - which can be fine, if it serves a common use case, but it adds an extra "trick" that readers need to know about. It doesn't let you do anything you can't already - (array)$foo would just be a funny way of spelling $foo->__toArray()
> JsonSerializable actually suffers from similar problems, and is IMO useful > only because it's automatically recursive. I presume the proposed mechanism > would not be, i.e. return [$foo] would not be interpreted as return > [(array)$foo]. > > > You bring up a good point. Could you for a moment pretend like you're > behind this proposal and expand upon this question? If PHP were to have a > __toArray() method, would you see it as being recursive? In your opinion, > how should/would it react? >
A recursive method would certainly have more value, because it actually does something more than translate one syntax to another. On the other hand, the use case that comes to mind is serialization, and we already have more specific methods and systems for that. I guess that's what it comes down to, what *specific* use cases would this feature be intended to help with? Is there some code of your own that inspired you to propose it, or something you've seen publically that would benefit from it? Regards, -- Rowan Collins [IMSoP]
  105906
June 13, 2019 14:23 stevenwadejr@gmail.com (Steven Wade)
Apologies for the super late response:

> A recursive method would certainly have more value, because it actually > does something more than translate one syntax to another. On the other > hand, the use case that comes to mind is serialization, and we already have > more specific methods and systems for that.
How could this new magic method be recursive? If it only works if you manually declare __toArray() in your class, wouldn't you then as the user be in charge of casting anything manually in your method implementation?
> > I guess that's what it comes down to, what *specific* use cases would this > feature be intended to help with? Is there some code of your own that > inspired you to propose it, or something you've seen publically that would > benefit from it?
Originally, it was inspired by seeing Laravel's use of Arrayable as an interface and if something implements that, calling that class' `toArray()` method, and wishing that was built in so that frameworks didn't re-invent the wheel every time. As far as in my code, collections being cast as an array easily would be nice. Models with relationships, being able to implement that cast and control how your model and its children are (or aren't) represented. That's useful for returning an array in a controller for an API, or for simply adding context to a log message. IMO, the point is, it's another tool in the developers arsenal that they can use when they see fit. Not everyone will use it and not everyone will see the benefit of it, and that's ok, but for those that would and could, __toArray() is for them (and me). -- Steven Wade stevenwadejr@gmail.com
  105909
June 13, 2019 14:35 ocramius@gmail.com (Marco Pivetta)
On Thu, Jun 13, 2019 at 4:23 PM Steven Wade <stevenwadejr@gmail.com> wrote:

> > I guess that's what it comes down to, what *specific* use cases would > this > > feature be intended to help with? Is there some code of your own that > > inspired you to propose it, or something you've seen publically that > would > > benefit from it? > > Originally, it was inspired by seeing Laravel's use of Arrayable as an > interface and if something implements that, calling that class' `toArray()` > method, and wishing that was built in so that frameworks didn't re-invent > the wheel every time. >
Interestingly, my work day today is spent mostly removing this kind of behavior from a codebase riddled by it, replacing it with explicit conversions where needed. Marco Pivetta http://twitter.com/Ocramius http://ocramius.github.com/
  105690
May 13, 2019 17:51 ocramius@gmail.com (Marco Pivetta)
Hi Steven,

On Mon, 13 May 2019, 15:46 Steven Wade, <stevenwadejr@gmail.com> wrote:

> > On May 4, 2019, at 10:58 AM, Steven Wade <stevenwadejr@gmail.com> wrote: > > > > Hi Internals team! > > > > I have an idea for a feature that I'd love to see in the language one > day and wanted to run the idea by you all. > > > > The idea is to add a new magic method "__toArray()" that would allow a > developer to specifiy how a class is cast to an array. The idea is the same > mentality of __toString(), but, for arrays. > > > > I would personally love this feature and those I've run it by were also > excited by the idea. So I'm soliciting feedback in hopes that things go > well and I can officially write the RFC. As for implementation, Sara > Golemon is awesome and while chatting a few months back, knocked out a > proof-of-concept implementation < > https://github.com/sgolemon/php-src/tree/experimental.toarray>. There's > still work to be done if this proposal gets to the RFC phase, but again, > just gauging interest here. > > > > I appreciate any feedback you all can provide. > > > > Thanks, > > > > - Steven Wade > > Hi all, I wanted to re-ping the list to see if there is any more feedback > on this proposal? Any technical concerns or true BC changes? > > This feature wouldn't be as exciting as the others in 7.4, but I think > it'd be a nice little helper, and the community feedback I've received from > developers has been positive, so I'd like to keep the conversation going. >
I don't think any of the discussion points mentioned above is resolved: a heavy BC break on the `(array)` cast instead of introducing a clear `SomeArrayableInterfaceType#toArray()` (to be explicitly called) is a no-go from my end, as no clear value is added besides more language complexity, and more mixed cast is encouraged. Greets, Marco
>
  105691
May 13, 2019 18:43 stevenwadejr@gmail.com (Steven Wade)
> I don't think any of the discussion points mentioned above is resolved
I believe that most of the discussion points mentioned have been addressed and resolved. Regarding the concern that an array cast is the only operation capable of exposing object state, the same functionality can be achieved with reflection - as demonstrated here: https://3v4l.org/Dh3PO <https://3v4l.org/Dh3PO>.
> a heavy BC break on the `(array)` cast instead of introducing a clear `SomeArrayableInterfaceType#toArray()` (to be explicitly called) is a no-go from my end
As far as backwards compatibility breaks, I don't believe that's an issue here. My proposal and subsequent conversations state that __toArray() is only called if it is explicitly declared and implemented by a developer, otherwise casting an class to an array behaves exactly as it does today. The PHP documentation on magic methods <https://www.php.net/manual/en/language.oop5.magic.php> even specifically mentions: "PHP reserves all function names starting with __ as magical" and cautions "it is recommended that you do not use function names with __ in PHP unless you want some documented magic functionality." So the language itself reserves the double underscore methods for itself and future versions.
> as no clear value is added besides more language complexity, and more mixed cast is encouraged.
I don't think supporting another magic method and giving developers control over how an object they create is handled when cast to an array adds complexity. It's on par with the already existing __toString() method. Many libraries use this already to determine how their objects are handled as strings, and to make it easy to convert between types without needing to explicitly implement interfaces: such as Ramsey\Uuid <https://github.com/ramsey/uuid>, and League\Uri <https://github.com/thephpleague/uri-components>. The proposal to add a __toArray() isn't meant to add language complexity, it's meant to add an optional helper for developers to simply converting their objects to arrays, and doing so with magic methods isn't without precedence and is more in line with PHP. -- Steven Wade stevenwadejr@gmail.com
  105693
May 13, 2019 18:55 ocramius@gmail.com (Marco Pivetta)
On Mon, 13 May 2019, 20:43 Steven Wade, <stevenwadejr@gmail.com> wrote:

> I don't think any of the discussion points mentioned above is resolved > > > I believe that most of the discussion points mentioned have been addressed > and resolved. Regarding the concern that an array cast is the only > operation capable of exposing object state, the same functionality can be > achieved with reflection - as demonstrated here: https://3v4l.org/Dh3PO. >
That example misses the fact that `ReflectionProperty#getValue()` triggers property access guards, which is an extremely important detail for unset and typed properties. See https://3v4l.org/BtDs5 for an example of why `(array)` is vital in reflection/serialisation layers/libraries.
> a heavy BC break on the `(array)` cast instead of introducing a clear > `SomeArrayableInterfaceType#toArray()` (to be explicitly called) is a no-go > from my end > > > As far as backwards compatibility breaks, I don't believe that's an issue > here. My proposal and subsequent conversations state that __toArray() is > only called if it is explicitly declared and implemented by a developer, > otherwise casting an class to an array behaves exactly as it does today. > The PHP documentation on magic methods > <https://www.php.net/manual/en/language.oop5.magic.php> even specifically > mentions: "PHP reserves all function names starting with __ as magical" and > cautions "it is recommended that you do not use function names with __ in > PHP unless you want some documented magic functionality." So the language > itself reserves the double underscore methods for itself and future > versions. >
See example above on the BC break: existing libraries would need to be adapted to throw eagerly (and reject interactions with) objects implementing `__toArray`.
> as no clear value is added besides more language complexity, and more > mixed cast is encouraged. > > > I don't think supporting another magic method and giving developers > control over how an object they create is handled when cast to an array > adds complexity. It's on par with the already existing __toString() method. > Many libraries use this already to determine how their objects are handled > as strings, and to make it easy to convert between types without needing to > explicitly implement interfaces: such as Ramsey\Uuid > <https://github.com/ramsey/uuid>, and League\Uri > <https://github.com/thephpleague/uri-components>. >
The added complexity comes from consumers: a `(string) $something` cast is problematic if `$something` is of doubtful type, while `$something->toString()` already restricts the possible types of that `$something`. The same accidental complexity comes with unsafe `(array)` casts.
> The proposal to add a __toArray() isn't meant to add language complexity, > it's meant to add an optional helper for developers to simply converting > their objects to arrays, and doing so with magic methods isn't without > precedence and is more in line with PHP. >
Helpers need to be helpful: this ain't, as explained above, since a clear type declaration is a more useful and introspectible way of handling type conversions, especially with the already excessively complex and bloated design of objects in the PHP language. Greets, Marco