Re: [PHP-DEV] Concept: "arrayable" pseudo type and \Arrayable interface

  107826
November 19, 2019 15:54 rowan.collins@gmail.com (Rowan Tommins)
On Tue, 19 Nov 2019 at 03:19, Mike Schinkel <mike@newclarity.net> wrote:

> Should we make decisions about future language enhancements based on > conflicting and impossible to forecast predictions, or when a significant > subset of PHP developers see value in a feature for their use-cases *and* > others can simply ignore if they do not want to use it? >
Neither. We should discuss the advantages of the feature, the potential costs, and whether other features would be even better. I challenge that assertion that having a huge number of magic or standard
> methods is the *"only"* way this provides benefit. > >
Not the only way it can bring *any* benefit, but the only way it can bring the *particular* benefit of eliminating the need to agree method names across code bases. I think that's where this conversation has broken down a bit, which may be my fault: I wasn't intending to argue against all the possibilities of the feature, only the specific arguments you were raising. We disagree over whether (array)$foo would be used more consistently than meaningful method names, so maybe we should leave that there, and look at some other pros and cons of the feature.
> Since conversion to array from object is a common need just this one > method could provide a similar level of benefit that __toString() already > provides. > >
What makes __toString() worthy of a special case in my mind is that there are fairly common scenarios where variables are _implicitly_ cast to string: in double-quoted strings, echo, etc. Having the language be able to automatically call a particular method in those situations is therefore more valuable. I think I'd actually be more receptive to a proposal to allow _all_ casts to be overloaded, rather than adding array as a second special case, because *implicitly* casting to array doesn't seem like it would be any more common than other types.
> Yes, if your class is named Transaction then isSuccessful() probably *is* > a better name than __toBool(). > > > But if your class name itself is isSuccessful()? > >
Yes, I would see that as a better example. I think operator overloading in general makes sense when the object itself can be thought of as a special-case of the primitive it's emulating operators for. So in this case, the IsSuccessful class would be "a special kind of boolean"; and the hypothetical List or Collection classes I mentioned a couple of days ago would be "a special kind of array". We even have that in PHP for built-in types: GMP objects can now be used with mathematical operators like $foo * $bar, and those operations do what you'd expect them to do on an integer or float. Operator overloading can also be used just as a cute way of spelling something - probably most famously, C++ uses the << and >> operators for writing to and reading from streams, even though they're actually overloads of the bit-shift operator. This kind of use is, I would say, more controversial - going back to consistency, it's easier to reason about code if $foo * $bar always does some kind of multiplication than if it's been overloaded to mean "$foo is the star of $bar". Overloading of cast operators is no different - the clearest use cases for __toString() are where the whole class basically represents a piece of text, and the more controversial are where (string)$foo is actually a cute spelling of one method on a complex class. That's not necessarily a reason to not add a feature - any feature can be abused - but it potentially makes it harder for users to understand each other's code, and that's a cost we should at least consider.
> Let me count the ways. Here are several examples: > > https://gist.github.com/mikeschinkel/361bbcf44da1dac0da6afd786b6b8c3a > > > A great example of this is the AllowedHtml class I wrote for the above > gist whose sole purpose is to streamline the specification of allowed HTML > using to KSES library. > >
This is an interesting example. On the one hand, there is only one array ever going to be produced by that class; but on the other hand, the ultimate use of it is explicitly passing to a function that expects a different type. Assuming the same behaviour as __toString, the code shown would give an error under strict_types=1, and need changing to wp_kses((array)$allowed_html) A different interpretation would be that this is the Builder Pattern, and the target type happens to be an array rather than an object. So you might decide that the consistency you really needed is that all builders should have a build() method, and the last line of the example would become: $clean_html = wp_kses($_POST[ 'content' ] ?? null, $allowed_html_builder->build()); That doesn't mean the version you wrote is *wrong*, but should make us consider why one version deserves special treatment by the language and the other one doesn't. Regards, -- Rowan Tommins [IMSoP]
  107835
November 19, 2019 17:58 mike@newclarity.net (Mike Schinkel)
> On Nov 19, 2019, at 10:54 AM, Rowan Tommins collins@gmail.com> wrote: > Neither. We should discuss the advantages of the feature, the potential > costs, and whether other features would be even better.
In concept, I agree. But AFAIK no other "better" features were proposed. Do you have an alternate to propose for __toArray()? I am all ears.
> I think that's where this conversation has broken down a bit, which may be > my fault: I wasn't intending to argue against all the possibilities of the > feature, only the specific arguments you were raising. We disagree over > whether (array)$foo would be used more consistently than meaningful method > names, so maybe we should leave that there, and look at some other pros and > cons of the feature.
I appreciate you elaborating. We can certainly disagree; in both cases it is just our opinion for what might happen if a new feature is introduced, and neither of us can know for certain what the potential future holds. That said, which I think you acknowledged, it is a bit of bikeshedding on both our parts to belabor this one point any longer.
> What makes __toString() worthy of a special case in my mind is that there > are fairly common scenarios where variables are _implicitly_ cast to > string: in double-quoted strings, echo, etc. Having the language be able to > automatically call a particular method in those situations is therefore > more valuable.
The same can be said for __toBool(), and could be said for any other data type assuming that the appropriate __toType() were available. Said another way, if __toString() was not in PHP then no implicit casting would be possible and thus nobody would do it. So there really is no difference; if we had __toArray() and __toBool(), people could start using them where they would be implicitly cast to those values.
> I think I'd actually be more receptive to a proposal to > allow _all_ casts to be overloaded, rather than adding array as a second > special case, because *implicitly* casting to array doesn't seem like it > would be any more common than other types.
The feels like "perfect being the enemy of the good" again. If there is anything I have picked up by studying the nature of enhancements to PHP it is that they are added incrementally at best. Consider type hinting; we have basic type hinting today but we are just now getting union types. It would have been ideal to get all typing needs addressed at once, but if that had been the bar we probably would still be doing without. Also consider that implementation of _all_ casts would be a heavier lift than implementing just one today, and then hopefully implementing others later. And consider that the nature of PHP with its voting means that larger changes are exponentially harder to get accepted. So for me, rather than argue against a feature because we don't take all 10 steps at once, I would celebrate moving 1 step at a time in the positive direction, because we would at least be 1 step ahead of where we are today.
> Yes, I would see that as a better example. I think operator overloading in > general makes sense when the object itself can be thought of as a > special-case of the primitive it's emulating operators for. So in this > case, the IsSuccessful class would be "a special kind of boolean"; and the > hypothetical List or Collection classes I mentioned a couple of days ago > would be "a special kind of array". We even have that in PHP for built-in > types: GMP objects can now be used with mathematical operators like $foo * > $bar, and those operations do what you'd expect them to do on an integer or > float.
Exactly! I agree that for general purpose classes the general array cast is less ideal. But there would really be value in having "cast" magic methods to wrap built-in data types and then add related methods to them.
> Operator overloading can also be used just as a cute way of spelling > something - probably most famously, C++ uses the << and >> operators for > writing to and reading from streams, even though they're actually overloads > of the bit-shift operator.
Interestingly, I am against general operator overloading, because Ruby and Rails. Seriously, when operators can be overloaded you can end up with code that is almost impossible to decipher. But I do not think the same is true for casting as I tried to illustrate by example.
> Overloading of cast operators is no different - the clearest use cases for > __toString() are where the whole class basically represents a piece of > text, and the more controversial are where (string)$foo is actually a cute > spelling of one method on a complex class.
I fully agree with that.
> That's not necessarily a reason to not add a feature - any feature can be > abused - but it potentially makes it harder for users to understand each > other's code, and that's a cost we should at least consider.
I would like to err on the side of power combined with documentation and education rather than prohibition.
> This is an interesting example. On the one hand, there is only one array > ever going to be produced by that class; but on the other hand, the > ultimate use of it is explicitly passing to a function that expects a > different type. Assuming the same behaviour as __toString, the code shown > would give an error under strict_types=1, and need changing to > wp_kses((array)$allowed_html)
You assume that my proposed __asArray() was not also implemented. However — per the proposal — when __asArray() returns true PHP would treat the object as an array in any context where an array is expected but still allow methods to be called.
> A different interpretation would be that this is the Builder Pattern, and > the target type happens to be an array rather than an object. So you might > decide that the consistency you really needed is that all builders should > have a build() method, and the last line of the example would become: > > $clean_html = wp_kses($_POST[ 'content' ] ?? null, > $allowed_html_builder->build());
Totally. But the primary use-case for __toArray() I have identified is to support patterns that exist in other peoples code — such as WordPress core — where you cannot easily rearchitect the code to be what a developer thinks they really need; you have to go with what is not what you want to be.
> That doesn't mean the version you wrote is *wrong*, but should make us > consider why one version deserves special treatment by the language and the > other one doesn't.
What other version, and what special treatment is needed? I think I am sensing a pattern in your objections: For you, providing benefits in one area is to be avoided unless all areas can receive similar benefits? And when providing all areas with similar benefits would mean infinite features, you would prefer that no use cases get improved at all, even if the areas proposed to be improved represent the ~80 percentile of use-cases? Tell me if I am grasping your objections correctly? -Mike
  107837
November 19, 2019 19:00 rowan.collins@gmail.com (Rowan Tommins)
On Tue, 19 Nov 2019 at 17:58, Mike Schinkel <mike@newclarity.net> wrote:

> But AFAIK no other "better" features were proposed. Do you have an > alternate to propose for __toArray()? I am all ears. >
It's hard to know without understanding the use cases, which is why I got so hung up on your "consistency" argument - I wanted to understand what you were trying to achieve, and if this was the best way to achieve it. One thing you mentioned earlier was implicit interfaces, i.e. being able to match "any object that has a foo() method" rather than "any object that opted into the Fooable interface". You seemed to be saying that __toArray() would bring similar advantages, but maybe if we had implicit interfaces, those advantages would already be there? I may have completely misunderstood, but that's just an example of the wider thinking I'm talking about.
> Said another way, if __toString() was not in PHP then no implicit casting > would be possible and thus nobody would do it. >
Not quite. What I meant was that there are lots of places which implicitly cast *something* to string, and would still do so if we didn't have __toString(). For instance, it's common to use an integer in a double-quoted string or echo statement, which will implicitly cast it to string, so being able to use an object in the same place feels natural and useful.
> The feels like "perfect being the enemy of the good" again. > ....
> Also consider that implementation of _all_ casts would be a heavier lift > than implementing just one today, and then hopefully implementing others > later. >
Possibly, but it's also about how we *prioritise* features. If the general plan is "eventually we hope to have overloads for all casts, but let's do them one at a time", we should have some reason why array comes next. My gut feel is that implicit casts to boolean (e.g. in if statements) are the next most common after strings, so if asked to prioritise, I would pick that over arrays. There's also a kind of "tipping point" where splitting up the feature no longer makes sense - it would be weird to have __toString, __toArray, and __toBool; then add __toInt; then wait a release to add __toFloat. So I guess my suggestion is that maybe we've already reached that tipping point, and rather than bikeshedding the order to add things in, we should go ahead and add them all. But maybe there is a reason that __toArray() is easier / less dangerous / more useful than the rest.
> Seriously, when operators can be overloaded you can end up with code that > is almost impossible to decipher. But I do not think the same is true for > casting as I tried to illustrate by example. >
I guess this is where I'm less optimistic; I think overloaded casts could be just as unreadable as any other overloaded operator if abused - and just as powerful if used appropriately. I'd actually really love to be able to write $price = new Money('GBP', 100, 0) + new Money('GBP', 50, 50);
> I would like to err on the side of power combined with documentation and > education rather than prohibition. >
Just to play devil's advocate, the same reasoning could be used to dismiss your dislike of full operator overloading.
> You assume that my proposed __asArray() was not also implemented. > > However — per the proposal — when __asArray() returns true PHP would > treat the object as an array in any context where an array is expected but > still allow methods to be called. >
OK, I'd completely missed that proposal. It's an interesting idea, but it feels like there are some rough edges: why would you implement the function but return false, and how would that behave? And what if it returned true but there was no __toArray? Also, would the function actually receive the object, or the result of calling __toArray() on it? Would it be different in strict_types mode? In keeping with the idea of one feature at a time, it would probably make sense to work out the details of that mechanism for strings first, since we already have __toString(), then extend both parts to arrays at the same time.
> > $allowed_html_builder->build()); > > But the primary use-case for __toArray() I have identified is to support > patterns that exist in other peoples code — such as WordPress core — where > you cannot easily rearchitect the code to be what a developer thinks they > really need; you have to go with what is not what you want to be. >
My example requires exactly the same interaction with the third-party code as your example - create a custom object, and call some methods on it. If the __toArray() call is completely automatic, then your example saves the 9 characters of "->build()", but it doesn't change the *architecture* of the solution.
> > That doesn't mean the version you wrote is *wrong*, but should make us > > consider why one version deserves special treatment by the language and > the > > other one doesn't. > > What other version, and what special treatment is needed? >
The version using "$foo->build()" rather than "(array)$foo", and the special treatment of "(array)$foo" actually calling "$foo->__toArray()".
> I think I am sensing a pattern in your objections: For you, providing > benefits in one area is to be avoided unless all areas can receive similar > benefits?
Almost. It's more that I'm against adding a large number of small features that each make one use case slightly easier, but don't generalise very well..
> ... even if the areas proposed to be improved represent the ~80 > percentile of use-cases? >
Quite the opposite! I'm sceptical of this feature precisely because I think the use cases for it are rather narrow. Regards, -- Rowan Tommins [IMSoP]
  107838
November 19, 2019 20:20 larry@garfieldtech.com ("Larry Garfield")
On Tue, Nov 19, 2019, at 9:54 AM, Rowan Tommins wrote:

> Yes, I would see that as a better example. I think operator overloading in > general makes sense when the object itself can be thought of as a > special-case of the primitive it's emulating operators for. So in this > case, the IsSuccessful class would be "a special kind of boolean"; and the > hypothetical List or Collection classes I mentioned a couple of days ago > would be "a special kind of array". We even have that in PHP for built-in > types: GMP objects can now be used with mathematical operators like $foo * > $bar, and those operations do what you'd expect them to do on an integer or > float. > > Operator overloading can also be used just as a cute way of spelling > something - probably most famously, C++ uses the << and >> operators for > writing to and reading from streams, even though they're actually overloads > of the bit-shift operator. This kind of use is, I would say, more > controversial - going back to consistency, it's easier to reason about code > if $foo * $bar always does some kind of multiplication than if it's been > overloaded to mean "$foo is the star of $bar". > > Overloading of cast operators is no different - the clearest use cases for > __toString() are where the whole class basically represents a piece of > text, and the more controversial are where (string)$foo is actually a cute > spelling of one method on a complex class.
I just want to pop in here to highlight this last point, because that's the best heuristic I've seen for when __toString() is appropriate or not. I may have to steal that heuristic because it's spot on. To the rest of the thread, I think it's wise to split discussion of __toArray() off from "arrayable". They seem like they are addressing different if related problems. While an object with __toArray() would be "arrayable", there's already 2 array-ish things: arrays and the ArrayAccess interface. So some short hand for array|\ArrayAccess can stand on its own as a proposal. Or, since we now have union types, maybe that makes such a shorthand no longer necessary. I'm not sure there. --Larry Garfield