Re: [PHP-DEV] Operator overloading for userspace objects

This is only part of a thread. view whole thread
January 28, 2020 23:47 (Ben Ramsey)
> On Jan 28, 2020, at 17:14,>> wrote: > > Hello everybody, > > > > the last days I have experimented a bit with operator overloading in > userspace classes (redefing the meaning of arithmetic operations like +, -, > *, etc. for your own classes). > > This could be useful for different libraries which implements custom > arithmetic objects (like money values, tensors, etc.) or things like Symfony > string component (concatenate) operator, because it improves readability > much: > > $x * ($a + $b) instead of $x->multiply($a->add($b)) > > > > 4 years ago, there was a RFC about this topic ( > <> >, which was discussed a bit ( > <>, > but there was no real Outcome. > > > > I have tried to implement a proof of concept of the RFC, I encountered some > problems, when implementing the operator functions as (non-static) class > members and pass them only the “other” argument: What happens when we > encounter an expression like 2/$a and how can the class differ this from > $a/2. Also not every operation on every structure is e.g on commutative > (e.g. for matrices A*B =/= B*A). So I tried a C#-like approach, where the > operator implementations are static functions in the class, and both > arguments are passed. In my PHP implementation this would look something > like this: > > > > Class X { > > public static function __add($lhs, $rhs) { > > //... > > } > > } > > > > The class function can so decide what to do, based on both operands (so it > can decide if the developer wrote 2/$a or $a/2). Also that way an > implementor can not return $this by accident, which could lead to unintended > side effect, if the result of the operation is somehow mutated. > > > > I have taken over the idea of defining a magic function for each operation > (like Python does), because I think that way it is the clearest way to see, > what operators a class implements (could be useful for static analysis). The > downside to this approach is that this increases the number of magic > functions highly (my PoC-code defines 13 additional magic functions, and the > unary operators are missing yet), so some people in the original discussion > suggest to define a single (magic) function, where the operator is passed, > and the user code decides, what to do. Advantageous is very extensible (with > the right parser implementation, you could even define your own new > operators), with the cost that this method will become very complex for data > structures which use multiple operators (large if-else or switch > constructions, which delegate the logic to the appropriate functions). An > other idea mentioned was to extract interfaces with common functionality > (like Arithmetically, Comparable, etc.) like done with the ArrayAccess or > Countable interfaces. The problem that I see here, is that this approach is > rather unflexible and it would be difficult to extract really universal > interfaces (e.g. vectors does not need a division (/) operation, but the > concatenation . could be really useful for implementing dot product). This > would lead to either that only parts of the interfaces are implemented (and > the other just throw exceptions) or that the interfaces contain only one or > two functions (so we would have many interfaces instead of magic functions > in the end). > > > > On the topic which operators should be overloadable: My PoC-implementation > has magic functions for the arithmetic operators (+, -, *, /, %, **), string > concatenation (.), and bit operations (>>, <<, &, |, ^). Comparison and > equality checks are implement using a common __compare() function, which > acts like an overload of the spaceship operator. Based if -1, 0 or +1 is > returned by the comparison operators (<, >, <=, >=, ==) are evaluated. I > think this way we can enforce, that the assumed standard logic (e.g > !($a<$b)=($a>=$b) and ($a<$b)=($b>$a)) of comparison is implemented. Also I > don’t think this would restrict real world applications much (if you have an > example, where a separate definition of < and >= could be useful, please > comment it). > > Unlike the original idea, I don’t think it should be possible to overwrite > identity operator (===), because it should always be possible to check if > two objects are really identical (also every case should be coverable by > equality). The same applies to the logic operators (!, ||, &&), I think they > should always work like intended (other languages like Python and C# handles > it that way too). > > For the shorthand assignment operators like +=, -= the situation is a bit > more complicated: On the one hand the user has learned that $a+=1 is just an > abbreviation of $=$a+1, so this logic should apply to overloaded operators > as well (in C# it is implemented like this). On the other hand it could be > useful to differentiate between the two cases, so you can mutate the object > itself (in the += case) instead of returning a new object instance (the > class cannot know it is assigned to its own reference, when $a + 1 is > called). Personally I don’t think that this would be a big problem, so my > PoC-Code does not provide a possibility to override the short hand > operators.) For the increment/decrement operators ($a++) it is similar, it > would be nice if it would be possible to overload this operator but on the > other hand the use cases of this operator is really limited besides integer > incrementation and if you want to trigger something more complex, you should > call a method, to make clear of your intent. > > > > On the topic in which order the operators should be executed: Besides the > normal priority (defined by PHP), my code checks if the element on the left > side is an object and tries to call the appropriate magic function on it. If > this is not possible the same is done for the right argument. This should > cover the most of the use cases, except some cases: Consider a expression > like $a / $b, where $a and $b has different classes (class A + class B). If > class B knows how to divide class A, but class A does not know about class > B, we encounter a problem when evaluating just from left to right (and check > if the magic method exists). A solution for that would be that object $a can > express that he does not know how to handle class B (e.g. by returning null, > or throwing a special exception) and PHP can call the handler on object $b. > I'm not sure how common this problem would be, so I don’t have an idea how > useful this feature would be. > > > > My proof-of-concept implementation can be found here: > <> > > Here you can find some basic demo code using it: > <> > > > > > I would be happy to hear some opinions for this concept, and the idea of > overloadable operators in PHP in general.
On the subject of mutation, it seems awkward to me that `$a + 1` would alter the value of $a or that `2/$b` should alter $b. Rather, I would expect a new value to be *returned* as a result of this operation. If you take mutation off the table, then things become easier, IMO. We only need two magic methods: * __toInteger(): int * __toFloat(): float Then, in any mathematical context, PHP could call the appropriate method and use the number returned in the calculation. So, we could have something like this: class MyNumber { public function __toInteger(): int { return (int) $this->number; } } $x = new MyNumber(1); $y = $x + 1; And the value of $y would be 2. Of course, there’s the question of what we do if a class defines both __toInteger() and __toFloat(), so perhaps a __toNumber() is more appropriate, though that leads us into discussions what the return type of this method should be and whether a `number` scalar type is needed, but I think I’m getting ahead of the discussion here. TL;DR: mutating an object in the context of a mathematical operation (unless I’m explicitly calling a method on the object with the expectation of mutating it) could result in confusing and unexpected results for programmers. Cheers, Ben
January 28, 2020 23:59 ("Michael Cordover")
On Tue, Jan 28, 2020, at 18:47, Ben Ramsey wrote:
> If you take mutation off the table, then things become easier, IMO. We > only need two magic methods: > > * __toInteger(): int > * __toFloat(): float > > Then, in any mathematical context, PHP could call the appropriate > method and use the number returned in the calculation.
I don't think this is enough to make operator overloading useful, even without mutation. For example, the result of TimeInterval(1, 'ms') + TimeInterval(3, 'days') requires more information that we'd get out of __toInteger or __toFloat, but could still be a useful operation to perform, and ought to return a new TimeInterval (not an int). There are many of these tagged-number types where overloading would be helpful. - mjec