[RFC] New custom object serialization mechanism

  103823
January 24, 2019 12:27 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

I'd like to propose a new custom object serialization mechanism intended to
replace the broken Serializable interface:

https://wiki.php.net/rfc/custom_object_serialization

This was already previously discussed in https://externals.io/message/98834,
this just brings it into RFC form. The latest motivation for this is
https://bugs.php.net/bug.php?id=77302, a compatibility issue in 7.3
affecting Symfony, caused by Serializable. We can't fix Serializable, but
we can at least make sure that a working alternative exists.

Regards,
Nikita
  103828
January 24, 2019 14:09 ocramius@gmail.com (Marco Pivetta)
Not really fussed about having another implicit interface for serialization
(via magic methods).

Wouldn't a new interface make this clear, explicit, and make the
deprecation path easier (together with the migration)?

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/


On Thu, Jan 24, 2019 at 1:27 PM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to propose a new custom object serialization mechanism intended to > replace the broken Serializable interface: > > https://wiki.php.net/rfc/custom_object_serialization > > This was already previously discussed in > https://externals.io/message/98834, > this just brings it into RFC form. The latest motivation for this is > https://bugs.php.net/bug.php?id=77302, a compatibility issue in 7.3 > affecting Symfony, caused by Serializable. We can't fix Serializable, but > we can at least make sure that a working alternative exists. > > Regards, > Nikita >
  103829
January 24, 2019 14:18 sebastian@php.net (Sebastian Bergmann)
Am 24.01.2019 um 15:09 schrieb Marco Pivetta:
> Not really fussed about having another implicit interface for serialization > (via magic methods).
I second that emotion.
  103852
January 27, 2019 16:37 nicolas.grekas+php@gmail.com (Nicolas Grekas)
Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann <sebastian@php.net> a
écrit :

> Am 24.01.2019 um 15:09 schrieb Marco Pivetta: > > Not really fussed about having another implicit interface for > serialization > > (via magic methods). > > I second that emotion. >
The more I think about it, the more I'm convinced we should *not* add an interface for that. An interface defines a semantics - something that an API tpoa domain and that ppl can type-hint for to get a specific implementation of that API.Here, both aspects are not desired: we don't want ppl to type-hint for e.g. Serializable - and too bad it exists because I've already seen ppl think: "hey, I'll type-hint or extend it to express I want a serializable thing". BUT that's *not* the contract of Serializable or any variant of it because: 1. *any* PHP object is serializable 2. Serializable it well allowed to throw an exception to exactly *forbid* an object from being serialized. From this reasoning, I conclude that we really want magic methods here because what we need is a *behavior*. We want to hook into the engine to benefit from some special features it provides. That's what all magic methods are about and hooking into serialize()/unserialize() is not different. The parallel with the Symfony Serializer is interesting but it stop here: we don't need any help from core to build it. We should not seek for that goal IMHO as it blur the lines of what we're needing: a core primitive to build more useful things. Magic methods would just provide that, without polluting the semantics space. My 2cts, Nicolas
  103858
January 28, 2019 07:58 ocramius@gmail.com (Marco Pivetta)
On Sun, Jan 27, 2019 at 5:37 PM Nicolas Grekas grekas+php@gmail.com>
wrote:

> Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann <sebastian@php.net> a > écrit : > > > Am 24.01.2019 um 15:09 schrieb Marco Pivetta: > > > Not really fussed about having another implicit interface for > > serialization > > > (via magic methods). > > > > I second that emotion. > > > > > The more I think about it, the more I'm convinced we should *not* add an > interface for that. > An interface defines a semantics - something that an API tpoa domain and > that ppl can type-hint for to get a specific implementation of that > API.
An interface declares an API surface for an object: in this case, the consumer would be `serialize()` or `unserialize()`, and that's a perfectly valid use-case scenario.
> Here, both aspects are not desired: we don't want ppl to type-hint for > e.g. Serializable - and too bad it exists because I've already seen ppl > think: "hey, I'll type-hint or extend it to express I want a serializable > thing".
That's actually a very correct thing to do: by declaring that something is `Serializable`, you are expressing your intent to anybody inspecting the structure of the object.
> BUT that's *not* the contract of Serializable or any variant of it > because: 1. *any* PHP object is serializable 2. Serializable it well > allowed to throw an exception to exactly *forbid* an object from being > serialized. >
That's a completely different problem, which is that PHP has no way to declare APIs as functionally pure, or exception-less. That's something to be explored, in my opinion, but the lack of it does not warrant dismissing interfaces altogether (your current argument). Serializability is something to be declared: currently, PHP is very much ill-designed on this particular scope, but that doesn't mean that we should make it even worse. From this reasoning, I conclude that we really want magic methods here
> because what we need is a *behavior*. We want to hook into the engine to > benefit from some special features it provides. That's what all magic > methods are about and hooking into serialize()/unserialize() is not > different. >
Yet more magic methods just add an impressive amount of complexity to the language (I'm already thinking of the permutations of edge cases that I'm personally going to have to write). It's yet another way to define something that would work correctly if an interface (existing mechanism defined by the language) was used instead. Magic methods would just provide that, without polluting the semantics
> space. >
The newly introduced magic methods pollute the entire **language semantics**: now you have another edge case in the language, instead of an interface that works in combination with the `serialize()` and `unserialize()` functions. I'll also add that the problem being solved is much smaller than the issues introduced by the proposed additions. Marco Pivetta http://twitter.com/Ocramius http://ocramius.github.com/
  103859
January 28, 2019 08:31 nicolas.grekas+php@gmail.com (Nicolas Grekas)
(I should have reviewed myself in the first message - sentences edited in
this reply)

Le lun. 28 janv. 2019 à 08:58, Marco Pivetta <ocramius@gmail.com> a écrit :

> On Sun, Jan 27, 2019 at 5:37 PM Nicolas Grekas < > nicolas.grekas+php@gmail.com> wrote: > >> Le jeu. 24 janv. 2019 à 15:18, Sebastian Bergmann <sebastian@php.net> a >> écrit : >> >> > Am 24.01.2019 um 15:09 schrieb Marco Pivetta: >> > > Not really fussed about having another implicit interface for >> > serialization >> > > (via magic methods). >> > >> > I second that emotion. >> > >> >> >> The more I think about it, the more I'm convinced we should *not* add an >> interface for that. >> An interface defines a semantics - an API to a domain - and >> that ppl can type-hint for to get a specific implementation of that >> API. > > > An interface declares an API surface for an object: in this case, the > consumer would be `serialize()` or `unserialize()`, and that's a perfectly > valid use-case scenario. >
Implemeting serialize/unserialize methods is meant to hook into the same-name functions. That's a very technical and concrete thing - a behavior - not anything abstract as an interface is. Here, both aspects are not desired: we don't want ppl to type-hint for
>> e.g. Serializable - and too bad it exists because I've already seen ppl >> think: "hey, I'll type-hint or extend it to express I want a serializable >> thing". > > > That's actually a very correct thing to do: by declaring that something is > `Serializable`, you are expressing your intent to anybody inspecting the > structure of the object. >
"intent" is the issue here: these is no such abstract thing here. The PHP engine is a very concrete object that provides behaviors to userland. And we need a proper hook to configure the behavior of the serialize/unserialize functions. There is no such things as interface/intent/abstraction here, just a hook and that's perfect because the engine should define as little semantics as possible: that's the job of userland! BUT that's *not* the contract of Serializable or any variant of it
>> because: 1. *any* PHP object is serializable 2. Serializable it well >> allowed to throw an exception to exactly *forbid* an object from being >> serialized. >> > > That's a completely different problem, which is that PHP has no way to > declare APIs as functionally pure, or exception-less. That's something to > be explored, in my opinion, but the lack of it does not warrant dismissing > interfaces altogether (your current argument). >
You completly ignored 1.: *any* object is serializable. Another hint there isn't any abstraction sitting here: if Serializable was one, it would be like a small wall in wide field. The only way we could make "Serializable2" an abstraction is by fordbidding any object that *does not* impementit to be serializable. That's unrealistic - and uneeded to me.
> Serializability is something to be declared: currently, PHP is very much > ill-designed on this particular scope, but that doesn't mean that we should > make it even worse. > > From this reasoning, I conclude that we really want magic methods here >> because what we need is a *behavior*. We want to hook into the engine to >> benefit from some special features it provides. That's what all magic >> methods are about and hooking into serialize()/unserialize() is not >> different. >> > > Yet more magic methods just add an impressive amount of complexity to the > language (I'm already thinking of the permutations of edge cases that I'm > personally going to have to write). >
Please do, I'd be happy to better understand your pov. It's yet another way to define something that would work correctly if an
> interface (existing mechanism defined by the language) was used instead. >
It would not abstract anything, thus would be just broken syntactic illusional sugar. Magic methods would just provide that, without polluting the semantics
>> space. >> > > The newly introduced magic methods pollute the entire **language > semantics**: now you have another edge case in the language, instead of an > interface that works in combination with the `serialize()` and > `unserialize()` functions. >
"Language semantics" is another thing. I'm talking about "domain semantics". See reasoning above :) I'll also add that the problem being solved is much smaller than the issues
> introduced by the > proposed additions.
>
I promise the contrary: my personal experience is that Serializable does real harm (please work on https://github.com/symfony/symfony/issues/29951 and related issues to get the feeling). Magic methods would solve all this crap. Cheers, Nicolas
  103864
January 28, 2019 13:27 claude.pache@gmail.com (Claude Pache)
> Le 28 janv. 2019 à 08:58, Marco Pivetta <ocramius@gmail.com> a écrit : > >> >> Here, both aspects are not desired: we don't want ppl to type-hint for >> e.g. Serializable - and too bad it exists because I've already seen ppl >> think: "hey, I'll type-hint or extend it to express I want a serializable >> thing". > > > That's actually a very correct thing to do: by declaring that something is > `Serializable`, you are expressing your intent to anybody inspecting the > structure of the object. >
This interface (as well as `JsonSerializable`) is incorrectly named. If I believe the manual, it is an ”interface for customized serializing”; therefore it should have been called `CustomizedSerialization` or something like that. If we use a new interface, at the very least, let its name match its function (or its function match its name). —Claude
  103863
January 28, 2019 11:30 bjorn.x.larsson@telia.com (=?UTF-8?Q?Bj=c3=b6rn_Larsson?=)
Den 2019-01-24 kl. 15:09, skrev Marco Pivetta:
> Not really fussed about having another implicit interface for serialization > (via magic methods). > > Wouldn't a new interface make this clear, explicit, and make the > deprecation path easier (together with the migration)?
How important is the following statement from the RFC regarding magic methods vs Interfaces? - "Using an interface instead requires either raising the version requirement to PHP 7.4, or dealing with the definition of a stub interface in a compatible manner." Sounds like it would be easier to implement this feature in code/libraries targeting 7.x versions. r//Björn Larsson
  103884
January 30, 2019 09:20 nicolas.grekas+php@gmail.com (Nicolas Grekas)
Hi Nikita,

https://wiki.php.net/rfc/custom_object_serialization
>
In the RFC, you mention that "Executing arbitrary code in the middle of unserialization is dangerous and has led to numerous unserialize() vulnerabilities in the past. For this reason __wakeup() calls are now delayed until the end of unserialization." How about destructors? Some vulnerabilities come from destructors doing things with unserialized state. Would it be possible/a good idea to *not* call any destructors unless the "wakeup" stage has been successful? Any exceptions thrown during __wakeup/__unserialize would mean the unserialized data structure should be destroyed without calling any destructors? WDYT? Nicolas
  103885
January 30, 2019 09:27 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Jan 30, 2019 at 10:20 AM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:

> Hi Nikita, > > https://wiki.php.net/rfc/custom_object_serialization >> > > In the RFC, you mention that "Executing arbitrary code in the middle of > unserialization is dangerous and has led to numerous unserialize() > vulnerabilities in the past. For this reason __wakeup() calls are now > delayed until the end of unserialization." > > How about destructors? > Some vulnerabilities come from destructors doing things with unserialized > state. > Would it be possible/a good idea to *not* call any destructors unless the > "wakeup" stage has been successful? Any exceptions thrown during > __wakeup/__unserialize would mean the unserialized data structure should be > destroyed without calling any destructors? > WDYT? >
This is already how it works. If a class has __wakeup and unserialization fails (or call of __wakeup fails), then we will not call the destructor. (The same would be true for __unserialize under this proposal.) Nikita
  104456
February 18, 2019 15:12 nicolas.grekas+php@gmail.com (Nicolas Grekas)
Hi Nikita,

I'd like to propose a new custom object serialization mechanism intended to
> replace the broken Serializable interface: > > https://wiki.php.net/rfc/custom_object_serialization > > This was already previously discussed in > https://externals.io/message/98834, this just brings it into RFC form. > The latest motivation for this is https://bugs.php.net/bug.php?id=77302, > a compatibility issue in 7.3 affecting Symfony, caused by Serializable. We > can't fix Serializable, but we can at least make sure that a working > alternative exists. >
Is there anything we can to do to help the RFC move forward? I'm spending a great deal of time removing Serializable from the Symfony code base. It's not pretty nor fun... but we have no choice since as ppl move to PHP 7.3, they can now spot when the current mechanism is breaking their serialized payloads (typically from user objects put in the session storage). About interface vs magic methods, technicaly, we can work with an interface provided by PHP core, so that if that's a blocker for voters to approve the RFC, let's do it - the current situation is really aweful :). FYI, I found myself implementing some getState()/setState() methods of my own, decoupled from the underlying mechanisms used by PHP. That allows me to still use Serializable for BC reasons when needed, or move to __sleep when possible, then move to the new 7.4 style when ready, without requiring any changes from the consumer POV. That's a good illustration of what I meant in my previous email: domain-specific interfaces in everyone's code is a better design as it allows better decoupling. It's also a better design to express that "instances of this interface of mine MUST be serializable". IMHO. Nicolas
  104472
February 19, 2019 10:31 nikita.ppv@gmail.com (Nikita Popov)
On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas grekas+php@gmail.com>
wrote:

> Hi Nikita, > > I'd like to propose a new custom object serialization mechanism intended >> to replace the broken Serializable interface: >> >> https://wiki.php.net/rfc/custom_object_serialization >> >> This was already previously discussed in >> https://externals.io/message/98834, this just brings it into RFC form. >> The latest motivation for this is https://bugs.php.net/bug.php?id=77302, >> a compatibility issue in 7.3 affecting Symfony, caused by Serializable. We >> can't fix Serializable, but we can at least make sure that a working >> alternative exists. >> > > Is there anything we can to do to help the RFC move forward? I'm spending > a great deal of time removing Serializable from the Symfony code base. It's > not pretty nor fun... but we have no choice since as ppl move to PHP 7.3, > they can now spot when the current mechanism is breaking their serialized > payloads (typically from user objects put in the session storage). > > About interface vs magic methods, technicaly, we can work with an > interface provided by PHP core, so that if that's a blocker for voters to > approve the RFC, let's do it - the current situation is really aweful :). > FYI, I found myself implementing some getState()/setState() methods of my > own, decoupled from the underlying mechanisms used by PHP. That allows me > to still use Serializable for BC reasons when needed, or move to __sleep > when possible, then move to the new 7.4 style when ready, without requiring > any changes from the consumer POV. That's a good illustration of what I > meant in my previous email: domain-specific interfaces in everyone's code > is a better design as it allows better decoupling. It's also a better > design to express that "instances of this interface of mine MUST be > serializable". IMHO. > > Nicolas >
I think for me the main open question here is whether we want to just handle the serialization issue or try to come up with something more general. If we limit this to just serialization, then the design should stay as-is -- for all the reasons you have already outlined, using an interface for this is inappropriate. For a more general mechanism, I think we would need something along the lines of (ignoring naming): interface GetState { public function getState(string $purpose): array; } interface SetState { public function setState(array $data, string $purpose): void; } We need separate interfaces for get/set to properly cater to cases like JSON, where only get is meaningful (right now). We need the $purpose argument to allow different state representations for different purposes, e.g. JSON will often need more preparation than PHP serialization. The $purpose argument is a string, to allow extension for custom purposes. Seeing that, this is really not something I would feel comfortable with introducing in core PHP. Our track record for introducing reusable general-purpose interfaces is not exactly stellar (say, did you hear about SplObserver and SplSubject?) and I wouldn't want to do this without seeing the concept fleshed out extensively in userland first. Nikita
  104473
February 19, 2019 10:42 nicolas.grekas+php@gmail.com (Nicolas Grekas)
Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a écrit :

> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < > nicolas.grekas+php@gmail.com> wrote: > >> Hi Nikita, >> >> I'd like to propose a new custom object serialization mechanism intended >>> to replace the broken Serializable interface: >>> >>> https://wiki.php.net/rfc/custom_object_serialization >>> >>> This was already previously discussed in >>> https://externals.io/message/98834, this just brings it into RFC form. >>> The latest motivation for this is https://bugs.php.net/bug.php?id=77302, >>> a compatibility issue in 7.3 affecting Symfony, caused by Serializable. We >>> can't fix Serializable, but we can at least make sure that a working >>> alternative exists. >>> >> >> Is there anything we can to do to help the RFC move forward? I'm spending >> a great deal of time removing Serializable from the Symfony code base. It's >> not pretty nor fun... but we have no choice since as ppl move to PHP 7.3, >> they can now spot when the current mechanism is breaking their serialized >> payloads (typically from user objects put in the session storage). >> >> About interface vs magic methods, technicaly, we can work with an >> interface provided by PHP core, so that if that's a blocker for voters to >> approve the RFC, let's do it - the current situation is really aweful :).. >> FYI, I found myself implementing some getState()/setState() methods of my >> own, decoupled from the underlying mechanisms used by PHP. That allows me >> to still use Serializable for BC reasons when needed, or move to __sleep >> when possible, then move to the new 7.4 style when ready, without requiring >> any changes from the consumer POV. That's a good illustration of what I >> meant in my previous email: domain-specific interfaces in everyone's code >> is a better design as it allows better decoupling. It's also a better >> design to express that "instances of this interface of mine MUST be >> serializable". IMHO. >> >> Nicolas >> > > I think for me the main open question here is whether we want to just > handle the serialization issue or try to come up with something more > general. If we limit this to just serialization, then the design should > stay as-is -- for all the reasons you have already outlined, using an > interface for this is inappropriate. > > For a more general mechanism, I think we would need something along the > lines of (ignoring naming): > > interface GetState { > public function getState(string $purpose): array; > } > interface SetState { > public function setState(array $data, string $purpose): void; > } > > We need separate interfaces for get/set to properly cater to cases like > JSON, where only get is meaningful (right now). We need the $purpose > argument to allow different state representations for different purposes, > e.g. JSON will often need more preparation than PHP serialization. The > $purpose argument is a string, to allow extension for custom purposes. > > Seeing that, this is really not something I would feel comfortable with > introducing in core PHP. Our track record for introducing reusable > general-purpose interfaces is not exactly stellar (say, did you hear about > SplObserver and SplSubject?) and I wouldn't want to do this without seeing > the concept fleshed out extensively in userland first. > > Nikita >
I think the per-purpose representation logic doesn't belong to each classes.. Symfony Serializer does it completly from the outside, and I think that's a much better design. I'm on the side this should not be in PHP code indeed. Nicolas
  104524
February 26, 2019 14:17 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Feb 19, 2019 at 11:42 AM Nicolas Grekas <
nicolas.grekas+php@gmail.com> wrote:

> > > Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a > écrit : > >> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < >> nicolas.grekas+php@gmail.com> wrote: >> >>> Hi Nikita, >>> >>> I'd like to propose a new custom object serialization mechanism intended >>>> to replace the broken Serializable interface: >>>> >>>> https://wiki.php.net/rfc/custom_object_serialization >>>> >>>> This was already previously discussed in >>>> https://externals.io/message/98834, this just brings it into RFC form. >>>> The latest motivation for this is https://bugs.php.net/bug.php?id=77302, >>>> a compatibility issue in 7.3 affecting Symfony, caused by Serializable.. We >>>> can't fix Serializable, but we can at least make sure that a working >>>> alternative exists. >>>> >>> >>> Is there anything we can to do to help the RFC move forward? I'm >>> spending a great deal of time removing Serializable from the Symfony code >>> base. It's not pretty nor fun... but we have no choice since as ppl move to >>> PHP 7.3, they can now spot when the current mechanism is breaking their >>> serialized payloads (typically from user objects put in the session >>> storage). >>> >>> About interface vs magic methods, technicaly, we can work with an >>> interface provided by PHP core, so that if that's a blocker for voters to >>> approve the RFC, let's do it - the current situation is really aweful :). >>> FYI, I found myself implementing some getState()/setState() methods of my >>> own, decoupled from the underlying mechanisms used by PHP. That allows me >>> to still use Serializable for BC reasons when needed, or move to __sleep >>> when possible, then move to the new 7.4 style when ready, without requiring >>> any changes from the consumer POV. That's a good illustration of what I >>> meant in my previous email: domain-specific interfaces in everyone's code >>> is a better design as it allows better decoupling. It's also a better >>> design to express that "instances of this interface of mine MUST be >>> serializable". IMHO. >>> >>> Nicolas >>> >> >> I think for me the main open question here is whether we want to just >> handle the serialization issue or try to come up with something more >> general. If we limit this to just serialization, then the design should >> stay as-is -- for all the reasons you have already outlined, using an >> interface for this is inappropriate. >> >> For a more general mechanism, I think we would need something along the >> lines of (ignoring naming): >> >> interface GetState { >> public function getState(string $purpose): array; >> } >> interface SetState { >> public function setState(array $data, string $purpose): void; >> } >> >> We need separate interfaces for get/set to properly cater to cases like >> JSON, where only get is meaningful (right now). We need the $purpose >> argument to allow different state representations for different purposes, >> e.g. JSON will often need more preparation than PHP serialization. The >> $purpose argument is a string, to allow extension for custom purposes. >> >> Seeing that, this is really not something I would feel comfortable with >> introducing in core PHP. Our track record for introducing reusable >> general-purpose interfaces is not exactly stellar (say, did you hear about >> SplObserver and SplSubject?) and I wouldn't want to do this without seeing >> the concept fleshed out extensively in userland first. >> >> Nikita >> > > I think the per-purpose representation logic doesn't belong to each > classes. > Symfony Serializer does it completly from the outside, and I think that's > a much better design. > I'm on the side this should not be in PHP code indeed. >
As there hasn't been further input here, I'd like to go forward with the RFC as-is and plan to start voting by Friday. Nikita
  104574
March 4, 2019 05:30 dev@mabe.berlin (Marc)
(sorry for top posting)

Hi Nikita,

as for the question about magic methods vs. interface.

I have the feeling that it could be done with both in a clean way to 
make both parties happy.


1. Having __serialize/ __unserialize as magic methods - allowing to 
throw exceptions also to explicitly disallow serialization.

2. Having some kind of Serializable interface (I know this is already 
used) which expects both methods and also disallows throwing exceptions 
to force mark an object serializable. It's not possible to define such 
thing in the signature, but it can be documented as well as exceptions 
here could be catched internally and re-thrown as UnexpectedExceptionError.


Another question about static unserialize:

 > Some people have expressed a desire to make |__unserialize()| a 
static method which creates and returns the unserialized object (rather 
than first constructing the object and then calling |__unserialize()| to 
initialize it).

 > This would allow an even greater degree of control over the 
serialization mechanism, for example it would allow to return an already 
existing object from |__unserialize()|.

 > However, allowing this would once again require immediately calling 
|__unserialize()| functions (interleaved with unserialization) to make 
the object available for backreferences, which would reintroduce some of 
the problems that |Serializable| suffers from. As such, this will not be 
supported.

I would be very happy to also have a solution for this as I have a 
library which would benefit from it:

(https://github.com/marc-mabe/php-enum/blob/v3.1.1/src/EnumSerializableTrait.php#L46-L72)

Would it be possible to instead get the new instance as argument and 
expect the instance to be returned but on the same time allow to return 
a self instantiated instance?

Something like this:

public  static function  __unserialize(static $new, array  <http://www.php.net/array>  $data):  static  {
     $new->prop_a = $data['prop_a'];
     return $new;
}

// or ignoring the argument and instantiate a new one

public  static function  __unserialize(static $new, array  <http://www.php.net/array>  $data):  static  {
     $new = (ReflectionClass(__CLASS__))->newInstanceWithoutConstructor();
     $new->prop_a = $data['prop_a'];
     return $new;
}

The second variant sounds like a wast of resources but if it makes an 
edge-case possible it still better then just ignore it.


Thoughts?

Marc


On 26.02.19 15:17, Nikita Popov wrote:
> On Tue, Feb 19, 2019 at 11:42 AM Nicolas Grekas < > nicolas.grekas+php@gmail.com> wrote: > >> >> Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a >> écrit : >> >>> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < >>> nicolas.grekas+php@gmail.com> wrote: >>> >>>> Hi Nikita, >>>> >>>> I'd like to propose a new custom object serialization mechanism intended >>>>> to replace the broken Serializable interface: >>>>> >>>>> https://wiki.php.net/rfc/custom_object_serialization >>>>> >>>>> This was already previously discussed in >>>>> https://externals.io/message/98834, this just brings it into RFC form. >>>>> The latest motivation for this is https://bugs.php.net/bug.php?id=77302, >>>>> a compatibility issue in 7.3 affecting Symfony, caused by Serializable. We >>>>> can't fix Serializable, but we can at least make sure that a working >>>>> alternative exists. >>>>> >>>> Is there anything we can to do to help the RFC move forward? I'm >>>> spending a great deal of time removing Serializable from the Symfony code >>>> base. It's not pretty nor fun... but we have no choice since as ppl move to >>>> PHP 7.3, they can now spot when the current mechanism is breaking their >>>> serialized payloads (typically from user objects put in the session >>>> storage). >>>> >>>> About interface vs magic methods, technicaly, we can work with an >>>> interface provided by PHP core, so that if that's a blocker for voters to >>>> approve the RFC, let's do it - the current situation is really aweful :). >>>> FYI, I found myself implementing some getState()/setState() methods of my >>>> own, decoupled from the underlying mechanisms used by PHP. That allows me >>>> to still use Serializable for BC reasons when needed, or move to __sleep >>>> when possible, then move to the new 7.4 style when ready, without requiring >>>> any changes from the consumer POV. That's a good illustration of what I >>>> meant in my previous email: domain-specific interfaces in everyone's code >>>> is a better design as it allows better decoupling. It's also a better >>>> design to express that "instances of this interface of mine MUST be >>>> serializable". IMHO. >>>> >>>> Nicolas >>>> >>> I think for me the main open question here is whether we want to just >>> handle the serialization issue or try to come up with something more >>> general. If we limit this to just serialization, then the design should >>> stay as-is -- for all the reasons you have already outlined, using an >>> interface for this is inappropriate. >>> >>> For a more general mechanism, I think we would need something along the >>> lines of (ignoring naming): >>> >>> interface GetState { >>> public function getState(string $purpose): array; >>> } >>> interface SetState { >>> public function setState(array $data, string $purpose): void; >>> } >>> >>> We need separate interfaces for get/set to properly cater to cases like >>> JSON, where only get is meaningful (right now). We need the $purpose >>> argument to allow different state representations for different purposes, >>> e.g. JSON will often need more preparation than PHP serialization. The >>> $purpose argument is a string, to allow extension for custom purposes. >>> >>> Seeing that, this is really not something I would feel comfortable with >>> introducing in core PHP. Our track record for introducing reusable >>> general-purpose interfaces is not exactly stellar (say, did you hear about >>> SplObserver and SplSubject?) and I wouldn't want to do this without seeing >>> the concept fleshed out extensively in userland first. >>> >>> Nikita >>> >> I think the per-purpose representation logic doesn't belong to each >> classes. >> Symfony Serializer does it completly from the outside, and I think that's >> a much better design. >> I'm on the side this should not be in PHP code indeed. >> > As there hasn't been further input here, I'd like to go forward with the > RFC as-is and plan to start voting by Friday. > > Nikita >
  104575
March 4, 2019 09:08 nikita.ppv@gmail.com (Nikita Popov)
On Mon, Mar 4, 2019 at 6:31 AM Marc <dev@mabe.berlin> wrote:

> (sorry for top posting) > > Hi Nikita, > > as for the question about magic methods vs. interface. > > I have the feeling that it could be done with both in a clean way to > make both parties happy. > > > 1. Having __serialize/ __unserialize as magic methods - allowing to > throw exceptions also to explicitly disallow serialization. > > 2. Having some kind of Serializable interface (I know this is already > used) which expects both methods and also disallows throwing exceptions > to force mark an object serializable. It's not possible to define such > thing in the signature, but it can be documented as well as exceptions > here could be catched internally and re-thrown as UnexpectedExceptionError. > > > Another question about static unserialize: > > > Some people have expressed a desire to make |__unserialize()| a > static method which creates and returns the unserialized object (rather > than first constructing the object and then calling |__unserialize()| to > initialize it). > > > This would allow an even greater degree of control over the > serialization mechanism, for example it would allow to return an already > existing object from |__unserialize()|. > > > However, allowing this would once again require immediately calling > |__unserialize()| functions (interleaved with unserialization) to make > the object available for backreferences, which would reintroduce some of > the problems that |Serializable| suffers from. As such, this will not be > supported. > > I would be very happy to also have a solution for this as I have a > library which would benefit from it: > > ( > https://github.com/marc-mabe/php-enum/blob/v3.1.1/src/EnumSerializableTrait.php#L46-L72 > ) > > Would it be possible to instead get the new instance as argument and > expect the instance to be returned but on the same time allow to return > a self instantiated instance? > > Something like this: > > public static function __unserialize(static $new, array < > http://www.php.net/array> $data): static { > $new->prop_a = $data['prop_a']; > return $new; > } > > // or ignoring the argument and instantiate a new one > > public static function __unserialize(static $new, array < > http://www.php.net/array> $data): static { > $new = (ReflectionClass(__CLASS__))->newInstanceWithoutConstructor(); > $new->prop_a = $data['prop_a']; > return $new; > } > > The second variant sounds like a wast of resources but if it makes an > edge-case possible it still better then just ignore it. >
I would love to have __unserialize() return a new object, I just don't think it's technically possible. Let me explain in more detail what the problem here is: Serialization has a notion of object identity implemented through use of backreferences. In serialize([$obj, $obj]) the first use of the object is serialized as usual, while the second is a backreference to the first. To resolve that backreference, we can either a) construct objects immediately during unserialization, so we can just directly use an existing object when we encounter a backreference, or b) try to keep track of all the places where backreferences are used, so we can fill them out later. If __unserialize() is itself responsible for constructing the object, then a) means that __unserialize() needs to be called in the middle of unserialization. This is problematic, because the backreference mechanism uses direct pointers into structures, which can easily become invalid if the structures are modified inside the __unserialize() implementation. This is why __wakeup() calls are delayed until the end of serialization. Variant b) also runs into the same issue. Keeping pointers to all the places that use backreferences works fine until the first __unserialize() call, which might modify structures and invalidate those pointers. Beyond these issues related to implementation details, you have the more general issue of circular references. If you have an object that has a reference to itself, how do you unserialize that? You need the already instantiated object to pass it to __unserialize(), but only __unserialize() is going to give you the object! You could do something like pass the data with the circular reference nulled out and then patch it after __unserialize() runs, but as you don't really know what happens to the data passed to __unserialize(), you wouldn't know what exactly needs to be changed. As such, I think the only way we could have __unserialize() return a new object is if there was also a way to opt-out from the preservation of object identity. Regards, Nikita On 26.02.19 15:17, Nikita Popov wrote:
> > On Tue, Feb 19, 2019 at 11:42 AM Nicolas Grekas < > > nicolas.grekas+php@gmail.com> wrote: > > > >> > >> Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a > >> écrit : > >> > >>> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < > >>> nicolas.grekas+php@gmail.com> wrote: > >>> > >>>> Hi Nikita, > >>>> > >>>> I'd like to propose a new custom object serialization mechanism > intended > >>>>> to replace the broken Serializable interface: > >>>>> > >>>>> https://wiki.php.net/rfc/custom_object_serialization > >>>>> > >>>>> This was already previously discussed in > >>>>> https://externals.io/message/98834, this just brings it into RFC > form. > >>>>> The latest motivation for this is > https://bugs.php.net/bug.php?id=77302, > >>>>> a compatibility issue in 7.3 affecting Symfony, caused by > Serializable. We > >>>>> can't fix Serializable, but we can at least make sure that a working > >>>>> alternative exists. > >>>>> > >>>> Is there anything we can to do to help the RFC move forward? I'm > >>>> spending a great deal of time removing Serializable from the Symfony > code > >>>> base. It's not pretty nor fun... but we have no choice since as ppl > move to > >>>> PHP 7.3, they can now spot when the current mechanism is breaking > their > >>>> serialized payloads (typically from user objects put in the session > >>>> storage). > >>>> > >>>> About interface vs magic methods, technicaly, we can work with an > >>>> interface provided by PHP core, so that if that's a blocker for > voters to > >>>> approve the RFC, let's do it - the current situation is really aweful > :). > >>>> FYI, I found myself implementing some getState()/setState() methods > of my > >>>> own, decoupled from the underlying mechanisms used by PHP. That > allows me > >>>> to still use Serializable for BC reasons when needed, or move to > __sleep > >>>> when possible, then move to the new 7.4 style when ready, without > requiring > >>>> any changes from the consumer POV. That's a good illustration of what > I > >>>> meant in my previous email: domain-specific interfaces in everyone's > code > >>>> is a better design as it allows better decoupling. It's also a better > >>>> design to express that "instances of this interface of mine MUST be > >>>> serializable". IMHO. > >>>> > >>>> Nicolas > >>>> > >>> I think for me the main open question here is whether we want to just > >>> handle the serialization issue or try to come up with something more > >>> general. If we limit this to just serialization, then the design should > >>> stay as-is -- for all the reasons you have already outlined, using an > >>> interface for this is inappropriate. > >>> > >>> For a more general mechanism, I think we would need something along the > >>> lines of (ignoring naming): > >>> > >>> interface GetState { > >>> public function getState(string $purpose): array; > >>> } > >>> interface SetState { > >>> public function setState(array $data, string $purpose): void; > >>> } > >>> > >>> We need separate interfaces for get/set to properly cater to cases like > >>> JSON, where only get is meaningful (right now). We need the $purpose > >>> argument to allow different state representations for different > purposes, > >>> e.g. JSON will often need more preparation than PHP serialization. The > >>> $purpose argument is a string, to allow extension for custom purposes.. > >>> > >>> Seeing that, this is really not something I would feel comfortable with > >>> introducing in core PHP. Our track record for introducing reusable > >>> general-purpose interfaces is not exactly stellar (say, did you hear > about > >>> SplObserver and SplSubject?) and I wouldn't want to do this without > seeing > >>> the concept fleshed out extensively in userland first. > >>> > >>> Nikita > >>> > >> I think the per-purpose representation logic doesn't belong to each > >> classes. > >> Symfony Serializer does it completly from the outside, and I think > that's > >> a much better design. > >> I'm on the side this should not be in PHP code indeed. > >> > > As there hasn't been further input here, I'd like to go forward with the > > RFC as-is and plan to start voting by Friday. > > > > Nikita > > >
  104583
March 4, 2019 20:18 dev@mabe.berlin (Marc)
On 04.03.19 10:08, Nikita Popov wrote:
> On Mon, Mar 4, 2019 at 6:31 AM Marc <dev@mabe.berlin> wrote: > >> (sorry for top posting) >> >> Hi Nikita, >> >> as for the question about magic methods vs. interface. >> >> I have the feeling that it could be done with both in a clean way to >> make both parties happy. >> >> >> 1. Having __serialize/ __unserialize as magic methods - allowing to >> throw exceptions also to explicitly disallow serialization. >> >> 2. Having some kind of Serializable interface (I know this is already >> used) which expects both methods and also disallows throwing exceptions >> to force mark an object serializable. It's not possible to define such >> thing in the signature, but it can be documented as well as exceptions >> here could be catched internally and re-thrown as UnexpectedExceptionError.
Any thoughts about this?
>> >> Another question about static unserialize: >> >> > Some people have expressed a desire to make |__unserialize()| a >> static method which creates and returns the unserialized object (rather >> than first constructing the object and then calling |__unserialize()| to >> initialize it). >> >> > This would allow an even greater degree of control over the >> serialization mechanism, for example it would allow to return an already >> existing object from |__unserialize()|. >> >> > However, allowing this would once again require immediately calling >> |__unserialize()| functions (interleaved with unserialization) to make >> the object available for backreferences, which would reintroduce some of >> the problems that |Serializable| suffers from. As such, this will not be >> supported. >> >> I would be very happy to also have a solution for this as I have a >> library which would benefit from it: >> >> ( >> https://github.com/marc-mabe/php-enum/blob/v3.1.1/src/EnumSerializableTrait.php#L46-L72 >> ) >> >> Would it be possible to instead get the new instance as argument and >> expect the instance to be returned but on the same time allow to return >> a self instantiated instance? >> >> Something like this: >> >> public static function __unserialize(static $new, array < >> http://www.php.net/array> $data): static { >> $new->prop_a = $data['prop_a']; >> return $new; >> } >> >> // or ignoring the argument and instantiate a new one >> >> public static function __unserialize(static $new, array < >> http://www.php.net/array> $data): static { >> $new = (ReflectionClass(__CLASS__))->newInstanceWithoutConstructor(); >> $new->prop_a = $data['prop_a']; >> return $new; >> } >> >> The second variant sounds like a wast of resources but if it makes an >> edge-case possible it still better then just ignore it. >> > I would love to have __unserialize() return a new object, I just don't > think it's technically possible. Let me explain in more detail what the > problem here is: > > Serialization has a notion of object identity implemented through use of > backreferences. In serialize([$obj, $obj]) the first use of the object is > serialized as usual, while the second is a backreference to the first. To > resolve that backreference, we can either > > a) construct objects immediately during unserialization, so we can just > directly use an existing object when we encounter a backreference, or > b) try to keep track of all the places where backreferences are used, so we > can fill them out later. > > If __unserialize() is itself responsible for constructing the object, then > a) means that __unserialize() needs to be called in the middle of > unserialization. This is problematic, because the backreference mechanism > uses direct pointers into structures, which can easily become invalid if > the structures are modified inside the __unserialize() implementation. This > is why __wakeup() calls are delayed until the end of serialization. Variant > b) also runs into the same issue. Keeping pointers to all the places that > use backreferences works fine until the first __unserialize() call, which > might modify structures and invalidate those pointers. > > Beyond these issues related to implementation details, you have the more > general issue of circular references. If you have an object that has a > reference to itself, how do you unserialize that? You need the already > instantiated object to pass it to __unserialize(), but only __unserialize() > is going to give you the object! You could do something like pass the data > with the circular reference nulled out and then patch it after > __unserialize() runs, but as you don't really know what happens to the data > passed to __unserialize(), you wouldn't know what exactly needs to be > changed. > > As such, I think the only way we could have __unserialize() return a new > object is if there was also a way to opt-out from the preservation of > object identity. > > Regards, > Nikita
Thank you for the very detailed explanation! Do I understand it correctly that it's about self references in the serialized object or is that self-reference an internal implementation detail of the serializing logic. In case about self references in the serialized object and without any knowledge about the internal implementation I can just assume and based in that I still thing it should be possible. -> see https://3v4l.org/AbQ2s If my assumptions are just stupid please ignore Thanks, Marc
> > On 26.02.19 15:17, Nikita Popov wrote: >>> On Tue, Feb 19, 2019 at 11:42 AM Nicolas Grekas < >>> nicolas.grekas+php@gmail.com> wrote: >>> >>>> Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a >>>> écrit : >>>> >>>>> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < >>>>> nicolas.grekas+php@gmail.com> wrote: >>>>> >>>>>> Hi Nikita, >>>>>> >>>>>> I'd like to propose a new custom object serialization mechanism >> intended >>>>>>> to replace the broken Serializable interface: >>>>>>> >>>>>>> https://wiki.php.net/rfc/custom_object_serialization >>>>>>> >>>>>>> This was already previously discussed in >>>>>>> https://externals.io/message/98834, this just brings it into RFC >> form. >>>>>>> The latest motivation for this is >> https://bugs.php.net/bug.php?id=77302, >>>>>>> a compatibility issue in 7.3 affecting Symfony, caused by >> Serializable. We >>>>>>> can't fix Serializable, but we can at least make sure that a working >>>>>>> alternative exists. >>>>>>> >>>>>> Is there anything we can to do to help the RFC move forward? I'm >>>>>> spending a great deal of time removing Serializable from the Symfony >> code >>>>>> base. It's not pretty nor fun... but we have no choice since as ppl >> move to >>>>>> PHP 7.3, they can now spot when the current mechanism is breaking >> their >>>>>> serialized payloads (typically from user objects put in the session >>>>>> storage). >>>>>> >>>>>> About interface vs magic methods, technicaly, we can work with an >>>>>> interface provided by PHP core, so that if that's a blocker for >> voters to >>>>>> approve the RFC, let's do it - the current situation is really aweful >> :). >>>>>> FYI, I found myself implementing some getState()/setState() methods >> of my >>>>>> own, decoupled from the underlying mechanisms used by PHP. That >> allows me >>>>>> to still use Serializable for BC reasons when needed, or move to >> __sleep >>>>>> when possible, then move to the new 7.4 style when ready, without >> requiring >>>>>> any changes from the consumer POV. That's a good illustration of what >> I >>>>>> meant in my previous email: domain-specific interfaces in everyone's >> code >>>>>> is a better design as it allows better decoupling. It's also a better >>>>>> design to express that "instances of this interface of mine MUST be >>>>>> serializable". IMHO. >>>>>> >>>>>> Nicolas >>>>>> >>>>> I think for me the main open question here is whether we want to just >>>>> handle the serialization issue or try to come up with something more >>>>> general. If we limit this to just serialization, then the design should >>>>> stay as-is -- for all the reasons you have already outlined, using an >>>>> interface for this is inappropriate. >>>>> >>>>> For a more general mechanism, I think we would need something along the >>>>> lines of (ignoring naming): >>>>> >>>>> interface GetState { >>>>> public function getState(string $purpose): array; >>>>> } >>>>> interface SetState { >>>>> public function setState(array $data, string $purpose): void; >>>>> } >>>>> >>>>> We need separate interfaces for get/set to properly cater to cases like >>>>> JSON, where only get is meaningful (right now). We need the $purpose >>>>> argument to allow different state representations for different >> purposes, >>>>> e.g. JSON will often need more preparation than PHP serialization. The >>>>> $purpose argument is a string, to allow extension for custom purposes. >>>>> >>>>> Seeing that, this is really not something I would feel comfortable with >>>>> introducing in core PHP. Our track record for introducing reusable >>>>> general-purpose interfaces is not exactly stellar (say, did you hear >> about >>>>> SplObserver and SplSubject?) and I wouldn't want to do this without >> seeing >>>>> the concept fleshed out extensively in userland first. >>>>> >>>>> Nikita >>>>> >>>> I think the per-purpose representation logic doesn't belong to each >>>> classes. >>>> Symfony Serializer does it completly from the outside, and I think >> that's >>>> a much better design. >>>> I'm on the side this should not be in PHP code indeed. >>>> >>> As there hasn't been further input here, I'd like to go forward with the >>> RFC as-is and plan to start voting by Friday. >>> >>> Nikita >>>
  104601
March 6, 2019 09:28 nikita.ppv@gmail.com (Nikita Popov)
On Mon, Mar 4, 2019 at 9:18 PM Marc <dev@mabe.berlin> wrote:

> > On 04.03.19 10:08, Nikita Popov wrote: > > On Mon, Mar 4, 2019 at 6:31 AM Marc <dev@mabe.berlin> wrote: > > > >> (sorry for top posting) > >> > >> Hi Nikita, > >> > >> as for the question about magic methods vs. interface. > >> > >> I have the feeling that it could be done with both in a clean way to > >> make both parties happy. > >> > >> > >> 1. Having __serialize/ __unserialize as magic methods - allowing to > >> throw exceptions also to explicitly disallow serialization. > >> > >> 2. Having some kind of Serializable interface (I know this is already > >> used) which expects both methods and also disallows throwing exceptions > >> to force mark an object serializable. It's not possible to define such > >> thing in the signature, but it can be documented as well as exceptions > >> here could be catched internally and re-thrown as > UnexpectedExceptionError. > > Any thoughts about this? > > > >> > >> Another question about static unserialize: > >> > >> > Some people have expressed a desire to make |__unserialize()| a > >> static method which creates and returns the unserialized object (rather > >> than first constructing the object and then calling |__unserialize()| to > >> initialize it). > >> > >> > This would allow an even greater degree of control over the > >> serialization mechanism, for example it would allow to return an already > >> existing object from |__unserialize()|. > >> > >> > However, allowing this would once again require immediately calling > >> |__unserialize()| functions (interleaved with unserialization) to make > >> the object available for backreferences, which would reintroduce some of > >> the problems that |Serializable| suffers from. As such, this will not be > >> supported. > >> > >> I would be very happy to also have a solution for this as I have a > >> library which would benefit from it: > >> > >> ( > >> > https://github.com/marc-mabe/php-enum/blob/v3.1.1/src/EnumSerializableTrait.php#L46-L72 > >> ) > >> > >> Would it be possible to instead get the new instance as argument and > >> expect the instance to be returned but on the same time allow to return > >> a self instantiated instance? > >> > >> Something like this: > >> > >> public static function __unserialize(static $new, array < > >> http://www.php.net/array> $data): static { > >> $new->prop_a = $data['prop_a']; > >> return $new; > >> } > >> > >> // or ignoring the argument and instantiate a new one > >> > >> public static function __unserialize(static $new, array < > >> http://www.php.net/array> $data): static { > >> $new = > (ReflectionClass(__CLASS__))->newInstanceWithoutConstructor(); > >> $new->prop_a = $data['prop_a']; > >> return $new; > >> } > >> > >> The second variant sounds like a wast of resources but if it makes an > >> edge-case possible it still better then just ignore it. > >> > > I would love to have __unserialize() return a new object, I just don't > > think it's technically possible. Let me explain in more detail what the > > problem here is: > > > > Serialization has a notion of object identity implemented through use of > > backreferences. In serialize([$obj, $obj]) the first use of the object is > > serialized as usual, while the second is a backreference to the first. To > > resolve that backreference, we can either > > > > a) construct objects immediately during unserialization, so we can just > > directly use an existing object when we encounter a backreference, or > > b) try to keep track of all the places where backreferences are used, so > we > > can fill them out later. > > > > If __unserialize() is itself responsible for constructing the object, > then > > a) means that __unserialize() needs to be called in the middle of > > unserialization. This is problematic, because the backreference mechanism > > uses direct pointers into structures, which can easily become invalid if > > the structures are modified inside the __unserialize() implementation. > This > > is why __wakeup() calls are delayed until the end of serialization. > Variant > > b) also runs into the same issue. Keeping pointers to all the places that > > use backreferences works fine until the first __unserialize() call, which > > might modify structures and invalidate those pointers. > > > > Beyond these issues related to implementation details, you have the more > > general issue of circular references. If you have an object that has a > > reference to itself, how do you unserialize that? You need the already > > instantiated object to pass it to __unserialize(), but only > __unserialize() > > is going to give you the object! You could do something like pass the > data > > with the circular reference nulled out and then patch it after > > __unserialize() runs, but as you don't really know what happens to the > data > > passed to __unserialize(), you wouldn't know what exactly needs to be > > changed. > > > > As such, I think the only way we could have __unserialize() return a new > > object is if there was also a way to opt-out from the preservation of > > object identity. > > > > Regards, > > Nikita > > > Thank you for the very detailed explanation! > > Do I understand it correctly that it's about self references in the > serialized object or is that self-reference an internal implementation > detail of the serializing logic. > > > In case about self references in the serialized object and without any > knowledge about the internal implementation I can just assume and based > in that I still thing it should be possible. > > -> see https://3v4l.org/AbQ2s > > If my assumptions are just stupid please ignore >
If I understand right, your suggestion here is basically to make __unserialize() responsible for dealing with self-references by itself: It it wants to return a new object, then it needs to also take care of replacing any potential self-references. This somewhat works, but only if the references are really self-references. You can also simply have multiple references to the same object without there being an actual cycle, in which case __unserialize() will not have access to those references. An additional issue is that this basically requires the object to be fully aware of it's contents: You can't have an unspecified $data property that holds arbitrary values, because there might be a self-reference *somewhere* in there, but you wouldn't know whether to find it or how to update it. Nikita
> > > > On 26.02.19 15:17, Nikita Popov wrote: > >>> On Tue, Feb 19, 2019 at 11:42 AM Nicolas Grekas < > >>> nicolas.grekas+php@gmail.com> wrote: > >>> > >>>> Le mar. 19 févr. 2019 à 11:31, Nikita Popov ppv@gmail.com> a > >>>> écrit : > >>>> > >>>>> On Mon, Feb 18, 2019 at 4:12 PM Nicolas Grekas < > >>>>> nicolas.grekas+php@gmail.com> wrote: > >>>>> > >>>>>> Hi Nikita, > >>>>>> > >>>>>> I'd like to propose a new custom object serialization mechanism > >> intended > >>>>>>> to replace the broken Serializable interface: > >>>>>>> > >>>>>>> https://wiki.php.net/rfc/custom_object_serialization > >>>>>>> > >>>>>>> This was already previously discussed in > >>>>>>> https://externals.io/message/98834, this just brings it into RFC > >> form. > >>>>>>> The latest motivation for this is > >> https://bugs.php.net/bug.php?id=77302, > >>>>>>> a compatibility issue in 7.3 affecting Symfony, caused by > >> Serializable. We > >>>>>>> can't fix Serializable, but we can at least make sure that a > working > >>>>>>> alternative exists. > >>>>>>> > >>>>>> Is there anything we can to do to help the RFC move forward? I'm > >>>>>> spending a great deal of time removing Serializable from the Symfony > >> code > >>>>>> base. It's not pretty nor fun... but we have no choice since as ppl > >> move to > >>>>>> PHP 7.3, they can now spot when the current mechanism is breaking > >> their > >>>>>> serialized payloads (typically from user objects put in the session > >>>>>> storage). > >>>>>> > >>>>>> About interface vs magic methods, technicaly, we can work with an > >>>>>> interface provided by PHP core, so that if that's a blocker for > >> voters to > >>>>>> approve the RFC, let's do it - the current situation is really > aweful > >> :). > >>>>>> FYI, I found myself implementing some getState()/setState() methods > >> of my > >>>>>> own, decoupled from the underlying mechanisms used by PHP. That > >> allows me > >>>>>> to still use Serializable for BC reasons when needed, or move to > >> __sleep > >>>>>> when possible, then move to the new 7.4 style when ready, without > >> requiring > >>>>>> any changes from the consumer POV. That's a good illustration of > what > >> I > >>>>>> meant in my previous email: domain-specific interfaces in everyone's > >> code > >>>>>> is a better design as it allows better decoupling. It's also a > better > >>>>>> design to express that "instances of this interface of mine MUST be > >>>>>> serializable". IMHO. > >>>>>> > >>>>>> Nicolas > >>>>>> > >>>>> I think for me the main open question here is whether we want to just > >>>>> handle the serialization issue or try to come up with something more > >>>>> general. If we limit this to just serialization, then the design > should > >>>>> stay as-is -- for all the reasons you have already outlined, using an > >>>>> interface for this is inappropriate. > >>>>> > >>>>> For a more general mechanism, I think we would need something along > the > >>>>> lines of (ignoring naming): > >>>>> > >>>>> interface GetState { > >>>>> public function getState(string $purpose): array; > >>>>> } > >>>>> interface SetState { > >>>>> public function setState(array $data, string $purpose): void; > >>>>> } > >>>>> > >>>>> We need separate interfaces for get/set to properly cater to cases > like > >>>>> JSON, where only get is meaningful (right now). We need the $purpose > >>>>> argument to allow different state representations for different > >> purposes, > >>>>> e.g. JSON will often need more preparation than PHP serialization. > The > >>>>> $purpose argument is a string, to allow extension for custom > purposes. > >>>>> > >>>>> Seeing that, this is really not something I would feel comfortable > with > >>>>> introducing in core PHP. Our track record for introducing reusable > >>>>> general-purpose interfaces is not exactly stellar (say, did you hear > >> about > >>>>> SplObserver and SplSubject?) and I wouldn't want to do this without > >> seeing > >>>>> the concept fleshed out extensively in userland first. > >>>>> > >>>>> Nikita > >>>>> > >>>> I think the per-purpose representation logic doesn't belong to each > >>>> classes. > >>>> Symfony Serializer does it completly from the outside, and I think > >> that's > >>>> a much better design. > >>>> I'm on the side this should not be in PHP code indeed. > >>>> > >>> As there hasn't been further input here, I'd like to go forward with > the > >>> RFC as-is and plan to start voting by Friday. > >>> > >>> Nikita > >>> > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >