[RFC] Add WeakMap

  107755
November 4, 2019 13:22 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

This is a follow up to the addition of WeakReference in PHP 7.4.
WeakReference is an important primitive, but what people usually really
need are weak maps, which can't be implemented on top of WeakReference (at
least, not as exposed in PHP).

This RFC proposes to add a native WeakMap type for PHP 8:
https://wiki.php.net/rfc/weak_maps

Regards,
Nikita
  107765
November 5, 2019 12:32 benjamin.morel@gmail.com (Benjamin Morel)
Hi Nikita,

After reading the RFC, I have no comments to make, but I just want to thank
you for working on this. I regretted a lot that this wasn't implemented
together with WeakReference in PHP 7.4 <https://externals.io/message/106373>,
as the use cases for WeakReference vs WeakMap are really narrow.

Cheers,
Benjamin

On Tue, 5 Nov 2019 at 10:24, Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > This is a follow up to the addition of WeakReference in PHP 7.4. > WeakReference is an important primitive, but what people usually really > need are weak maps, which can't be implemented on top of WeakReference (at > least, not as exposed in PHP). > > This RFC proposes to add a native WeakMap type for PHP 8: > https://wiki.php.net/rfc/weak_maps > > Regards, > Nikita >
  107896
December 4, 2019 18:50 nikita.ppv@gmail.com (Nikita Popov)
On Mon, Nov 4, 2019 at 2:22 PM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > This is a follow up to the addition of WeakReference in PHP 7.4. > WeakReference is an important primitive, but what people usually really > need are weak maps, which can't be implemented on top of WeakReference (at > least, not as exposed in PHP). > > This RFC proposes to add a native WeakMap type for PHP 8: > https://wiki.php.net/rfc/weak_maps > > Regards, > Nikita >
Any comments on this proposal? Otherwise this could head to voting... Nikita
  107899
December 5, 2019 17:09 php@dennis.birkholz.biz (Dennis Birkholz)
Hi Nikita,

Am 04.12.19 um 19:50 schrieb Nikita Popov:
>> This RFC proposes to add a native WeakMap type for PHP 8: >> https://wiki.php.net/rfc/weak_maps > > Any comments on this proposal? Otherwise this could head to voting...
thanks for this proposal, will be really helpful! The only caveat for me is that WeakMap is not serializable. Wouldn't it be possible to allow serialization by just serializing it as an array with all objects that are still valid? This would avoid serialization errors and make using it with serialization very easy. It would not unserialize as a WeakMap but that is not that a great problem for me. Or maybe it could extend SplObjectStorage (or implement the same [new] interface) and could be serialized as an SplObjectStorage object instead of an array. Thanks for your great work, keep it up! Greets Dennis
  107900
December 6, 2019 10:29 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Dec 5, 2019 at 6:09 PM Dennis Birkholz <php@dennis.birkholz.biz>
wrote:

> Hi Nikita, > > Am 04.12.19 um 19:50 schrieb Nikita Popov: > >> This RFC proposes to add a native WeakMap type for PHP 8: > >> https://wiki.php.net/rfc/weak_maps > > > > Any comments on this proposal? Otherwise this could head to voting... > > thanks for this proposal, will be really helpful! > > The only caveat for me is that WeakMap is not serializable. Wouldn't it > be possible to allow serialization by just serializing it as an array > with all objects that are still valid? This would avoid serialization > errors and make using it with serialization very easy. It would not > unserialize as a WeakMap but that is not that a great problem for me. >
This is not possible. Classes always have to serialize to themselves ;)
> Or maybe it could extend SplObjectStorage (or implement the same [new] > interface) and could be serialized as an SplObjectStorage object instead > of an array. >
Could you provide some context on why you think serialization support for WeakMap is important? As weak maps are essentially caching structures, serializing them doesn't seem particularly useful in the first place, but when combined with the quite unintuitive behavior the serialization would have, I feel that it is better to leave this to the user (same as WeakReference). Specifically what I mean by uninituitive is this: When you do a $s = serialize($weakMap), you'll get back a large payload string, but when you then try to do an unserialize($s) you'll get back an empty WeakMap (or worse: a weak map that will only become empty on the next GC), because all of those objects will get removed as soon as unserialization is finalized. That "works", but doesn't seem terribly useful and is likely doing to be a wtf moment. Nikita
  107909
December 10, 2019 11:03 php@dennis.birkholz.biz (Dennis Birkholz)
Hi Nikita,

On 06.12.19 11:29, Nikita Popov wrote:
> Could you provide some context on why you think serialization support for > WeakMap is important? As weak maps are essentially caching structures, > serializing them doesn't seem particularly useful in the first place, but > when combined with the quite unintuitive behavior the serialization would > have, I feel that it is better to leave this to the user (same as > WeakReference).
structures provided by the PHP core tend to be used in the wild. As PHP lacks a method to check whether a given object (and all other objects contained within it) can be serialized (without traversing the complete object graph), each new always-available data structure that is not serializable increases the risk to encounter an object that is not serializable. That is the reason I prefer new data structures to be serializable. What I see coming is something like this: some kind of object that can contain other object and attach some meta information to that objects (that is stored as the value in a weak map). When an object is removed from the collection, the meta information gets removed eventually, no need to manually clear it in the remove-object method. If there are many different kinds of meta information this would save a lot of code in the remove method! Even though this seems to not be the intended use case, programmers tend to safe some key strokes here and there. That type of container object is not serializable unless the programmer takes extra steps and implements serialization him/herself.
> Specifically what I mean by uninituitive is this: When you do a $s = > serialize($weakMap), you'll get back a large payload string, but when you > then try to do an unserialize($s) you'll get back an empty WeakMap (or > worse: a weak map that will only become empty on the next GC), because all > of those objects will get removed as soon as unserialization is finalized. > That "works", but doesn't seem terribly useful and is likely doing to be a > wtf moment.
Ok, my intention was to have a more sophisticated approach: when the WeakMap is serialized, only objects in the object graph that is serialized are considered alive, all other objects are not serialized. So directly serializing a WeakMap would result in an empty map but serializing an object that contains a list of objects and a WeakMap containing some of the same child objects would create a meaningful payload string and unserialize would reconstruct the same object with only objects from the child list available in the WeakMap any more. I understand that this may complicate the implementation a lot (or even be not possible). But my just want to repeat my main concern: buildin data structures that are not serializable are a real problem for users that use serialization extensively. Maybe the solution to that problem is a method to check whether a provided object graph can be serialized (which may not be possible due to throwing an exception in __sleep() or something like that), some way to ignore unserializable elements or some way to register callback methods to handle unserializable elements. Anyway, thanks for taking your time and for bringing this proposal forward! Greets Dennis
  107911
December 12, 2019 17:13 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Dec 10, 2019 at 12:03 PM Dennis Birkholz <php@dennis.birkholz.biz>
wrote:

> Hi Nikita, > > On 06.12.19 11:29, Nikita Popov wrote: > > Could you provide some context on why you think serialization support for > > WeakMap is important? As weak maps are essentially caching structures, > > serializing them doesn't seem particularly useful in the first place, but > > when combined with the quite unintuitive behavior the serialization would > > have, I feel that it is better to leave this to the user (same as > > WeakReference). > > structures provided by the PHP core tend to be used in the wild. As PHP > lacks a method to check whether a given object (and all other objects > contained within it) can be serialized (without traversing the complete > object graph), each new always-available data structure that is not > serializable increases the risk to encounter an object that is not > serializable. That is the reason I prefer new data structures to be > serializable. > > What I see coming is something like this: some kind of object that can > contain other object and attach some meta information to that objects > (that is stored as the value in a weak map). When an object is removed > from the collection, the meta information gets removed eventually, no > need to manually clear it in the remove-object method. If there are many > different kinds of meta information this would save a lot of code in the > remove method! Even though this seems to not be the intended use case, > programmers tend to safe some key strokes here and there. That type of > container object is not serializable unless the programmer takes extra > steps and implements serialization him/herself. > > > Specifically what I mean by uninituitive is this: When you do a $s = > > serialize($weakMap), you'll get back a large payload string, but when you > > then try to do an unserialize($s) you'll get back an empty WeakMap (or > > worse: a weak map that will only become empty on the next GC), because > all > > of those objects will get removed as soon as unserialization is > finalized. > > That "works", but doesn't seem terribly useful and is likely doing to be > a > > wtf moment. > > Ok, my intention was to have a more sophisticated approach: when the > WeakMap is serialized, only objects in the object graph that is > serialized are considered alive, all other objects are not serialized. > So directly serializing a WeakMap would result in an empty map but > serializing an object that contains a list of objects and a WeakMap > containing some of the same child objects would create a meaningful > payload string and unserialize would reconstruct the same object with > only objects from the child list available in the WeakMap any more. > > I understand that this may complicate the implementation a lot (or even > be not possible).
This is indeed not possible. When we serialize the WeakMap, we do not know what else will be serialized as well. We can only serialize everything and let unserialization discard objects that are no longer live.
> But my just want to repeat my main concern: buildin > data structures that are not serializable are a real problem for users > that use serialization extensively. Maybe the solution to that problem > is a method to check whether a provided object graph can be serialized > (which may not be possible due to throwing an exception in __sleep() or > something like that), some way to ignore unserializable elements or some > way to register callback methods to handle unserializable elements. >
I'm must be missing something obvious here: Isn't this a reliable way to detect whether an object graph is serializable? try { $serialized = serialize($value); } catch (\Throwable $e) { // not serializable } Nikita