RE: [PHP-DEV] Exposing object handles to userland

  99751
July 5, 2017 00:01 tysonandre775@hotmail.com (tyson andre)
There was a proposal back in 2015 to implement
a function spl_object_id(object $o) : int`,
which directly returns the object handle
(similar to `spl_object_hash`, but as an integer, not a string).
I'm interested in finishing implementing spl_object_id for php 7.2 

I already have working code implementing spl_object_id() at
https://github.com/TysonAndre/php-src/pull/1
The implementation XORs the object handle with the 
exact same random bits that `spl_object_hash` would.

Previous emails from 2015 can be seen here:
- https://marc.info/?t=143835274500003&r=1&w=2

Previous comment by a PHP maintainer in support of `spl_object_id()`
- https://marc.info/?l=php-internals&m=143837339210596&w=2

I'm unsure if an RFC is necessary. I have two pending questions.

- Can two objects can have the same object id
  but different object handlers?
  (e.g. iterators of some built in classes?)
  I'm not familiar enough with PHP's history to be sure.
- Can the the largest object handle be larger
  than the size of `zend_long` in 32-bit systems?

Example places where this would be useful:

1. https://marc.info/?l=php-internals&m=143849841618494&w=2

2. I also recently wanted to track a large number of (cloneable)
   small sets of objects in an application that sometimes used a lot of memory,
   and the fact that arrays support copy on write helped save memory
   relative to SplObjectHash if arrays and integer keys were used.
   See https://github.com/etsy/phan/pull/729#issuecomment-299289378

- Tyson Andre (tandre)
  99752
July 5, 2017 03:27 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> - Can two objects can have the same object id > but different object handlers? > (e.g. iterators of some built in classes?) > I'm not familiar enough with PHP's history to be sure.
Yes, if extension using non-standard handlers is in use.
> - Can the the largest object handle be larger > than the size of `zend_long` in 32-bit systems?
Handle is uint32_t, so probably no. -- Stas Malyshev smalyshev@gmail.com
  99756
July 5, 2017 08:38 weltling@outlook.de (Anatol Belski)
Hi,

> -----Original Message----- > From: Stanislav Malyshev [mailto:smalyshev@gmail.com] > Sent: Wednesday, July 5, 2017 5:28 AM > To: tyson andre <tysonandre775@hotmail.com>; internals@lists.php.net > Subject: Re: [PHP-DEV] Exposing object handles to userland > > Hi! > > > - Can two objects can have the same object id > > but different object handlers? > > (e.g. iterators of some built in classes?) > > I'm not familiar enough with PHP's history to be sure. > > Yes, if extension using non-standard handlers is in use. > > > - Can the the largest object handle be larger > > than the size of `zend_long` in 32-bit systems? > > Handle is uint32_t, so probably no. > On 32-bit zend_long is a signed 32-bit int, so it can theoretically overflow, while sizeof is same.
Regards Anatol
  99777
July 5, 2017 20:22 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> On 32-bit zend_long is a signed 32-bit int, so it can theoretically overflow, while sizeof is same.
Well, it's the same issue we having on representing any unsigned values, I guess. Since int<->uint in this case is one-to-one, should be ok to just use the negative nums, if they are used as IDs and not to calculate anything, etc. So I don't think it's much of a problem here. -- Stas Malyshev smalyshev@gmail.com
  99755
July 5, 2017 08:23 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Jul 5, 2017 at 2:01 AM, tyson andre <tysonandre775@hotmail.com>
wrote:

> There was a proposal back in 2015 to implement > a function spl_object_id(object $o) : int`, > which directly returns the object handle > (similar to `spl_object_hash`, but as an integer, not a string). > I'm interested in finishing implementing spl_object_id for php 7.2 > > I already have working code implementing spl_object_id() at > https://github.com/TysonAndre/php-src/pull/1 > The implementation XORs the object handle with the > exact same random bits that `spl_object_hash` would. >
You can drop the masking. It was never effective at what it's supposed to do (hide memory addresses), but as this is the object ID only, it is completely unnecessary here.
> Previous emails from 2015 can be seen here: > - https://marc.info/?t=143835274500003&r=1&w=2 > > Previous comment by a PHP maintainer in support of `spl_object_id()` > - https://marc.info/?l=php-internals&m=143837339210596&w=2 > > I'm unsure if an RFC is necessary. I have two pending questions. >
I'm +1 on the addition and would be fine with including it without RFC, if there are no objections on internals.
> - Can two objects can have the same object id > but different object handlers? > (e.g. iterators of some built in classes?) > I'm not familiar enough with PHP's history to be sure. >
No: In PHP 7 this is not possible, which is also why spl_object_hash() no longer includes the handlers.
> - Can the the largest object handle be larger > than the size of `zend_long` in 32-bit systems? >
Only in the sense that it could theoretically wrap around to negative numbers. Of course those would still serve as IDs just as well.
> Example places where this would be useful: > > 1. https://marc.info/?l=php-internals&m=143849841618494&w=2 > > 2. I also recently wanted to track a large number of (cloneable) > small sets of objects in an application that sometimes used a lot of > memory, > and the fact that arrays support copy on write helped save memory > relative to SplObjectHash if arrays and integer keys were used. > See https://github.com/etsy/phan/pull/729#issuecomment-299289378 > > - Tyson Andre (tandre) > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
  99776
July 5, 2017 20:21 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> No: In PHP 7 this is not possible, which is also why spl_object_hash() no > longer includes the handlers.
Ah, I missed that part. -- Stas Malyshev smalyshev@gmail.com
  99783
July 6, 2017 05:23 tysonandre775@hotmail.com (tyson andre)
Updated https://github.com/TysonAndre/php-src/pull/1 , which is now much shorter.

In response to Nikita Popov's comments:

> I'm +1 on the addition and would be fine with including it without RFC, if > there are no objections on internals.
How long should I wait to see if there are objections before creating a pull request?
> > - Can two objects can have the same object id > > but different object handlers? > No: In PHP 7 this is not possible, which is also why spl_object_hash() no > longer includes the handlers.
True. Checking again, the implementation of spl_object_hash doesn't include the handlers, so I don't need to worry.
> You can drop the masking. It was never effective at what it's supposed to > do (hide memory addresses), but as this is the object ID only, it is > completely unnecessary here. > [...] > Only in the sense that it could theoretically wrap around to negative > numbers. Of course those would still serve as IDs just as well.
That makes sense. I'm omitting the XOR (and returning the unobfuscated object handle/id) in the proposed change, then.
  99787
July 6, 2017 06:46 michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Brzuchalski?=)
06.07.2017 07:24 "tyson andre" <tysonandre775@hotmail.com> napisał(a):
> > Updated https://github.com/TysonAndre/php-src/pull/1 , which is now much shorter.
> > In response to Nikita Popov's comments: > > > I'm +1 on the addition and would be fine with including it without RFC, if
> > there are no objections on internals. > > How long should I wait to see if there are objections before creating a pull request?
AFAIK you can create PR any time.
> > > > - Can two objects can have the same object id > > > but different object handlers? > > No: In PHP 7 this is not possible, which is also why spl_object_hash() no
> > longer includes the handlers. > > True. Checking again, the implementation of spl_object_hash > doesn't include the handlers, so I don't need to worry. > > > You can drop the masking. It was never effective at what it's supposed to
> > do (hide memory addresses), but as this is the object ID only, it is > > completely unnecessary here. > > [...] > > Only in the sense that it could theoretically wrap around to negative > > numbers. Of course those would still serve as IDs just as well. > > That makes sense. I'm omitting the XOR > (and returning the unobfuscated object handle/id) > in the proposed change, then. > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php >
  99830
July 11, 2017 04:59 tysonandre775@hotmail.com (tyson andre)
>> How long should I wait to see if there are objections before creating a pull request? >AFAIK you can create PR any time.
I created a PR implementing spl_object_id(object $o) : int several days ago, at https://github.com/php/php-src/pull/2611
  99831
July 11, 2017 09:52 ocramius@gmail.com (Marco Pivetta)
Asking here, since it's not clear to me: is this a good/fitting replacement
for `spl_object_hash()`?

On 11 Jul 2017 7:00 AM, "tyson andre" <tysonandre775@hotmail.com> wrote:

> >> How long should I wait to see if there are objections before creating a > pull request? > >AFAIK you can create PR any time. > > I created a PR implementing spl_object_id(object $o) : int several days > ago, > at https://github.com/php/php-src/pull/2611 > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
  99832
July 11, 2017 10:21 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Jul 11, 2017 at 11:52 AM, Marco Pivetta <ocramius@gmail.com> wrote:

> Asking here, since it's not clear to me: is this a good/fitting > replacement for `spl_object_hash()`? > > On 11 Jul 2017 7:00 AM, "tyson andre" <tysonandre775@hotmail.com> wrote: > >> >> How long should I wait to see if there are objections before creating >> a pull request? >> >AFAIK you can create PR any time. >> >> I created a PR implementing spl_object_id(object $o) : int several days >> ago, >> at https://github.com/php/php-src/pull/2611 >> -- >> PHP Internals - PHP Runtime Development Mailing List >> To unsubscribe, visit: http://www.php.net/unsub.php >> >> Yes. spl_object_id() provides the same guarantees as spl_object_hash() --
but is much faster and has a more compact output (an integer instead of an iirc 32 byte string). Nikita
  99837
July 11, 2017 16:44 kontakt@beberlei.de (Benjamin Eberlei)
On Tue, Jul 11, 2017 at 12:21 PM, Nikita Popov ppv@gmail.com> wrote:

> On Tue, Jul 11, 2017 at 11:52 AM, Marco Pivetta <ocramius@gmail.com> > wrote: > > > Asking here, since it's not clear to me: is this a good/fitting > > replacement for `spl_object_hash()`? > > > > On 11 Jul 2017 7:00 AM, "tyson andre" <tysonandre775@hotmail.com> wrote: > > > >> >> How long should I wait to see if there are objections before creating > >> a pull request? > >> >AFAIK you can create PR any time. > >> > >> I created a PR implementing spl_object_id(object $o) : int several days > >> ago, > >> at https://github.com/php/php-src/pull/2611 > >> -- > >> PHP Internals - PHP Runtime Development Mailing List > >> To unsubscribe, visit: http://www.php.net/unsub.php > >> > >> > Yes. spl_object_id() provides the same guarantees as spl_object_hash() -- > but is much faster and has a more compact output (an integer instead of an > iirc 32 byte string). >
Just to clarify, that means object ids are still re-used between objects if one gets destroyed? The initial bug report was about creating a second function that solves this problem.
> > Nikita >
  99845
July 11, 2017 20:56 tysonandre775@hotmail.com (tyson andre)
> Just to clarify, that means object ids are still re-used between > objects if one gets destroyed? > The initial bug report was about creating a second function that solves this problem. 
Correct. This is intended to return the object handles. The most recent discussion from 2015 was only about object handles. An intended side effect is that object ids are re-used, the same way spl_object_hash is reused. - i.e. this is not intended to solve https://bugs.php.net/bug.php?id=52657 , and NEWS should not link to that bug id. Having unique integers wouldn't work as an integer in long runs of processes 32-bit builds - After ~4 billion objects were constructed, the 32-bit integer would be reused. As for a globally unique object id (let's call that spl_object_uuid), I don't think it'll ever get included. I also don't have any need for it. - It would require adding an additional field to zend_object, which would hurt performance and increase overall memory usage (e.g. increasing 64-bit integer) - No other programming languages I'm aware of have that (globally unique object id, even after object is garbage collected) as a native feature.
  99833
July 11, 2017 11:01 pollita@php.net (Sara Golemon)
On Tue, Jul 4, 2017 at 8:01 PM, tyson andre <tysonandre775@hotmail.com> wrote:
> I'm unsure if an RFC is necessary. I have two pending questions. > Unless someone insists upon it, I'll put my release manager hat on and
say I'm fine with it just going in without an RFC. -Sara