Is reference counting necessary for a PHP implementation?

  100788
September 27, 2017 17:57 sid.kshatriya@gmail.com (Sidharth Kshatriya)
In:

https://github.com/php/php-langspec/blob/master/spec/04-basic-concepts.md#reclamation-and-automatic-memory-management

> Despite the use of the term refcount, conforming implementations are not required to use a reference counting-based implementation for automatic
memory management. Is this statement correct? If I understand correctly many PHP projects depend on the deterministic firing of `__destruct()` function to cleanup SQL transactions or connections and so forth. HHVM elaborates on this:
> Eliminating destructors. Deterministic object destruction is the reason why nonscalar PHP values require precise reference counting. This
requirement has long been, and continues to be, a sizable performance bottleneck in our optimized JIT-compiled code. Using garbage collection instead could unlock measurable performance improvements, and the behavior of destructors could be closely imitated by a combination of try/finally and other new language constructs. (from http://hhvm.com/blog/2017/09/18/the-future-of-hhvm.html ) I'm curious about the answer here. Is ref counting necessary for all PHP implementations like HHVM claims? Thanks, Sidharth
  100789
September 27, 2017 19:34 ajf@ajf.me (Andrea Faulds)
Hi there,

Sidharth Kshatriya wrote:
> In: > > https://github.com/php/php-langspec/blob/master/spec/04-basic-concepts.md#reclamation-and-automatic-memory-management > >> Despite the use of the term refcount, conforming implementations are not > required to use a reference counting-based implementation for automatic > memory management. > > Is this statement correct? If I understand correctly many PHP projects > depend on the deterministic firing of `__destruct()` function to cleanup > SQL transactions or connections and so forth.
It's __destruct() that is the problem, yes. If “running PHP code that relies on deterministic __destruct()” is what you mean by “a PHP implementation”, then yes, it's necessary. If not, then no. :) -- Andrea Faulds https://ajf.me/
  100790
September 27, 2017 19:39 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> Is this statement correct? If I understand correctly many PHP projects > depend on the deterministic firing of `__destruct()` function to cleanup > SQL transactions or connections and so forth.
Yes. But, strictly speaking, you do not have to use specifically refcounting - i.e. having some value to increment by 1 each time reference is added and decremented by 1 each time reference is removed - to achieve that. Pretty much none of the code requires there would be an actual counter - only that the system would behave as if there was a counter. Of course, in this case actually having the counter is the most natural way of doing it :) But if you find some other way of achieving the same semantics, I think it'd be still OK.
> instead could unlock measurable performance improvements, and the behavior > of destructors could be closely imitated by a combination of try/finally > and other new language constructs.
I am not sure I am convinced by this statement, and not sure how RAII patterns would work in GC-based environments (i.e. it seems to me they won't). That being said, many resource allocation scenarios can be implemented without RAII (in fact, C doesn't have RAII at all, people still manage to work with it :). -- Stas Malyshev smalyshev@gmail.com
  100792
September 28, 2017 05:55 sid.kshatriya@gmail.com (Sidharth Kshatriya)
The incrementing of the counter is the easy part.

In ref counting, while decrementing the counter for a non-scalar (objects,
arrays, etc), if the counter reaches zero we need to follow all the
non-scalars referenced by the non-scalar you just made zero (and decrement
them too. Also, if any of them reach zero then you would need to follow up
on the non-scalars they point to, and so forth recursively). So reference
counting can become non-deterministic in terms of number of objects you
need to touch while doing any operation that results in a counter
decrement. Perhaps that is why HHVM is claiming that ref-counting can be a
performance hit?

The point here is that since we need the destructor to fire
deterministically there seems to be no way to avoid reference counting
semantics. We can however choose not to free memory while doing reference
updates but wait for later during a full garbage collection pass. However I
don't know what the specific performance implications of this might be.
That garbage collection pass will also automatically deal with circular
references. (However the Zend PHP implementation frees the objects during
reference decrement and only deals with cycles during it "gc" pass as far
as I know).

TL;DR: It seems to me that we need to follow full reference counting
semantics which involve following indeterminate non-scalar chains even
though we may not choose to actually free the memory occupied by
non-scalars when doing so.

Thanks,

Sidharth




On Thu, Sep 28, 2017 at 1:09 AM, Stanislav Malyshev <smalyshev@gmail.com>
wrote:

> Hi! > > > Is this statement correct? If I understand correctly many PHP projects > > depend on the deterministic firing of `__destruct()` function to cleanup > > SQL transactions or connections and so forth. > > Yes. But, strictly speaking, you do not have to use specifically > refcounting - i.e. having some value to increment by 1 each time > reference is added and decremented by 1 each time reference is removed - > to achieve that. Pretty much none of the code requires there would be an > actual counter - only that the system would behave as if there was a > counter. Of course, in this case actually having the counter is the most > natural way of doing it :) But if you find some other way of achieving > the same semantics, I think it'd be still OK. > > > instead could unlock measurable performance improvements, and the > behavior > > of destructors could be closely imitated by a combination of try/finally > > and other new language constructs. > > I am not sure I am convinced by this statement, and not sure how RAII > patterns would work in GC-based environments (i.e. it seems to me they > won't). That being said, many resource allocation scenarios can be > implemented without RAII (in fact, C doesn't have RAII at all, people > still manage to work with it :). > > -- > Stas Malyshev > smalyshev@gmail.com >