Making all Traversables an Iterator or IteratorAggregate

  108970
March 11, 2020 09:50 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

Userland classes that implement Traversable must do so either through
Iterator or IteratorAggregate. The same requirement does not exist for
internal classes: They can implement the internal get_iterator mechanism,
without exposing either the Iterator or IteratorAggregate APIs. This makes
them usable in get_iterator(), but incompatible with any Iterator based
APIs.

A lot of internal classes do this, because exposing the userland APIs is
simply a lot of work. I would like to add a general mechanism to make this
simpler: https://github.com/php/php-src/pull/5216 adds a generic
"InternalIterator" class, that essentially converts the internal
get_iterator interface into a proper Iterator. Internal classes then only
need to a) implement the IteratorAggregate interface and b) add a
getIterator() method with an implementation looking like this:

// WeakMap::getIterator(): Iterator
ZEND_METHOD(WeakMap, getIterator)
{
    if (zend_parse_parameters_none() == FAILURE) {
        return;
    }
    zend_create_internal_iterator_zval(return_value, ZEND_THIS);
}

This allows internal classes to trivially implement IteratorAggregate, and
as such allows us to enforce that all Traversables implement Iterator or
IteratorAggregate.

Regards,
Nikita
  110134
May 12, 2020 08:26 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Mar 11, 2020 at 10:50 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > Userland classes that implement Traversable must do so either through > Iterator or IteratorAggregate. The same requirement does not exist for > internal classes: They can implement the internal get_iterator mechanism, > without exposing either the Iterator or IteratorAggregate APIs. This makes > them usable in get_iterator(), but incompatible with any Iterator based > APIs. >
This should have said: "This makes them usable in foreach(), but incompatible with any Iterator based APIs." A lot of internal classes do this, because exposing the userland APIs is
> simply a lot of work. I would like to add a general mechanism to make this > simpler: https://github.com/php/php-src/pull/5216 adds a generic > "InternalIterator" class, that essentially converts the internal > get_iterator interface into a proper Iterator. Internal classes then only > need to a) implement the IteratorAggregate interface and b) add a > getIterator() method with an implementation looking like this: > > // WeakMap::getIterator(): Iterator > ZEND_METHOD(WeakMap, getIterator) > { > if (zend_parse_parameters_none() == FAILURE) { > return; > } > zend_create_internal_iterator_zval(return_value, ZEND_THIS); > } > > This allows internal classes to trivially implement IteratorAggregate, and > as such allows us to enforce that all Traversables implement Iterator or > IteratorAggregate. >
Does anyone have thoughts on this change? Mostly this is a feature for extensions, but also user-visible in that a bunch of classes will switch from being Traversable to being IteratorAggregate. We may also want to convert some existing Iterators to IteratorAggregates. For example SplFixedArray currently implements Iterator, which means that it's not possible to have nested loops over SplFixedArray. We could now easily fix this by switching it to use IteratorAggregate, which will allow multiple parallel iterators to work on the same array. Of course, there is BC break potential in such a change. There's some bikeshed potential here regarding the class name. I picked "InternalIterator" as an iterator for internal classes, but "internal iteration" is also a technical term (the opposite of "external iteration"), so maybe that name isn't ideal. Regards, Nikita
  110136
May 12, 2020 12:49 pollita@php.net (Sara Golemon)
On Tue, May 12, 2020 at 3:26 AM Nikita Popov ppv@gmail.com> wrote:
> // WeakMap::getIterator(): Iterator > ZEND_METHOD(WeakMap, getIterator) > { > if (zend_parse_parameters_none() == FAILURE) { > return; > } > zend_create_internal_iterator_zval(return_value, ZEND_THIS); > } > Given that the body of this method seems to usually (always?) be the same,
why not define it in InternalIterator and allow it to be inherited?
> There's some bikeshed potential here regarding the class name. > Not personally over-picky, but I do agree that "Internal*" is a bit
awkward. Unfortunately there's not much that's better and appropriate for exposing to scripts. This might be one of those rare occasions where exposing "Zend" into userspace makes sense. "PHP" is overloaded into several meanings that are significant for userspace developers, but something like "ZendIterator" might convey the right level of "This has to do with the engine, please move along". Or maybe go verbose: 'IteratorForExtensionClassImplementations'. :p
> ZEND_ASSERT(scope->get_iterator != zend_user_it_get_new_iterator); > Does this mean that if I do `class Foo implements InternalIterator {}` in a
script, I can assert (crash) PHP? I don't see anything obvious (at a glance) preventing me from inheriting from InternalIterator. -Sara
  110138
May 12, 2020 13:08 bobwei9@hotmail.com (Bob Weinand)
Hey Sara,

> Am 12.05.2020 um 14:49 schrieb Sara Golemon <pollita@php.net>: > > On Tue, May 12, 2020 at 3:26 AM Nikita Popov ppv@gmail.com> wrote: >> // WeakMap::getIterator(): Iterator >> ZEND_METHOD(WeakMap, getIterator) >> { >> if (zend_parse_parameters_none() == FAILURE) { >> return; >> } >> zend_create_internal_iterator_zval(return_value, ZEND_THIS); >> } >> > Given that the body of this method seems to usually (always?) be the same, > why not define it in InternalIterator and allow it to be inherited?
Good idea.
>> There's some bikeshed potential here regarding the class name. >> > Not personally over-picky, but I do agree that "Internal*" is a bit > awkward. Unfortunately there's not much that's better and appropriate for > exposing to scripts. This might be one of those rare occasions where > exposing "Zend" into userspace makes sense. "PHP" is overloaded into > several meanings that are significant for userspace developers, but > something like "ZendIterator" might convey the right level of "This has to > do with the engine, please move along". Or maybe go verbose: > 'IteratorForExtensionClassImplementations'. :p
We call the engine Zend, but eih, that's a rather internal detail we mostly don't leak into userland, and I wouldn't do it here either. InternalIterator is the better choice I think.
>> ZEND_ASSERT(scope->get_iterator != zend_user_it_get_new_iterator); > > Does this mean that if I do `class Foo implements InternalIterator {}` in a > script, I can assert (crash) PHP? I don't see anything obvious (at a > glance) preventing me from inheriting from InternalIterator.
The class is marked final, so userland won't be able to directly extend it. (zend_ce_internal_iterator->ce_flags |= ZEND_ACC_FINAL;) Bob
  110139
May 12, 2020 13:36 Danack@basereality.com (Dan Ackroyd)
On Tue, 12 May 2020 at 13:49, Sara Golemon <pollita@php.net> wrote:
> > This might be one of those rare occasions where > exposing "Zend" into userspace makes sense.
I think 'Engine' would be easier for userland people to grok. Also, for those brave souls who are implementing PHP in other technologies, avoiding using an implementation detail of the common internal engine, would be better for them. cheers Dan Ack
  110141
May 12, 2020 13:57 derick@php.net (Derick Rethans)
On Tue, 12 May 2020, Sara Golemon wrote:

> On Tue, May 12, 2020 at 3:26 AM Nikita Popov ppv@gmail.com> wrote: > > // WeakMap::getIterator(): Iterator > > ZEND_METHOD(WeakMap, getIterator) > > { > > if (zend_parse_parameters_none() == FAILURE) { > > return; > > } > > zend_create_internal_iterator_zval(return_value, ZEND_THIS); > > } > > > Given that the body of this method seems to usually (always?) be the same, > why not define it in InternalIterator and allow it to be inherited? > > > There's some bikeshed potential here regarding the class name. > > > Not personally over-picky, but I do agree that "Internal*" is a bit > awkward. Unfortunately there's not much that's better and appropriate for > exposing to scripts. This might be one of those rare occasions where > exposing "Zend" into userspace makes sense. "PHP" is overloaded into > several meanings that are significant for userspace developers, but > something like "ZendIterator" might convey the right level of "This has to > do with the engine, please move along". Or maybe go verbose: > 'IteratorForExtensionClassImplementations'. :p
I do not believe we should expose the Zend term into userland. Let the bike shedding start: Perhaps just "EngineIterator". cheers, Derick -- PHP 7.4 Release Manager Host of PHP Internals News: https://phpinternals.news Like Xdebug? Consider supporting me: https://xdebug.org/support https://derickrethans.nl | https://xdebug.org | https://dram.io twitter: @derickr and @xdebug
  110142
May 12, 2020 14:02 nicolas.grekas+php@gmail.com (Nicolas Grekas)
> I do not believe we should expose the Zend term into userland. >
Dunno if it helps or not, but there is a precedent here: ReflectionZendExtension Nicolas