[RFC] Global functions any() and all() on iterables

  111756
August 31, 2020 23:56 tysonandre775@hotmail.com (tyson andre)
Hi internals,

I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable

This was proposed 2 days ago in https://externals.io/message/111711 with some interest
("Proposal: Adding functions any(iterable $input, ?callable $cb = null, int $use_flags=0) and all(...)")

- The $use_flags parameter was removed

The primitives any() and all() are a common part of many programming languages and help in avoiding verbosity or unnecessary abstractions.

- https://hackage.haskell.org/package/base-4.14.0.0/docs/Prelude.html#v:any
- https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/somehttps://docs.python.org/3/library/functions.html#allhttps://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#allMatch-java.util.function.Predicate-

For example, the following code could be shortened significantly

```
// Old version
$satisifes_predicate = false;
foreach ($item_list as $item) {
    if (API::satisfiesCondition($item)) {
        $satisfies_predicate = true;
        break;
    }
}
if (!$satisfies_predicate) {
    throw new APIException("No matches found");
}

// New version is much shorter and readable
if (!any($item_list, fn($item) => API::satisfiesCondition($item))) {
    throw new APIException("No matches found");
}
```

That example doesn't have any suitable helpers already in the standard library.
Using array_filter would unnecessarily call satisfiesCondition even after the first item was found,
and array_search doesn't take a callback.

A proposed implementation is https://github.com/php/php-src/pull/6053 - it takes similar flags and param orders to array_filter().

Thanks,
- Tyson
  111758
September 1, 2020 01:54 matthewmatthew@gmail.com (Matthew Brown)
This would be a fantastic addition, and it would also alleviate an issue in
static analysis land where it's very tricky (in the general case) to verify
exactly what implications successfully completing a given foreach loop has
on the array being iterated over (e.g.
https://github.com/vimeo/psalm/issues/649).

Using this function would make user intent much clearer to static analysis
tools, and also (as RFC describes) increase code legibility.


On Mon, 31 Aug 2020 at 19:56, tyson andre <tysonandre775@hotmail.com> wrote:

> Hi internals, > > I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable > > This was proposed 2 days ago in https://externals.io/message/111711 with > some interest > ("Proposal: Adding functions any(iterable $input, ?callable $cb = null, > int $use_flags=0) and all(...)") > > - The $use_flags parameter was removed > > The primitives any() and all() are a common part of many programming > languages and help in avoiding verbosity or unnecessary abstractions. > > - > https://hackage.haskell.org/package/base-4.14.0.0/docs/Prelude.html#v:any > - > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/some > - https://docs.python.org/3/library/functions.html#all > - > https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#allMatch-java.util.function.Predicate- > > For example, the following code could be shortened significantly > > ``` > // Old version > $satisifes_predicate = false; > foreach ($item_list as $item) { > if (API::satisfiesCondition($item)) { > $satisfies_predicate = true; > break; > } > } > if (!$satisfies_predicate) { > throw new APIException("No matches found"); > } > > // New version is much shorter and readable > if (!any($item_list, fn($item) => API::satisfiesCondition($item))) { > throw new APIException("No matches found"); > } > ``` > > That example doesn't have any suitable helpers already in the standard > library. > Using array_filter would unnecessarily call satisfiesCondition even after > the first item was found, > and array_search doesn't take a callback. > > A proposed implementation is https://github.com/php/php-src/pull/6053 - > it takes similar flags and param orders to array_filter(). > > Thanks, > - Tyson > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
  111760
September 1, 2020 07:30 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Sep 1, 2020 at 1:56 AM tyson andre <tysonandre775@hotmail.com>
wrote:

> Hi internals, > > I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable > > This was proposed 2 days ago in https://externals.io/message/111711 with > some interest > ("Proposal: Adding functions any(iterable $input, ?callable $cb = null, > int $use_flags=0) and all(...)") > > - The $use_flags parameter was removed > > The primitives any() and all() are a common part of many programming > languages and help in avoiding verbosity or unnecessary abstractions. > > - > https://hackage.haskell.org/package/base-4.14.0.0/docs/Prelude.html#v:any > - > https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/some > - https://docs.python.org/3/library/functions.html#all > - > https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#allMatch-java.util.function.Predicate- > > For example, the following code could be shortened significantly > > ``` > // Old version > $satisifes_predicate = false; > foreach ($item_list as $item) { > if (API::satisfiesCondition($item)) { > $satisfies_predicate = true; > break; > } > } > if (!$satisfies_predicate) { > throw new APIException("No matches found"); > } > > // New version is much shorter and readable > if (!any($item_list, fn($item) => API::satisfiesCondition($item))) { > throw new APIException("No matches found"); > } > ``` > > That example doesn't have any suitable helpers already in the standard > library. > Using array_filter would unnecessarily call satisfiesCondition even after > the first item was found, > and array_search doesn't take a callback. > > A proposed implementation is https://github.com/php/php-src/pull/6053 - > it takes similar flags and param orders to array_filter(). > > Thanks, > - Tyson >
To be in line with naming conventions, I would suggest calling these iter_any() and iter_all(), using iter_* as the prefix for our future additions to the "functions that work on arbitrary iterables" space. iterable_any() and iterable_all() would work as well, though are unnecessarily verbose. Regards, Nikita
  111761
September 1, 2020 07:57 kjarli@gmail.com (Lynn)
On Tue, Sep 1, 2020 at 9:31 AM Nikita Popov ppv@gmail.com> wrote:

> To be in line with naming conventions, I would suggest calling these > iter_any() and iter_all(), using iter_* as the prefix for our future > additions to the "functions that work on arbitrary iterables" space. > iterable_any() and iterable_all() would work as well, though are > unnecessarily verbose. > > Regards, > Nikita >
Hi, I would very much prefer the verbose versions of the two, "iter" doesn't mean much to me, and I can imagine it's confusing to newer developers as well. I'd rather not have to guess what it means when reading it, as it could also mean something I've never heard of. Though if the choice arises, I would prefer "any" and "all" over "iterable_any" and "iterable_all" for the sake of simplicity.
  111771
September 1, 2020 23:54 tysonandre775@hotmail.com (tyson andre)
Hi Lynn and Nikita,

> To be in line with naming conventions, I would suggest calling these > iter_any() and iter_all(), using iter_* as the prefix for our future > additions to the "functions that work on arbitrary iterables" space. > iterable_any() and iterable_all() would work as well, though are > unnecessarily verbose.
I'd also feel like iter_any() could get confused with iterator_apply() for Traversable in terms of whether it's shorthand for iterator or iterable when learning the language. There's been more discussion than I expected on this, arguments for both options, and I'm not sure what the overall sentiment of voters is. I was considering creating a straw poll on the wiki with 2 questions to guide the final version of the RFC/PR: 1. Name choice (iterator_all(), iter_all(), or all(), or no interest in adding the functionality) 2. Interest in future RFCs to extend support to keys and entries Thanks, - Tyson
  111814
September 3, 2020 12:18 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Sep 2, 2020 at 1:54 AM tyson andre <tysonandre775@hotmail.com>
wrote:

> Hi Lynn and Nikita, > > > To be in line with naming conventions, I would suggest calling these > > iter_any() and iter_all(), using iter_* as the prefix for our future > > additions to the "functions that work on arbitrary iterables" space. > > iterable_any() and iterable_all() would work as well, though are > > unnecessarily verbose. > > I'd also feel like iter_any() could get confused with iterator_apply() for > Traversable > in terms of whether it's shorthand for iterator or iterable when learning > the language. > > There's been more discussion than I expected on this, arguments for both > options, and I'm not sure what the overall sentiment of voters is. > I was considering creating a straw poll on the wiki with 2 questions to > guide the final version of the RFC/PR: > > 1. Name choice (iterator_all(), iter_all(), or all(), or no interest in > adding the functionality) > 2. Interest in future RFCs to extend support to keys and entries > > Thanks, > - Tyson
The main thing I'm concerned about is that once we start extending this area (I assume that any & all are not going to be the last additions in this space) we will quickly run into function names that are either too generic or outright collide. For example, what if we want to add an iterator-based version of range()? Do we *really* want to be forced to pull a Python and call it xrange()? That's about as good as real_range()... As such, I think it's important to prefix these *somehow*, though I don't care strongly how. Could be iter_all() or iterable_all(). We might even make it iterator_all() if we also adjust other existing iterator_* functions to accept iterables. I'd also be happy with iter\all() or iterable\all(), but that gets us back into namespacing discussions :) Regards, Nikita
  111836
September 3, 2020 20:08 kalle@php.net (Kalle Sommer Nielsen)
Den tor. 3. sep. 2020 kl. 15.18 skrev Nikita Popov ppv@gmail.com>:
> The main thing I'm concerned about is that once we start extending this > area (I assume that any & all are not going to be the last additions in > this space) we will quickly run into function names that are either too > generic or outright collide. For example, what if we want to add an > iterator-based version of range()? Do we *really* want to be forced to pull > a Python and call it xrange()? That's about as good as real_range()...
I agree with this point, and we already have some `iterator_*` functions in ext/spl. Such as `iterator_to_array()`, `iterator_count()` and `iterator_apply()`, so that prefix is somewhat available to use for consistency. -- regards, Kalle Sommer Nielsen kalle@php.net
  111762
September 1, 2020 07:59 ocramius@gmail.com (Marco Pivetta)
On Tue, Sep 1, 2020, 09:31 Nikita Popov ppv@gmail.com> wrote:

> > To be in line with naming conventions, I would suggest calling these > iter_any() and iter_all(), using iter_* as the prefix for our future > additions to the "functions that work on arbitrary iterables" space. > iterable_any() and iterable_all() would work as well, though are > unnecessarily verbose. > > Regards, > Nikita >
Oof, does it really have to be butchered this way? Or are you trolling? 😬 Did the namespaces ship sail forever? Can we just have those instead, please?
>
  111763
September 1, 2020 08:08 phpmailinglists@gmail.com (Peter Bowyer)
On Tue, 1 Sep 2020 at 08:59, Marco Pivetta <ocramius@gmail.com> wrote:

> Did the namespaces ship sail forever? Can we just have those instead, > please? >
To mix metaphors: it sailed, shot down in fiery flames. Unfortunately. Peter
  111764
September 1, 2020 15:05 internals@lists.php.net ("Levi Morrison via internals")
On Tue, Sep 1, 2020 at 2:08 AM Peter Bowyer <phpmailinglists@gmail.com> wrote:
> > On Tue, 1 Sep 2020 at 08:59, Marco Pivetta <ocramius@gmail.com> wrote: > > > Did the namespaces ship sail forever? Can we just have those instead, > > please? > > > > To mix metaphors: it sailed, shot down in fiery flames. > > Unfortunately. > > Peter
If we add significantly more functions to the proposal (which makes it riskier from a time investment standpoint), then I think the case for a namespace makes sense. Nobody wants to add a namespace for one or two things. If there are 10+ things it's a bit more convincing. This is a case where having a proposed API and impl in PHP-land is probably sufficient enough to make an RFC vote; the conversion to C can happen if it passes. Here is a sampling of functions from my own (private) iterable library: - all_values - any_value - no_value - one_value - to_iterator - filter_values - for_each_value - map_values - reduce_values - zip_values The values suffix is because there are variants that work with more than just the values. Being value-centric is great, but we must leave design room for the rest for when they are needed. Anyway, the key point I'm making is that I think a larger RFC has a better shot of passing if we want to namespace it.
  111766
September 1, 2020 20:10 marandall@php.net (Mark Randall)
On 01/09/2020 16:05, Levi Morrison via internals wrote:
> Anyway, the key point I'm making is that I think a larger RFC has a > better shot of passing if we want to namespace it.
Try as we might 4 of us working together couldn't get namespaces accepted. I got the feeling that there seems to be an opposition to them on principle, rather than the merits or drawbacks of the RFCs. Various libraries do use statics for this: Iterators::all()
  111767
September 1, 2020 20:41 dik.takken@gmail.com (Dik Takken)
On 01-09-2020 09:30, Nikita Popov wrote:
> To be in line with naming conventions, I would suggest calling these > iter_any() and iter_all(), using iter_* as the prefix for our future > additions to the "functions that work on arbitrary iterables" space. > iterable_any() and iterable_all() would work as well, though are > unnecessarily verbose.
It would make a lot of sense if we end up having a set of array_* functions and a set of iter_* functions. It would be nice for the RFC to mention this as a possible future result. However I am equally tempted to reason otherwise: Since the new functions are generalizations of functions which only work for arrays, strip the array_ prefixes off the existing array functions to create a set of generalized counterparts. It does yield nice short function names. :) Also, it would not surprise me if the generalized functions gradually replace the good old array functions in the future. When that happens, I would rather not have the iter_ prefixes. Regards, Dik Takken
  111768
September 1, 2020 20:53 dik.takken@gmail.com (Dik Takken)
On 01-09-2020 01:56, tyson andre wrote:
> Hi internals, > > I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable
Hi Andre, I like the RFC, it is small and the added value is clear. The RFC mentions the possibility of adding a first() method in the future. I think it would be great to generalize this to the very thing that excites me most about this RFC: It could be the starting point for ultimately arriving at a better set of tools for working with iterables. There is some discussion about the need for functions that work on keys in stead of values or on both keys and values. It was suggested to introduce option flags. It was also suggested to change the function names to allow introducing multiple variants in the future: all_values() all_keys() However, as Larry already hinted at, we may be able to do better. If we introduce a keys() function that generates the keys from a given iterable then we can use any value oriented function to work with the keys in stead. There is also the case where both the keys and values are relevant. It may be interesting to look at how Python handles this (I'm sorry, I just happen to know that language well). Python has dictionaries. Dictionaries have an items() method. This method is one of the most commonly used methods in Python. It is a generator that yields key/value pairs, one pair for each item in the dictionary. Perhaps we could do the same thing by introducing an items() function that takes an iterable and yields one array for each item in the iterable. The arrays contain the key/value pairs. Quoting the example Tyson gave: any($itemMap, fn($enabled, $itemId) => $enabled && meetsAdditionalCondition($itemId), ARRAY_FILTER_USE_BOTH) Using items(), the above can also be written as: any(items($itemMap), fn($item) => $item[0] && meetsAdditionalCondition($item[1])) This way we can do with just one any() variant and without option flags. It makes the example slightly less descriptive though, trading variable names for array indices. In Python I would probably use a comprehension here. Maybe the following will be possible in PHP at some point: any(foreach $itemMap as $enabled => $itemId yield $enabled && meetsAdditionalCondition($itemId)) Regards, Dik Takken
  111769
September 1, 2020 23:35 tysonandre775@hotmail.com (tyson andre)
> Perhaps we could do the same thing by introducing an items() function > that takes an iterable and yields one array for each item in the > iterable. The arrays contain the key/value pairs. Quoting the example > Tyson gave: > >   any($itemMap, fn($enabled, $itemId) => $enabled && >   meetsAdditionalCondition($itemId), ARRAY_FILTER_USE_BOTH) > > Using items(), the above can also be written as: > >   any(items($itemMap), fn($item) => $item[0] && >   meetsAdditionalCondition($item[1]))
This could be called various things (`items()`, `entries()`, `iterable_items(): Traversable`) and there'd be the question of whether `array_items(array $entries): array` should also be added (return a list of key+value entries). I'd expect `items(): iterable` to definitely have more performance overhead than `$flag=0` due to the extra method calls to next() and key() on the result but potentially be better for readability or working with generators. There's many proposed ways to solve that problem (adding `int $flag=0`, `any_item(iterable, callable($value, $key))`, etc.), so I'm leaving it out of the RFC Thanks, - Tyson
  111820
September 3, 2020 14:36 pollita@php.net (Sara Golemon)
On Mon, Aug 31, 2020 at 6:56 PM tyson andre <tysonandre775@hotmail.com>
wrote:

> I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable > > I've probably reached this thread too late, but I'm going to throw out my
old chestnut that these things don't belong in the engine. They belong in userspace. 1. Instant forward compatibility (any version can run `composer install`) 2. Instant bug fixes and improvements (no waiting for the next minor version of PHP) 3. Better visibility from the JIT (not having to cross userspace/internals border is good) And that's all I'm going to say because I'm pretty sure I've lost the argument long ago, but here's my any/all/none (and other) methods from years ago (IN USERSPACE!): https://github.com/phplang/generator/blob/master/src/iterable.php -Sara
  111821
September 3, 2020 14:39 david.proweb@gmail.com (David Rodrigues)
Do you think that it could be proxied too? I mean, optimize foreach
(array_keys()...) syntax to not call array_keys() in fact, but a optimized
version of foreach to handle key only.

Em qui, 3 de set de 2020 11:36, Sara Golemon <pollita@php.net> escreveu:

> On Mon, Aug 31, 2020 at 6:56 PM tyson andre <tysonandre775@hotmail.com> > wrote: > > > I've created an RFC for https://wiki.php.net/rfc/any_all_on_iterable > > > > > I've probably reached this thread too late, but I'm going to throw out my > old chestnut that these things don't belong in the engine. They belong in > userspace. > > 1. Instant forward compatibility (any version can run `composer install`) > 2. Instant bug fixes and improvements (no waiting for the next minor > version of PHP) > 3. Better visibility from the JIT (not having to cross userspace/internals > border is good) > > And that's all I'm going to say because I'm pretty sure I've lost the > argument long ago, but here's my any/all/none (and other) methods from > years ago (IN USERSPACE!): > https://github.com/phplang/generator/blob/master/src/iterable.php > > -Sara >
  111839
September 4, 2020 01:22 internals@lists.php.net ("Levi Morrison via internals")
> 3. Better visibility from the JIT (not having to cross userspace/internals > border is good)
This is a good point _in theory_, but to me this is just indicative that we need to be able to bundle built-in functions for php-src. Now that we have preloading, I wonder how far off we are from achieving this in some form.
  111842
September 4, 2020 15:32 pollita@php.net (Sara Golemon)
On Thu, Sep 3, 2020 at 8:23 PM Levi Morrison morrison@datadoghq.com>
wrote:

> > 3. Better visibility from the JIT (not having to cross > userspace/internals > > border is good) > > This is a good point _in theory_, but to me this is just indicative > that we need to be able to bundle built-in functions for php-src. Now > that we have preloading, I wonder how far off we are from achieving > this in some form. >
Mostly an orthogonal topic, though I do agree that as the JIT matures we may find we particularly want to move in this direction. It certainly makes sense to build in the capability sooner rather than later. Thinking about ways to negate the other two issues I brought up... 1. Forward-compat; We could produce a smaller distributable package containing just the script builtin implementations for use on older versions. Use some mechanism to mark which versions they're introduced in and selectively pull them out or something. Or maybe do that external to the project (anyone could pull these definitions out and package them for composer). 2. Bugfixing; I wonder if it makes sense to allow preload files to override builtin functions. Then a fix in git could be pulled in before a new release is ever minted without the need for rebuilds. Similarly, hooks could be added to any function to customize them for private installs. Again, no rebuilds, no special skills required. Just thinking out loud. -Sara