Option for array_column() to preserve keys.

  115998
September 8, 2021 00:19 andreas@dqxtech.net (Andreas Hennings)
Hello internals,

The function array_column() would be much more useful if there was an
option to preserve the original array keys.
I can create an RFC, but I think it is better to first discuss the options.

This is requested in different places on the web, e.g.
https://stackoverflow.com/questions/27204590/php-array-column-how-to-keep-the-keys/39298759

A workaround is proposed here and elsewhere, using array_keys() and
array_combine() to restore the keys.
However, this workaround not only adds complexity, but it breaks down
if some items don't have the value key. See https://3v4l.org/im2gZ.

A more robust workaround would be array_map(), but this is more
complex and probably slower than array_column(), for the given
purpose.

Some links for your convenience:
The function was introduced in this RFC, https://wiki.php.net/rfc/array_column
It is now documented here,
https://www.php.net/manual/en/function.array-column.php

Some ideas how this could be fixed:
1. Allow a magic value (e.g. TRUE) for the $index_key parameter, that
would cause the assoc behavior. To fully avoid BC break, this must be
a value that previously was completely forbidden. The value TRUE is
currently only forbidden with strict_types=1. A value of e.g. new
\stdClass is fully forbidden, but would be weird. A constant could be
introduced, but this would not prevent the BC concern.
2. Make the function preserve keys if $index_key === NULL. This would
be a full BC break.
3. Add an additional parameter with a boolean option or with integer
flags. This would be weird, because it would make the $index_key
parameter useless.
4. Add a new function.

Personally I would prefer option 1, with value TRUE (I can't think of
something better).

If I could change history, I would prefer option 2. The current
behavior could still be achieved with array_values(array_column(..)).

Regards,
Andreas
  116001
September 8, 2021 08:12 ocramius@gmail.com (Marco Pivetta)
Heyo,

On Wed, 8 Sep 2021, 02:19 Andreas Hennings, <andreas@dqxtech.net> wrote:

> Hello internals, > > The function array_column() would be much more useful if there was an > option to preserve the original array keys. > I can create an RFC, but I think it is better to first discuss the options. >
New function, please 🙏
>
  116009
September 8, 2021 14:10 andreas@dqxtech.net (Andreas Hennings)
Thanks for the feedback so far!

On Wed, 8 Sept 2021 at 10:13, Marco Pivetta <ocramius@gmail.com> wrote:
> > Heyo, > > On Wed, 8 Sep 2021, 02:19 Andreas Hennings, <andreas@dqxtech.net> wrote: >> >> Hello internals, >> >> The function array_column() would be much more useful if there was an >> option to preserve the original array keys. >> I can create an RFC, but I think it is better to first discuss the options. > > > New function, please 🙏
I am not opposed. But I am also curious what others think. What I don't like so much is how the situation with two different functions will have a "historically grown wtf" smell about it. But this is perhaps preferable to BC breaks or overly "magic" parameters or overly crowded signatures. If we go for a new function: A name could be array_column_assoc(). array_column_assoc(array $array, string $value_key) This would behave the same as array_column($array, $value_key), but preserve original keys. Items which are not arrays or which lack the key will be omitted. A $value_key === NULL would be useless, because this would simply return the original array. The question is, should it do anything beyond the most obvious? Or should we leave it minimal for now, with the potential for additional parameters in the future? Limitations: If some items are omitted, it will be awkward to restore the missing items while preserving the order of the array. Possible ideas for additional functionality: - Replicate a lot of the behavior of array_column(), e.g. with an optional $index_key parameter. This would be mostly redundant. - Additional functionality for nested arrays? - Fallback value for entries that don't have the key? Or perhaps even a fallback callback like with array_map()? - Option to capture missing entries e.g. in a by-reference variable? A benefit of keeping the limited functionality would be that programming errors are revealed more easily due to the strict signature. A question is how we would look at this long term: Do we want both functions to co-exist long-term, or do we want to deprecate one of them at some point? If array_column() is going to stay, then array_column_assoc() only needs to cover the few use cases that are missing. -- Andreas
  116010
September 8, 2021 14:48 andreas@dqxtech.net (Andreas Hennings)
On Wed, 8 Sept 2021 at 16:10, Andreas Hennings <andreas@dqxtech.net> wrote:
> > Thanks for the feedback so far! > > On Wed, 8 Sept 2021 at 10:13, Marco Pivetta <ocramius@gmail.com> wrote: > > > > Heyo, > > > > On Wed, 8 Sep 2021, 02:19 Andreas Hennings, <andreas@dqxtech.net> wrote: > >> > >> Hello internals, > >> > >> The function array_column() would be much more useful if there was an > >> option to preserve the original array keys. > >> I can create an RFC, but I think it is better to first discuss the options. > > > > > > New function, please 🙏 > > I am not opposed. But I am also curious what others think. > What I don't like so much is how the situation with two different > functions will have a "historically grown wtf" smell about it. > But this is perhaps preferable to BC breaks or overly "magic" > parameters or overly crowded signatures. > > If we go for a new function: > A name could be array_column_assoc(). > > array_column_assoc(array $array, string $value_key) > > This would behave the same as array_column($array, $value_key), but > preserve original keys. > Items which are not arrays or which lack the key will be omitted. > A $value_key === NULL would be useless, because this would simply > return the original array. > > The question is, should it do anything beyond the most obvious? > Or should we leave it minimal for now, with the potential for > additional parameters in the future? > > Limitations: > If some items are omitted, it will be awkward to restore the missing > items while preserving the order of the array. > > Possible ideas for additional functionality: > - Replicate a lot of the behavior of array_column(), e.g. with an > optional $index_key parameter. This would be mostly redundant. > - Additional functionality for nested arrays? > - Fallback value for entries that don't have the key? Or perhaps even > a fallback callback like with array_map()? > - Option to capture missing entries e.g. in a by-reference variable? > > A benefit of keeping the limited functionality would be that > programming errors are revealed more easily due to the strict > signature. > > A question is how we would look at this long term: > Do we want both functions to co-exist long-term, or do we want to > deprecate one of them at some point? > If array_column() is going to stay, then array_column_assoc() only > needs to cover the few use cases that are missing. > > -- Andreas
If we want to support nested array structures, it could work like this: NOTE: We actually don't need to squeeze this into array_column_assoc(). We could easily introduce a 3rd function instead, e.g. array_column_recursive(), if/when we want to have this in the future. I am only posting this so that we get an idea about the surrounding design space. $source['a']['b']['x']['c']['y'] = 5; $expected['a']['b']['c'] = 5; assert($expected === array_column_assoc($source, [null, null, 'x', null, 'y'])); Note the first NULL, which only exists to make the system feel more "complete". This could be useful if the array is coming from a function call. The following examples show this: unset($source, $expected); // (reset vars) $source['a']['x']['b'] = 5; $expected['a']['b'] = 5; assert($expected === array_column_assoc($source, [null, 'x'])); assert($expected === array_column_assoc($source, 'x')); unset($source, $expected); // (reset vars) $source['x']['a'] = 5; $expected['a'] = 5; assert($expected === array_column_assoc($source, ['x'])); assert($expected === $source['x'] ?? []); Trailing NULLs do almost nothing, except to ensure that non-arrays are removed from the tree. I would have to think more about the details, but I think it would work like this: unset($source, $expected); // (reset vars) $source['a0']['b'] = 5; $source['a1'] = 5; $expected = $actual; assert($expected === array_column_assoc($source, [])); assert($expected === array_column_assoc($source, [null])); unset($expected['a1']); assert($expected === array_column_assoc($source, [null, null])); unset($expected['a0']); assert($expected === array_column_assoc($source, [null, null])); Another idea could be to "collapse" array levels, using a magic value other than NULL, that does not work as an array key. unset($source, $expected); // (reset vars) $source['a0']['b0'] = 5; $source['a1']['b1'] = 5; $expected['b0'] = 5; $expected['b1'] = 5; assert($expected === array_column_assoc($source, [false])); unset($expected); $expected['a0'] = 5; $expected['a1'] = 5; assert($expected === array_column_assoc($source, [null, false])); -- Andreas
  116011
September 8, 2021 15:07 andreas@dqxtech.net (Andreas Hennings)
$source['a0']['b01'] = 5;On Wed, 8 Sept 2021 at 16:48, Andreas
Hennings <andreas@dqxtech.net> wrote:
> > On Wed, 8 Sept 2021 at 16:10, Andreas Hennings <andreas@dqxtech.net> wrote: > > > > Thanks for the feedback so far! > > > > On Wed, 8 Sept 2021 at 10:13, Marco Pivetta <ocramius@gmail.com> wrote: > > > > > > Heyo, > > > > > > On Wed, 8 Sep 2021, 02:19 Andreas Hennings, <andreas@dqxtech.net> wrote: > > >> > > >> Hello internals, > > >> > > >> The function array_column() would be much more useful if there was an > > >> option to preserve the original array keys. > > >> I can create an RFC, but I think it is better to first discuss the options. > > > > > > > > > New function, please 🙏 > > > > I am not opposed. But I am also curious what others think. > > What I don't like so much is how the situation with two different > > functions will have a "historically grown wtf" smell about it. > > But this is perhaps preferable to BC breaks or overly "magic" > > parameters or overly crowded signatures. > > > > If we go for a new function: > > A name could be array_column_assoc(). > > > > array_column_assoc(array $array, string $value_key) > > > > This would behave the same as array_column($array, $value_key), but > > preserve original keys. > > Items which are not arrays or which lack the key will be omitted. > > A $value_key === NULL would be useless, because this would simply > > return the original array. > > > > The question is, should it do anything beyond the most obvious? > > Or should we leave it minimal for now, with the potential for > > additional parameters in the future? > > > > Limitations: > > If some items are omitted, it will be awkward to restore the missing > > items while preserving the order of the array. > > > > Possible ideas for additional functionality: > > - Replicate a lot of the behavior of array_column(), e.g. with an > > optional $index_key parameter. This would be mostly redundant. > > - Additional functionality for nested arrays? > > - Fallback value for entries that don't have the key? Or perhaps even > > a fallback callback like with array_map()? > > - Option to capture missing entries e.g. in a by-reference variable? > > > > A benefit of keeping the limited functionality would be that > > programming errors are revealed more easily due to the strict > > signature. > > > > A question is how we would look at this long term: > > Do we want both functions to co-exist long-term, or do we want to > > deprecate one of them at some point? > > If array_column() is going to stay, then array_column_assoc() only > > needs to cover the few use cases that are missing. > > > > -- Andreas > > If we want to support nested array structures, it could work like this: > > NOTE: We actually don't need to squeeze this into array_column_assoc(). > We could easily introduce a 3rd function instead, e.g. > array_column_recursive(), if/when we want to have this in the future. > I am only posting this so that we get an idea about the surrounding > design space. > > $source['a']['b']['x']['c']['y'] = 5; > $expected['a']['b']['c'] = 5; > assert($expected === array_column_assoc($source, [null, null, 'x', null, 'y'])); > > Note the first NULL, which only exists to make the system feel more "complete". > This could be useful if the array is coming from a function call. > The following examples show this: > > unset($source, $expected); // (reset vars) > $source['a']['x']['b'] = 5; > $expected['a']['b'] = 5; > assert($expected === array_column_assoc($source, [null, 'x'])); > assert($expected === array_column_assoc($source, 'x')); > > unset($source, $expected); // (reset vars) > $source['x']['a'] = 5; > $expected['a'] = 5; > assert($expected === array_column_assoc($source, ['x'])); > assert($expected === $source['x'] ?? []); > > Trailing NULLs do almost nothing, except to ensure that non-arrays are > removed from the tree. > I would have to think more about the details, but I think it would > work like this: > > unset($source, $expected); // (reset vars) > $source['a0']['b'] = 5; > $source['a1'] = 5; > $expected = $actual; > assert($expected === array_column_assoc($source, [])); > assert($expected === array_column_assoc($source, [null])); > unset($expected['a1']); > assert($expected === array_column_assoc($source, [null, null])); > unset($expected['a0']); > assert($expected === array_column_assoc($source, [null, null])); > > Another idea could be to "collapse" array levels, using a magic value > other than NULL, that does not work as an array key. > > unset($source, $expected); // (reset vars) > $source['a0']['b0'] = 5; > $source['a1']['b1'] = 5; > $expected['b0'] = 5; > $expected['b1'] = 5; > assert($expected === array_column_assoc($source, [false])); > unset($expected); > $expected['a0'] = 5; > $expected['a1'] = 5; > assert($expected === array_column_assoc($source, [null, false])); > > -- Andreas
Another option to support nested arrays, but simpler. Some of the functionality I proposed earlier now needs multiple calls, but I think this is fine. New signature: function array_column_assoc(array $source, string $value_key, int $level = 1); unset($source, $expected, $expected2); // (reset vars) $source['a']['b']['x']['c']['y'] = 5; $expected['a']['b']['c']['y'] = 5; assert($expected === array_column_assoc($source, 'x', 2)); $expected2['a']['b']['c'] = 5; assert($expected2 === array_column_assoc($expected, 'y', 3)); To collapse array levels, we could introduce a separate function. Similar to array_merge(), but preserving all keys. unset($source, $expected); // (reset vars) $source['a0']['b01'] = '0.01'; $source['a0']['b02'] = '0.02'; $source['a1']['b1'] = '1.1'; $expected['b01'] = '0.01; $expected['b02'] = '0.02'; $expected['b1'] = '1.1'; assert($expected === array_collapse($source, 0)); unset($expected); $expected['a0'] = '0.01'; $expected['a1'] = '1.1'; assert($expected === array_collapse($source, 1));
  116002
September 8, 2021 08:14 guilliam.xavier@gmail.com (Guilliam Xavier)
Yes please! This has been requested multiple times, for instance:
- https://bugs.php.net/bug.php?id=64493
- https://bugs.php.net/bug.php?id=66435
- https://bugs.php.net/bug.php?id=73735

Regards,

-- 
Guilliam Xavier
  116019
September 8, 2021 21:31 ramsey@php.net (Ben Ramsey)
--cNnbtYxamPBYvFDB02OqeByXi3M8phe4Z
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

Andreas Hennings wrote on 9/7/21 19:19:
> Hello internals, >=20 > The function array_column() would be much more useful if there was an > option to preserve the original array keys. > I can create an RFC, but I think it is better to first discuss the opti= ons.
>=20 > This is requested in different places on the web, e.g. > https://stackoverflow.com/questions/27204590/php-array-column-how-to-ke= ep-the-keys/39298759
>=20 > A workaround is proposed here and elsewhere, using array_keys() and > array_combine() to restore the keys. > However, this workaround not only adds complexity, but it breaks down > if some items don't have the value key. See https://3v4l.org/im2gZ. >=20 > A more robust workaround would be array_map(), but this is more > complex and probably slower than array_column(), for the given > purpose. >=20 > Some links for your convenience: > The function was introduced in this RFC, https://wiki.php.net/rfc/array= _column
> It is now documented here, > https://www.php.net/manual/en/function.array-column.php >=20 > Some ideas how this could be fixed: > 1. Allow a magic value (e.g. TRUE) for the $index_key parameter, that > would cause the assoc behavior. To fully avoid BC break, this must be > a value that previously was completely forbidden. The value TRUE is > currently only forbidden with strict_types=3D1. A value of e.g. new > \stdClass is fully forbidden, but would be weird. A constant could be > introduced, but this would not prevent the BC concern. > 2. Make the function preserve keys if $index_key =3D=3D=3D NULL. This w= ould
> be a full BC break. > 3. Add an additional parameter with a boolean option or with integer > flags. This would be weird, because it would make the $index_key > parameter useless. > 4. Add a new function. >=20 > Personally I would prefer option 1, with value TRUE (I can't think of > something better). >=20 > If I could change history, I would prefer option 2. The current > behavior could still be achieved with array_values(array_column(..)). >=20 > Regards, > Andreas >=20
We originally had a patch for this while PHP 5.5 was still in beta, but we decided against merging it, and I can't remember why. :-) https://github.com/php/php-src/pull/331 Cheers, Ben --cNnbtYxamPBYvFDB02OqeByXi3M8phe4Z--
  116021
September 8, 2021 21:37 ramsey@php.net (Ben Ramsey)
--2OrnRNXpmby6n0VJ22L590LdY579MIzyY
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

Ben Ramsey wrote on 9/8/21 16:31:> We originally had a patch for this
while PHP 5.5 was still in beta, but
> we decided against merging it, and I can't remember why. :-) >=20
This looks like part of the thread. I'm not sure where the rest of it is: https://externals.io/message/67113 Cheers, Ben --2OrnRNXpmby6n0VJ22L590LdY579MIzyY--