[RFC] Short Closures 2, aka auto-capture take 3

  117888
June 9, 2022 16:34 larry@garfieldtech.com ("Larry Garfield")
Last year, Nuno Maduro and I put together an RFC for combining the multi-line capabilities of long-closures with the auto-capture compactness of short-closures.  That RFC didn't fully go to completion due to concerns over the performance impact, which Nuno and I didn't have bandwidth to resolve.

Arnaud Le Blanc has now picked up the flag with an improved implementation that includes benchmarks showing an effectively net-zero performance impact, aka, good news as it avoids over-capturing.

The RFC has therefore been overhauled accordingly and is now ready for consideration.

https://wiki.php.net/rfc/auto-capture-closure

-- 
  Larry Garfield
  larry@garfieldtech.com
  117889
June 9, 2022 16:46 ocramius@gmail.com (Marco Pivetta)
Hey Larry,


On Thu, 9 Jun 2022 at 18:34, Larry Garfield <larry@garfieldtech.com> wrote:

> Last year, Nuno Maduro and I put together an RFC for combining the > multi-line capabilities of long-closures with the auto-capture compactness > of short-closures. That RFC didn't fully go to completion due to concerns > over the performance impact, which Nuno and I didn't have bandwidth to > resolve. > > Arnaud Le Blanc has now picked up the flag with an improved implementation > that includes benchmarks showing an effectively net-zero performance > impact, aka, good news as it avoids over-capturing. > > The RFC has therefore been overhauled accordingly and is now ready for > consideration. > > https://wiki.php.net/rfc/auto-capture-closure >
Couple questions: ## nesting these functions within each other What happens when/if we nest these functions? Take this minimal example: ```php $a = 'hello world'; (fn () { (fn () { echo $a; })(); })(); ``` ## capturing `$this` In the past (also present), I had to type `static fn () => ...` or `static function () { ...` all over the place, to avoid implicitly binding `$this` to a closure, causing hidden memory leaks. Assuming following: * these new closures could capture `$this` automatically, once detected * these new closures can optimize away unnecessary variables that aren't captured Would that allow us to get rid of `static fn () {` declarations, when creating one of these closures in an instance method context? Greets, Marco Pivetta https://twitter.com/Ocramius https://ocramius.github.io/
  117890
June 9, 2022 18:15 arnaud.lb@gmail.com (Arnaud Le Blanc)
Hi,

On jeudi 9 juin 2022 18:46:53 CEST Marco Pivetta wrote: 
> ## nesting these functions within each other > > What happens when/if we nest these functions? Take this minimal example: > > ```php > $a = 'hello world'; > > (fn () { > (fn () { > echo $a; > })(); > })(); > ```
Capture bubbles up. When an inner function uses a variable, the outer function in fact uses it too, so it's captured by both functions, by-value. This example prints "hello world": The inner function captures $a from the outer function, which captures $a from its declaring scope. This is equivalent to ```php (function () use ($a) { (function () use ($a) { echo $a; })(); })(); ```
> ## capturing `$this` > > In the past (also present), I had to type `static fn () => ...` or `static > function () { ...` all over the place, to avoid implicitly binding `$this` > to a closure, causing hidden memory leaks. > > Assuming following: > > * these new closures could capture `$this` automatically, once detected > * these new closures can optimize away unnecessary variables that aren't > captured > > Would that allow us to get rid of `static fn () {` declarations, when > creating one of these closures in an instance method context?
It would be great to get rid of this, but ideally this would apply to Arrow Functions and Anonymous Functions as well. This could be a separate RFC. -- Arnaud Le Blanc
  117893
June 9, 2022 19:27 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Jun 9, 2022 at 8:15 PM Arnaud Le Blanc lb@gmail.com> wrote:

> Hi, > > On jeudi 9 juin 2022 18:46:53 CEST Marco Pivetta wrote: > > ## nesting these functions within each other > > > > What happens when/if we nest these functions? Take this minimal example: > > > > ```php > > $a = 'hello world'; > > > > (fn () { > > (fn () { > > echo $a; > > })(); > > })(); > > ``` > > Capture bubbles up. When an inner function uses a variable, the outer > function > in fact uses it too, so it's captured by both functions, by-value. > > This example prints "hello world": The inner function captures $a from the > outer function, which captures $a from its declaring scope. > > This is equivalent to > > ```php > (function () use ($a) { > (function () use ($a) { > echo $a; > })(); > })(); > ``` > > > ## capturing `$this` > > > > In the past (also present), I had to type `static fn () => ...` or > `static > > function () { ...` all over the place, to avoid implicitly binding > `$this` > > to a closure, causing hidden memory leaks. > > > > Assuming following: > > > > * these new closures could capture `$this` automatically, once detected > > * these new closures can optimize away unnecessary variables that aren't > > captured > > > > Would that allow us to get rid of `static fn () {` declarations, when > > creating one of these closures in an instance method context? > > It would be great to get rid of this, but ideally this would apply to > Arrow > Functions and Anonymous Functions as well. This could be a separate RFC. >
I've tried this in the past, and this is not possible due to implicit $this uses. See https://wiki.php.net/rfc/arrow_functions_v2#this_binding_and_static_arrow_functions for a brief note on this. The tl;dr is that if your closure does "fn() => Foo::bar()" and Foo happens to be a parent of your current scope and bar() a non-static method, then this performs a scoped instance call that inherits $this. Not binding $this here would result in an Error exception, but the compiler doesn't have any way to know that $this needs to be bound. Regards, Nikita
  117894
June 9, 2022 19:28 ocramius@gmail.com (Marco Pivetta)
On Thu, 9 Jun 2022 at 21:27, Nikita Popov ppv@gmail.com> wrote:

> On Thu, Jun 9, 2022 at 8:15 PM Arnaud Le Blanc lb@gmail.com> > wrote: > >> Hi, >> >> On jeudi 9 juin 2022 18:46:53 CEST Marco Pivetta wrote: >> > ## nesting these functions within each other >> > >> > What happens when/if we nest these functions? Take this minimal example: >> > >> > ```php >> > $a = 'hello world'; >> > >> > (fn () { >> > (fn () { >> > echo $a; >> > })(); >> > })(); >> > ``` >> >> Capture bubbles up. When an inner function uses a variable, the outer >> function >> in fact uses it too, so it's captured by both functions, by-value. >> >> This example prints "hello world": The inner function captures $a from >> the >> outer function, which captures $a from its declaring scope. >> >> This is equivalent to >> >> ```php >> (function () use ($a) { >> (function () use ($a) { >> echo $a; >> })(); >> })(); >> ``` >> >> > ## capturing `$this` >> > >> > In the past (also present), I had to type `static fn () => ...` or >> `static >> > function () { ...` all over the place, to avoid implicitly binding >> `$this` >> > to a closure, causing hidden memory leaks. >> > >> > Assuming following: >> > >> > * these new closures could capture `$this` automatically, once detected >> > * these new closures can optimize away unnecessary variables that >> aren't >> > captured >> > >> > Would that allow us to get rid of `static fn () {` declarations, when >> > creating one of these closures in an instance method context? >> >> It would be great to get rid of this, but ideally this would apply to >> Arrow >> Functions and Anonymous Functions as well. This could be a separate RFC. >> > > I've tried this in the past, and this is not possible due to implicit > $this uses. See > https://wiki.php.net/rfc/arrow_functions_v2#this_binding_and_static_arrow_functions > for a brief note on this. The tl;dr is that if your closure does "fn() => > Foo::bar()" and Foo happens to be a parent of your current scope and bar() > a non-static method, then this performs a scoped instance call that > inherits $this. Not binding $this here would result in an Error exception, > but the compiler doesn't have any way to know that $this needs to be bound. > > Regards, > Nikita >
Hey Nikita, Do you have another example? Calling instance methods statically is... well... deserving a hard crash :| Marco Pivetta https://twitter.com/Ocramius https://ocramius.github.io/
  117895
June 9, 2022 19:35 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Jun 9, 2022 at 9:29 PM Marco Pivetta <ocramius@gmail.com> wrote:

> > On Thu, 9 Jun 2022 at 21:27, Nikita Popov ppv@gmail.com> wrote: > >> On Thu, Jun 9, 2022 at 8:15 PM Arnaud Le Blanc lb@gmail.com> >> wrote: >> >>> Hi, >>> >>> On jeudi 9 juin 2022 18:46:53 CEST Marco Pivetta wrote: >>> > ## nesting these functions within each other >>> > >>> > What happens when/if we nest these functions? Take this minimal >>> example: >>> > >>> > ```php >>> > $a = 'hello world'; >>> > >>> > (fn () { >>> > (fn () { >>> > echo $a; >>> > })(); >>> > })(); >>> > ``` >>> >>> Capture bubbles up. When an inner function uses a variable, the outer >>> function >>> in fact uses it too, so it's captured by both functions, by-value. >>> >>> This example prints "hello world": The inner function captures $a from >>> the >>> outer function, which captures $a from its declaring scope. >>> >>> This is equivalent to >>> >>> ```php >>> (function () use ($a) { >>> (function () use ($a) { >>> echo $a; >>> })(); >>> })(); >>> ``` >>> >>> > ## capturing `$this` >>> > >>> > In the past (also present), I had to type `static fn () => ...` or >>> `static >>> > function () { ...` all over the place, to avoid implicitly binding >>> `$this` >>> > to a closure, causing hidden memory leaks. >>> > >>> > Assuming following: >>> > >>> > * these new closures could capture `$this` automatically, once >>> detected >>> > * these new closures can optimize away unnecessary variables that >>> aren't >>> > captured >>> > >>> > Would that allow us to get rid of `static fn () {` declarations, when >>> > creating one of these closures in an instance method context? >>> >>> It would be great to get rid of this, but ideally this would apply to >>> Arrow >>> Functions and Anonymous Functions as well. This could be a separate RFC. >>> >> >> I've tried this in the past, and this is not possible due to implicit >> $this uses. See >> https://wiki.php.net/rfc/arrow_functions_v2#this_binding_and_static_arrow_functions >> for a brief note on this. The tl;dr is that if your closure does "fn() => >> Foo::bar()" and Foo happens to be a parent of your current scope and bar() >> a non-static method, then this performs a scoped instance call that >> inherits $this. Not binding $this here would result in an Error exception, >> but the compiler doesn't have any way to know that $this needs to be bound. >> >> Regards, >> Nikita >> > > Hey Nikita, > > Do you have another example? Calling instance methods statically is... > well... deserving a hard crash :| >
Maybe easier to understand if you replace Foo::bar() with parent::bar()? That's the most common spelling for this type of call. I agree that the syntax we use for this is unfortunate (because it is syntactically indistinguishable from a static method call, which it is *not*), but that's what we have right now, and we can hardly just stop supporting it. Regards, Nikita
  117896
June 9, 2022 19:39 ocramius@gmail.com (Marco Pivetta)
Hey Nikita,

On Thu, 9 Jun 2022 at 21:35, Nikita Popov ppv@gmail.com> wrote:

> On Thu, Jun 9, 2022 at 9:29 PM Marco Pivetta <ocramius@gmail.com> wrote: > >> >> On Thu, 9 Jun 2022 at 21:27, Nikita Popov ppv@gmail.com> wrote: >> >>> On Thu, Jun 9, 2022 at 8:15 PM Arnaud Le Blanc lb@gmail.com> >>> wrote: >>> >>>> > Would that allow us to get rid of `static fn () {` declarations, when >>>> > creating one of these closures in an instance method context? >>>> >>>> It would be great to get rid of this, but ideally this would apply to >>>> Arrow >>>> Functions and Anonymous Functions as well. This could be a separate RFC. >>>> >>> >>> I've tried this in the past, and this is not possible due to implicit >>> $this uses. See >>> https://wiki.php.net/rfc/arrow_functions_v2#this_binding_and_static_arrow_functions >>> for a brief note on this. The tl;dr is that if your closure does "fn() => >>> Foo::bar()" and Foo happens to be a parent of your current scope and bar() >>> a non-static method, then this performs a scoped instance call that >>> inherits $this. Not binding $this here would result in an Error exception, >>> but the compiler doesn't have any way to know that $this needs to be bound. >>> >>> Regards, >>> Nikita >>> >> >> Hey Nikita, >> >> Do you have another example? Calling instance methods statically is... >> well... deserving a hard crash :| >> > > Maybe easier to understand if you replace Foo::bar() with parent::bar()? > That's the most common spelling for this type of call. > > I agree that the syntax we use for this is unfortunate (because it is > syntactically indistinguishable from a static method call, which it is > *not*), but that's what we have right now, and we can hardly just stop > supporting it. >
Dunno, it's a new construct, so perhaps we could do something about it. I'm not suggesting we change the existing `fn` or `function` declarations, but in this case, we're introducing a new construct, and some work already went in to do the eager discovery of by-val variables. Heck, variable variables already wouldn't work here, according to this RFC :D Marco Pivetta https://twitter.com/Ocramius https://ocramius.github.io/
  117897
June 9, 2022 19:42 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Jun 9, 2022 at 9:39 PM Marco Pivetta <ocramius@gmail.com> wrote:

> Hey Nikita, > > On Thu, 9 Jun 2022 at 21:35, Nikita Popov ppv@gmail.com> wrote: > >> On Thu, Jun 9, 2022 at 9:29 PM Marco Pivetta <ocramius@gmail.com> wrote: >> >>> >>> On Thu, 9 Jun 2022 at 21:27, Nikita Popov ppv@gmail.com> wrote: >>> >>>> On Thu, Jun 9, 2022 at 8:15 PM Arnaud Le Blanc lb@gmail.com> >>>> wrote: >>>> >>>>> > Would that allow us to get rid of `static fn () {` declarations, when >>>>> > creating one of these closures in an instance method context? >>>>> >>>>> It would be great to get rid of this, but ideally this would apply to >>>>> Arrow >>>>> Functions and Anonymous Functions as well. This could be a separate >>>>> RFC. >>>>> >>>> >>>> I've tried this in the past, and this is not possible due to implicit >>>> $this uses. See >>>> https://wiki.php.net/rfc/arrow_functions_v2#this_binding_and_static_arrow_functions >>>> for a brief note on this. The tl;dr is that if your closure does "fn() => >>>> Foo::bar()" and Foo happens to be a parent of your current scope and bar() >>>> a non-static method, then this performs a scoped instance call that >>>> inherits $this. Not binding $this here would result in an Error exception, >>>> but the compiler doesn't have any way to know that $this needs to be bound. >>>> >>>> Regards, >>>> Nikita >>>> >>> >>> Hey Nikita, >>> >>> Do you have another example? Calling instance methods statically is... >>> well... deserving a hard crash :| >>> >> >> Maybe easier to understand if you replace Foo::bar() with parent::bar()? >> That's the most common spelling for this type of call. >> >> I agree that the syntax we use for this is unfortunate (because it is >> syntactically indistinguishable from a static method call, which it is >> *not*), but that's what we have right now, and we can hardly just stop >> supporting it. >> > > Dunno, it's a new construct, so perhaps we could do something about it. > I'm not suggesting we change the existing `fn` or `function` declarations, > but in this case, we're introducing a new construct, and some work already > went in to do the eager discovery of by-val variables. > > Heck, variable variables already wouldn't work here, according to this RFC > :D >
We're not introducing a new construct. We're just extending existing fn() functions to accept {} blocks, with exactly the same semantics as before. I would find it highly concerning if fn() => X and fn() => { return X; } had differences in capture semantics. Those two expressions should be strictly identical -- the former should be desugared to the latter. Nikita
  117891
June 9, 2022 18:36 michal.brzuchalski@gmail.com (=?UTF-8?Q?Micha=C5=82_Marcin_Brzuchalski?=)
Hi Larry,

czw., 9 cze 2022 o 18:34 Larry Garfield <larry@garfieldtech.com> napisał(a):

> Last year, Nuno Maduro and I put together an RFC for combining the > multi-line capabilities of long-closures with the auto-capture compactness > of short-closures. That RFC didn't fully go to completion due to concerns > over the performance impact, which Nuno and I didn't have bandwidth to > resolve. > > Arnaud Le Blanc has now picked up the flag with an improved implementation > that includes benchmarks showing an effectively net-zero performance > impact, aka, good news as it avoids over-capturing. > > The RFC has therefore been overhauled accordingly and is now ready for > consideration. > > https://wiki.php.net/rfc/auto-capture-closure
Nice work. Well-described behaviors. One question, more around future scope or related functionality in the future: A future RFC for "short-methods" described here in one of your declined RFC https://wiki.php.net/rfc/short-functions in the past could be revived with no conflicts in the scope of methods? class Foo { public string $firstName = 'John'; public function getFirstName(): string => $this->firstName; } I'm asking if I understand the scopes of this and previous RFCs correctly and if they don't block in future "short-methods"? Cheers, Michał Marcin Brzuchalski
  117892
June 9, 2022 19:19 larry@garfieldtech.com ("Larry Garfield")
On Thu, Jun 9, 2022, at 1:36 PM, Michał Marcin Brzuchalski wrote:
> Hi Larry, > > czw., 9 cze 2022 o 18:34 Larry Garfield <larry@garfieldtech.com> napisał(a): > >> Last year, Nuno Maduro and I put together an RFC for combining the >> multi-line capabilities of long-closures with the auto-capture compactness >> of short-closures. That RFC didn't fully go to completion due to concerns >> over the performance impact, which Nuno and I didn't have bandwidth to >> resolve. >> >> Arnaud Le Blanc has now picked up the flag with an improved implementation >> that includes benchmarks showing an effectively net-zero performance >> impact, aka, good news as it avoids over-capturing. >> >> The RFC has therefore been overhauled accordingly and is now ready for >> consideration. >> >> https://wiki.php.net/rfc/auto-capture-closure > > > Nice work. Well-described behaviors. > > One question, more around future scope or related functionality in the > future: > A future RFC for "short-methods" described here in one of your declined RFC > https://wiki.php.net/rfc/short-functions in the past could be revived with > no conflicts in the scope of methods? > > class Foo { > public string $firstName = 'John'; > public function getFirstName(): string => $this->firstName; > } > > I'm asking if I understand the scopes of this and previous RFCs correctly > and if they don't block in future "short-methods"? > > Cheers, > Michał Marcin Brzuchalski
The short-functions RFC is entirely separate. The syntax choices in both that RFC and this one were made to ensure that they don't conflict with each other, and the resulting syntax meaning is consistent across the language. The implementations are independent and should in no way conflict. (The short-functions RFC would have enabled short-methods too. It was purely a syntax sugar with no additional behavior.) That's assuming the attitude toward the short-function RFC ever changes enough in the future to make it worth trying again... --Larry Garfield
  117905
June 11, 2022 21:14 rowan.collins@gmail.com (Rowan Tommins)
On 09/06/2022 17:34, Larry Garfield wrote:
> Last year, Nuno Maduro and I put together an RFC for combining the multi-line capabilities of long-closures with the auto-capture compactness of short-closures ... Arnaud Le Blanc has now picked up the flag with an improved implementation ... The RFC has therefore been overhauled accordingly and is now ready for consideration. > > https://wiki.php.net/rfc/auto-capture-closure
First of all, thanks to all three of you for the work on this. Although I'm not quite convinced yet, I know a lot of people have expressed desire for this feature over the years. My main concern is summed up accidentally by your choice of subject line for this thread: is the proposal to add *short closure syntax* or is it to add *auto-capturing closures*? They may sound like the same thing, but to me "short closure syntax" (and a lot of the current RFC) implies that the new syntax is better for nearly all closures, and that once it is introduced, the old syntax would only really be there for compatibility - similar to how the [] syntax replaces array() and list(). If that is the aim, it's not enough to assert that "the majority" of closures are very short; the syntax should stand up even when used for, say, a middleware handler in a micro-framework. As such, I think we need additional features to opt back out of capturing, and explicitly mark function- or block-scoped variables. On the other hand, "auto-capturing" could be seen as a feature in its own right; something that users will opt into when it makes sense, while continuing to use explicit capture in others. If that is the aim, the proposed syntax is decidedly sub-optimal: to a new user, there is no obvious reason why "fn" and "function" should imply different semantics, or which one is which. A dedicated syntax such as use(*) or use(...) would be much clearer. We could even separately propose that "fn" and "function" be interchangeable everywhere, allowing combinations such as "fn() use(...) { return $x; }" and "function() => $x;" To go back to the point about variable scope: right now, if you're in a function, all variables are scoped to that function. With a tiny handful of exceptions (e.g. superglobals), access to variables from any other scope is always explicit - via parameters, "global", "use", "$this", and so on. If we think that should change, we should make that decision explicitly, not treat it as a side-effect of syntax. I don't find the comparison to a foreach loop very convincing. Loops are still only accessing variables while the function is running, not saving them to be used at some indeterminate later time. And users don't "learn to recognize" that a loop doesn't hide all variables from the parent scope; it would be very peculiar if it did. This is also where comparison to other languages falls down: most languages which capture implicitly for closures also merge scopes implicitly at other times - e.g. global variables in functions; instance properties in methods; or nested block scopes. Generally they also have a way to opt out of those, and mark a variable as local to a function or block; PHP does not, because it has always required an opt *in*. Which leads me back to my constructive suggestion: let's introduce a block scoping syntax (e.g. "let $foo;") as a useful feature in its own right, before we introduce short closures. As proposed, users will need to have some idea of what "live variable analysis" means, or add dummy assignments, if they want to be sure a variable is actually local. With a block scoping keyword, they can mark local variables explicitly, as they would in other languages. Regards, -- Rowan Tommins [IMSoP]
  117906
June 11, 2022 22:01 deleugyn@gmail.com (Deleu)
On Sat, Jun 11, 2022 at 11:14 PM Rowan Tommins collins@gmail.com>
wrote:

> On 09/06/2022 17:34, Larry Garfield wrote: > > Last year, Nuno Maduro and I put together an RFC for combining the > multi-line capabilities of long-closures with the auto-capture compactness > of short-closures ... Arnaud Le Blanc has now picked up the flag with an > improved implementation ... The RFC has therefore been overhauled > accordingly and is now ready for consideration. > > > > https://wiki.php.net/rfc/auto-capture-closure > > > They may sound like the same thing, but to me "short closure syntax" > (and a lot of the current RFC) implies that the new syntax is better for > nearly all closures, and that once it is introduced, the old syntax > would only really be there for compatibility - similar to how the [] > syntax replaces array() and list(). If that is the aim, it's not enough > to assert that "the majority" of closures are very short; the syntax > should stand up even when used for, say, a middleware handler in a > micro-framework. As such, I think we need additional features to opt > back out of capturing, and explicitly mark function- or block-scoped > variables. >
The RFC does mention that the existing Anonymous Function Syntax remains untouched and will not be deprecated. Whether the new syntax is better for nearly all closures will be a personal choice. If the new syntax doesn't suit, say, a middleware handler, then we still can: - reach for the old syntax - use invocable classes - call another method or function which creates a brand new scope and then returns a function/callable.
> > On the other hand, "auto-capturing" could be seen as a feature in its > own right; something that users will opt into when it makes sense, while > continuing to use explicit capture in others. If that is the aim, the > proposed syntax is decidedly sub-optimal: to a new user, there is no > obvious reason why "fn" and "function" should imply different semantics, > or which one is which. A dedicated syntax such as use(*) or use(...) > would be much clearer. We could even separately propose that "fn" and > "function" be interchangeable everywhere, allowing combinations such as > "fn() use(...) { return $x; }" and "function() => $x;" >
The previous discussions talked about use(*) or use(...) and most people I know that would love this RFC to pass would also dislike that alternative. It does not have the greatest asset for short closure: aesthetics. Maybe my personal bubble is not statistically relevant, but this is where PHP Internals is lacking on surveying actual users of the language to help on such matters. All I can say is that use(*) is not a replacement for the RFC..
> > To go back to the point about variable scope: right now, if you're in a > function, all variables are scoped to that function. With a tiny handful > of exceptions (e.g. superglobals), access to variables from any other > scope is always explicit - via parameters, "global", "use", "$this", and > so on. If we think that should change, we should make that decision > explicitly, not treat it as a side-effect of syntax. >
Any attempt to make it explicit defeats the purpose of the RFC. The auto-capturing means we don't have to write awkward code to access variables. The only way we have to avoid awkward syntax (such as use ($var1, $var2)) is to declare an entire new invocable class and send the parameters via the constructor. When many variables are involved, that may still be a great option, but doing that just for 1 variable and 2 lines is quite... sad. When I think of new accessors for this particular case, they would either be innovative or verbose. If they are verbose, we already have a syntax for that. If they are innovative, it would be an awkward out-of-place situation that doesn't happen elsewhere in the language. Or I lack the imagination to see a different result. Ultimately, I see fn() as "an opt-in to not create a separate scope for a function". PHP has several language constructs that may or may not create a separate scope. Delimite Scope: function, method, class, procedural file Shared scope: if, for, foreach, include, require and fn
> Regards, > > -- > Rowan Tommins > [IMSoP] > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
-- Marco Aurélio Deleu
  117907
June 12, 2022 01:53 larry@garfieldtech.com ("Larry Garfield")
On Sat, Jun 11, 2022, at 5:01 PM, Deleu wrote:
> On Sat, Jun 11, 2022 at 11:14 PM Rowan Tommins collins@gmail.com> > wrote: > >> On 09/06/2022 17:34, Larry Garfield wrote: >> > Last year, Nuno Maduro and I put together an RFC for combining the >> multi-line capabilities of long-closures with the auto-capture compactness >> of short-closures ... Arnaud Le Blanc has now picked up the flag with an >> improved implementation ... The RFC has therefore been overhauled >> accordingly and is now ready for consideration. >> > >> > https://wiki.php.net/rfc/auto-capture-closure >> >> >> They may sound like the same thing, but to me "short closure syntax" >> (and a lot of the current RFC) implies that the new syntax is better for >> nearly all closures, and that once it is introduced, the old syntax >> would only really be there for compatibility - similar to how the [] >> syntax replaces array() and list(). If that is the aim, it's not enough >> to assert that "the majority" of closures are very short; the syntax >> should stand up even when used for, say, a middleware handler in a >> micro-framework. As such, I think we need additional features to opt >> back out of capturing, and explicitly mark function- or block-scoped >> variables. >> > > The RFC does mention that the existing Anonymous Function Syntax remains > untouched and will not be deprecated. Whether the new syntax is better for > nearly all closures will be a personal choice. If the new syntax doesn't > suit, say, a middleware handler, then we still can: > - reach for the old syntax > - use invocable classes > - call another method or function which creates a brand new scope and then > returns a function/callable.
Correct. If this RFC passes, there will be three equally supported syntaxes for creating closures: function ($a) use ($b) { return $a * $b; }; fn ($a) => $a * $b; fn ($a) { return $a * $b; }; Which one is appropriate in a given situation is left up to developer judgement. My own personal position would be * use fn => where possible * use fn {} if going mult-line. * if the body of the closure is more than ~3 lines and is not virtually the entire wrapping scope, it should be its own named function/method *anyway*, and the new first-class-callable syntax makes that nice and easy to use. That is, I would probably not use the manual-capture syntax very often at all. However, if someone disagrees with me on case 3 it's still there for them if that's easier in context. Whether the new syntax is viewed as "adding auto-capture to long closures" or "adding multi-line support to short closures" is, in the end, a mostly academic distinction with no practical difference. The resulting syntax is smack in the middle of the two existing ones. The original RFC from a year ago approached it from the perspective of the first; The rewritten RFC leans on the second perspective. The use of both names is mostly a historical artifact of reusing the old URL. :-) The net result is the same. --Larry Garfield
  117912
June 12, 2022 12:29 rowan.collins@gmail.com (Rowan Tommins)
On 11/06/2022 23:01, Deleu wrote:
> The RFC does mention that the existing Anonymous Function Syntax > remains untouched and will not be deprecated. Whether the new syntax > is better for nearly all closures will be a personal choice.
I honestly don't think this is how it will be perceived. If this syntax is approved, people will see "fn" as the "new, better way" and "function" as the "old, annoying way". To put it a different way: imagine we had no closure support at all, and decided that we needed two flavours, one with explicit capture and one with implicit capture. Would we choose "function" and "fn" as keywords?
> The previous discussions talked about use(*) or use(...) and most > people I know that would love this RFC to pass would also dislike that > alternative. It does not have the greatest asset for short closure: > aesthetics. [...] All I can say is that use(*) is not a replacement > for the RFC.
I think you're trying to have it both ways here: if you really believed that the two syntaxes were going to live side by side, there would be no reason for "aesthetics" to be any more important for one than the other. Some people are of the opinion that automatic capture should always have been the default, and the current syntax is a mistake. I'm fine with that opinion, but I want people to be honest about it rather than pretending they're just adding a new option for a narrow use case.
> Any attempt to make it explicit defeats the purpose of the RFC.
That depends what you think the purpose of the RFC is, which is what I want people to be honest about. If the purpose is to replace long lists of captured variables, an explicit "capture all" syntax like "use(*)" achieves that purpose perfectly fine.
> Ultimately, I see fn() as "an opt-in to not create a separate scope > for a function".
I disagree with both parts of this: 1) I don't think users will see "fn" as an "opt-in", they'll see it as "the new normal", and "function" as a rare "opt-out" or a "legacy version". 2) It does still create a separate scope, it just creates a *nested* scope, which combines two sets of variables, in a way that PHP currently never does. Regards, -- Rowan Tommins [IMSoP]
  117914
June 12, 2022 15:22 deleugyn@gmail.com (Deleu)
On Sun, Jun 12, 2022 at 2:29 PM Rowan Tommins collins@gmail.com>
wrote:

> On 11/06/2022 23:01, Deleu wrote: > > The RFC does mention that the existing Anonymous Function Syntax > > remains untouched and will not be deprecated. Whether the new syntax > > is better for nearly all closures will be a personal choice. > > > I honestly don't think this is how it will be perceived. If this syntax > is approved, people will see "fn" as the "new, better way" and > "function" as the "old, annoying way". >
And to me that's not an argument to deny what people want.
> To put it a different way: imagine we had no closure support at all, and > decided that we needed two flavours, one with explicit capture and one > with implicit capture. Would we choose "function" and "fn" as keywords? >
I often don't indulge such hypotheticals because we will never truly be able to make progress based on such an assumption. A breaking change that changes how closure works is just not gonna happen. Given the current state in the world we're in, what can we do to have a better DX on anonymous functions?
> > > The previous discussions talked about use(*) or use(...) and most > > people I know that would love this RFC to pass would also dislike that > > alternative. It does not have the greatest asset for short closure: > > aesthetics. [...] All I can say is that use(*) is not a replacement > > for the RFC. > > > I think you're trying to have it both ways here: if you really believed > that the two syntaxes were going to live side by side, there would be no > reason for "aesthetics" to be any more important for one than the other. > > Some people are of the opinion that automatic capture should always have > been the default, and the current syntax is a mistake. I'm fine with > that opinion, but I want people to be honest about it rather than > pretending they're just adding a new option for a narrow use case. > > Honestly I don't think it was a mistake. It was designed more than a decade
ago and there was no way of predicting the future. I've seen code written 20~10 years ago and I've seen code written 5~0 years ago. I think the best decision was taken at the time it was taken and the world of development has changed enough for us to make different decisions now. It's not that I'm trying to have it both ways, I'm just not assuming my view is the right one. I do believe that if such an RFC is approved, I will almost never reach for `function () use ()` anymore because I will prefer the short syntax. If I need a new scope I will reach for an invocable class. But that doesn't mean other teams/projects/people are forced to agree or follow the same practices as me or my team.
> > > Any attempt to make it explicit defeats the purpose of the RFC. > > > That depends what you think the purpose of the RFC is, which is what I > want people to be honest about. > > If the purpose is to replace long lists of captured variables, an > explicit "capture all" syntax like "use(*)" achieves that purpose > perfectly fine. > > If someone decides to implement `function () use (*)` on a separate RFC, I
would abstain from that because it's not something I'm interested in using and it doesn't address the aesthetic issue we have today. I just don't like it being considered an alternative to the current RFC because it's not. The purpose is to replace long lists of captured variables while addressing the aesthetic issue caused by `use ()`, which is the only place in the language we use this construct. It seems to me that you agree that there is a chance the proposed syntax is going to be perceived as better and people will not want to use the old syntax anymore and that makes you not want to accept the RFC. Regards,
> > -- > Rowan Tommins > [IMSoP] > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
-- Marco Aurélio Deleu
  117915
June 12, 2022 15:51 rowan.collins@gmail.com (Rowan Tommins)
On 12 June 2022 16:22:15 BST, Deleu <deleugyn@gmail.com> wrote:
>> I honestly don't think this is how it will be perceived. If this syntax >> is approved, people will see "fn" as the "new, better way" and >> "function" as the "old, annoying way". >> > >And to me that's not an argument to deny what people want.
I never said it was. I said that if that is what we expect, we should design the feature with that in mind, rather than relying on the older syntax as a crutch.
> Given the current state in the world we're in, what can we > do to have a better DX on anonymous functions?
I already gave my answer to that: either add implicit capture as an opt-in to the current syntax; or add block scope and treat short closures as consistent with that.
>Honestly I don't think it was a mistake. It was designed more than a decade >ago and there was no way of predicting the future. I've seen code written >20~10 years ago and I've seen code written 5~0 years ago. I think the best >decision was taken at the time it was taken and the world of development >has changed enough for us to make different decisions now.
I've seen that argument before, but it's not clear to me that anything *has* changed. Anonymous functions are used for roughly the same things that always were, so why are the arguments made when they were added, and again when short closures were discussed previously, no longer valid?
>If someone decides to implement `function () use (*)` on a separate RFC, I >would abstain from that because it's not something I'm interested in using
That's fair enough. Just remember that that is *your* opinion of what is important, and others may have different views.
>aesthetic issue caused by `use ()`, which is the only place in the language >we use this construct.
That's like saying we only use the word "class" when declaring classes. It has slightly different syntax, but "use" is exactly the same principle as importing variables into scope with "global", or declaring them "static". It's entirely in keeping with how scope works in PHP.
>It seems to me that you agree that there is a chance the proposed syntax is >going to be perceived as better and people will not want to use the old >syntax anymore and that makes you not want to accept the RFC.
No, it makes me want to make the new syntax as useful as possible. Regards, -- Rowan Tommins [IMSoP]
  117919
June 12, 2022 18:09 deleugyn@gmail.com (Deleu)
On Sun, Jun 12, 2022 at 6:55 PM Rowan Tommins collins@gmail.com>
wrote:

> > >It seems to me that you agree that there is a chance the proposed syntax > is > >going to be perceived as better and people will not want to use the old > >syntax anymore and that makes you not want to accept the RFC. > > No, it makes me want to make the new syntax as useful as possible. >
On the sentiment, we can agree. It just happens that from where I'm standing, any change to the proposed syntax will make it less useful. -- Marco Aurélio Deleu
  117908
June 12, 2022 02:01 larry@garfieldtech.com ("Larry Garfield")
On Sat, Jun 11, 2022, at 4:14 PM, Rowan Tommins wrote:
> On 09/06/2022 17:34, Larry Garfield wrote: >> Last year, Nuno Maduro and I put together an RFC for combining the multi-line capabilities of long-closures with the auto-capture compactness of short-closures ... Arnaud Le Blanc has now picked up the flag with an improved implementation ... The RFC has therefore been overhauled accordingly and is now ready for consideration. >> >> https://wiki.php.net/rfc/auto-capture-closure > > > First of all, thanks to all three of you for the work on this. Although > I'm not quite convinced yet, I know a lot of people have expressed > desire for this feature over the years.
> To go back to the point about variable scope: right now, if you're in a > function, all variables are scoped to that function. With a tiny handful > of exceptions (e.g. superglobals), access to variables from any other > scope is always explicit - via parameters, "global", "use", "$this", and > so on. If we think that should change, we should make that decision > explicitly, not treat it as a side-effect of syntax. > > I don't find the comparison to a foreach loop very convincing. Loops are > still only accessing variables while the function is running, not saving > them to be used at some indeterminate later time. And users don't "learn > to recognize" that a loop doesn't hide all variables from the parent > scope; it would be very peculiar if it did.
There are languages that do, however. Some languages have block-scoped variables by default (such as Rust), or partially blocked scoped depending on details. PHP is not one of them, but to someone coming from a language that does, PHP's way of doing things is just as weird and requires learning. The point here is that "which things create a scope and which don't" are not "intuitive" in any language. They're always language-idiomatic, and may or may not be internally consistent, which is the important part. PHP is fairly internally consistent: functions and classes create a scope, nothing else does. This RFC doesn't change that one way or another, so it's not really any harder to learn. Plus, as noted, the `fn` keyword becomes consistently the flag saying "auto-capture happens here, FYI", which is already the case as of 7.4.
> Which leads me back to my constructive suggestion: let's introduce a > block scoping syntax (e.g. "let $foo;") as a useful feature in its own > right, before we introduce short closures. > > As proposed, users will need to have some idea of what "live variable > analysis" means, or add dummy assignments, if they want to be sure a > variable is actually local. With a block scoping keyword, they can mark > local variables explicitly, as they would in other languages.
That may be a useful feature on its own, especially for longer loops. I'm definitely open to discussing that. I don't think that is a prerequisite for a nicer lambda syntax, however, as I don't think the confusion potential is anywhere near as large as you apparently fear it is. --Larry Garfield
  117909
June 12, 2022 08:33 mike@newclarity.net (MKS Archive)
> On Jun 11, 2022, at 10:01 PM, Larry Garfield <larry@garfieldtech.com> wrote: > > On Sat, Jun 11, 2022, at 4:14 PM, Rowan Tommins wrote: >> On 09/06/2022 17:34, Larry Garfield wrote: >>> Last year, Nuno Maduro and I put together an RFC for combining the multi-line capabilities of long-closures with the auto-capture compactness of short-closures ... Arnaud Le Blanc has now picked up the flag with an improved implementation ... The RFC has therefore been overhauled accordingly and is now ready for consideration. >>> >>> https://wiki.php.net/rfc/auto-capture-closure >> >> >> First of all, thanks to all three of you for the work on this. Although >> I'm not quite convinced yet, I know a lot of people have expressed >> desire for this feature over the years. > >> To go back to the point about variable scope: right now, if you're in a >> function, all variables are scoped to that function. With a tiny handful >> of exceptions (e.g. superglobals), access to variables from any other >> scope is always explicit - via parameters, "global", "use", "$this", and >> so on. If we think that should change, we should make that decision >> explicitly, not treat it as a side-effect of syntax. >> >> I don't find the comparison to a foreach loop very convincing. Loops are >> still only accessing variables while the function is running, not saving >> them to be used at some indeterminate later time. And users don't "learn >> to recognize" that a loop doesn't hide all variables from the parent >> scope; it would be very peculiar if it did. > > There are languages that do, however. Some languages have block-scoped variables by default (such as Rust), or partially blocked scoped depending on details. PHP is not one of them, but to someone coming from a language that does, PHP's way of doing things is just as weird and requires learning.
Working in Go now for several years I'd say one of its biggest foot guns that I consistently run into when doing code reviews and even in my own code is block-level scoping where one variable shadows the same named variable outside the block and the inner variable is updated when the intention was to update the outer variable. In short, block level scoping is a convenience that does more harm than good. At least in my experience. Thus I would highly recommend *not* adding block level variable scope to PHP where a block in the middle of a function shadows a variable outside the block, and that variable is used below the block, such as for a return value. #jmtcw #fwiw -Mike
  117913
June 12, 2022 12:58 rowan.collins@gmail.com (Rowan Tommins)
On 12/06/2022 03:01, Larry Garfield wrote:
>> ... users don't "learn to recognize" that a loop doesn't hide all variables from the parent >> scope; it would be very peculiar if it did. > There are languages that do, however. Some languages have block-scoped variables by default (such as Rust), or partially blocked scoped depending on details.
That's not what the RFC example implies, though; it implies that someone might expect $guests and $guestsIds to not be usable *inside* the foreach loop, because they were declared *outside* it. I don't know of any language where entering a loop creates a completely empty symbol table, do you? Whether or not $guest, having been declared *inside* the loop, is visible *after* the loop is a completely different question, and one that doesn't apply to closures - the content of the closure hasn't been executed yet, so it is inevitably a black box to the code after it.
> PHP is fairly internally consistent: functions and classes create a scope, nothing else does. This RFC doesn't change that one way or another...
The RFC fundamentally changes the rule that a function always creates a new, *empty* scope. Every variable that is not local to that scope has to be explicitly imported, one way or another. Every language I know where scopes do *not* start out empty has keywords for marking which variables are definitely local to that block. That's why I think a "var" or "let" equivalent is a natural accompaniment to changing PHP's rules in this way.
> Plus, as noted, the `fn` keyword becomes consistently the flag saying "auto-capture happens here, FYI", which is already the case as of 7.4.
It's a cute idea, but I don't think "if you miss most of the letters out of a word, it means this special thing" is at all memorable. I've never heard the syntax added in 7.4 called "fn functions", but I've frequently heard it called "arrow functions", because what stands out to people is the "=>". The keyword is only there because ($a)=>$b on its own would collide with array syntax. I would much rather see "fn" and "function" become synonyms, so that "public fn foo() {}" is a valid method declaration, and "function() => $foo" is a valid arrow function. Regards, -- Rowan Tommins [IMSoP]
  117931
June 13, 2022 12:57 arnaud.lb@gmail.com (Arnaud Le Blanc)
On samedi 11 juin 2022 23:14:28 CEST Rowan Tommins wrote:
> My main concern is summed up accidentally by your choice of subject line > for this thread: is the proposal to add *short closure syntax* or is it > to add *auto-capturing closures*?
The proposal is to extend the Arrow Functions syntax so that it allows multiple statements. I wanted to give a name to the RFC, so that we could refer to the feature by that name instead of the longer "auto-capture multi- statement closures". But the auto-capture behavior is an important aspect we want to inherit from Arrow Functions.
> As such, I think we need additional features to opt > back out of capturing, and explicitly mark function- or block-scoped > variables.
Currently the `use()` syntax co-exists with auto-capture, but we could change it so that an explicit `use()` list disables auto-capture instead: ```php fn () use ($a) { } // Would capture $a and disable auto-capture fn () use () { } // Would capture nothing and disable auto-capture ```
> On the other hand, "auto-capturing" could be seen as a feature in its > own right; something that users will opt into when it makes sense, while > continuing to use explicit capture in others. If that is the aim, the > proposed syntax is decidedly sub-optimal: to a new user, there is no > obvious reason why "fn" and "function" should imply different semantics, > or which one is which. A dedicated syntax such as use(*) or use(...) > would be much clearer. We could even separately propose that "fn" and > "function" be interchangeable everywhere, allowing combinations such as > "fn() use(...) { return $x; }" and "function() => $x;"
Unfortunately, Arrow Functions already auto-capture today, so requiring a `use(*)` to enable auto-capture would be a breaking change.
> I don't find the comparison to a foreach loop very convincing. Loops are > still only accessing variables while the function is running, not saving > them to be used at some indeterminate later time.
Do you have an example where this would be a problem?
> This is also where comparison to other languages falls down: most > languages which capture implicitly for closures also merge scopes > implicitly at other times - e.g. global variables in functions; instance > properties in methods; or nested block scopes. Generally they also have > a way to opt out of those, and mark a variable as local to a function or > block; PHP does not, because it has always required an opt *in*.
These languages capture/inherit in a read-write fashion. Being able to scope a variable (opt out of capture) is absolutely necessary otherwise there is only one scope. In these languages it is easy to accidentally override/bind a variable from the parent scope by forgetting a variable declaration. Auto-capture in PHP is by-value. This makes this impossible. It also makes explicit declarations non-necessary and much less useful.
> Which leads me back to my constructive suggestion: let's introduce a > block scoping syntax (e.g. "let $foo;") as a useful feature in its own > right, before we introduce short closures.
I like this, especially if it also allows to specify a type. However, I don't think it's needed before this RFC.
> As proposed, users will need to have some idea of what "live variable > analysis" means, or add dummy assignments, if they want to be sure a > variable is actually local. With a block scoping keyword, they can mark > local variables explicitly, as they would in other languages.
Live-variable analysis is mentioned in as part of implementation details. It should not be necessary to understand these details to understand the behavior of auto-capture. I've updated the "Auto-capture semantics" section of the RFC. Regards, -- Arnaud Le Blanc
  117933
June 13, 2022 13:36 rowan.collins@gmail.com (Rowan Tommins)
On 13/06/2022 13:57, Arnaud Le Blanc wrote:
> The proposal is to extend the Arrow Functions syntax so that it allows > multiple statements.
That's one perspective. The other perspective is that the proposal is to extend closure syntax to support automatic capture.
> Currently the `use()` syntax co-exists with auto-capture, but we could change > it so that an explicit `use()` list disables auto-capture instead: > > ```php > fn () use ($a) { } // Would capture $a and disable auto-capture > fn () use () { } // Would capture nothing and disable auto-capture > ```
That's an interesting idea. I was coming from the other direction, but it might make sense I guess. By the way, the current RFC implies you could write this: fn() use (&$myRef, $a) { $myRef = $a * $b; } It's clear that $myRef is captured by reference, and $a by value; but what about $b? Is it local to the closure as it would be in a "long" closure, or implicitly captured by value as it would be with no "use" statement? It might be best for such mixtures to raise an error.
> Unfortunately, Arrow Functions already auto-capture today, so requiring a > `use(*)` to enable auto-capture would be a breaking change.
I'm not suggesting any change to arrow functions, just the ability to write "use(*)" (or "use(...)") in all the place you can write "use($foo)" today. I don't think that introduces any problems, if you think of "fn" as an alternative spelling of "function", and "=>" as expanding to "use(*) { return"
>> I don't find the comparison to a foreach loop very convincing. Loops are >> still only accessing variables while the function is running, not saving >> them to be used at some indeterminate later time. > Do you have an example where this would be a problem?
I didn't say anything was a problem; I just said that the comparison didn't make sense, because the scenarios are so different.
> Auto-capture in PHP is by-value. This makes this impossible. It also makes > explicit declarations non-necessary and much less useful.
> Live-variable analysis is mentioned in as part of implementation details. It > should not be necessary to understand these details to understand the behavior > of auto-capture.
As noted in my other e-mail, by-value capture can still have side effects, so users may still want to ensure that their code is free of such side effects. Currently, the only way to do so is to understand the "implementation details" of which variables will be captured, and perhaps add dummy statements like "$foo = null;" or "unset($foo);" to make sure of it. Regards, -- Rowan Tommins [IMSoP]
  117934
June 13, 2022 13:52 larry@garfieldtech.com ("Larry Garfield")
On Mon, Jun 13, 2022, at 8:36 AM, Rowan Tommins wrote:
> On 13/06/2022 13:57, Arnaud Le Blanc wrote: >> The proposal is to extend the Arrow Functions syntax so that it allows >> multiple statements. > > > That's one perspective. The other perspective is that the proposal is to > extend closure syntax to support automatic capture.
As noted before, this is a distinction without a difference. The proposed syntax brings in one aspect of short-closures and one aspect of long-closures. Which you consider it "coming from" as a starting point is, in practice, irrelevant.
> By the way, the current RFC implies you could write this: > > fn() use (&$myRef, $a) { $myRef = $a * $b; } > > It's clear that $myRef is captured by reference, and $a by value; but > what about $b? Is it local to the closure as it would be in a "long" > closure, or implicitly captured by value as it would be with no "use" > statement? > > It might be best for such mixtures to raise an error.
The RFC already covers that. $b will be auto-captured by value from scope if it exists. See the "Explicit capture" section and its example.
>> Auto-capture in PHP is by-value. This makes this impossible. It also makes >> explicit declarations non-necessary and much less useful. > >> Live-variable analysis is mentioned in as part of implementation details. It >> should not be necessary to understand these details to understand the behavior >> of auto-capture. > > > As noted in my other e-mail, by-value capture can still have side > effects, so users may still want to ensure that their code is free of > such side effects. > > Currently, the only way to do so is to understand the "implementation > details" of which variables will be captured, and perhaps add dummy > statements like "$foo = null;" or "unset($foo);" to make sure of it.
There's two different issues you're raising here that almost seem to be contradictory. 1. Auto-capture could still over-capture without people realizing it. Whether this is actually an issue in practice (or would be) is hard to say with certainty; I'm not sure if it's possible to make an educated guess based on a top-1000 analysis, so we're all trying to predict the future. Note, however, that this risk is already present for short-closures, as the capture logic is the same. Arguably it's less of an issue only because short-closures are, well, short, so less likely to reuse variables unintentionally. However, based on my top-1000 survey, even today the vast majority of long-closures are only 2-4 lines long. I don't believe that makes it 2-4 times more likely, as it's still trivial for a developer to look at a 2 line closure and say "oh, I'm reusing that variable name, maybe that's not as clear as it could be." 2. The syntactic indicator that "auto capture will happen". The RFC says "fn". You're recommending "use(*)". However, changing the indicator syntax would do nothing to improve point 1. It's just a longer indicator to use the same logic, especially as it would also require the full "function" word. (Making fn and function synonyms sounds like it would have a lot more knock-on effects that feel very out of scope at present.) --Larry Garfield
  117937
June 13, 2022 17:39 rowan.collins@gmail.com (Rowan Tommins)
On 13/06/2022 14:52, Larry Garfield wrote:
>> That's one perspective. The other perspective is that the proposal is to >> extend closure syntax to support automatic capture. > As noted before, this is a distinction without a difference.
It's a difference in focus which is very evident in some of the comments on this thread. For instance, Arnaud assumed that adding "use(*)" would require a change to arrow functions, whereas that never even occurred to me, because we're looking at the feature through a different lens.
>> By the way, the current RFC implies you could write this: >> >> fn() use (&$myRef, $a) { $myRef = $a * $b; } > The RFC already covers that. $b will be auto-captured by value from scope if it exists. See the "Explicit capture" section and its example.
So it does. I find that extremely confusing; I think it would be clearer to error for that case, changing the proposal to: > Short Closures support explicit by-reference capture with the |use| keyword. Combining a short closure with explicit by-value capture produces an error. And the example to: $a = 1; fn () use (&$b) {     return $a + $b; // $a is auto-captured by value                          // $b is explicitly captured by reference } Clearer syntax for this has been cited previously as an advantage of use(*) or use(...): $a = 1; function () use (&$b, ...) { // read as "use $b by reference, everything else by value"     return $a + $b; }
> 1. Auto-capture could still over-capture without people realizing it. Whether this is actually an issue in practice (or would be) is hard to say with certainty; I'm not sure if it's possible to make an educated guess based on a top-1000 analysis, so we're all trying to predict the future.
I tried to make very explicit what I was and was not disputing: > Whether the risk of these side effects is a big problem is up for debate, but it's wrong to suggest they don't exist. The RFC seems to be implying that the implementation removes the side effects, but it does not, it is users paying attention to their code which will remove the side effects.
> Arguably it's less of an issue only because short-closures are, well, short, so less likely to reuse variables unintentionally.
Our current short closures aren't just a single *statement*, they're a single *expression*, and that's a really significant difference, because it means to all intents and purposes *they have no local scope*. (You can create and use a local variable within one expression, but it requires the kind of twisted code that only happens in code golf.) If there are no local variables, there is nothing to be accidentally captured. That's why the current implementation doesn't bother optimising which variables it captures - it's pretty safe to assume that *all* variables in the expression are either parameters or captured.
> 2. The syntactic indicator that "auto capture will happen". The RFC says "fn". You're recommending "use(*)". However, changing the indicator syntax would do nothing to improve point 1.
The reason I think it would be better is because it is a more *intentional* syntax: the author of the code is more likely to think "I'm using an auto-capture closure, rather than an explicit-capture closure, what effect will that have?" and readers of the code are more likely to think "hm, this is using auto-capture, I wonder which variables are local, and which are captured?" Of course they can still guess wrong, but I don't think "fn" vs "function" is a strong enough clue.
> (Making fn and function synonyms sounds like it would have a lot more knock-on effects that feel very out of scope at present.)
Off the top of my head, I can't think of any, but I admit I haven't tried hacking it into the parser to see if anything explodes. Regards, -- Rowan Tommins [IMSoP]
  117944
June 14, 2022 14:28 mike@newclarity.net (Mike Schinkel)
> On Jun 13, 2022, at 1:39 PM, Rowan Tommins collins@gmail.com> wrote: > > On 13/06/2022 14:52, Larry Garfield wrote: >>> That's one perspective. The other perspective is that the proposal is to >>> extend closure syntax to support automatic capture. >> As noted before, this is a distinction without a difference. > > > It's a difference in focus which is very evident in some of the comments on this thread. For instance, Arnaud assumed that adding "use(*)" would require a change to arrow functions, whereas that never even occurred to me, because we're looking at the feature through a different lens. > > >>> By the way, the current RFC implies you could write this: >>> >>> fn() use (&$myRef, $a) { $myRef = $a * $b; } >> The RFC already covers that. $b will be auto-captured by value from scope if it exists. See the "Explicit capture" section and its example. > > > So it does. I find that extremely confusing; I think it would be clearer to error for that case, changing the proposal to: > > > Short Closures support explicit by-reference capture with the |use| keyword. Combining a short closure with explicit by-value capture produces an error. > > And the example to: > > $a = 1; > fn () use (&$b) { > return $a + $b; // $a is auto-captured by value > // $b is explicitly captured by reference > } > > > Clearer syntax for this has been cited previously as an advantage of use(*) or use(...): > > $a = 1; > function () use (&$b, ...) { // read as "use $b by reference, everything else by value" > return $a + $b; > } > > >> 1. Auto-capture could still over-capture without people realizing it. Whether this is actually an issue in practice (or would be) is hard to say with certainty; I'm not sure if it's possible to make an educated guess based on a top-1000 analysis, so we're all trying to predict the future. > > > I tried to make very explicit what I was and was not disputing: > > > Whether the risk of these side effects is a big problem is up for debate, but it's wrong to suggest they don't exist. > > The RFC seems to be implying that the implementation removes the side effects, but it does not, it is users paying attention to their code which will remove the side effects. > > >> Arguably it's less of an issue only because short-closures are, well, short, so less likely to reuse variables unintentionally. > > > Our current short closures aren't just a single *statement*, they're a single *expression*, and that's a really significant difference, because it means to all intents and purposes *they have no local scope*. (You can create and use a local variable within one expression, but it requires the kind of twisted code that only happens in code golf.) > > If there are no local variables, there is nothing to be accidentally captured. That's why the current implementation doesn't bother optimising which variables it captures - it's pretty safe to assume that *all* variables in the expression are either parameters or captured. > > >> 2. The syntactic indicator that "auto capture will happen". The RFC says "fn". You're recommending "use(*)". However, changing the indicator syntax would do nothing to improve point 1. > > > The reason I think it would be better is because it is a more *intentional* syntax: the author of the code is more likely to think "I'm using an auto-capture closure, rather than an explicit-capture closure, what effect will that have?" and readers of the code are more likely to think "hm, this is using auto-capture, I wonder which variables are local, and which are captured?" > > Of course they can still guess wrong, but I don't think "fn" vs "function" is a strong enough clue.
"Strong enough" is an opinion, and it seems all who have commented have differing ones of those. But maybe a memory device would help address (some of?) your concerns: - fn() — It is SHORT and implicit. Short is CONVENIENT. Thus Short auto-captures variables because that is the most Convenient thing to do. - function() — It is LONG. Long is more EXPLICIT. Thus Long requires Explicitly declaring variables, which is also more rigorous and robust. Or for the TL;DR crowd: - fn() => SHORT => CONVENIENT => Auto-captures - function() => LONG => EXPLICIT => Requires declaration Hope this helps. #fwiw -Mike
  117941
June 14, 2022 11:18 arnaud.lb@gmail.com (Arnaud Le Blanc)
On lundi 13 juin 2022 15:36:26 CEST Rowan Tommins wrote:
> > Auto-capture in PHP is by-value. This makes this impossible. It also makes > > explicit declarations non-necessary and much less useful. > > > > Live-variable analysis is mentioned in as part of implementation details. > > It should not be necessary to understand these details to understand the > > behavior of auto-capture. > > As noted in my other e-mail, by-value capture can still have side > effects, so users may still want to ensure that their code is free of > such side effects.
My choice of words in this reply was inaccurate when I said "In these languages it is easy to accidentally override/bind a variable from the parent scope by forgetting a variable declaration.", since "override" can be interpreted in different ways. What I meant here is that it is not possible to accidentally bind a variable on the parent scope. This is actually impossible unless you explicitly capture a variable by-reference. Do you agree with this ? Possible side-effects via object mutations are documented in the "No unintended side-effects" section of the RFC. This assumes that property assignments or method calls to captured objects would be intended, since these assignments/calls would result in an error if the variable was not defined and not captured. Do you have examples where assignments/calls would non- intendedly cause a side effect, with code you would actually write ?
> As noted in my other e-mail, by-value capture can still have side > effects, so users may still want to ensure that their code is free of > such side effects.
There are two ways for a closure to have a side-effect (already documented in the RFC) : - The closure explicit captures a variable by reference, and bind it - The closure mutates a value accessed through a captured variable. Mutable values include objects and resources, but NOT scalars or arrays (since they are copy-on-write). In the first case, this is entirely explicit. In the second case, the only thing you need do understand is that if you access a variable you did not define, the variable is either undefined or comes from the declaring scope. Accessing undefined variables is an error, so it must come from the declaring scope. Your example uses isset(), which is valid code in most circumstances, but as you said it's not particularly good code. Do you have other examples that come to mind ?
> Currently, the only way to do so is to understand the "implementation > details"
I'm willing to make changes if that's true, because I definitely don't want this to be the case.
  117945
June 14, 2022 15:45 rowan.collins@gmail.com (Rowan Tommins)
On 14/06/2022 12:18, Arnaud Le Blanc wrote:
> - The closure mutates a value accessed through a captured variable. Mutable > values include objects and resources, but NOT scalars or arrays (since they > are copy-on-write).
It's not something that is used very often, so is often forgotten or ignored, but there is technically an edge case where arrays are mutable: they can contain references, and the references remain "live" even when the array itself is passed by value. So although unlikely, it is possible for a by-value closure to over-write variables in other scopes: https://3v4l.org/dPZlI // plain variable $a = 42; // array containing a reference $b = [     'a' => &$a ]; // capture the array by-value $f = function() use($b) {     // update the reference from inside the closure     $b['a'] = 69; }; // call it $f(); // observe that both the array and the plain variable now have the new value var_dump($a, $b); -- Rowan Tommins [IMSoP]
  117946
June 14, 2022 16:04 rowan.collins@gmail.com (Rowan Tommins)
(Sorry for double reply, hit send too soon)

On 14/06/2022 12:18, Arnaud Le Blanc wrote:
> Your example uses isset(), which is valid code in most circumstances, but as > you said it's not particularly good code. Do you have other examples that come > to mind ?
There is plenty of code out there in the real world that is not particularly good, so I think it's a realistic example to think about. The question is, do we think the language can and should help people avoid that mistake? - Would the explicitness of "use(...)" make it more likely someone would spot it? - Would people be more likely to write "let $guest=..." (or whatever block-scope keyword we choose) than add "$guest = null;" at the beginning of the closure? I'm not totally sure, but we should always consider the impact on less-expert users, not just the power-users who are the ones often asking for new features. Regards, -- Rowan Tommins [IMSoP]
  117936
June 13, 2022 17:21 Danack@basereality.com (Dan Ackroyd)
Hi Arnaud,

Arnaud Le Blanc lb@gmail.com> wrote:
> > Following your comment, I have clarified a few things in the "Auto-capture > semantics" section. This includes a list of way in which these effects can be > observed. These are really marginal cases that are not relevant for most > programs.
Cool, thanks.
> Unfortunately, Arrow Functions already auto-capture today, so requiring a > `use(*)` to enable auto-capture would be a breaking change.
I think there are two things that making this conversation be more adversial than it could be otherwise: 1. Some people really want implicit auto-capture, while others are deeply fearful of it. That comes more from the experience people have from writing/reading different types of code leading them to have different aesthetic preferences. Trying to persuade people their lived experience is wrong, is hard. 2. The current situation of having to list all variables is kind of annoying when it doesn't provide much value e.g. for stuff like: function getCallback($foo, $bar, $quux) { return function($x) use ($foo, $bar, $quux) { return $quux($foo, $bar, $x); } } Where the code that returns the closure is trivial having to list out the full of captured variables does feel tedious, and doesn't provide any value. I realise it's annoying when people suggest expanding the scope of an RFC, however...how would you feel about adding support for use(*) to the current 'long closures'? That way, people could choose between: * Explicit capture of individual variables: function($x) use ($foo, $bar, $quux) {...} * Explicit capture of all relevant variables: function($x) use (*) {...} * Implicit capture of all relevant variables, and fewer letters: fn($x) {...} People who don't want implicit capture would be able tell their code quality analysis tools to warn on any use of short closures (or possibly better, warn when a variable has been captured). People who do want implicit capture can use the short closures which always have implicit capture. cheers Dan Ack
  117938
June 13, 2022 17:49 Danack@basereality.com (Dan Ackroyd)
Hi Larry, Arnaud,

On Mon, 13 Jun 2022 at 13:57, Arnaud Le Blanc lb@gmail.com> wrote:

> > Auto-capture in PHP is by-value. This makes this impossible. It also makes > explicit declarations non-necessary and much less useful. >
Separating off some pedantism from the hopefully constructive comment, I think some of the words in the RFC are a bit inaccurate:
> A by-value capture means that it is not possible to modify any variables from the outer scope:
> Because variables are bound by-value, the confusing behaviors often associated with closures do not exist.
> Because variables are captured by-value, Short Closures can not have unintended side effects.
Those statements are true for scalar values. They are not true for objects: class Foo { function __construct(public string $value) {} function __toString() { return $this->value; } } $a = new Foo('bar'); $f = fn() { $a->value = 'explicit scope is nice'; }; print $a; // prints "bar" $f(); print $a; // prints 'explicit scope is nice'; Yes, I know you can avoid these types of problems by avoiding mutability, and/or avoiding capturing variables that represent services, but sometimes those things are needed. When you are capturing objects that can have side effects, making that capture be explicit is quite nice (imo). I think the different emphasis on capturing scalar values or objects might come down to a difference in style of how different people use closures. cheers Dan Ack
  117940
June 14, 2022 10:29 arnaud.lb@gmail.com (Arnaud Le Blanc)
Hi Dan,

On lundi 13 juin 2022 19:49:10 CEST Dan Ackroyd wrote:
> > Auto-capture in PHP is by-value. This makes this impossible. It also makes > > explicit declarations non-necessary and much less useful. > > Separating off some pedantism from the hopefully constructive comment, > > I think some of the words in the RFC are a bit inaccurate: > > A by-value capture means that it is not possible to modify any variables > > from the outer scope: > > > > Because variables are bound by-value, the confusing behaviors often > > associated with closures do not exist. > > > > Because variables are captured by-value, Short Closures can not have > > unintended side effects. > Those statements are true for scalar values. They are not true for objects:
This is shown in the "No unintended side-effects" section of the RFC. I agree that the choice of words is inaccurate, as "modify any variable" could be interpreted not only as "bind a variable", but also as "mutate a value". The section you have quoted is meant to show how by-value capture, which is the default capture mode in all PHP closures, is less error prone than by- variable/by-reference capture, by a huge margin. Especially since variable bindings do not have side-effects unless a variable was explicitly captured by-reference. Do you agree with this ? The "No unintended side-effects" section assumes that property assignments to captured variables are intended side-effects. In your example, the programmer intended to have a side effect because `$a` can only come from the declaring scope (the code would result in an error otherwise) :
> $a = new Foo('bar'); > $f = fn() { > $a->value = 'explicit scope is nice'; > };
Do you have an example where the intent would be less obvious ? With code you would actually write ? Cheers, -- Arnaud Le Blanc
  117951
June 15, 2022 12:05 guilliam.xavier@gmail.com (Guilliam Xavier)
> > > Because variables are captured by-value, Short Closures can not have > > > unintended side effects. > > > > Those statements are true for scalar values. They are not true for objects: > > This is shown in the "No unintended side-effects" section of the RFC.
I'm confused by the last example: $fn2 = function () use (&$a) { /* code with $a AND $b */ } Isn't that missing a ", $b" in the `use`? And like others, I also find that allowing mixing explicit *by-value* capture with auto-capture is not really needed and even confusing; if you "expect that explicitly capturing by value will be rare in practice" you might as well forbid it? Maybe you don't even need to add explicit [by-reference] capture to short closures at all, but rather extend *long* closures so that we can write things like: $val1 = rand(); $val2 = rand(); $ref = null; $fn1 = function () use (...) { /* do something with $val1 and $val2 */ }; $fn2 = function () use (&$ref, ...) { $ref = $val1 + $val2; }; (and even if not, at least mention in the RFC that it has been considered)? By the way, what about *arrow* functions? e.g. $fn = fn () use (&$ref) => $ref = $val1 + $val2; // assigns and returns Would that be allowed? Is it really *desirable*? Regards, -- Guilliam Xavier
  117957
June 15, 2022 23:59 larry@garfieldtech.com ("Larry Garfield")
On Wed, Jun 15, 2022, at 7:05 AM, Guilliam Xavier wrote:
>> > > Because variables are captured by-value, Short Closures can not have >> > > unintended side effects. >> > >> > Those statements are true for scalar values. They are not true for objects: >> >> This is shown in the "No unintended side-effects" section of the RFC. > > I'm confused by the last example: > > $fn2 = function () use (&$a) { /* code with $a AND $b */ } > > Isn't that missing a ", $b" in the `use`? > > And like others, I also find that allowing mixing explicit *by-value* > capture with auto-capture is not really needed and even confusing; if > you "expect that explicitly capturing by value will be rare in > practice" you might as well forbid it?
Arnaud and I discussed it, and we're going to drop the mix-autocapture-and-manual functionality. I was tepid on it to begin with, and it can be confusing. RFC will be updated soon.
> By the way, what about *arrow* functions? e.g. > > $fn = fn () use (&$ref) => $ref = $val1 + $val2; // assigns and returns > > Would that be allowed? Is it really *desirable*?
I don't think it's really desireable. By-ref closure is unusual, probably even less so in one line closures (though I've not checked that specifically), references are usually a bad idea anyway, and in those unusual cases the long-form is still there if you want to control everything. --Larry Garfield
June 14, 2022 21:14 internals@lists.php.net ("Björn Larsson via internals")
Den 2022-06-13 kl. 14:57, skrev Arnaud Le Blanc:
> On samedi 11 juin 2022 23:14:28 CEST Rowan Tommins wrote: >> My main concern is summed up accidentally by your choice of subject line >> for this thread: is the proposal to add *short closure syntax* or is it >> to add *auto-capturing closures*? > > The proposal is to extend the Arrow Functions syntax so that it allows > multiple statements. I wanted to give a name to the RFC, so that we could > refer to the feature by that name instead of the longer "auto-capture multi- > statement closures". But the auto-capture behavior is an important aspect we > want to inherit from Arrow Functions. > >> As such, I think we need additional features to opt >> back out of capturing, and explicitly mark function- or block-scoped >> variables. > > Currently the `use()` syntax co-exists with auto-capture, but we could change > it so that an explicit `use()` list disables auto-capture instead: > > ```php > fn () use ($a) { } // Would capture $a and disable auto-capture > fn () use () { } // Would capture nothing and disable auto-capture > ``` >I like this idea very much. In the RFC two variables are captured explicitly and one implicitly.
$c = 1; fn () use ($a, &$b) { return $a + $b + $c; } I don't see the UC / value for not specifying $c while specifying $a. Think it's much clearer when capturing variables to implicitly capture everything or list the ones that should be captured. One only need to think about which variables are listed, not the ones that might be implicitly captured. Of course capturing by reference will always be required to list and if combined with capturing variables by value, they also needs to be listed. The there is this other proposal to enhance traditional anonymous functions by allowing the syntax use(*), meaning capture everything. Even if it's outside the scope of this RFC it could be mentioned in "What about Anonymous Functions?" or "Future scope". r//Björn L
  117916
June 12, 2022 17:21 larry@garfieldtech.com ("Larry Garfield")
On Thu, Jun 9, 2022, at 11:34 AM, Larry Garfield wrote:
> Last year, Nuno Maduro and I put together an RFC for combining the > multi-line capabilities of long-closures with the auto-capture > compactness of short-closures. That RFC didn't fully go to completion > due to concerns over the performance impact, which Nuno and I didn't > have bandwidth to resolve. > > Arnaud Le Blanc has now picked up the flag with an improved > implementation that includes benchmarks showing an effectively net-zero > performance impact, aka, good news as it avoids over-capturing. > > The RFC has therefore been overhauled accordingly and is now ready for > consideration. > > https://wiki.php.net/rfc/auto-capture-closure
A little data: I used Nikita's project analyzer on the top 1000 projects to get a rough sense of how long-closures are used now. All usual caveats apply about such survey data. I was specifically looking at how many `use` statements a closure typically had, and how many statements it typically had. Mainly, I am interested in how common "really long closures where the developer is likely to lose track of what is and isn't closed over" are. Total closures: 20052 Total used variables: 11534 Avg capture per closure: 0.575 Avg statements per closure: 0.575 Used variable distribution (# of use variables => how many times that happens): 0 => 12833 1 => 4585 2 => 1667 3 => 591 4 => 198 5 => 98 6 => 43 7 => 16 8 => 9 9 => 6 10 => 2 11 => 4 Statement count distribution (# of statements => how many times that happens): 0 => 266 1 => 13134 2 => 2885 3 => 1598 4 => 818 5 => 429 6 => 284 7 => 176 8 => 125 9 => 88 10 => 48 11 => 58 12 => 25 13 => 27 14 => 14 15 => 16 16 => 13 17 => 7 18 => 3 19 => 7 20 => 4 21 => 5 22 => 3 23 => 2 24 => 3 26 => 2 27 => 1 29 => 1 30 => 1 35 => 1 36 => 1 42 => 1 44 => 1 48 => 1 59 => 1 69 => 1 103 => 1 122 => 1 Analysis: * The bulk of closures close over nothing, so are irrelevant for us. * The bulk of closures use only one statement. That means they could easily be short-lambdas today, and are likely just pre-7.4 code that no one has bothered to update. * The overwhelming majority of the rest are 2-3 lines long. The dropoff after that is quite steep. (Approximately halving each time, with a few odd exceptions.) * Similarly, most `use` clauses contain 1-2 variables, and the dropoff after that is also quite steep. * There's some nitwit out there writing 122 line closures, and closing over 11 variables explicitly. Fortunately it looks like an extremely small number of nitwits. :-) The primary target of this RFC is people writing 2-4 line closures that import 1-2 variables, both easily small enough that there should be very little risk of developers getting confused by their own code. Based on the data above, I conclude that group is very much the typical case for closures already, and thus the risk of this syntax resulting in harder to follow code where developers get confused about what is imported and what isn't is very low. --Larry Garfield
  117917
June 12, 2022 17:54 mark@demon-angel.eu (Mark Baker)
On 12/06/2022 19:21, Larry Garfield wrote:
> On Thu, Jun 9, 2022, at 11:34 AM, Larry Garfield wrote: > > A little data: > > I used Nikita's project analyzer on the top 1000 projects to get a rough sense of how long-closures are used now. All usual caveats apply about such survey data. I was specifically looking at how many `use` statements a closure typically had, and how many statements it typically had. Mainly, I am interested in how common "really long closures where the developer is likely to lose track of what is and isn't closed over" are. > > Total closures: 20052 > Total used variables: 11534 > Did many of those closures use "pass by reference" in the use clause,
because that's one real differentiator between traditional closures and short lambdas. There's also the fact that use values are bound at the point where the closure is defined, not where it's called (if they even exist at all at that point), although that's probably more difficult to determine. -- Mark Baker _________ |. \ \-3 |_J_/ PHP | || | __ | || |m| |m| I LOVE PHP
  117920
June 12, 2022 20:25 larry@garfieldtech.com ("Larry Garfield")
On Sun, Jun 12, 2022, at 12:54 PM, Mark Baker wrote:
> On 12/06/2022 19:21, Larry Garfield wrote: >> On Thu, Jun 9, 2022, at 11:34 AM, Larry Garfield wrote: >> >> A little data: >> >> I used Nikita's project analyzer on the top 1000 projects to get a rough sense of how long-closures are used now. All usual caveats apply about such survey data. I was specifically looking at how many `use` statements a closure typically had, and how many statements it typically had. Mainly, I am interested in how common "really long closures where the developer is likely to lose track of what is and isn't closed over" are. >> >> Total closures: 20052 >> Total used variables: 11534 >> > Did many of those closures use "pass by reference" in the use clause, > because that's one real differentiator between traditional closures and > short lambdas. There's also the fact that use values are bound at the > point where the closure is defined, not where it's called (if they even > exist at all at that point), although that's probably more difficult to > determine.
New run to check for that: Total used variables: 11534 ByRef used variables: 1833 So around 13% of used variables are by-ref, and thus would need to be explicitly used even with the new syntax.
> There's also the fact that use values are bound at the > point where the closure is defined, not where it's called (if they even > exist at all at that point), although that's probably more difficult to > determine.
I... don't see what relevance that has? The potential for confusion is at the definition point, not call point. If a closure is used inline then those are the same place, but if they're not, it's only the definition point that is relevant at the moment. --Larry Garfield
  117925
June 13, 2022 09:35 arnaud.lb@gmail.com (Arnaud Le Blanc)
On dimanche 12 juin 2022 19:54:06 CEST Mark Baker wrote:
> Did many of those closures use "pass by reference" in the use clause, > because that's one real differentiator between traditional closures and > short lambdas. There's also the fact that use values are bound at the > point where the closure is defined, not where it's called (if they even > exist at all at that point), although that's probably more difficult to > determine.
Please note that auto-capture binds variables at function declaration. This is the case in Arrow Functions, and is inherited by this RFC.
  117927
June 13, 2022 10:28 landers.robert@gmail.com (Robert Landers)
On Mon, Jun 13, 2022 at 11:36 AM Arnaud Le Blanc lb@gmail.com> wrote:
> > On dimanche 12 juin 2022 19:54:06 CEST Mark Baker wrote: > > Did many of those closures use "pass by reference" in the use clause, > > because that's one real differentiator between traditional closures and > > short lambdas. There's also the fact that use values are bound at the > > point where the closure is defined, not where it's called (if they even > > exist at all at that point), although that's probably more difficult to > > determine. > > Please note that auto-capture binds variables at function declaration. This is > the case in Arrow Functions, and is inherited by this RFC. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
From a maintainer and code review aspect, I prefer the longer syntax because it is 100% clear on which variables are being closed over and utilized in the anonymous function. fn($x) => $x + $y is pretty clear that $y is being pulled in from an outer scope but if you start getting into longer ones, it can get non-obvious pretty quickly... $func = fn($x) { $y[] = $x; // do some stuff return $y; } If $y is pulled from the outside scope, it may or may not be intentional but hopefully, it is an array. If anyone uses the name $y outside the lambda, this code may subtly break. That being said, I'd love this RFC broken into two RFCs, one for generic auto-capturing and one for multi-line fn functions (just to reduce some typing when refactoring). There are times when auto-capturing can be useful for all lambdas, especially when writing some custom middleware.
  117929
June 13, 2022 12:12 arnaud.lb@gmail.com (Arnaud Le Blanc)
On lundi 13 juin 2022 12:28:17 CEST Robert Landers wrote:
> From a maintainer and code review aspect, I prefer the longer syntax > because it is 100% clear on which variables are being closed over and > utilized in the anonymous function. fn($x) => $x + $y is pretty clear > that $y is being pulled in from an outer scope but if you start > getting into longer ones, it can get non-obvious pretty quickly... > > $func = fn($x) { > $y[] = $x; > // do some stuff > return $y; > } > > If $y is pulled from the outside scope, it may or may not be > intentional but hopefully, it is an array. If anyone uses the name $y > outside the lambda, this code may subtly break.
This is true for any function that uses the array-append operator on an undefined variable.
  117921
June 12, 2022 20:44 rowan.collins@gmail.com (Rowan Tommins)
On 12/06/2022 18:21, Larry Garfield wrote:
> The primary target of this RFC is people writing 2-4 line closures that import 1-2 variables
The first half of that sentence I was expecting - although as I've already said, I think the chosen syntax suggests strongly that the RFC is really targeting *all* closures, not any subset of them. The second half makes much less sense. If you are only importing 1 or 2 variables, is writing their names really that big a burden? Several of the conversations I've had on this in the past have been very explicitly about the burden of large numbers of captures; if that's really as rare as you suggest, it makes me wonder why we're even bothering. Regards, -- Rowan Tommins [IMSoP]
  117922
June 12, 2022 21:29 larry@garfieldtech.com ("Larry Garfield")
On Sun, Jun 12, 2022, at 3:44 PM, Rowan Tommins wrote:
> On 12/06/2022 18:21, Larry Garfield wrote: >> The primary target of this RFC is people writing 2-4 line closures that import 1-2 variables > > > The first half of that sentence I was expecting - although as I've > already said, I think the chosen syntax suggests strongly that the RFC > is really targeting *all* closures, not any subset of them. > > The second half makes much less sense. If you are only importing 1 or 2 > variables, is writing their names really that big a burden? > > Several of the conversations I've had on this in the past have been very > explicitly about the burden of large numbers of captures; if that's > really as rare as you suggest, it makes me wonder why we're even bothering. > > Regards,
Disclaimer: My own view, I cannot speak for Nuno or Arnaud. If you're capturing a very large number of variables, then I would view that as a code smell. "Very large" is subjective, of course, and there's some context to it. The two main use cases I see myself using are A) 2-3 liners that use 1-3 variables from scope, so it's dead obvious what they are. In this case, the extra use clause doesn't really add much beyond visible noise. B) An entire method body is a closure that is being returned, or inlined into an inTransction() call or something like that. In this case, basically all method parameters would be captured, and it would be on the very previous line, so no matter how many there are (more than ~5 is probably a problem with the method, not with the closure), they're redundant and don't tell you anything that isn't already self-evident. So the burden is in having to think about redundant syntax at all, plus having more redundant text that has to be read in the future. Even with use(*) or use(...) or whatever, that's better than the status quo but is still just more boilerplate that would have to be added/removed when switching from a one line short lambda (side note: This is the term I always use; I basically never use "arrow function". I don't know how typical that is) to a 2-line closure when refactoring. --Larry Garfield
  117918
June 12, 2022 18:05 Danack@basereality.com (Dan Ackroyd)
On Thu, 9 Jun 2022 at 17:34, Larry Garfield <larry@garfieldtech.com> wrote:
> > That RFC didn't fully go to completion due to concerns over the performance impact
I don't believe that is an accurate summary. There were subtle issues in the previous RFC that should have been addressed. Nikita Popov wrote in https://news-web.php.net/php.internals/114239
> I'm generally in favor of supporting auto-capture for multi-line closures. > > There are some caveats though, which this RFC should address: > > Subtle side-effects, visible in debugging functionality, or through destructor > effects (the fact that a variable is captured may be observable). I think it > nothing else, the RFC should at least make clear that this behavior > is explicitly unspecified, and a future implementation may no longer capture > variables where any path from entry to read passes a write.
To be clear, I don't fully understand all those issues myself (and I have just enough knowledge to know to be scared to look at that part of the engine) but my understanding is that the concerns are not about just performance, they are deep concerns about the behaviour. It would produce a better discussion if the RFC document either said how those issues are resolved, or detail how they are still limitations on the implementation. It also probably would have been better (imo) to create a new RFC document. The previous RFC went to vote, even if the vote was cancelled. Diskspace is cheap. Having different (though similar) RFCs under the same URL makes is confusing when trying to understand what happened to particular RFCs. cheers Dan Ack
  117930
June 13, 2022 12:29 arnaud.lb@gmail.com (Arnaud Le Blanc)
On dimanche 12 juin 2022 20:05:02 CEST Dan Ackroyd wrote:
> On Thu, 9 Jun 2022 at 17:34, Larry Garfield <larry@garfieldtech.com> wrote: > > That RFC didn't fully go to completion due to concerns over the > > performance impact > I don't believe that is an accurate summary. There were subtle issues > in the previous RFC that should have been addressed. Nikita Popov > wrote in https://news-web.php.net/php.internals/114239
> It would produce a better discussion if the RFC document either said > how those issues are resolved, or detail how they are still > limitations on the implementation.
> To be clear, I don't fully understand all those issues myself (and I > have just enough knowledge to know to be scared to look at that part > of the engine) but my understanding is that the concerns are not about > just performance, they are deep concerns about the behaviour.
Thank you for pointing this out. Nikita was referring to side-effects of capturing too much variables, and suggested to make the capture analysis behavior explicitly unspecified in the RFC so that it could be changed (optimized) later. The new version of the RFC does the optimization. Following your comment, I have clarified a few things in the "Auto-capture semantics" section. This includes a list of way in which these effects can be observed. These are really marginal cases that are not relevant for most programs. Cheers -- Arnaud Le Blanc
  117932
June 13, 2022 13:13 rowan.collins@gmail.com (Rowan Tommins)
On 13/06/2022 13:29, Arnaud Le Blanc wrote:
> Following your comment, I have clarified a few things in the "Auto-capture > semantics" section. This includes a list of way in which these effects can be > observed. These are really marginal cases that are not relevant for most > programs.
I'm not sure I agree that all of these are marginal, or with the way you've characterised them... > Note that destructor timing is undefined in PHP, especially when reference cycles exist. Outside of reference cycles, which are pretty rare and generally easy to avoid, PHP's destructors are entirely deterministic. Unlike in fully garbage-collected languages, you can use a plain object to implement an "RAII" pattern - e.g. the constructor locks a file and the destructor unlocks it; or the constructor starts a transaction, and the destructor rolls it back if not yet committed. A related case is resource lifetime: file and network handles are guaranteed to be closed when they go out of scope, and accidentally taking an extra copy of their "value" can prevent that. > It ends up capturing the same variables that would have been captured by a manually curated |use| list. This slightly muddles two different questions: 1) Given a well-written closure, where all variables are either clearly local or clearly intended to be captured, does the implementation do a good job of distinguishing them? 2) Given a badly-written closure, where variables are accidentally ambiguous, what side-effects might the user experience? The answer to question 1 seems to be yes, the implementation does a good job, and that's good news, and thank you for working on it. That is not the same, however, as saying that question 2 is never relevant. Consider the following, adapted from an example in the RFC: $filter = fn ($user) {     if ( $user->id !== -1 ) {         $guest = $repository->findByUserId($user->id);     }     return isset($guest) && in_array($guest->id, $guestsIds); }; This is not particularly great code, but it works ... unless the parent scope happens to have a variable named $guest, which will then be bound to the closure, since there is a path where it is read before being written. In this case, side effects include: * The behaviour will change based on the captured value of $guest * Any resources held by that value will be held until $filter is destructed, rather than when $guest is destructed Whether the risk of these side effects is a big problem is up for debate, but it's wrong to suggest they don't exist. Regards, -- Rowan Tommins [IMSoP]
  118123
June 29, 2022 17:30 larry@garfieldtech.com ("Larry Garfield")
On Thu, Jun 9, 2022, at 11:34 AM, Larry Garfield wrote:
> Last year, Nuno Maduro and I put together an RFC for combining the > multi-line capabilities of long-closures with the auto-capture > compactness of short-closures. That RFC didn't fully go to completion > due to concerns over the performance impact, which Nuno and I didn't > have bandwidth to resolve. > > Arnaud Le Blanc has now picked up the flag with an improved > implementation that includes benchmarks showing an effectively net-zero > performance impact, aka, good news as it avoids over-capturing. > > The RFC has therefore been overhauled accordingly and is now ready for > consideration. > > https://wiki.php.net/rfc/auto-capture-closure
The conversation has died down, so we'll be opening the vote for this tomorrow. Two changes of note since the discussion started: * The option to mix explicit capture and implicit capture has been removed as too confusing/unpredictable. Either trust the engine to capture the right things (the new syntax proposed here) or explicitly list everything (the existing syntax we've had since 5.3.) * We added a section discussing the `use(*)` syntax alternative, and why it wasn't, er, used. (Pun only sort of intended.) --Larry Garfield
  118124
June 29, 2022 18:09 internals@lists.php.net ("Björn Larsson via internals")
Den 2022-06-29 kl. 19:30, skrev Larry Garfield:
> On Thu, Jun 9, 2022, at 11:34 AM, Larry Garfield wrote: >> Last year, Nuno Maduro and I put together an RFC for combining the >> multi-line capabilities of long-closures with the auto-capture >> compactness of short-closures. That RFC didn't fully go to completion >> due to concerns over the performance impact, which Nuno and I didn't >> have bandwidth to resolve. >> >> Arnaud Le Blanc has now picked up the flag with an improved >> implementation that includes benchmarks showing an effectively net-zero >> performance impact, aka, good news as it avoids over-capturing. >> >> The RFC has therefore been overhauled accordingly and is now ready for >> consideration. >> >> https://wiki.php.net/rfc/auto-capture-closure > > The conversation has died down, so we'll be opening the vote for this tomorrow. > > Two changes of note since the discussion started: > > * The option to mix explicit capture and implicit capture has been removed as too confusing/unpredictable. Either trust the engine to capture the right things (the new syntax proposed here) or explicitly list everything (the existing syntax we've had since 5.3.) > * We added a section discussing the `use(*)` syntax alternative, and why it wasn't, er, used. (Pun only sort of intended.) > > --Larry Garfield
Hi, Would it be an option to include a "Future scope" with the features: - Explicit capture that list only the variables to be captured by value or reference, nothing else. - Extending the traditional anonymous function with use(*) for capturing everything. Anyway, hope this passes for PHP 8.2! Regards //Björn Larsson
  118125
June 29, 2022 22:31 Danack@basereality.com (Dan Ackroyd)
On Wed, 29 Jun 2022 at 18:30, Larry Garfield <larry@garfieldtech.com> wrote:
> > The conversation has died down, so we'll be opening the vote for this tomorrow.
I think I've just thought of a problem with the optimization bit of 'not capturing variables if they are written to before being used inside the closure'. Imagine some code that looks like this: // Acquire some resource e.g. an exclusive lock. $some_resource = acquire_some_resource(); $fn = fn () { // Free that resource $some_resource = null; } // do some stuff that assumes the exclusive // lock is still active. // call the callback that we 'know' frees the resource $fn(); That's a not unreasonable piece of code to write even if it's of a style many people avoid. I believe in C++ it's called "Resource acquisition is initialization", though they're trying to change the name to "Scope-Bound Resource Management" as that is a better description of what it is. With the optimization in place, that code would not behave consistently with how the rest of PHP works, where the lifetime of an object is reasonably well defined with "The destructor method will be called as soon as there are no other references to a particular object,". From the RFC:
> This approach would result in a waste of memory or CPU usage.
For the record, all of my previous concerns about scoping rules have been about making code hard to reason about, and behave sanely. Memory itself is cheap. Although not having that optimization might mean that some variables last longer than they should, that is at least explainable*. Having variables not last as long as they should (because of an optimization) is harder to explain, and harder to explain how to work around. cheers Dan Ack * either use long closures or change your variable name if you don't want it captured.
  118126
June 30, 2022 08:19 rowan.collins@gmail.com (Rowan Tommins)
On 29/06/2022 23:31, Dan Ackroyd wrote:
> Imagine some code that looks like this: > > // Acquire some resource e.g. an exclusive lock. > $some_resource = acquire_some_resource(); > > $fn = fn () { > // Free that resource > $some_resource = null; > } > > // do some stuff that assumes the exclusive > // lock is still active. > > // call the callback that we 'know' frees the resource > $fn(); > > That's a not unreasonable piece of code to write
For that to work, it would require the variable to be captured by reference, not value. Writing to a variable captured by value, like writing to a parameter passed by value, is just writing to a local variable. In fact, the "optimisation" is in my opinion a critical part of the semantics, to avoid the opposite problem: // Acquire some resource e.g. an exclusive lock. $some_resource = acquire_some_resource(); $fn = fn () {     // Use a variable that happens to have the same name     // A naive implementation would see $some_resource mentioned, and capture it     // Over-writing the local variable here makes no difference; the closure still holds the value for next time     $some_resource = 'hello'; } // Free what we believe is the last pointer, to trigger the destructor unset($some_resource); // If $some_resource gets captured, it can only be released by destroying the closure unset($fn); Regards, -- Rowan Tommins [IMSoP]
  118127
June 30, 2022 09:20 landers.robert@gmail.com (Robert Landers)
On Thu, Jun 30, 2022 at 10:19 AM Rowan Tommins collins@gmail.com> wrote:
> > On 29/06/2022 23:31, Dan Ackroyd wrote: > > Imagine some code that looks like this: > > > > // Acquire some resource e.g. an exclusive lock. > > $some_resource = acquire_some_resource(); > > > > $fn = fn () { > > // Free that resource > > $some_resource = null; > > } > > > > // do some stuff that assumes the exclusive > > // lock is still active. > > > > // call the callback that we 'know' frees the resource > > $fn(); > > > > That's a not unreasonable piece of code to write > > > For that to work, it would require the variable to be captured by > reference, not value. Writing to a variable captured by value, like > writing to a parameter passed by value, is just writing to a local variable. > > > In fact, the "optimisation" is in my opinion a critical part of the > semantics, to avoid the opposite problem: > > // Acquire some resource e.g. an exclusive lock. > $some_resource = acquire_some_resource(); > > $fn = fn () { > // Use a variable that happens to have the same name > // A naive implementation would see $some_resource mentioned, and > capture it > // Over-writing the local variable here makes no difference; the > closure still holds the value for next time > $some_resource = 'hello'; > } > > // Free what we believe is the last pointer, to trigger the destructor > unset($some_resource); > > // If $some_resource gets captured, it can only be released by > destroying the closure > unset($fn); > > > Regards, > > -- > Rowan Tommins > [IMSoP] > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
> For that to work, it would require the variable to be captured by > reference, not value.
I think their suggested code would work (at least currently in PHP) by the simple fact they would increase the reference count on that object/resource until they set it as null. However, with the "optimization," the reference count will never be incremented and thus fail to work as defined.
  118128
June 30, 2022 10:17 rowan.collins@gmail.com (Rowan Tommins)
On 30/06/2022 10:20, Robert Landers wrote:
> I think their suggested code would work (at least currently in PHP) by > the simple fact they would increase the reference count on that > object/resource until they set it as null. However, with the > "optimization," the reference count will never be incremented and thus > fail to work as defined.
No, the captured value is tied to the lifetime of the closure itself, not the variable inside the closure. $some_resource = acquire_some_resource(); // refcount=1 (outer $some_resource) $fn = function() use ($some_resource) {     $some_resource = null; } // refcount=2 (outer $some_resource, closure $fn) $fn(); // during execution, refcount is 3 (outer $some_resource, closure $fn, local $some_resource) // once the local variable is written to, the refcount goes back to 2 (outer $some_resource, closure $fn) unset($some_resource); // refcount=1 (closure $fn) $fn(); // the captured variable always starts with its original value, regardless of how many times you execute the function // during execution, refcount is now 2 (closure $fn, local $some_resource) // after execution, refcount is still 1 (closure $fn) unset($fn); // only now does the refcount go down to 0 and trigger the destructor The only way for it to work would be using capture by reference (not supported by the proposed short syntax): $some_resource = acquire_some_resource(); // refcount=1: simple variable $fn = function() use (&$some_resource) {     $some_resource = null; } // refcount=1: a reference set with 2 members (outer $some_resource, closure $fn) $fn(); // during execution, we have a reference set with 3 members (outer $some_resource, closure $fn, local $some_resource) // the assignment assigns to this reference set, changing the value referenced by all 3 members // refcount on the resource drops from 1 to 0, triggering the destructor $fn(); // because it was captured by reference, the initial value of $some_resource in the closure has now changed Regards, -- Rowan Tommins [IMSoP]
  118132
June 30, 2022 13:25 Danack@basereality.com (Dan Ackroyd)
Hi Rowan,

Rowan wrote:
> For that to work, it would require the variable to be captured by > reference, not value. > ... > The only way for it to work would be using capture by reference (not > supported by the proposed short syntax):
I wrote about this before. Some of the words in the RFC are, in my opinion, quite inaccurate: Danack wrote in https://news-web.php.net/php.internals/117938 :
> Those statements are true for scalar values. They are not true for objects:
With automatic capturing of variables, for the code example I gave the user would want the variable to be captured, and to them it looks like it should be, but because of an optimization it is not. When the code doesn't work as they expect it to, the programmer is likely to add a var_dump to try to see what is happening. Which makes it look like their code 'should' work, as their resource object is still alive.
> In fact, the "optimisation" is in my opinion a critical part of the > semantics, to avoid the opposite problem:
As I said, I think that problem is a lot easier to explain "either use long closures or change your variable name if you don't want it captured." than trying to explain "yes, the variable is referenced inside the closure, but it's not captured because you aren't reading from it". cheers Dan Ack For this code, comment the var_dump in/out to affect the lifetime of the object. class ResourceType { public function __destruct() { echo "Resource is released.\n"; } } function get_callback() { $some_resource = new ResourceType(); $fn = fn() { // // why is my lock released? var_dump($some_resource); // "Free that resource" $some_resource = null; }; return $fn; } $fn = get_callback(); echo "Before callback\n"; $fn(); echo "After callback\n"; // Without var_dump Resource is released. Before callback After callback // With var_dump Before callback object(ResourceType)#1 (0) { } After callback Resource is released.
  118134
June 30, 2022 14:18 landers.robert@gmail.com (Robert Landers)
On Thu, Jun 30, 2022 at 3:26 PM Dan Ackroyd <Danack@basereality.com> wrote:
> > Hi Rowan, > > Rowan wrote: > > For that to work, it would require the variable to be captured by > > reference, not value. > > ... > > The only way for it to work would be using capture by reference (not > > supported by the proposed short syntax): > > I wrote about this before. Some of the words in the RFC are, in my > opinion, quite inaccurate: > > Danack wrote in https://news-web.php.net/php.internals/117938 : > > Those statements are true for scalar values. They are not true for objects: > > With automatic capturing of variables, for the code example I gave the > user would want the variable to be captured, and to them it looks like > it should be, but because of an optimization it is not. > > When the code doesn't work as they expect it to, the programmer is > likely to add a var_dump to try to see what is happening. Which makes > it look like their code 'should' work, as their resource object is > still alive. > > > In fact, the "optimisation" is in my opinion a critical part of the > > semantics, to avoid the opposite problem: > > As I said, I think that problem is a lot easier to explain "either use > long closures or change your variable name if you don't want it > captured." than trying to explain "yes, the variable is referenced > inside the closure, but it's not captured because you aren't reading > from it". > > cheers > Dan > Ack > > > For this code, comment the var_dump in/out to affect the lifetime of the object. > > class ResourceType > { > public function __destruct() { > echo "Resource is released.\n"; > } > } > > function get_callback() > { > $some_resource = new ResourceType(); > $fn = fn() { > // // why is my lock released? > var_dump($some_resource); > // "Free that resource" > $some_resource = null; > }; > return $fn; > } > > $fn = get_callback(); > echo "Before callback\n"; > $fn(); > echo "After callback\n"; > > // Without var_dump > Resource is released. > Before callback > After callback > > // With var_dump > Before callback > object(ResourceType)#1 (0) { > } > After callback > Resource is released. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
Rowan wrote:
> No, the captured value is tied to the lifetime of the closure itself, not the variable inside the closure.
With the "optimization," it won't be captured at all by the closure, possibly causing some resources to go out of scope early. Are optimizations going to be applied to single-line arrow functions (I didn't see that in the RFC, but I admittedly didn't look that hard and I vaguely remember reading something about it in one of these threads)? If so, it will probably change some behaviors in existing applications if they were relying on it. Perhaps static analysis tools can detect this and inform the developer. Here's Dan's code: https://3v4l.org/99XUN#v8.1.7 that he just sent, modified to not capture the $some_resource and you can see that it is indeed released earlier than if it were captured.
  118138
June 30, 2022 15:47 guilliam.xavier@gmail.com (Guilliam Xavier)
On Thu, Jun 30, 2022 at 3:26 PM Dan Ackroyd <Danack@basereality.com> wrote:
> > Hi Rowan, > > Rowan wrote: > > For that to work, it would require the variable to be captured by > > reference, not value. > > ... > > The only way for it to work would be using capture by reference (not > > supported by the proposed short syntax): > > I wrote about this before. Some of the words in the RFC are, in my > opinion, quite inaccurate: > > Danack wrote in https://news-web.php.net/php.internals/117938 : > > Those statements are true for scalar values. They are not true for objects:
But the RFC has been updated since (notably the DateTime example); do you find the current wording still inaccurate?
> With automatic capturing of variables, for the code example I gave the > user would want the variable to be captured, and to them it looks like > it should be, but because of an optimization it is not.
Am I missing something here? To me, it has been explained (and shown) by Rowan (and me) that the code example you gave would *not* work as expected *even without the optimization* (for it to work it would need to either capture by *reference*, or use e.g. `$some_resource->close();` [or `close($some_resource);`] instead of a destructor); but maybe we don't "expect" the same behavior in the first place?
> When the code doesn't work as they expect it to, the programmer is > likely to add a var_dump to try to see what is happening. Which makes > it look like their code 'should' work, as their resource object is > still alive.
This indeed seems a valid point (that adding a `var_dump($some_resource);` before the `$some_resource = null;` changes it from "not captured" to "captured", with an effect on its lifetime). But are there "real" cases where it would *actually* matter?
> > In fact, the "optimisation" is in my opinion a critical part of the > > semantics, to avoid the opposite problem: > > As I said, I think that problem is a lot easier to explain "either use > long closures or change your variable name if you don't want it > captured." than trying to explain "yes, the variable is referenced > inside the closure, but it's not captured because you aren't reading > from it".
Same as above. On Thu, Jun 30, 2022 at 4:19 PM Robert Landers robert@gmail.com> wrote:
> > Rowan wrote: > > No, the captured value is tied to the lifetime of the closure itself, > not the variable inside the closure. > > With the "optimization," it won't be captured at all by the closure, > possibly causing some resources to go out of scope early.
And it has been explained that conversely, capturing it would possible cause some resources to "remain in scope" late.
> Are > optimizations going to be applied to single-line arrow functions (I > didn't see that in the RFC, but I admittedly didn't look that hard and > I vaguely remember reading something about it in one of these > threads)?
Seems so: https://github.com/php/php-src/pull/8330/files#diff-85701127596aca0e597bd7961b5d59cdde4f6bb3e2a109a22be859ab7568b4d2R7318-R7320
> If so, it will probably change some behaviors in existing > applications if they were relying on it. Perhaps static analysis tools > can detect this and inform the developer.
Here too, do you have a "real" case where it would *actually* matter?
> Here's Dan's code: https://3v4l.org/99XUN#v8.1.7 that he just sent, > modified to not capture the $some_resource and you can see that it is > indeed released earlier than if it were captured.
And here it is "un-modified": https://3v4l.org/gZai2 where you see that calling $fn() (which internally nullifies *its local copy of* $some_resource) does *not* release; is it really what you expect? are you creating the closure only to extend the lifetime of $some_resource? Regards, -- Guilliam Xavier
  118140
June 30, 2022 16:04 landers.robert@gmail.com (Robert Landers)
On Thu, Jun 30, 2022 at 5:47 PM Guilliam Xavier
xavier@gmail.com> wrote:
> > On Thu, Jun 30, 2022 at 3:26 PM Dan Ackroyd <Danack@basereality.com> wrote: > > > > Hi Rowan, > > > > Rowan wrote: > > > For that to work, it would require the variable to be captured by > > > reference, not value. > > > ... > > > The only way for it to work would be using capture by reference (not > > > supported by the proposed short syntax): > > > > I wrote about this before. Some of the words in the RFC are, in my > > opinion, quite inaccurate: > > > > Danack wrote in https://news-web.php.net/php.internals/117938 : > > > Those statements are true for scalar values. They are not true for objects: > > But the RFC has been updated since (notably the DateTime example); do > you find the current wording still inaccurate? > > > With automatic capturing of variables, for the code example I gave the > > user would want the variable to be captured, and to them it looks like > > it should be, but because of an optimization it is not. > > Am I missing something here? To me, it has been explained (and shown) > by Rowan (and me) that the code example you gave would *not* work as > expected *even without the optimization* (for it to work it would need > to either capture by *reference*, or use e.g. > `$some_resource->close();` [or `close($some_resource);`] instead of a > destructor); but maybe we don't "expect" the same behavior in the > first place? > > > When the code doesn't work as they expect it to, the programmer is > > likely to add a var_dump to try to see what is happening. Which makes > > it look like their code 'should' work, as their resource object is > > still alive. > > This indeed seems a valid point (that adding a > `var_dump($some_resource);` before the `$some_resource = null;` > changes it from "not captured" to "captured", with an effect on its > lifetime). But are there "real" cases where it would *actually* > matter? > > > > In fact, the "optimisation" is in my opinion a critical part of the > > > semantics, to avoid the opposite problem: > > > > As I said, I think that problem is a lot easier to explain "either use > > long closures or change your variable name if you don't want it > > captured." than trying to explain "yes, the variable is referenced > > inside the closure, but it's not captured because you aren't reading > > from it". > > Same as above. > > > On Thu, Jun 30, 2022 at 4:19 PM Robert Landers robert@gmail.com> wrote: > > > > Rowan wrote: > > > No, the captured value is tied to the lifetime of the closure itself, > > not the variable inside the closure. > > > > With the "optimization," it won't be captured at all by the closure, > > possibly causing some resources to go out of scope early. > > And it has been explained that conversely, capturing it would possible > cause some resources to "remain in scope" late. > > > Are > > optimizations going to be applied to single-line arrow functions (I > > didn't see that in the RFC, but I admittedly didn't look that hard and > > I vaguely remember reading something about it in one of these > > threads)? > > Seems so: https://github.com/php/php-src/pull/8330/files#diff-85701127596aca0e597bd7961b5d59cdde4f6bb3e2a109a22be859ab7568b4d2R7318-R7320 > > > If so, it will probably change some behaviors in existing > > applications if they were relying on it. Perhaps static analysis tools > > can detect this and inform the developer. > > Here too, do you have a "real" case where it would *actually* matter? > > > Here's Dan's code: https://3v4l.org/99XUN#v8.1.7 that he just sent, > > modified to not capture the $some_resource and you can see that it is > > indeed released earlier than if it were captured. > > And here it is "un-modified": https://3v4l.org/gZai2 where you see > that calling $fn() (which internally nullifies *its local copy of* > $some_resource) does *not* release; is it really what you expect? are > you creating the closure only to extend the lifetime of > $some_resource? > > > Regards, > > -- > Guilliam Xavier > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
> And here it is "un-modified": https://3v4l.org/gZai2 where you see that calling $fn() (which internally nullifies *its local copy of*
$some_resource) does *not* release; is it really what you expect? are you creating the closure only to extend the lifetime of $some_resource? Personally, not that I'm aware of, which is the point. This may subtly change code that works just fine today and it will be hard to track it down. Though perhaps static analysis/IDE's will help track it down by pointing out automatically captured vs. non-captured variables. Ah, I see Arnaud just confirmed that it won't be applied to existing arrow functions. Perhaps this is a moot point and it will be just another quirk to be aware of when writing PHP. I was just worried about it being applied to any existing code.
  118139
June 30, 2022 15:56 arnaud.lb@gmail.com (Arnaud Le Blanc)
Hi,

On jeudi 30 juin 2022 16:18:44 CEST Robert Landers wrote:
> Are > optimizations going to be applied to single-line arrow functions (I > didn't see that in the RFC, but I admittedly didn't look that hard and > I vaguely remember reading something about it in one of these > threads)? If so, it will probably change some behaviors in existing > applications if they were relying on it. Perhaps static analysis tools > can detect this and inform the developer.
It is not planned to change the behavior of arrow functions in this RFC. This optimization is less important for arrow functions because they don't usually assign variables. This could be a follow up RFC though.
  118141
June 30, 2022 16:29 guilliam.xavier@gmail.com (Guilliam Xavier)
On Thu, Jun 30, 2022 at 5:57 PM Arnaud Le Blanc lb@gmail.com> wrote:
> > Hi, > > On jeudi 30 juin 2022 16:18:44 CEST Robert Landers wrote: > > Are > > optimizations going to be applied to single-line arrow functions (I > > didn't see that in the RFC, but I admittedly didn't look that hard and > > I vaguely remember reading something about it in one of these > > threads)? If so, it will probably change some behaviors in existing > > applications if they were relying on it. Perhaps static analysis tools > > can detect this and inform the developer. > > It is not planned to change the behavior of arrow functions in this RFC. This > optimization is less important for arrow functions because they don't usually > assign variables.
Ah? Sorry, I had interpreted https://github.com/php/php-src/pull/8330/files#diff-85701127596aca0e597bd7961b5d59cdde4f6bb3e2a109a22be859ab7568b4d2R7318-R7320 as "capture the *minimal* set of variables for *both* arrow functions and short closures", but I was wrong? I don't see a test like this: ```php class C { public function __destruct() { echo 'destructed', PHP_EOL; } } $x = new C(); $fn = fn ($a, $b) => (($x = $a ** 2) + ($y = $b ** 2)) * ($x - $y); echo '- unsetting $x', PHP_EOL; unset($x); echo '- calling $fn', PHP_EOL; var_dump($fn(3, 2)); echo '- unsetting $fn', PHP_EOL; unset($fn); echo '- DONE.', PHP_EOL; ``` with current output (https://3v4l.org/ve3BL#v8.1.7): ``` - unsetting $x - calling $fn int(65) - unsetting $fn destructed - DONE. ``` where the optimization would make the "destructed" line move up to just after "- unsetting $x" -- Guilliam Xavier
  118143
June 30, 2022 17:37 arnaud.lb@gmail.com (Arnaud Le Blanc)
On jeudi 30 juin 2022 18:29:44 CEST Guilliam Xavier wrote:
> Ah? Sorry, I had interpreted > https://github.com/php/php-src/pull/8330/files#diff-85701127596aca0e597bd796 > 1b5d59cdde4f6bb3e2a109a22be859ab7568b4d2R7318-R7320 as "capture the > *minimal* set of variables for *both* arrow functions and short closures", > but I was wrong?
No, you are right, the PR changes arrow functions too. But in the RFC we decided to not touch the arrow functions for now.
  118144
June 30, 2022 19:48 rowan.collins@gmail.com (Rowan Tommins)
On 30/06/2022 18:37, Arnaud Le Blanc wrote:
> On jeudi 30 juin 2022 18:29:44 CEST Guilliam Xavier wrote: >> Ah? Sorry, I had interpreted >> https://github.com/php/php-src/pull/8330/files#diff-85701127596aca0e597bd796 >> 1b5d59cdde4f6bb3e2a109a22be859ab7568b4d2R7318-R7320 as "capture the >> *minimal* set of variables for *both* arrow functions and short closures", >> but I was wrong? > No, you are right, the PR changes arrow functions too. But in the RFC we > decided to not touch the arrow functions for now.
Personally, I would be in favour of leaving the change in for arrow functions as well. The fact that a variable of the same name, whose value is never actually used, is captured by the closure, is to me a bug, not a feature. It's hard to even contrive an example where this is observable, so I highly doubt anyone is relying on it. Regards, -- Rowan Tommins [IMSoP]
  118136
June 30, 2022 15:36 rowan.collins@gmail.com (Rowan Tommins)
On 30/06/2022 14:25, Dan Ackroyd wrote:
> With automatic capturing of variables, for the code example I gave the > user would want the variable to be captured, and to them it looks like > it should be, but because of an optimization it is not.
Please look again at the detailed explanation I gave, and the examples that Guilliam posted. Your example can *only* work if the variable is captured by reference, because it requires the statement *inside* the closure to have an effect on a variable *outside* the closure. No version of auto-capture has ever proposed capturing by reference. If instead of $some_resource = null; you wrote $some_container->some_resource = null; then that would have an effect on the object, but the "optimisation" would be irrelevant because the use of $some_container itself is not an assignment.
> As I said, I think that problem is a lot easier to explain "either use > long closures or change your variable name if you don't want it > captured." than trying to explain "yes, the variable is referenced > inside the closure, but it's not captured because you aren't reading > from it".
Right now, assigning (or unsetting) a variable is the *only* way to force it to be local. That's why I said I would be more likely to support this feature alongside a "var" or "let" keyword to make such variables explicit. Not being able to have local variables *at all* other than by very careful variable naming is a terrible idea. Just to re-iterate, here's your new example with explicit capture, to demonstrate that the closure *does not and cannot free the resource*: https://3v4l.org/WrTb5 class ResourceType { public function __destruct() { echo "Resource is released.\n"; } } function get_callback() { $some_resource = new ResourceType(); $fn = function() use ($some_resource) { // this line does nothing // it overwrites a local variable which is never read // next time the closure runs, it will start again as the captured value $some_resource = null; }; return $fn; } $fn = get_callback(); echo "Before callback\n"; $fn(); echo "After callback\n"; unset($some_resource); echo "After destroying outer var\n"; // the captured reference is still live here, no matter how many times we call $fn() // only destroying the closure frees it unset($fn); echo "After destroying closure\n"; One way of thinking of it is that assignments inside a closure are assignments to a local variable, which "shadow" any captured variable with the same name. If all you do with a variable is shadow it, then it is illogical to consider it "used" in that function. On 30/06/2022 15:18, Robert Landers wrote:
> Are > optimizations going to be applied to single-line arrow functions (I > didn't see that in the RFC, but I admittedly didn't look that hard and > I vaguely remember reading something about it in one of these > threads)?
I would expect so, yes. It could be considered a bug that the arrow function implementation currently "over-captures" variables, and it only wasn't a higher priority in Nikita's RFC because it is extremely rare that a single expression closure would have any local variables. Indeed, that lack of local scope is one of the big reasons why I and others supported that RFC, because it avoids all the confusion evident in today's messages. Regards, -- Rowan Tommins [IMSoP]
  118129
June 30, 2022 10:28 guilliam.xavier@gmail.com (Guilliam Xavier)
On Thu, Jun 30, 2022 at 11:20 AM Robert Landers
robert@gmail.com> wrote:
> > On Thu, Jun 30, 2022 at 10:19 AM Rowan Tommins collins@gmail.com> wrote: > > > > On 29/06/2022 23:31, Dan Ackroyd wrote: > > > Imagine some code that looks like this: > > > > > > // Acquire some resource e.g. an exclusive lock. > > > $some_resource = acquire_some_resource(); > > > > > > $fn = fn () { > > > // Free that resource > > > $some_resource = null; > > > } > > > > > > // do some stuff that assumes the exclusive > > > // lock is still active. > > > > > > // call the callback that we 'know' frees the resource > > > $fn(); > > > > > > That's a not unreasonable piece of code to write > > > > > > For that to work, it would require the variable to be captured by > > reference, not value. Writing to a variable captured by value, like > > writing to a parameter passed by value, is just writing to a local variable. > > > > > > In fact, the "optimisation" is in my opinion a critical part of the > > semantics, to avoid the opposite problem: > > > > // Acquire some resource e.g. an exclusive lock. > > $some_resource = acquire_some_resource(); > > > > $fn = fn () { > > // Use a variable that happens to have the same name > > // A naive implementation would see $some_resource mentioned, and > > capture it > > // Over-writing the local variable here makes no difference; the > > closure still holds the value for next time > > $some_resource = 'hello'; > > } > > > > // Free what we believe is the last pointer, to trigger the destructor > > unset($some_resource); > > > > // If $some_resource gets captured, it can only be released by > > destroying the closure > > unset($fn); > > > > > > Regards, > > > > -- > > Rowan Tommins > > [IMSoP] > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: https://www.php.net/unsub.php > > > > > For that to work, it would require the variable to be captured by > > reference, not value. > > I think their suggested code would work (at least currently in PHP) by > the simple fact they would increase the reference count on that > object/resource until they set it as null. However, with the > "optimization," the reference count will never be incremented and thus > fail to work as defined. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php >
No offense, but why don't you just try it? Please see equivalents of: - Dan's code: https://3v4l.org/51jXY => doesn't "work" - Dan's code with capture by reference (as said by Rowan): https://3v4l.org/JoUVi => "works" - Rowan's code: https://3v4l.org/7ZVv3 => shows the "problem" with capture PS: I see that Rowan just replied with refcount explanations. I agree (but am sending this anyway) Regards, -- Guilliam Xavier
  118130
June 30, 2022 10:29 arnaud.lb@gmail.com (Arnaud Le Blanc)
Hi,

On jeudi 30 juin 2022 00:31:44 CEST Dan Ackroyd wrote:
> On Wed, 29 Jun 2022 at 18:30, Larry Garfield <larry@garfieldtech.com> wrote: > > The conversation has died down, so we'll be opening the vote for this > > tomorrow. > I think I've just thought of a problem with the optimization bit of > 'not capturing variables if they are written to before being used > inside the closure'. > > Imagine some code that looks like this: > > // Acquire some resource e.g. an exclusive lock. > $some_resource = acquire_some_resource(); > > $fn = fn () { > // Free that resource > $some_resource = null; > } > > // do some stuff that assumes the exclusive > // lock is still active. > > // call the callback that we 'know' frees the resource > $fn(); > > That's a not unreasonable piece of code to write even if it's of a > style many people avoid. I believe in C++ it's called "Resource > acquisition is initialization", though they're trying to change the > name to "Scope-Bound Resource Management" as that is a better > description of what it is.
I feel that the RAII pattern aka SBRM / Scope-Bound Resource Management is not relevant in PHP context, and I don't believe that it's commonly used in PHP or in garbage collected language. Also, in this particular code example, using an explicit fclose() would be better in every way, including legibility and reliability, so this doesn't appear to be realistic code. Because of this, I don't think that we should be taking decisions on this feature based on this use case. I've used the RAII pattern in PHP to manage temporary files, as a best-effort way to remove them (in a destructor) when they are not used anymore. However I would not rely on this for anything more critical or anything that requires predictability in resource release timing. RAII is useful in C++ because memory is managed manually. This is not the case in PHP. It's also useful in C++ to manage other kinds of resources such as file pointers or locks. In PHP it would be dangerous because you don't realistically control the lifetime of values, so you also don't control the timing at which the resources are closed. It's too easy to extend the lifetime of a value accidentally. One way the lifetime of a value could be extended is via a reference cycle. These are easy to introduce and difficult to prevent or observe (e.g. in a test or in an assertion). An other way would be by referencing the value somewhere else. You can not guarantee that the lifetime of a value is unaffected after passing it to a function. In C++ it's different because no code would implicitly keep a reference to a variable passed to it unless it was part of that code's contract, or unless the variable was refcounted. Another factor that makes RAII un-viable in PHP is that the order of the destructor calls is unspecified. Currently, if multiple objects go out of scope at the same time, they happen to be called in a FIFO order, which is not what is needed when using the RAII pattern [0][1]. I think that RAII can only realistically be used in a non-managed, non- refcounted, non-GC language. GC or reference counting should not be used to manage anything else than memory allocation. Other languages typically have other ways to explicitly manage the lifetime of resources. Go has `defer()` [2]. Python has context managers / `with` [3], C# has `using` [4]. `with` and `using` can be implemented in userland in PHP. Because of all these reasons, I don't think that RAII in PHP is practical or actually used. So I don't think that we should be taking decisions on Short Closures based on this use case.
> With the optimization in place, that code would not behave > consistently with how the rest of PHP works
There exist no circumstance in PHP in which the existence of the statement `$a = null` would extend the lifetime of the value bound to `$a`. [0] Destructor order PHP: https://3v4l.org/iGAPj [1] Destructor order C++: https://godbolt.org/z/f78Pa9j69 [2] https://go.dev/doc/effective_go#defer [3] https://docs.python.org/3/reference/compound_stmts.html#with [4] https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/ keywords/using-statement Cheers, -- Arnaud Le Blanc
  118131
June 30, 2022 12:20 rowan.collins@gmail.com (Rowan Tommins)
On 30/06/2022 11:29, Arnaud Le Blanc wrote:

> I feel that the RAII pattern aka SBRM / Scope-Bound Resource Management is not > relevant in PHP context, and I don't believe that it's commonly used in PHP or > in garbage collected language.
I've used a simple version of the pattern effectively to implement transactions: if the Transaction object goes out of scope without being explicitly committed or rolled back, it assumes the program hit an unexpected error condition and rolls back.
> One way the lifetime of a value could be extended is via a reference cycle. > These are easy to introduce and difficult to prevent or observe (e.g. in a > test or in an assertion).
I would expect reference cycles to be pretty rare in most code, particularly when you're dealing with a value with a short lifetime as is involved in most RAII scenarios. The worst-case release of the cycle can also be made predictable by running gc_collect_cycles()
> An other way would be by referencing the value > somewhere else. You can not guarantee that the lifetime of a value is > unaffected after passing it to a function.
Surely the only way to avoid that is with something like Rust's "borrow checker"? Otherwise, any function that has a reference to something can extend the lifetime of that reference by storing it inside some other structure with a longer lifetime. Manually freeing the underlying resource then just leads to a "use after free" error.
> Another factor that makes RAII un-viable in PHP is that the order of the > destructor calls is unspecified. Currently, if multiple objects go out of > scope at the same time, they happen to be called in a FIFO order, which is not > what is needed when using the RAII pattern [0][1].
I can imagine this would be a problem for some advanced uses of the pattern, but for a simple "acquire lock, release on scope exit" or "start transaction, rollback on unexpected scope exit", it's generally not relevant.
> Other languages typically have other ways to explicitly manage the lifetime of > resources. Go has `defer()` [2]. Python has context managers / `with` [3], C# > has `using` [4]. `with` and `using` can be implemented in userland in PHP.
My understanding is that C#'s "using" is indeed about deterministic destruction, but Pythons's "with" is a more powerful inversion-of-control mechanism. I would actually really love to have some version of Python's context managers in PHP, and think it would be a better alternative to closures in a lot of cases. For instance, a motivation cited in support of auto-capture is something like this: function doSomething($a, $b, $c) {    return $db->doInTransaction(fn() {        // use $a, $b, and $c        // roll back on exception, commit otherwise        return $theActualResult;    } } But this is actually quite a "heavy" implementation: we create a Closure, capture values, enter a new stack frame, and have two return statements, just to wrap the code in try...catch...finally boilerplate. The equivalent with a context manager would look something like this: function doSomething($a, $b, $c) {    with ( $db->startTransaction() as $transaction ) {        // use $a, $b, and $c        // roll back on exception, commit otherwise        return $theActualResult;    } } Here, the with statement doesn't create a new stack frame, it just triggers a series of callbacks for the boilerplate at the start and end of the block. No variables need to be captured, because they are all still available, and "return" returns from the doSomething() function, not the transaction wrapper. The explanation of how Python's implementation works and why is an interesting read: https://peps.python.org/pep-0343/ Regards, -- Rowan Tommins [IMSoP]