[RFC] Union Types v2

  106844
September 4, 2019 08:26 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

I'd like to start the discussion on union types again, with a new proposal:

Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md

As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.

Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in this
proposal are:
 * Updated to specify interaction with new language features, like full
variance and property types.
 * Updated for the use of the ?Type syntax rather than the Type|null syntax.
 * Only supports "false" as a pseudo-type, not "true".
 * Slightly simplified semantics for the coercive typing mode.

Regards,
Nikita
  106845
September 4, 2019 09:15 ocramius@gmail.com (Marco Pivetta)
Hey Nikita,

On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. >
Huge huge huge +1! Reviewing line-by-line allowed me to go much more in detail with the feedback :-) Marco Pivetta http://twitter.com/Ocramius http://ocramius.github.com/
  106846
September 4, 2019 09:36 aegir@aegir.sexy (Aegir Leet)
On the subject of using GitHub for this RFC:

Personally, I think GitHub is a much better platform than the mailing 
list for this kind of discussion. Mailing list threads are just not very 
accessible to the average PHP user. Reading them through externals.io is 
an OK experience, but actually contributing is unnecessarily complicated.

I'd imagine most people aren't really interested in something like the 
recent "Annotating internal function argument and return types​" thread, 
but would very much like to participate in a discussion (even if it's 
just leaving a "+1") about something that affects their day-to-day work, 
like union types.

Generally, I'd split things up like this:
Truly internal work on PHP itself, things only people working *on* PHP 
(as opposed to *with* PHP) really care about -> internals.
Changes affecting regular PHP users, RFCs for adding or removing 
features and things like that -> GitHub or some other platform where 
everyone can easily contribute.


On 04.09.19 10:26, Nikita Popov wrote:
> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in this > proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
  106847
September 4, 2019 09:58 nicolas.grekas+php@gmail.com (Nicolas Grekas)


Thank you Nikita, this would be a hugely welcomed step forward! Can't wait
to get rid of all those docblock annotations!

I've some additional suggestions that would greatly help remove more
docblocks and provide engine-enforced type-safety, maybe for the "future
scope" section. We use the all in Symfony:

   - we miss a stringable union type, for `string|__toString`. This is
   required when an API need lazyness regarding the generation of some
   strings. Right now, we have no other option but using string|object.
   - we use "@return $this" quite often. On the implementation side, I'd
   suggest enforcing this at compile time only (enforce all return points are
   explicit, and written as "return $this", similarly to what is done for
   nullable/void return types.) This makes sense only for return types.
   - we use "@return static" quite often too, typically when a method
   returns a clone of the current instance. If anyone wonders, this is not the
   same as "@return self" of methods overridden in child classes. This makes
   sense only for return types.

About union types with void in them, we do use one in Symfony:
Command::execute() has "@return int|void". The reason is that if we were to
use "?int" instead, the engine would force the community to add "return
null;" where no return statement is needed at all most of the time. Right
now, we consider that the transition cost for the community is not worth
the extra boilerplate this requires. Note that there would be only one
friendly path forward: trigger a deprecation when null is returned, asking
ppl to add  "return 0;". Not sure how this should impact the proposal, but
I thought it could be worth sharing.

Thanks again,
Nicolas
  106848
September 4, 2019 11:20 matthewmatthew@gmail.com (Matthew Brown)
- Agree on the usefulness of a stringable meta-type.

- Hack supports an explicit “: this” return type (without dollar) when returning “new static(...)”. I think I might prefer that to “: static”.

- From a type perspective, I don’t understand the “int|void” idea - it might make your users’ life easier, but doesn’t accord with how PHP works (which treats void as null to consumers).

- if we’re adding to some future wish list, would love to have support for “: noreturn” when a function always throws or exits

On Sep 4, 2019, at 5:58 AM, Nicolas Grekas grekas+php@gmail.com> wrote:

>> >> https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > > Thank you Nikita, this would be a hugely welcomed step forward! Can't wait > to get rid of all those docblock annotations! > > I've some additional suggestions that would greatly help remove more > docblocks and provide engine-enforced type-safety, maybe for the "future > scope" section. We use the all in Symfony: > > - we miss a stringable union type, for `string|__toString`. This is > required when an API need lazyness regarding the generation of some > strings. Right now, we have no other option but using string|object. > - we use "@return $this" quite often. On the implementation side, I'd > suggest enforcing this at compile time only (enforce all return points are > explicit, and written as "return $this", similarly to what is done for > nullable/void return types.) This makes sense only for return types. > - we use "@return static" quite often too, typically when a method > returns a clone of the current instance. If anyone wonders, this is not the > same as "@return self" of methods overridden in child classes. This makes > sense only for return types. > > About union types with void in them, we do use one in Symfony: > Command::execute() has "@return int|void". The reason is that if we were to > use "?int" instead, the engine would force the community to add "return > null;" where no return statement is needed at all most of the time. Right > now, we consider that the transition cost for the community is not worth > the extra boilerplate this requires. Note that there would be only one > friendly path forward: trigger a deprecation when null is returned, asking > ppl to add "return 0;". Not sure how this should impact the proposal, but > I thought it could be worth sharing. > > Thanks again, > Nicolas
  106849
September 4, 2019 11:30 arnold.adaniels.nl@gmail.com (Arnold Daniels)
Instead of using `__toString` as type maybe it's better to introduce a `Stringable` interface, similar to how `__wakeup` and `__sleep` are already superseded by `Serializable`. Of course `__toString` would still continue to work (for bc), but isn't usable for type hinting.

[Arnold Daniels - Chat @ Spike](https://www.spikenow.com/?ref=spike-organic-signature&_ts=5acb3)	[5acb3]

On September 4, 2019 at 9:59 GMT, Nicolas Grekas grekas+php@gmail.com> wrote:
  106850
September 4, 2019 11:45 phpmailinglists@gmail.com (Peter Bowyer)
On Wed, 4 Sep 2019 at 12:30, Arnold Daniels nl@gmail.com>
wrote:

> Instead of using `__toString` as type maybe it's better to introduce a > `Stringable` interface, similar to how `__wakeup` and `__sleep` are already > superseded by `Serializable`.
I support that. I don't like the naming in `string|__toString`. `string|Stringable` is more readable IMO. Peter
  106851
September 4, 2019 11:54 Danack@basereality.com (Dan Ackroyd)
On Wed, 4 Sep 2019 at 10:59, Nicolas Grekas
grekas+php@gmail.com> wrote:

> we use "@return $this" quite often. On the implementation side, I'd > suggest enforcing this at compile time only (enforce all return points are > explicit, and written as "return $this"
That's doing a deeper level of thing than a type system normal does. It's enforcing the internal behaviour of a function, rather than just defining the type of the parameter that is returned.
> if we were to use "?int" instead, the engine would force the > community to add "return null;"
That sounds like a bug to me. The fact that null is returned by any function that lacks an explicit return value, is well-defined, and one of the most underrated aspects of PHP imo. Arnold Daniels wrote:
> Instead of using `__toString` as type maybe it's better to introduce > a `Stringable` interface,
Although casting things to string is probably the most common use case, if you're going to think about an RFC along those lines, covering all of the scalars would probably be a good idea, as that would allow people to use specific types for values, that can then be passed easily to functions that expect a scalar. function setRetryLimit(int $maxRetries) {...} class ImageUploadRetryLimit extends int {...} function processImage(ImageUploadRetryLimit $iurl, ....) { ... setRetryLimit($iurl); ... } That type* of stuff is completely possible currently in PHP, it's just that it's a bit painful to both write and read. cheers Dan Ack *intended
  106854
September 4, 2019 13:23 claude.pache@gmail.com (Claude Pache)
> Le 4 sept. 2019 à 13:54, Dan Ackroyd <danack@basereality.com> a écrit : > >> >> if we were to use "?int" instead, the engine would force the >> community to add "return null;" > > That sounds like a bug to me. The fact that null is returned by any > function that lacks an explicit return value, is well-defined, and one > of the most underrated aspects of PHP imo. >
This is an aspect of the eternal debate between explicit vs. implicit. In another recent thread, another aspect, namely variable initialisation, was debated. There is no “correct” solution, as it pertains much to coding style. —Claude
  106868
September 5, 2019 09:39 come@opensides.be (=?ISO-8859-1?Q?C=F4me?= Chilliet)
Le mercredi 4 septembre 2019, 10:26:09 CEST Nikita Popov a écrit :
> As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual.
Huge "no" from me on using github for discussing RFCs. Huge "yes" on the RFC content.
> * Only supports "false" as a pseudo-type, not "true".
I think it would make sense to support true as well as I’ve seen a lot of cases of functions returning either an error string or TRUE on success. So that would be string|true. Côme
  106870
September 5, 2019 10:04 brendt@stitcher.io (Brent)
Hi Côme

> Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback. Kind regards Brent On 5 Sep 2019, 11:40 +0200, Côme Chilliet <come@opensides.be>, wrote:
> Le mercredi 4 septembre 2019, 10:26:09 CEST Nikita Popov a écrit : > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > > evaluate whether this might be a better medium for RFC proposals in the > > future. It would be great if we could keep the discussion to the GitHub > > pull request for the purpose of this experiment (keep in mind that you can > > also create comments on specific lines in the proposal, not just the > > overall discussion thread!) Of course, you can also reply to this mail > > instead. The final vote will be held in the wiki as usual. > > Huge "no" from me on using github for discussing RFCs. > > Huge "yes" on the RFC content. > > > * Only supports "false" as a pseudo-type, not "true". > > I think it would make sense to support true as well as I’ve seen a lot of cases of functions returning either an error string or TRUE on success. > So that would be string|true. > > Côme > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php >
  106871
September 5, 2019 10:21 come@opensides.be (=?ISO-8859-1?Q?C=F4me?= Chilliet)
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
> > Huge "no" from me on using github for discussing RFCs. > > Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback.
Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms. PHP is already on github but it’s only a mirror, the main git repository is at git.php.net . Côme
  106872
September 5, 2019 10:50 krakjoe@gmail.com (Joe Watkins)
> Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions
This is obviously your opinion, but you haven't actually told us why this is the case, and it's not at all obvious.
> should not encourage (some would say force) people to use such platforms.
In any case it's not a choice for the contributor, internals chooses the medium and the contributor has to use it. Whether we force them to use a mailing list from last century or something from this century makes no difference with regard to choice for the contributor. Cheers Joe On Thu, 5 Sep 2019 at 12:22, Côme Chilliet <come@opensides.be> wrote:
> Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit : > > > Huge "no" from me on using github for discussing RFCs. > > > > Care to elaborate why? The majority seems to like it. Though I am also > curious about Nikita's experience with it, as he is the one having to > process the feedback. > > Because the PHP project should avoid depending on a privately owned > centralized service for its technical discussions, and should not encourage > (some would say force) people to use such platforms. > > PHP is already on github but it’s only a mirror, the main git repository > is at git.php.net . > > Côme > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
  106874
September 5, 2019 11:27 come@opensides.be (=?ISO-8859-1?Q?C=F4me?= Chilliet)
Le jeudi 5 septembre 2019, 12:50:48 CEST Joe Watkins a écrit :
> > Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions > > This is obviously your opinion, but you haven't actually told us why this > is the case, and it's not at all obvious.
I thought it was obvious, sorry. So, the problem is PHP have no control over github, and cannot know how it will evolve. GAFAM organizations have closed people accounts before, sometimes without giving reasons. I think PHP project should not let other organisations choose who can or cannot participate in PHP development.
> > should not encourage (some would say force) people to use such platforms. > > In any case it's not a choice for the contributor, internals chooses the > medium and the contributor has to use it. Whether we force them to use a > mailing list from last century or something from this century makes no > difference with regard to choice for the contributor.
With a mailing list the contributor can choose its email server and email client. He does not have to agree to term of service of a third party. https://tosdr.org/#github With github, internal chooses to delegate to github, and github (currently microsoft) chooses who can take part in the discussion and how the discussion works. (they can change the platform tomorrow and internal won’t be able to do anything about it). Côme
  106876
September 5, 2019 12:07 Danack@basereality.com (Dan Ackroyd)
On Thu, 5 Sep 2019 at 12:27, Côme Chilliet <come@opensides.be> wrote:
> > PHP have no control over github, and cannot know how it will evolve. > > (they can change the platform tomorrow and internal won’t be able to do anything about it). >
Those are hypothetically problems. But they do not appear to be currently problems. I'm pretty sure that if new problems with a medium were encountered, we could adapt to either work around them or move to a different system. And in case anyone says "some people might not be able to comment on Github" the same is true for our email lists. The signup process was apparently broken for ages, and I've seen multiple people ask for how to persuade the system to accept their messages. Which probably means there are more people who never contributed because they couldn't get past that first barrier. Actual problems I can see with having the discussion on github: i) There is no off-topic space. For example, apparently some people don't understand the RFC and could do with a brief explainer on type systems. Doing that inline to the github comments would make the on-topic discussion harder to read. ii) It means that the discussion is harder to track. However.....that is already a problem. When I was putting together the info for https://github.com/danack/RfcCodex which attempts to document why certain ideas that keep coming up haven't succeeded yet, it was a massive pain trying to track email threads to the RFC. Both of those things are not really technical problems. They are documentation problems. They would be best solved (imo) by having a paid member of staff on the PHP project writing lots of words. cheers Dan Ack
  106877
September 5, 2019 13:13 girgias@php.net (George Peter Banyard)
One idea could be to use a self hosted GitLab instance,

I'm pretty sure there are multiple ways via OAuth to connect to an
independent GitLab instance.

This would allow to have PR like thread on PHP's own infrastructure (even
though it seems the project is pretty bad at keeping it's infrastructure up
to date) while keeping control over it.

Best regards

George Peter Banyard
  106878
September 5, 2019 13:29 brendt@stitcher.io (Brent)
Let's name a few examples of large OSS projects managed on GitHub:

 - ECMA (https://github.com/tc39)
 - Rust (https://github.com/rust-lang/rust)
 - React (https://github.com/facebook/react)
 - Node (https://github.com/nodejs/node)

Also, let's not forget the hunderd of thousands of PHP packages hosted on GitHub. While GitHub might _in theory_ one day decide to stop, there is no way it will happen in practice, GitHub has proven itself as a reliable platform over the past ten years.

On the topic of self hosted GitLab: let's not reinvent the wheel. Managing your own platform will take more manpower and resources, which are better invested somewhere else — the development of PHP perhaps?

I believe GitHub is the way to go. Several large communities manage their OSS on it and have proven it works, PHP should simply do the same.

Kind regards
Brent
On 5 Sep 2019, 15:14 +0200, George Peter Banyard <girgias@php.net>, wrote:
> One idea could be to use a self hosted GitLab instance, > > I'm pretty sure there are multiple ways via OAuth to connect to an > independent GitLab instance. > > This would allow to have PR like thread on PHP's own infrastructure (even > though it seems the project is pretty bad at keeping it's infrastructure up > to date) while keeping control over it. > > Best regards > > George Peter Banyard
  106879
September 5, 2019 13:47 rowan.collins@gmail.com (Rowan Tommins)
On Thu, 5 Sep 2019 at 14:29, Brent <brendt@stitcher.io> wrote:

> I believe GitHub is the way to go. Several large communities manage their > OSS on it and have proven it works, PHP should simply do the same. >
I think this is just as simplistic as saying "never". What are these communities using it for, and what would we want to use it for? Are our requirements the same as theirs, and is GitHub the best solution for those requirements? For instance: - Rust does not use GitHub as its primary co-ordination mechanism, it has an online forum at https://internals.rust-lang.org/ - The ECMA TC39 committee has a specific membership structure and holds regular in-person meetings - There are undoubtedly more open-source communities _not_ using GitHub than who _are_ using it To be clear, I'm not saying these are reasons against GitHub in themselves, but it's a rather huge leap from "here are four repos I found" to "GitHub is the way to go"; we should be making specific arguments for why it will meet our needs, and evaluating it among all the alternatives. Regards, -- Rowan Tommins [IMSoP]
  106880
September 5, 2019 13:49 Danack@basereality.com (Dan Ackroyd)
On Thu, 5 Sep 2019 at 14:29, Brent <brendt@stitcher.io> wrote:
> > I believe GitHub is the way to go. Several large communities manage their OSS on it and have proven it works, PHP should simply do the same.
At the risk of giving advice, you will find conversations are far more productive if you ask why something can't be done, rather than just stating it will be simple. Not only will that elicit more useful information to you, it avoids being subtly insulting, as you're implying that something will be easy and people are being stupid for not doing it*. In this particular case you could have asked "what would be the problems with moving the build systems to github?", and the answers would include: * PHP has karma (aka permissions) system which github could not support. I don't know how that could be solved/avoided. * There is very strong reluctance to be dependent on other people's infrastructure for things that could take a long time to migrate. e.g. we can move discussions from one medium to another, by just telling people to go talk over there. But for actual software CI, it's a big deal to move from one system to another. btw it's not obvious that the projects you linked actually have their build systems integrated with Github. I'm pretty sure you're making assumptions about how simple their systems are. cheers Dan Ack * in pithier form: https://signalvnoise.com/posts/439-four-letter-words
  106922
September 10, 2019 07:36 come@opensides.be (=?ISO-8859-1?Q?C=F4me?= Chilliet)
Le jeudi 5 septembre 2019, 13:07:28 CEST Dan Ackroyd a écrit :
> On Thu, 5 Sep 2019 at 12:27, Côme Chilliet <come@opensides.be> wrote: > > > > PHP have no control over github, and cannot know how it will evolve. > > > > (they can change the platform tomorrow and internal won’t be able to do anything about it). > > Those are hypothetically problems. But they do not appear to be > currently problems.
The fact that PHP has no control over github is current, this is not hypothetical.
> And in case anyone says "some people might not be able to comment on > Github" the same is true for our email lists. The signup process was > apparently broken for ages, and I've seen multiple people ask for how > to persuade the system to accept their messages. Which probably means > there are more people who never contributed because they couldn't get > past that first barrier.
It’s not the same when the project can act to fix it and when the project is powerless. If github blocks someone from commenting we cannot do anything about it.
  106924
September 10, 2019 09:02 rowan.collins@gmail.com (Rowan Tommins)
On Tue, 10 Sep 2019 at 08:37, Côme Chilliet <come@opensides.be> wrote:

> > > PHP have no control over github, and cannot know how it will evolve. > > > > > > (they can change the platform tomorrow and internal won’t be able to > do anything about it). > > > > Those are hypothetically problems. But they do not appear to be > > currently problems. > > The fact that PHP has no control over github is current, this is not > hypothetical. >
The idea that the platform will change overnight in a way that makes it unusable by the project is hypothetical.
> It’s not the same when the project can act to fix it and when the project > is powerless. > If github blocks someone from commenting we cannot do anything about it. >
Are you aware of any heavy-handed moderation on github, or is this, again, a hypothetical problem? As you will see from my other responses on this thread, I'm not totally sold on github in particular, but I can see pros and cons more generally: - our own systems, fully in our control, but used by nobody else, and managed by a handful of volunteers - or: a well-established third-party system, which could change in unpredictable ways, but is widely used, and supported by hundreds of paid staff Even the mailing list relies on third-party software; I presume it gets updated regularly, and those updates could include changes in functionality we disagree with. There is a pragmatic decision to be made between building absolutely everything from scratch, and trusting some third parties, with contingency plans if that trust proves ill-founded. Regards, -- Rowan Tommins [IMSoP]
  106925
September 10, 2019 09:38 kjarli@gmail.com (Lynn)
On Tue, Sep 10, 2019 at 11:02 AM Rowan Tommins collins@gmail.com>
wrote:

> On Tue, 10 Sep 2019 at 08:37, Côme Chilliet <come@opensides.be> wrote: > > > It’s not the same when the project can act to fix it and when the project > > is powerless. > > If github blocks someone from commenting we cannot do anything about it.. > > > Are you aware of any heavy-handed moderation on github, or is this, again, > a hypothetical problem? >
As much as I like Github for these kind of things, we're forgetting about a critical part here; The US trade restrictions. Github being a company in the US, is required to block certain access to users from certain countries.. Regards, Lynn van der Berg
  106926
September 10, 2019 10:13 rowan.collins@gmail.com (Rowan Tommins)
On Tue, 10 Sep 2019 at 10:39, Lynn <kjarli@gmail.com> wrote:

> > Are you aware of any heavy-handed moderation on github, or is this, again, >> a hypothetical problem? >> > > As much as I like Github for these kind of things, we're forgetting about > a critical part here; The US trade restrictions. Github being a company in > the US, is required to block certain access to users from certain countries. >
That's a reasonable point; that (and the inverse: governments blocking access to github in response to some perceived offence) would be a potential issue to weigh up against the risks of running our own infrastructure. Regards, -- Rowan Tommins [IMSoP]
  106927
September 10, 2019 13:31 kalle@php.net (Kalle Sommer Nielsen)
Den tir. 10. sep. 2019 kl. 13.14 skrev Rowan Tommins collins@gmail.com>:
> That's a reasonable point; that (and the inverse: governments blocking > access to github in response to some perceived offence) would be a > potential issue to weigh up against the risks of running our own > infrastructure.
We are already under the US law when it comes to distribution of software as our binaries (like Windows) which contains encryption software for export since we potentially allow countries which the US have an embargo with, such as Iran (or more recently Venezuela). If I remember correctly its partly the reason why PHP is registered as an entity in the US to comply with EAR. How this is dealt with under the hood of the project and all of that I have no idea, but even as it is, we are not currently operating in a noman's land and already live with restrictions imposed by the governments of the world. -- regards, Kalle Sommer Nielsen kalle@php.net
  106894
September 6, 2019 09:28 pierre.php@gmail.com (Pierre Joye)
On Thu, Sep 5, 2019, 5:51 PM Joe Watkins <krakjoe@gmail.com> wrote:

> > Because the PHP project should avoid depending on a privately owned > centralized service for its technical discussions > > This is obviously your opinion, but you haven't actually told us why this > is the case, and it's not at all obvious. > > > should not encourage (some would say force) people to use such platforms. > > In any case it's not a choice for the contributor, internals chooses the > medium and the contributor has to use it. Whether we force them to use a > mailing list from last century or something from this century makes no > difference with regard to choice for the contributor.
I am not worrying when one uses it for draft. However anything after that should happen in the wiki and on our git as it is the correct process. I really like github, or gitlab, however int he current context, almost all contributors may lose access to github (or other US based) companies based on the US government policies or directives. Without starting a political discussion whether or not this is valid, it is definitely a big risk. A risk I am not really willing to take for php itself, if I may say. best,
  106873
September 5, 2019 11:08 rowan.collins@gmail.com (Rowan Tommins)
On Thu, 5 Sep 2019 at 11:22, Côme Chilliet <come@opensides.be> wrote:

> Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit : > > > Huge "no" from me on using github for discussing RFCs. > > > > Care to elaborate why? The majority seems to like it. Though I am also > curious about Nikita's experience with it, as he is the one having to > process the feedback. > > Because the PHP project should avoid depending on a privately owned > centralized service for its technical discussions, and should not encourage > (some would say force) people to use such platforms. > > PHP is already on github but it’s only a mirror, the main git repository > is at git.php.net . >
The "privately owned" and "centralized" parts don't bother me particularly, but there's potentially an issue in splitting the discussion between multiple platforms, with different logins required. An example of this is the discussion on this RFC about type aliases - Nikita requested it to be split into a separate discussion, but the people involved may not be subscribed to this list, and if they are, it's hard to maintain context when jumping between different forums. That conversation also highlighted a limitation of the particular platform: inline comments on GitHub PRs show as threads, but comments on the whole PR don't, so that interleaved discussions are hard to follow. Admittedly, that's true on a lot of e-mail clients as well (thanks to GMail popularising "conversations" rather than "threads"), but at least views like externals.io and news.php.net can let you navigate the tree. I wonder if a hybrid approach would work better - the RFC is a PR (perhaps against the language spec repo, as Andrea suggested) but the main discussion stays on the list. Suggestions to improve the RFC itself could be made inline on the PR by anyone who wanted to, but non-inline PR comments would be heavily discouraged so that wider comments on the proposal would stay here. Either way, I think it's interesting to experiment with different ways of working, and maybe there are other platforms we should trial as well. Regards, -- Rowan Tommins [IMSoP]
  106875
September 5, 2019 12:05 markyr@gmail.com (Mark Randall)
On 05/09/2019 12:08, Rowan Tommins wrote:
> but at least views > like externals.io and news.php.net can let you navigate the tree.
The lack of a full tree-like structure isn't the worst thing in the world. If only because it discourages certain types of individual from wanting to reply to every single sub-branch individually to get the final word. Something I am finding hard on Github, and maybe it's just because I haven't found the option yet, is finding new posts. Mark Randall
  106882
September 5, 2019 21:45 peterkokot@gmail.com (Peter Kokot)
Hello,

On Thu, 5 Sep 2019 at 12:22, Côme Chilliet <come@opensides.be> wrote:
> > Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit : > > > Huge "no" from me on using github for discussing RFCs. > > > > Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback. > > Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms. > > PHP is already on github but it’s only a mirror, the main git repository is at git.php.net . > > Côme
If I may put few random thoughts here. GitHub usage is inevitable. The interface is so good with clear discussion and review options that it would be really worthy to check it out for all PHP RFCS in the future. The main worry here is basically that one day GitHub will go offline and that discussion will be lost. Repository however will stay in the Git repo and will be "timeless". Very rarely one will want to look at the old (several 10 years) discussion comments. This is not a problem because even with having an email archive online very rarely someone will return to such discussion. RFC content is there and will be there for PHP to move "elsewhere" though if such hypothetical case comes. Tank you. -- Peter Kokot
  106893
September 6, 2019 09:10 rowan.collins@gmail.com (Rowan Tommins)
On Thu, 5 Sep 2019 at 22:45, Peter Kokot <peterkokot@gmail.com> wrote:

> GitHub usage is inevitable.
Did you use the wrong word here, or are you saying that, of all the hundreds of different platforms we could investigate, there is no chance that we would end up using something other than github?
> The interface is so good with clear discussion and review options >
As my previous message, and those of several other people, show, that is far from an established consensus. The power of an e-mail list is that different users can use different interfaces - I've yet to see a forum suggested that I would find easier than Thunderbird's tree view. There are certainly downsides to e-mail, and upsides to GitHub, but let's stay calm and evaluate our options rather than jumping at the first thing we see. Regards, -- Rowan Tommins [IMSoP]
  106895
September 6, 2019 11:31 peterkokot@gmail.com (Peter Kokot)
On Fri, 6 Sep 2019 at 11:11, Rowan Tommins collins@gmail.com> wrote:
> > On Thu, 5 Sep 2019 at 22:45, Peter Kokot <peterkokot@gmail.com> wrote: > > > GitHub usage is inevitable. > > > > Did you use the wrong word here, or are you saying that, of all the > hundreds of different platforms we could investigate, there is no chance > that we would end up using something other than github?
Plastic analogy - adding "127.0.0.1 github.com" to your /etc/hosts file shows that developer cannot bring most of the today's (PHP) projects to any working state without using it. That's what is meant by inevitable because everything open source today is either on GitHub and some minor ones scattered around custom Git repos and other Git hosting providers. PHP is already using GitHub. Is it moving to something else? No, so let's not complicate things more with other hosting providers now. USA political issues and embargo on some countries are indeed a reason I'm also willing to accept that PHP won't be using GitHub otherwise. For other reasons presented here, none is convincing enough to me honestly. -- Peter Kokot
  106896
September 6, 2019 12:43 rowan.collins@gmail.com (Rowan Tommins)
On Fri, 6 Sep 2019 at 12:31, Peter Kokot <peterkokot@gmail.com> wrote:

> > Plastic analogy - adding "127.0.0.1 github.com" to your /etc/hosts > file shows that developer cannot bring most of the today's (PHP) > projects to any working state without using it. That's what is meant > by inevitable because everything open source today is either on GitHub > and some minor ones scattered around custom Git repos and other Git > hosting providers. >
Ah, I see. Yes, having some usage of GitHub is currently pretty much inevitable in that sense. Of course, that may change eventually, just as SourceForge fell out of favour, but that's not something we need to worry about. However, projects over a certain size generally *don't* use it as their main or only discussion platform, which is what we're talking about here.
> PHP is already using GitHub. Is it moving to > something else? No, so let's not complicate things more with other > hosting providers now. >
The question is not "should PHP ban the use of GitHub for any kind of activity?" it's "should PHP abandon the discussion processes it's been using for most of its history and use GitHub as a discussion forum?". As a code collaboration platform, GitHub is pretty good, but it's not built as a discussion forum, and there are plenty of limitations to using it as one. Regards, -- Rowan Tommins [IMSoP]
  106897
September 6, 2019 13:14 benjamin.morel@gmail.com (Benjamin Morel)
> > As a code collaboration platform, GitHub is pretty good, but it's not built > as a discussion forum, and there are plenty of limitations to using it as > one.
Could we work on agreeing on a set of requirements for such a discussion "forum" to replace mailing lists? That would make it easier for anyone wanting to give it a shot, to come up with a first version that has a chance to convince everyone that this is the direction we want to follow. Using GitHub Issues as a starting point, what would you change? — Benjamin
  106898
September 6, 2019 13:39 rowan.collins@gmail.com (Rowan Tommins)
On Fri, 6 Sep 2019 at 14:14, Benjamin Morel morel@gmail.com>
wrote:

> As a code collaboration platform, GitHub is pretty good, but it's not built >> as a discussion forum, and there are plenty of limitations to using it as >> one. > > > Could we work on agreeing on a set of requirements for such a discussion > "forum" to replace mailing lists? >
That could be an interesting exercise, yes. Ideally, we should consider RFC drafting, voting, and bug tracking as well - not because we have to replace all of them with one platform, but because we might want to divide things up differently from how we do at the moment.
> Using GitHub Issues as a starting point, what would you change? >
That's pretty much the opposite of your previous question. For one thing, it's unanswerable without knowing the scope - e.g. would it just be for RFCs, or all discussions? Besides that, if we're going to introduce an anchor that we compare everything to, surely we should say "using the setup we have as a starting point, what would you change?" Until we know what we're looking for, I'm really not clear why GitHub issues should have any starting advantage over Discourse, or PHPBB, or Trac, or Phabricator, or Bugzilla, or probably hundreds of suggestions we could evaluate. Regards, -- Rowan Tommins [IMSoP]
  106899
September 6, 2019 13:53 benjamin.morel@gmail.com (Benjamin Morel)
> > That's pretty much the opposite of your previous question. For one thing, > it's unanswerable without knowing the scope - e.g. would it just be for > RFCs, or all discussions?
I'm thinking about a generic "forum" for all discussions that happen on the mailing lists right now, something that could be used for internals but also for other PHP mailing lists. Then, its scope can be expanded specifically for internals, to better discuss RFCs, etc., but that's not what I had in mind right now. Until we know what we're looking for, I'm really not clear why GitHub
> issues should have any starting advantage over Discourse, or PHPBB, or > Trac, or Phabricator, or Bugzilla, or probably hundreds of suggestions we > could evaluate.
I chose GitHub because it was mentioned several times in this thread, because it's already used to discuss PRs, and because I suspect pretty much everyone on this list either uses GitHub on a daily basis, or has at least *some *experience with GitHub issues (let's face it, I google stuff every day for many open-source projects, and most of the discussions I stumble upon are on GitHub issues/pulls), so at least we have a* starting point* that everyone knows and has learned to love or hate. Now the whole point is, if you think another software does things better, please share! Now obviously, should you hate GitHub issues from start to finish, then indeed I can understand you consider this starting point a poor choice, in this case I'd be interested to know what you dislike so much! — Benjamin
  106888
September 6, 2019 05:33 kalle@php.net (Kalle Sommer Nielsen)
Den tor. 5. sep. 2019 kl. 13.22 skrev Côme Chilliet <come@opensides.be>:
> Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms. > > PHP is already on github but it’s only a mirror, the main git repository is at git.php.net .
As an old timer around here, I feel very strongly about moving the medium and I prefer to be on the PHP.net infrastructure. Clearly one of our biggest issues with that as the PHP organization is that we poorly maintain it, and I think it could be time to rather invest into renewing that effort instead. It seems like many have an issue with subscribing to internals (I know it was broken for the longest time by using the webform), but that is something we can telegraph better on the php.net website for one thing and try to put resources into figuring this problem out to gain momentum for more developers to join the effort of internals development. Using Github for PRs and relevant discussions for that is perfectly fine with me, but switching to Github for RFCs is a big -1 from me, it is really difficult to read new comments if you are not email subscribed and even then it still remains hard to follow. The individual moderation required to also sort out irrelevant comments is also one thing I personally would not want to deal with either. -- regards, Kalle Sommer Nielsen kalle@php.net
  106900
September 6, 2019 14:47 nikita.ppv@gmail.com (Nikita Popov)
Here are my own thoughts on how the pull request discussion for union types
went...

I think the main takeaway for me is that inline comments (on specific lines
in the RFC) were really invaluable. Each comment thread discussed a
specific issue and most of them have resulted in a direct improvement to
the RFC.

Generally there was a lot of discussion of specific technical details that
we very rarely see in RFC discussions. Current RFC discussions on the
mailing list tend to be rather high level (which is fine in itself), with
nobody ever discussing the details (which is very bad).

Thinking back to https://wiki.php.net/rfc/engine_warnings, I think that RFC
could have really benefited from this discussion medium. While the mailing
list discussion ended up talking circles around more or less one single
question (undefined variables), pretty much none of the other parts of the
RFC have seen so much as a comment. I'm sure that there would be a lot more
discussion regarding specific classifications if this went up as a pull
request.

Another nice thing is that it's possible to mark a comment thread as
resolved, once the RFC has been adjusted to address the comments. That way
you don't have to see issues that were already addressed (though you can if
you like).

Having thumbs-up and thumbs-down reactions to comments was also helpful to
judge whether some comment represents a minority opinion or not, something
that is notoriously hard with current mailing list discussions (which are
almost dominated by "negative" opinions which mysteriously don't show up in
voting results).

However, while the inline comments were pleasantly insightful, the same
cannot be said for the top-level comments on the pull request. The majority
of them was borderline off-topic. While some in principle interesting
discussion happened there, it simply didn't belong in the RFC thread for
union types. The top-level comments also suffered from a lack of threading
-- I would have been less bothered about tangential discussions if they
were threaded. (To be fair: I use gmail, so I don't get threading on the
mailing list either.)

If this kind of discussion behavior is representative, then I would suggest
a workflow alone the following lines...

* RFCs are submitted as PRs on GitHub, but must be announced on the mailing
list.
* The PR description should have a fat warning that top-level comments
belong on the mailing list. We can mark all top-level comments on PRs as
"off-topic" as a matter of general policy.
* Top-level commentary stays on the mailing list.

This is a shift from what I originally had in mind (completely moving the
RFC process to GitHub), towards providing a way for more detailed and
specific feedback on the RFC text.

Regarding GitHub as a 3rd party. I think there are a few things to
considered:
 * We're already very heavily reliant on GitHub. Most of my day-to-day
interaction with PHP core development is via GitHub and most of the
day-to-day decisions also happen there. Only the major stuff everhits this
mailing list.
 * The RFC repo would of course be hosted on git.php.net as usual and only
be mirrored to GitHub.
 * GitHub would not be the exclusive venue for RFC discussion. All RFCs are
still announced on internals and the top-level discussion can and should
still happen here.

Disclaimer: I'm trying to draw conclusions here from an experiment with a
sample size of 1, which may not be representative. Union types are a pretty
significant proposal (and also the first one to be on GH), and other,
smaller proposals might well have different discussion dynamics.

Regards,
Nikita

On Thu, Sep 5, 2019 at 12:22 PM Côme Chilliet <come@opensides.be> wrote:

> Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit : > > > Huge "no" from me on using github for discussing RFCs. > > > > Care to elaborate why? The majority seems to like it. Though I am also > curious about Nikita's experience with it, as he is the one having to > process the feedback. > > Because the PHP project should avoid depending on a privately owned > centralized service for its technical discussions, and should not encourage > (some would say force) people to use such platforms. > > PHP is already on github but it’s only a mirror, the main git repository > is at git.php.net . > > Côme > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >
  106901
September 8, 2019 10:42 brendt@stitcher.io (Brent)
Happy to read your thoughts on this Nikita, I think you've drawn some good conclusions.

If I may add a thought or two:

I wouldn't make any final decisions based on one experiment, especially a big RFC as this one. I think the GitHub discussion got extra attention because it was the first one, and because of the scope of the RFC. I would try to have two or three more RFCs discussed on GitHub, maybe smaller ones?

Second, there are more things we can do in order to keep the main thread on topic. Three things come to mind:

 - We could add community guidelines, clearly stating that RFC comments should stay on topic
 - People could be appointed to moderate the comments, allowing contributors to focus on the code instead of community management
 - Conversations on GitHub can be locked as a last measurement. Repository members can still comment.

I fear that separating the main discussion from the PR will cause unnecessary confusion: important, generals remarks could be made on the "main thread", and I think there's value in keeping these remarks together with everything else.

Kind regards
Brent
On 6 Sep 2019, 16:48 +0200, Nikita Popov ppv@gmail.com>, wrote:
> Here are my own thoughts on how the pull request discussion for union types > went... > > I think the main takeaway for me is that inline comments (on specific lines > in the RFC) were really invaluable. Each comment thread discussed a > specific issue and most of them have resulted in a direct improvement to > the RFC. > > Generally there was a lot of discussion of specific technical details that > we very rarely see in RFC discussions. Current RFC discussions on the > mailing list tend to be rather high level (which is fine in itself), with > nobody ever discussing the details (which is very bad). > > Thinking back to https://wiki.php.net/rfc/engine_warnings, I think that RFC > could have really benefited from this discussion medium. While the mailing > list discussion ended up talking circles around more or less one single > question (undefined variables), pretty much none of the other parts of the > RFC have seen so much as a comment. I'm sure that there would be a lot more > discussion regarding specific classifications if this went up as a pull > request. > > Another nice thing is that it's possible to mark a comment thread as > resolved, once the RFC has been adjusted to address the comments. That way > you don't have to see issues that were already addressed (though you can if > you like). > > Having thumbs-up and thumbs-down reactions to comments was also helpful to > judge whether some comment represents a minority opinion or not, something > that is notoriously hard with current mailing list discussions (which are > almost dominated by "negative" opinions which mysteriously don't show up in > voting results). > > However, while the inline comments were pleasantly insightful, the same > cannot be said for the top-level comments on the pull request. The majority > of them was borderline off-topic. While some in principle interesting > discussion happened there, it simply didn't belong in the RFC thread for > union types. The top-level comments also suffered from a lack of threading > -- I would have been less bothered about tangential discussions if they > were threaded. (To be fair: I use gmail, so I don't get threading on the > mailing list either.) > > If this kind of discussion behavior is representative, then I would suggest > a workflow alone the following lines... > > * RFCs are submitted as PRs on GitHub, but must be announced on the mailing > list. > * The PR description should have a fat warning that top-level comments > belong on the mailing list. We can mark all top-level comments on PRs as > "off-topic" as a matter of general policy. > * Top-level commentary stays on the mailing list. > > This is a shift from what I originally had in mind (completely moving the > RFC process to GitHub), towards providing a way for more detailed and > specific feedback on the RFC text. > > Regarding GitHub as a 3rd party. I think there are a few things to > considered: > * We're already very heavily reliant on GitHub. Most of my day-to-day > interaction with PHP core development is via GitHub and most of the > day-to-day decisions also happen there. Only the major stuff everhits this > mailing list. > * The RFC repo would of course be hosted on git.php.net as usual and only > be mirrored to GitHub. > * GitHub would not be the exclusive venue for RFC discussion. All RFCs are > still announced on internals and the top-level discussion can and should > still happen here. > > Disclaimer: I'm trying to draw conclusions here from an experiment with a > sample size of 1, which may not be representative. Union types are a pretty > significant proposal (and also the first one to be on GH), and other, > smaller proposals might well have different discussion dynamics. > > Regards, > Nikita > > On Thu, Sep 5, 2019 at 12:22 PM Côme Chilliet <come@opensides.be> wrote: > > > Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit : > > > > Huge "no" from me on using github for discussing RFCs. > > > > > > Care to elaborate why? The majority seems to like it. Though I am also > > curious about Nikita's experience with it, as he is the one having to > > process the feedback. > > > > Because the PHP project should avoid depending on a privately owned > > centralized service for its technical discussions, and should not encourage > > (some would say force) people to use such platforms. > > > > PHP is already on github but it’s only a mirror, the main git repository > > is at git.php.net . > > > > Côme > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: http://www.php.net/unsub.php > > > >
  106902
September 8, 2019 11:55 rowan.collins@gmail.com (Rowan Tommins)
On 8 September 2019 11:42:07 BST, Brent <brendt@stitcher.io> wrote:
> - We could add community guidelines, clearly stating that RFC comments >should stay on topic > - People could be appointed to moderate the comments, allowing >contributors to focus on the code instead of community management > - Conversations on GitHub can be locked as a last measurement. >Repository members can still comment. > >I fear that separating the main discussion from the PR will cause >unnecessary confusion: important, generals remarks could be made on the >"main thread", and I think there's value in keeping these remarks >together with everything else.
I'm sceptical of that as a solution for two reasons: Firstly, the conversations weren't necessarily wrong, they were just a slight drift of topic. The problem is not removing them from the PR, it's encouraging them to move somewhere else. I fear that saying "sign up to the mailing list and repeat that point in a completely different format" will be taken up less than "make a new thread on this same list/forum". Secondly, the problem is partly a technical one: GitHub PRs have very poor support for replies and sub-threads, so even on-topic discussions that don't relate to a specific part of the text are hard to follow. I think Nikita's suggestion is a good one: use a PR for making targeted suggestions to the RFC text itself, but raise the general points on the main list. That might even include saying "I've added a handful of suggestions relating to X" and discussing the wider issue that links them. I agree it would be interesting to experiment further, and I think this hybrid approach would be a good one to try next. Regards, -- Rowan Tommins [IMSoP]
  106923
September 10, 2019 08:02 come@opensides.be (=?ISO-8859-1?Q?C=F4me?= Chilliet)
Le vendredi 6 septembre 2019, 16:47:52 CEST Nikita Popov a écrit :
> * GitHub would not be the exclusive venue for RFC discussion. All RFCs are > still announced on internals and the top-level discussion can and should > still happen here.
My remark on the mailing list regarding string|true was unanswered and I had to go over to github to see that the concern was discussed there. This is the kind of problems splitting the discussion will cause, people will have to check both places or miss things. Regarding the github PR as a practical point of view, I find it hard to check if there are new messages since last time I visited the page. Is there any way to browse messages by time? (backwards ideally). Côme
  107204
September 18, 2019 13:33 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in > this proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null > syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
Heads up, two weeks have passed, so this may now go to voting... I believe relative to my original draft the main change that has happened as a result of the discussion is the use of T1|T2|null syntax instead of ?(T1|T2) syntax for nullable types. ?T becomes an alias for T|null. People felt fairly strongly that while ?T is a nice shorthand for a common case, unions should use the T1|T2|null syntax that both reads better and is already well-established from phpdoc. I figured I should mentioned this for people who haven't been following the GitHub thread... Nikita
  107224
September 18, 2019 23:03 bjorn.x.larsson@telia.com (=?UTF-8?Q?Bj=c3=b6rn_Larsson?=)
Den 2019-09-18 kl. 15:33, skrev Nikita Popov:
> On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote: > >> Hi internals, >> >> I'd like to start the discussion on union types again, with a new proposal: >> >> Pull Request: https://github.com/php/php-rfcs/pull/1 >> Rendered Proposal: >> https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md >> >> As an experiment, I'm submitting this RFC as a GitHub pull request, to >> evaluate whether this might be a better medium for RFC proposals in the >> future. It would be great if we could keep the discussion to the GitHub >> pull request for the purpose of this experiment (keep in mind that you can >> also create comments on specific lines in the proposal, not just the >> overall discussion thread!) Of course, you can also reply to this mail >> instead. The final vote will be held in the wiki as usual. >> >> Relatively to the previous proposal by Bob&Levi ( >> https://wiki.php.net/rfc/union_types), I think the main differences in >> this proposal are: >> * Updated to specify interaction with new language features, like full >> variance and property types. >> * Updated for the use of the ?Type syntax rather than the Type|null >> syntax. >> * Only supports "false" as a pseudo-type, not "true". >> * Slightly simplified semantics for the coercive typing mode. >> >> Regards, >> Nikita >> > Heads up, two weeks have passed, so this may now go to voting... > > I believe relative to my original draft the main change that has happened > as a result of the discussion is the use of T1|T2|null syntax instead of > ?(T1|T2) syntax for nullable types. ?T becomes an alias for T|null. People > felt fairly strongly that while ?T is a nice shorthand for a common case, > unions should use the T1|T2|null syntax that both reads better and is > already well-established from phpdoc. > > I figured I should mentioned this for people who haven't been following the > GitHub thread... > > Nikita
Hi Nikita, Planned to comment on ? vs null on Github, but here it goes. Advantage with ? syntax was in my eyes that it's clear it's not a stand-alone type or what I should call it, but rather a "change/variant" to an existing type. Having it stated as T1|T2|null lead it a bit in the direction of being a more stand-alone type. Anyway, I'm quite happy with the proposal as it is. So thanks for the excellent work! r//Björn L
  107304
September 23, 2019 20:14 benjamin.morel@gmail.com (Benjamin Morel)
Hi Nikita,

Thank you for your work on this, I look forward to being able to use union
types!

The one thing that bothers me a bit is introducing yet another
inconsistency by supporting false and not true as a type. Sure, I do get
the need for false as a valid type for historical reasons, but at the same
time I thought it would be interesting to see if true was being used as
well in the wild, before closing the door on it.

Therefore I reviewed all Composer packages with > 1 million downloads (1211
packages total), and here are the results:

*31 packages
<https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-packages-txt>
use at least one @var,  @param or @return containing true as a type.*

- there are many illegitimate use cases, such as @return true
<https://github.com/drupal/core/blob/c447ebb6a89149ef86e0af06bc00fec63027ec0d/modules/views/src/ViewExecutable.php#L719>
, @return bool|true
<https://github.com/php-http/message/blob/144d13ba024d0c597830636907144bc7c23f50dd/src/Authentication/AutoBasicAuth.php#L23>,
@return true|false
<https://github.com/zendframework/zend-form/blob/ff4c9ec32d141e4ac622c2aa40a8c9eb77a40725/src/Annotation/AnnotationBuilder.php#L412>,
or even @return bool|true|false
<https://github.com/slevomat/coding-standard/blob/964a3ff08d7ee924510c3b705f826165adaa97fe/tests/Sniffs/Namespaces/data/fullyQualifiedClassNameInAnnotationNoErrors.php#L41>
- there are, however, quite a few potentially legitimate use cases. Among
others:

- @return true|WP_Error
<https://github.com/johnpbloch/wordpress-core/blob/4a96ef28019f3f529465eac71c718afea2c88e8b/wp-includes/rest-api/class-wp-rest-request.php#L641>
 (*johnpbloch/wordpress-core*) : "True if the JSON data was passed or no
JSON data was provided, WP_Error if invalid JSON was passed."
- @return true|string
<https://github.com/johnpbloch/wordpress-core/blob/bd27b2bc49483b796fdb228f9b0614f9880823ee/wp-includes/ms-load.php#L72>
 (*johnpbloch/wordpress-core*) : "Returns true on success, or drop-in file
to include."
- @return string|true
<https://github.com/PHP-DI/PHP-DI/blob/b891248f04c594dae4c07650bc0815d0dc81bacb/src/Compiler/Compiler.php#L317>
 (*php/di*) : True if compilable, string explaining why if not
- @return string|true
<https://github.com/zendframework/zend-authentication/blob/86d6553e1cfe5ed2bde53cbea9fd162c6875a9c8/src/Adapter/Ldap.php#L378>
 (*zendframework/zend-authentication*) : True if successfull, error message
if not
- @param int|true
<https://github.com/yiisoft/yii2/blob/fdbe45af3a4f923a97b718d8694981e1e195403d/framework/db/Query.php#L1266>
(*yiisoft/yii2*) : "Use 0 to indicate that..., Use a negative number to
indicate that..., Use boolean `true` to indicate that ..."
- @param callable|resource|string|true|null
<https://github.com/symfony/symfony/blob/242f24427d34bd47ef0e3a027a5a4e979d713c6f/src/Symfony/Component/VarDumper/Dumper/AbstractDumper.php#L117>
(*symfony/symfony*) : "A line dumper callable, an opened stream, an output
path or true to return the dump"

Full list here
<https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-types-txt>
..

So although true as a type does not have the same historical background as
false, it does seem that it's being used by enough packages to be worth
considering; what do you think?

— Benjamin
  107305
September 23, 2019 23:37 andreas@dqxtech.net (Andreas Hennings)
To have any information content, a variable (or return value, or
parameter) needs to have more than one possible value.
If one of those values is TRUE, the most natural alternative value
would be FALSE.

But should this really be baked into the language?
Should really "try to be smart" here?
Isn't it better to aim for completeness and consistency, and provide a
"round" developer experience with the least amount of surprise?

The same could be asked about other restrictions, e.g. false|null.

Ideally we should have a type system with the properties of a
mathematical space of some sort.
Artificial restrictions would ruin this.
It won't be perfect, because the different types are not "symmetric".
But we can at least try to make it as consistent as possible.

Some use cases I can think of:


## Use case: true[] assoc

I like to use true[] for map-like associative arrays.
So, $map[$key] = true;
For each key, the array either has the value TRUE, or no value at all.
No need to store false values.

An extended use case might be a (tree = true|tree[]), where each leaf
would have the value true.

Does this mean we need true for parameters and return types?
It will be rare. Especially in the first case of a simple true[],
there is no point returning the value in a variable, if we not also
allow FALSE for "not found".


## Use case: Generics

However, what if the code is written in a generic way? E.g. if PHP
gets native generics one day, or some userland generics library based
on code generation.

For an iterator over true[], the return value for ->current() would be
"true|null". (I just tried, it is null and not false, see
https://3v4l.org/PvWWl).

For a generic method receiving values from the array, the parameter
type would be "true".

It would be really disappointing if a tool that attempts to give us
generics would have to work around arbitrary limitations of the typing
system.


## Use case: Variance

Imagine a base class or interface method which returns boolean for
success / failure.

We extend this method, and our implementation is always successful, so
we will always return true.
We can indicate this in the type hint.
This indicates that calling code does not need to check the return
value, it can simply assume that it was successful.

interface Operation {
  function run(): bool
}

interface SafeOperation extends Operation {
  function run(): true  // always successful.
}


## Use case: Code generation and reflection tools

I would say that for any automated tool, special cases of disallowed
types make things more complicated.
This could be tools for code generation, parsing, or some kind of
"type reflection" with type arithmetics.


## Use case: Legacy code

Finally, in some cases it is irrelevant whether the combination of
types makes sense or not.
We simply want to type-harden legacy code.


On Mon, 23 Sep 2019 at 22:15, Benjamin Morel morel@gmail.com> wrote:
> > Hi Nikita, > > Thank you for your work on this, I look forward to being able to use union > types! > > The one thing that bothers me a bit is introducing yet another > inconsistency by supporting false and not true as a type. Sure, I do get > the need for false as a valid type for historical reasons, but at the same > time I thought it would be interesting to see if true was being used as > well in the wild, before closing the door on it. > > Therefore I reviewed all Composer packages with > 1 million downloads (1211 > packages total), and here are the results: > > *31 packages > <https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-packages-txt> > use at least one @var, @param or @return containing true as a type.* > > - there are many illegitimate use cases, such as @return true > <https://github.com/drupal/core/blob/c447ebb6a89149ef86e0af06bc00fec63027ec0d/modules/views/src/ViewExecutable.php#L719> > , @return bool|true > <https://github.com/php-http/message/blob/144d13ba024d0c597830636907144bc7c23f50dd/src/Authentication/AutoBasicAuth.php#L23>, > @return true|false > <https://github.com/zendframework/zend-form/blob/ff4c9ec32d141e4ac622c2aa40a8c9eb77a40725/src/Annotation/AnnotationBuilder.php#L412>, > or even @return bool|true|false > <https://github.com/slevomat/coding-standard/blob/964a3ff08d7ee924510c3b705f826165adaa97fe/tests/Sniffs/Namespaces/data/fullyQualifiedClassNameInAnnotationNoErrors.php#L41> > - there are, however, quite a few potentially legitimate use cases. Among > others: > > - @return true|WP_Error > <https://github.com/johnpbloch/wordpress-core/blob/4a96ef28019f3f529465eac71c718afea2c88e8b/wp-includes/rest-api/class-wp-rest-request.php#L641> > (*johnpbloch/wordpress-core*) : "True if the JSON data was passed or no > JSON data was provided, WP_Error if invalid JSON was passed." > - @return true|string > <https://github.com/johnpbloch/wordpress-core/blob/bd27b2bc49483b796fdb228f9b0614f9880823ee/wp-includes/ms-load.php#L72> > (*johnpbloch/wordpress-core*) : "Returns true on success, or drop-in file > to include." > - @return string|true > <https://github.com/PHP-DI/PHP-DI/blob/b891248f04c594dae4c07650bc0815d0dc81bacb/src/Compiler/Compiler.php#L317> > (*php/di*) : True if compilable, string explaining why if not > - @return string|true > <https://github.com/zendframework/zend-authentication/blob/86d6553e1cfe5ed2bde53cbea9fd162c6875a9c8/src/Adapter/Ldap.php#L378> > (*zendframework/zend-authentication*) : True if successfull, error message > if not > - @param int|true > <https://github.com/yiisoft/yii2/blob/fdbe45af3a4f923a97b718d8694981e1e195403d/framework/db/Query.php#L1266> > (*yiisoft/yii2*) : "Use 0 to indicate that..., Use a negative number to > indicate that..., Use boolean `true` to indicate that ..." > - @param callable|resource|string|true|null > <https://github.com/symfony/symfony/blob/242f24427d34bd47ef0e3a027a5a4e979d713c6f/src/Symfony/Component/VarDumper/Dumper/AbstractDumper.php#L117> > (*symfony/symfony*) : "A line dumper callable, an opened stream, an output > path or true to return the dump" > > Full list here > <https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-types-txt> > . > > So although true as a type does not have the same historical background as > false, it does seem that it's being used by enough packages to be worth > considering; what do you think? > > — Benjamin
  107316
September 24, 2019 17:24 claude.pache@gmail.com (Claude Pache)
> Le 23 sept. 2019 à 22:14, Benjamin Morel morel@gmail.com> a écrit : > > > So although true as a type does not have the same historical background as > false, it does seem that it's being used by enough packages to be worth > considering; what do you think? >
Considering to support `true` will raise several questions. They can be resolved, of course, but that will put more design decisions over the shoulders of the Union Types RFC: * The first one, of course, is why the list of possible literal values is *still* restricted. Because now you’ll want to use other literal values, e.g., `0` or `"some-string"`. * Another issue is the relation between `Foo | false | true` and `Foo | bool`: whether the first form is allowed; whether they are equivalent w.r.t. variance rules; how casting to `true | false` works (or doesn’t work) under `declare(strict_types=0)`. * Also, if we support more than two literal values as types (and especially when not all of them indicate failure), we’ll want more strongly to have them to appear as standalone types, e.g., `true | null`; and then, naturally, just `true` and just `null`. At this point, the question will be raised whether it is desirable to have both `null` and `void` as return types. ------------- The choice of supporting precisely the two literal values `null` and `false` is not arbitrary: They are the two values that are the most often used as sentinel values (for indicating failure or absence). It is true that `true` is also sometimes used as sentinel value (more rarely and not among the internal functions), but the same can be said of other literal values (one of your examples includes `0`). —Claude
  107317
September 24, 2019 20:05 pollita@php.net (Sara Golemon)
On Tue, Sep 24, 2019 at 12:24 PM Claude Pache pache@gmail.com>
wrote:
> The choice of supporting precisely the two literal values `null` and `false`
> is not arbitrary: They are the two values that are the most often used as > sentinel values (for indicating failure or absence). It is true that `true` is
> also sometimes used as sentinel value (more rarely and not among the > internal functions), but the same can be said of other literal values > (one of your examples includes `0`). > While I personally think `false` makes sense as an allowed "type", I also
don't want to see the union types RFC get held up on such a tiny detail. I would propose either of the following alternatives: 1/ Remove `false` from the proposal. It can always be added at a later time, but not taken away. 2/ Make this detail a sub-vote. I would suggest that this sub-vote should also be subject to a 2/3 majority in order to pass. -Sara
  107320
September 26, 2019 00:17 larry@garfieldtech.com ("Larry Garfield")
On Tue, Sep 24, 2019, at 3:05 PM, Sara Golemon wrote:
> On Tue, Sep 24, 2019 at 12:24 PM Claude Pache pache@gmail.com> > wrote: > > The choice of supporting precisely the two literal values `null` and > `false` > > is not arbitrary: They are the two values that are the most often used as > > sentinel values (for indicating failure or absence). It is true that > `true` is > > also sometimes used as sentinel value (more rarely and not among the > > internal functions), but the same can be said of other literal values > > (one of your examples includes `0`). > > > While I personally think `false` makes sense as an allowed "type", I also > don't want to see the union types RFC get held up on such a tiny detail. > > I would propose either of the following alternatives: > 1/ Remove `false` from the proposal. It can always be added at a later > time, but not taken away. > 2/ Make this detail a sub-vote. I would suggest that this sub-vote should > also be subject to a 2/3 majority in order to pass. > > -Sara
I would tend to agree with Sara. That seems to be the only issue of contention and it is (AFAIK) reasonably straightforward to add "later" without blocking the rest of Union types. It would mean some functions wouldn't be able to get a fully accurate return type yet but... they've survived this long without them, they can wait a bit longer while this gets settled and/or some even more robust alternative is found. --Larry Garfield
  107323
September 26, 2019 07:42 benjamin.morel@gmail.com (Benjamin Morel)
> I would tend to agree with Sara. That seems to be the only issue of contention and it is (AFAIK) reasonably straightforward to add "later"
without blocking the rest of Union types. It would mean some functions wouldn't be able to get a fully accurate return type yet but... they've survived this long without them, they can wait a bit longer while this gets settled and/or some even more robust alternative is found. Note that I'm personally alright with only bool as a type (i.e. strpos() : int|bool). I was just pointing out that introducing false alone is adding yet another inconsistency to the language, and it may not need this right now. If everything is to be voted at once, I'd suggest a split vote as follows: - vote 1: introduce union types (base proposal with bool only) : yes / no - vote 2: add false/true types: false only / false and true / no — Benjamin
  107324
September 26, 2019 08:06 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Sep 24, 2019 at 10:06 PM Sara Golemon <pollita@php.net> wrote:

> On Tue, Sep 24, 2019 at 12:24 PM Claude Pache pache@gmail.com> > wrote: > > The choice of supporting precisely the two literal values `null` and > `false` > > is not arbitrary: They are the two values that are the most often used as > > sentinel values (for indicating failure or absence). It is true that > `true` is > > also sometimes used as sentinel value (more rarely and not among the > > internal functions), but the same can be said of other literal values > > (one of your examples includes `0`). > > > While I personally think `false` makes sense as an allowed "type", I also > don't want to see the union types RFC get held up on such a tiny detail. > > I would propose either of the following alternatives: > 1/ Remove `false` from the proposal. It can always be added at a later > time, but not taken away. > 2/ Make this detail a sub-vote. I would suggest that this sub-vote should > also be subject to a 2/3 majority in order to pass. > > -Sara >
This RFC is currently held up by a lack of implementation. Once that is done, the RFC will go forward as-is (barring any novel concerns). Because I consider it an important part of the overall proposal (*), I will neither remove the false type, nor split it into a separate vote. People may vote against the whole RFC if they disagree with this aspect, or any other aspect of the proposal. Regards, Nikita (*) While certainly not the primary reason for why we should support union types, the reason why I brought this proposal forward at this time specifically, is that the lack of union types is a blocker for my pet project of providing comprehensive type annotations for internal functions. Supporting "false" is strictly necessary for this purpose, because it is part of nearly all unions as far as internal functions are concerned.
  107326
September 26, 2019 09:21 kjarli@gmail.com (Lynn)
On Thu, Sep 26, 2019 at 10:06 AM Nikita Popov ppv@gmail.com> wrote:

> > This RFC is currently held up by a lack of implementation. Once that is > done, the RFC will go forward as-is (barring any novel concerns). Because I > consider it an important part of the overall proposal (*), I will neither > remove the false type, nor split it into a separate vote. People may vote > against the whole RFC if they disagree with this aspect, or any other > aspect of the proposal. > > Regards, > Nikita > > (*) While certainly not the primary reason for why we should support union > types, the reason why I brought this proposal forward at this time > specifically, is that the lack of union types is a blocker for my pet > project of providing comprehensive type annotations for internal functions. > Supporting "false" is strictly necessary for this purpose, because it is > part of nearly all unions as far as internal functions are concerned. >
Hi, What if a 'new' type is added specifically to identify this case? Instead of giving it `int|false`, it could be `int|failed`, where `failed` would allow a single value only: false. I like it more than naming the return type `false`. I'm not sure this type should be declarable in user-land, meaning it's limited to the core and a marker for legacy. `failed` (or name it whatever) and `false` would still be identical, yet has a more distinct identification and does not imply there could also a type called `true`. Figured I'd toss in here as an out-of-the-box idea. Regards, Lynn van der Berg
  107328
September 26, 2019 09:54 claude.pache@gmail.com (Claude Pache)
> Le 26 sept. 2019 à 11:21, Lynn <kjarli@gmail.com> a écrit : > > > I'm not sure this type should be declarable in user-land, meaning it's limited to the core and a marker for legacy.
In general, we should avoid to give builtin code functionality that can’t be reproduced in userland, unless there is a good reason for it. For the particular case, people may want to reproduce in userland an improved or corrected version of a given internal function, and they need be able to give their function the same return type information than the original. —Claude
  107329
September 26, 2019 10:03 cmbecker69@gmx.de ("Christoph M. Becker")
On 26.09.2019 at 10:06, Nikita Popov wrote:

> This RFC is currently held up by a lack of implementation. Once that is > done, the RFC will go forward as-is (barring any novel concerns). Because I > consider it an important part of the overall proposal (*), I will neither > remove the false type, nor split it into a separate vote. People may vote > against the whole RFC if they disagree with this aspect, or any other > aspect of the proposal. > > (*) While certainly not the primary reason for why we should support union > types, the reason why I brought this proposal forward at this time > specifically, is that the lack of union types is a blocker for my pet > project of providing comprehensive type annotations for internal functions. > Supporting "false" is strictly necessary for this purpose, because it is > part of nearly all unions as far as internal functions are concerned.
Maybe an option would be to introduce "falsable" types by marking them with an exclamation mark in front? While I'm generally not in favor of having failure signalled by returning false (except for bool functions), and it wouldn't solve the issue generally (some functions can return multiple types in addition to false), it appears to fit to be able to mark nullable types with a question mark. By the way, I wouldn't call that a "pet project". In my opinion, that is a really big improvement, and I hope that these stub files can become the single source of truth in the not too distant future. Regards, Christoph
  107333
September 26, 2019 11:17 benjamin.morel@gmail.com (Benjamin Morel)
> While certainly not the primary reason for why we should support union types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet project of providing comprehensive type annotations for internal functions. Supporting "false" is strictly necessary for this purpose, because it is part of nearly all unions as far as internal functions are concerned. How is that blocked by adding only "bool"? — Benjamin
  107351
September 30, 2019 10:18 bjorn.x.larsson@telia.com (=?UTF-8?Q?Bj=c3=b6rn_Larsson?=)
Den 2019-09-26 kl. 10:06, skrev Nikita Popov:

> On Tue, Sep 24, 2019 at 10:06 PM Sara Golemon <pollita@php.net> wrote: > >> On Tue, Sep 24, 2019 at 12:24 PM Claude Pache pache@gmail.com> >> wrote: >>> The choice of supporting precisely the two literal values `null` and >> `false` >>> is not arbitrary: They are the two values that are the most often used as >>> sentinel values (for indicating failure or absence). It is true that >> `true` is >>> also sometimes used as sentinel value (more rarely and not among the >>> internal functions), but the same can be said of other literal values >>> (one of your examples includes `0`). >>> >> While I personally think `false` makes sense as an allowed "type", I also >> don't want to see the union types RFC get held up on such a tiny detail. >> >> I would propose either of the following alternatives: >> 1/ Remove `false` from the proposal. It can always be added at a later >> time, but not taken away. >> 2/ Make this detail a sub-vote. I would suggest that this sub-vote should >> also be subject to a 2/3 majority in order to pass. >> >> -Sara >> > This RFC is currently held up by a lack of implementation. Once that is > done, the RFC will go forward as-is (barring any novel concerns). Because I > consider it an important part of the overall proposal (*), I will neither > remove the false type, nor split it into a separate vote. People may vote > against the whole RFC if they disagree with this aspect, or any other > aspect of the proposal. > > Regards, > Nikita > > (*) While certainly not the primary reason for why we should support union > types, the reason why I brought this proposal forward at this time > specifically, is that the lack of union types is a blocker for my pet > project of providing comprehensive type annotations for internal functions. > Supporting "false" is strictly necessary for this purpose, because it is > part of nearly all unions as far as internal functions are concerned.
Hi Nikita, Given the feedback on 23/9 from B Morel regarding occurrence of true as a return value, would you then consider adding true as a valid return type in unions? r//Björn L
  107626
October 22, 2019 09:36 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in > this proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null > syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
An implementation of this proposal is now available at https://github.com/php/php-src/pull/4838. As the implementation didn't turn up any new issues, I think it's time to move this RFC forward to voting. Nikita
  107627
October 22, 2019 12:38 dmitry@zend.com (Dmitry Stogov)
Hi Nikita,

Can you please give me one/two days, before starting the voting, for implementation review (at least until October 25),

Thanks. Dmitry.

________________________________
From: Nikita Popov ppv@gmail.com>
Sent: Tuesday, October 22, 2019 12:36
To: PHP internals <internals@lists.php.net>
Subject: [PHP-DEV] Re: [RFC] Union Types v2

On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in > this proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null > syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
An implementation of this proposal is now available at https://github.com/php/php-src/pull/4838. As the implementation didn't turn up any new issues, I think it's time to move this RFC forward to voting. Nikita CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.
  107647
October 23, 2019 15:42 dmitry@zend.com (Dmitry Stogov)
Hi Nikita,

I checked the Union Type implementation, and it more or less good. I mean, it implements the RFC in almost the best way.
However, as I don't like the RFC itself. Especially, unions of multiple classes and interference with type variance, I'll vote against this.

Actually, I think PHP already took wrong direction implementing "typed references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference assignment) performance degradation.
Anyone may rerun my benchmarks https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9

Thanks. Dmitry.

________________________________
From: Dmitry Stogov <dmitry@zend.com>
Sent: Tuesday, October 22, 2019 15:38
To: Nikita Popov ppv@gmail.com>; PHP internals <internals@lists.php..net>
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2

Hi Nikita,

Can you please give me one/two days, before starting the voting, for implementation review (at least until October 25),

Thanks. Dmitry.

________________________________
From: Nikita Popov ppv@gmail.com>
Sent: Tuesday, October 22, 2019 12:36
To: PHP internals <internals@lists.php.net>
Subject: [PHP-DEV] Re: [RFC] Union Types v2

On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in > this proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null > syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
An implementation of this proposal is now available at https://github.com/php/php-src/pull/4838. As the implementation didn't turn up any new issues, I think it's time to move this RFC forward to voting. Nikita CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.
  107656
October 24, 2019 12:47 rowan.collins@gmail.com (Rowan Tommins)
Hi Dmitry,

On Wed, 23 Oct 2019 at 16:43, Dmitry Stogov <dmitry@zend.com> wrote:

> Actually, I think PHP already took wrong direction implementing "typed > references" and "type variance". > Introducing more "typing", we suggest using it, but this "typing" comes > with a huge cost of run-time checks. > From 10% (scalar type hint of argument) to 3 times (typed reference > assignment) performance degradation. > Anyone may rerun my benchmarks > https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9 >
I think this performance impact is a real concern; PHP is the only language I know of which implements type checks entirely at run-time in production code, and we should ask ourselves if that's definitely the right approach. Would it be possible, at least in principle, to build a static analysis tool which could prove that certain type checks would never fail, and prime OpCache with code that leaves them out? As I understand it, this is how Dart works: the compiler only emits run-time checks for assertions it can't prove true by static analysis. The simpler idea I had in this area was caching what type checks a value had passed, either on each zval or perhaps just at the class level, so that checking the same type again would be much faster, even if it was a complex union with multiple interfaces. Regards, -- Rowan Tommins [IMSoP]
  107658
October 24, 2019 14:01 marandall@php.net (Mark Randall)
On 24/10/2019 13:47, Rowan Tommins wrote:
> The simpler idea I had in this area was caching what type checks a value > had passed, either on each zval or perhaps just at the class level, so that > checking the same type again would be much faster, even if it was a complex > union with multiple interfaces.
Last night I was playing about with caching the last passed interface as part of the instanceof_function. https://gist.github.com/marandall/38d7dba6600889d897f2c8fc57532f31 When used in the likes of loops which are performing the same check over and over, it yields about a 18% increase when using a 3 layer deep nested interface. It currently just stores a cache of the last match in the class entry so it would hit as many uses as possible. For T_INSTANCEOF it could potentially use an extended cache slot to store a reference to it (would need expanding out into a separate struct as the cache slot is already used for converting the static name to a ce). I'm less clear about how we could handle it for everything else, like parameter type validation. The underlying implementation already seems pretty well optimized so I'm not sure where a performance penalty is to be found. -- Mark Randall
  107690
October 25, 2019 08:08 phpmailinglists@gmail.com (Peter Bowyer)
On Thu, 24 Oct 2019 at 13:47, Rowan Tommins collins@gmail.com> wrote:

> I think this performance impact is a real concern; PHP is the only language > I know of which implements type checks entirely at run-time in production > code, and we should ask ourselves if that's definitely the right approach. >
As they are runtime checks, would an ini setting where they can be completely disabled be feasible? So during development and in production when the performance decrease doesn't matter, I can have the full runtime type checking. But when absolutely needed, the checking can be disabled. Peter
  107703
October 26, 2019 10:44 benjamin.morel@gmail.com (Benjamin Morel)
> > As they are runtime checks, would an ini setting where they can be > completely disabled be feasible? So during development and in production > when the performance decrease doesn't matter, I can have the full runtime > type checking. But when absolutely needed, the checking can be disabled.
Note that I would personally never disable these checks in production, as they may prevent potential bugs down the road, that would not have necessarily been caught in dev. I would rather expect PHP to bring down the cost of these checks in production to a negligible level, as mentioned in my previous message (static analysis and JIT). — Benjamin On Fri, 25 Oct 2019 at 10:09, Peter Bowyer <phpmailinglists@gmail.com> wrote:
> On Thu, 24 Oct 2019 at 13:47, Rowan Tommins collins@gmail.com> > wrote: > > > I think this performance impact is a real concern; PHP is the only > language > > I know of which implements type checks entirely at run-time in production > > code, and we should ask ourselves if that's definitely the right > approach. > > > > As they are runtime checks, would an ini setting where they can be > completely disabled be feasible? So during development and in production > when the performance decrease doesn't matter, I can have the full runtime > type checking. But when absolutely needed, the checking can be disabled. > > Peter >
  107702
October 26, 2019 08:35 benjamin.morel@gmail.com (Benjamin Morel)
> > I think this performance impact is a real concern; PHP is the only language > I know of which implements type checks entirely at run-time in production > code, and we should ask ourselves if that's definitely the right approach..
Would it be possible, at least in principle, to build a static analysis
> tool which could prove that certain type checks would never fail, and prime > OpCache with code that leaves them out? As I understand it, this is how > Dart works: the compiler only emits run-time checks for assertions it can't > prove true by static analysis.
I was wondering the same thing, especially in the light of preloading, where a lot of class/function definitions are known on startup. For example, doing: function x(): int {} function y(int $foo) {} y(x()); Should definitely not redundantly check the type of the y() parameter, if possible. Also, in the case of JIT, shouldn't type hints actually *improve* performance? — Benjamin On Thu, 24 Oct 2019 at 14:47, Rowan Tommins collins@gmail.com> wrote:
> Hi Dmitry, > > On Wed, 23 Oct 2019 at 16:43, Dmitry Stogov <dmitry@zend.com> wrote: > > > Actually, I think PHP already took wrong direction implementing "typed > > references" and "type variance". > > Introducing more "typing", we suggest using it, but this "typing" comes > > with a huge cost of run-time checks. > > From 10% (scalar type hint of argument) to 3 times (typed reference > > assignment) performance degradation. > > Anyone may rerun my benchmarks > > https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9 > > > > > I think this performance impact is a real concern; PHP is the only language > I know of which implements type checks entirely at run-time in production > code, and we should ask ourselves if that's definitely the right approach.. > > Would it be possible, at least in principle, to build a static analysis > tool which could prove that certain type checks would never fail, and prime > OpCache with code that leaves them out? As I understand it, this is how > Dart works: the compiler only emits run-time checks for assertions it can't > prove true by static analysis. > > The simpler idea I had in this area was caching what type checks a value > had passed, either on each zval or perhaps just at the class level, so that > checking the same type again would be much faster, even if it was a complex > union with multiple interfaces. > > Regards, > -- > Rowan Tommins > [IMSoP] >
  107693
October 25, 2019 10:22 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Oct 23, 2019 at 5:42 PM Dmitry Stogov <dmitry@zend.com> wrote:

> Hi Nikita, > > I checked the Union Type implementation, and it more or less good. I mean, > it implements the RFC in almost the best way. > However, as I don't like the RFC itself. Especially, unions of multiple > classes and interference with type variance, I'll vote against this. > > Actually, I think PHP already took wrong direction implementing "typed > references" and "type variance". > Introducing more "typing", we suggest using it, but this "typing" comes > with a huge cost of run-time checks. > From 10% (scalar type hint of argument) to 3 times (typed reference > assignment) performance degradation. > Anyone may rerun my benchmarks > https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9 > > Thanks. Dmitry. >
For reference, here are the results I get with/without JIT: https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33 I think that union types don't really change much in terms of the performance calculus of type checking, because type checking is equally fast (or slow) for a single scalar type, and 5 different scalar types: they are all handled in a single bit check. The only case that is indeed somewhat slow is if multiple class types are used in the union, because we actually have to check each one until we find a match. I do hope that having unions with a large number of classes will not be common. Generally, this area could use some more optimization from the implementation side. I spent a bit of time yesterday optimizing how we perform instanceof checks (the implementation was quite inefficient, especially for interfaces), which had a large positive impact on class type checks. There's more low hanging fruit like this, for example verify_return_type has no JIT implementation yet (which is why with JIT simple arg type checks have nearly zero overhead, while return type checks have a large overhead). Assignments to typed properties are also badly optimized, because everything is punted to the slow path, while we should be able to handle the simple cases with just one extra bit check, without going through the complex implementation that deals with things like weak typing coercions. I do think that performance considerations are important when it comes to new typing features (which is why, for example, we have categorically rejected a "typed arrays" implementation that has to check all array elements), but don't see union types are particular problematic in that regard, beyond what we already have. Nikita ------------------------------
> *From:* Dmitry Stogov <dmitry@zend.com> > *Sent:* Tuesday, October 22, 2019 15:38 > *To:* Nikita Popov ppv@gmail.com>; PHP internals < > internals@lists.php.net> > *Subject:* Re: [PHP-DEV] Re: [RFC] Union Types v2 > > Hi Nikita, > > Can you please give me one/two days, before starting the voting, for > implementation review (at least until October 25), > > Thanks. Dmitry. > > ------------------------------ > *From:* Nikita Popov ppv@gmail.com> > *Sent:* Tuesday, October 22, 2019 12:36 > *To:* PHP internals <internals@lists.php.net> > *Subject:* [PHP-DEV] Re: [RFC] Union Types v2 > > On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com> wrote: > > > Hi internals, > > > > I'd like to start the discussion on union types again, with a new > proposal: > > > > Pull Request: https://github.com/php/php-rfcs/pull/1 > > Rendered Proposal: > > > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > > evaluate whether this might be a better medium for RFC proposals in the > > future. It would be great if we could keep the discussion to the GitHub > > pull request for the purpose of this experiment (keep in mind that you > can > > also create comments on specific lines in the proposal, not just the > > overall discussion thread!) Of course, you can also reply to this mail > > instead. The final vote will be held in the wiki as usual. > > > > Relatively to the previous proposal by Bob&Levi ( > > https://wiki.php.net/rfc/union_types), I think the main differences in > > this proposal are: > > * Updated to specify interaction with new language features, like full > > variance and property types. > > * Updated for the use of the ?Type syntax rather than the Type|null > > syntax. > > * Only supports "false" as a pseudo-type, not "true". > > * Slightly simplified semantics for the coercive typing mode. > > > > Regards, > > Nikita > > > > An implementation of this proposal is now available at > https://github.com/php/php-src/pull/4838. As the implementation didn't > turn > up any new issues, I think it's time to move this RFC forward to voting. > > Nikita > > > CAUTION: This email originated from outside of the organization. Do not > click on links or open attachments unless you recognize the sender and know > the content is safe. >
  107699
October 25, 2019 13:27 dmitry@zend.com (Dmitry Stogov)
Hi Nikita,

in your results, assignment to typed reference is ~3 time slower without JIT and ~10 times slower with JIT.
But this is not the worst case. I can simple craft a script where single assignment will lead to  hundreds type checks.

ref =& $ref;
   }
}
for ($i = 0; $i < 100; $i++) {
   $a[$i] = new A($ref);
}
$ref = new A; // <= 100 type checks on a single assignment
?>

In case we add union types, we can make 200, 300 check...

The worst thing, for me, is variance together with union of multiple class types.
Each such method compatibility check is going to be a puzzle solving, and I imagine people, proud of writing "complex pearls".

My main concern, I don't like to complicate the language with features that shouldn't be often used, and I don't like to complicate implementation and reduce performance without real need.

In my opinion, union of standard types and single class or null should be enough for most typing use cases.

Thanks. Dmitry.
________________________________
From: Nikita Popov ppv@gmail.com>
Sent: Friday, October 25, 2019 13:22
To: Dmitry Stogov <dmitry@zend.com>
Cc: PHP internals <internals@lists.php.net>
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2

On Wed, Oct 23, 2019 at 5:42 PM Dmitry Stogov <dmitry@zend.com<mailto:dmitry@zend.com>> wrote:
Hi Nikita,

I checked the Union Type implementation, and it more or less good. I mean, it implements the RFC in almost the best way.
However, as I don't like the RFC itself. Especially, unions of multiple classes and interference with type variance, I'll vote against this.

Actually, I think PHP already took wrong direction implementing "typed references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference assignment) performance degradation.
Anyone may rerun my benchmarks https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9

Thanks. Dmitry.

For reference, here are the results I get with/without JIT: https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33

I think that union types don't really change much in terms of the performance calculus of type checking, because type checking is equally fast (or slow) for a single scalar type, and 5 different scalar types: they are all handled in a single bit check. The only case that is indeed somewhat slow is if multiple class types are used in the union, because we actually have to check each one until we find a match. I do hope that having unions with a large number of classes will not be common.

Generally, this area could use some more optimization from the implementation side. I spent a bit of time yesterday optimizing how we perform instanceof checks (the implementation was quite inefficient, especially for interfaces), which had a large positive impact on class type checks. There's more low hanging fruit like this, for example verify_return_type has no JIT implementation yet (which is why with JIT simple arg type checks have nearly zero overhead, while return type checks have a large overhead). Assignments to typed properties are also badly optimized, because everything is punted to the slow path, while we should be able to handle the simple cases with just one extra bit check, without going through the complex implementation that deals with things like weak typing coercions.

I do think that performance considerations are important when it comes to new typing features (which is why, for example, we have categorically rejected a "typed arrays" implementation that has to check all array elements), but don't see union types are particular problematic in that regard, beyond what we already have.

Nikita

________________________________
From: Dmitry Stogov <dmitry@zend.com<mailto:dmitry@zend.com>>
Sent: Tuesday, October 22, 2019 15:38
To: Nikita Popov ppv@gmail.com<mailto:nikita.ppv@gmail.com>>; PHP internals <internals@lists.php.net<mailto:internals@lists.php.net>>
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2

Hi Nikita,

Can you please give me one/two days, before starting the voting, for implementation review (at least until October 25),

Thanks. Dmitry.

________________________________
From: Nikita Popov ppv@gmail.com<mailto:nikita.ppv@gmail.com>>
Sent: Tuesday, October 22, 2019 12:36
To: PHP internals <internals@lists.php.net<mailto:internals@lists.php.net>>
Subject: [PHP-DEV] Re: [RFC] Union Types v2

On Wed, Sep 4, 2019 at 10:26 AM Nikita Popov ppv@gmail.com<mailto:nikita.ppv@gmail.com>> wrote:

> Hi internals, > > I'd like to start the discussion on union types again, with a new proposal: > > Pull Request: https://github.com/php/php-rfcs/pull/1 > Rendered Proposal: > https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md > > As an experiment, I'm submitting this RFC as a GitHub pull request, to > evaluate whether this might be a better medium for RFC proposals in the > future. It would be great if we could keep the discussion to the GitHub > pull request for the purpose of this experiment (keep in mind that you can > also create comments on specific lines in the proposal, not just the > overall discussion thread!) Of course, you can also reply to this mail > instead. The final vote will be held in the wiki as usual. > > Relatively to the previous proposal by Bob&Levi ( > https://wiki.php.net/rfc/union_types), I think the main differences in > this proposal are: > * Updated to specify interaction with new language features, like full > variance and property types. > * Updated for the use of the ?Type syntax rather than the Type|null > syntax. > * Only supports "false" as a pseudo-type, not "true". > * Slightly simplified semantics for the coercive typing mode. > > Regards, > Nikita >
An implementation of this proposal is now available at https://github.com/php/php-src/pull/4838. As the implementation didn't turn up any new issues, I think it's time to move this RFC forward to voting. Nikita CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe. CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.
  107700
October 25, 2019 15:40 Danack@basereality.com (Dan Ackroyd)
On Fri, 25 Oct 2019 at 14:27, Dmitry Stogov <dmitry@zend.com> wrote:
> > // <= 100 type checks on a single assignment
That code contains a reference that is re-used 100 times. I personally prefer not to use references at all, but if people do want to use them, we could put a note that references are bad for performance when used with types. People can then choose between: * using types, and not references for good performance. * using references, and not types for good performance. * using references and types, and getting slightly less good performance.
> I don't like to complicate the language with features > that shouldn't be often used,
Can we start the discussion about deprecating references in PHP 8 then? Very few people are writing code using them currently, and they seem to cause quite a lot of confusion, judging by the bug reports about them. If you're also saying that they are making it difficult to write performant PHP, that seems to be a good argument for looking at what would be needed to be done to remove them. cheers Dan Ack
  107701
October 25, 2019 16:13 dmitry@zend.com (Dmitry Stogov)
Removing references would be great for implementation 🙂 , but this doesn't look realistic in context of PHP language.
HHVM already made steps into this direction with Hack.

Thanks. Dmitry.

________________________________
From: Dan Ackroyd <Danack@basereality.com>
Sent: Friday, October 25, 2019 18:40
To: Dmitry Stogov <dmitry@zend.com>
Cc: Nikita Popov ppv@gmail.com>; PHP internals <internals@lists.php.net>
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2

On Fri, 25 Oct 2019 at 14:27, Dmitry Stogov <dmitry@zend.com> wrote:
> > // <= 100 type checks on a single assignment
That code contains a reference that is re-used 100 times. I personally prefer not to use references at all, but if people do want to use them, we could put a note that references are bad for performance when used with types. People can then choose between: * using types, and not references for good performance. * using references, and not types for good performance. * using references and types, and getting slightly less good performance.
> I don't like to complicate the language with features > that shouldn't be often used,
Can we start the discussion about deprecating references in PHP 8 then? Very few people are writing code using them currently, and they seem to cause quite a lot of confusion, judging by the bug reports about them. If you're also saying that they are making it difficult to write performant PHP, that seems to be a good argument for looking at what would be needed to be done to remove them. cheers Dan Ack CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.
  107737
October 30, 2019 21:11 d.takken@xs4all.nl (Dik Takken)
On 25-10-19 12:22, Nikita Popov wrote:
> I do think that performance considerations are important when it comes to > new typing features (which is why, for example, we have categorically > rejected a "typed arrays" implementation that has to check all array > elements), but don't see union types are particular problematic in that > regard, beyond what we already have.
While union types barely seem to add to the cost of type checking, I also understand the general sentiment: Type checks can be expensive, should we really add more? Perhaps it might be an idea to wait a little before starting the vote to show the cost going down as more optimization work is done. Just to build confidence with those who are hesitant to give it their approval. Regards, Dik Takken
  107738
October 30, 2019 21:22 marandall@php.net (Mark Randall)
On 30/10/2019 21:11, Dik Takken wrote:
> Perhaps it might be an idea to wait a little before starting the vote to > show the cost going down as more optimization work is done. Just to > build confidence with those who are hesitant to give it their approval.
I don't know if this post has been accidentally sitting in an outbox for a bit too long and has just sent, but the vote has been underway for a while now. It currently has more than 90% in favour with good turnout. -- Mark Randall
  107750
November 2, 2019 21:44 d.takken@xs4all.nl (Dik Takken)
On 25-10-19 12:22, Nikita Popov wrote:
> For reference, here are the results I get with/without JIT: > https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33
I toyed a bit with the benchmark script (union_bench.php) as well and wanted to share some observations. First of all I noticed the benchmark script has a typo on line 90 where it is calling the wrong function. It should read: func6(1, 2, 3, 4, 5); When running the corrected script I see that adding 5 argument type checks and a return type check cause almost 4x slowdown. My results (with opcache / jit): func($a,$b,$c,$d,$e) 0.680 0.583 func(int $a,$b,$c,$d,$e): int 2.106 2.009 However, this appears to be entirely due to the return type check lacking a JIT implementation, as pointed out by Nikita. Adding one more test to the benchmark shows this nicely: func($a,$b,$c,$d,$e) 0.675 0.575 func(int $a,$b,$c,$d,$e) 0.574 0.475 func(int $a,$b,$c,$d,$e): int 2.106 2.009 Now we can see that the argument type hint actually improves performance, I guess due to it narrowing down the number of possible types that need to be considered for the function arguments. Union types allow for more accurate type hinting as well as type hinting in places where this is currently not possible. As a result union types can be used to obtain performance gains. As an example, consider the case where the return type hint matches the type information that opcache has inferred about the variable that is returned. In that case, the return type check is optimized away. Let us try and leverage union types to make this happen. From the benchmark script we take func6: function func6(int $a, int $b, int $c, int $d, int $e) : int { return $a + $b + $c + $d + $e; } and adjust it to read: function func6(int $a, int $b, int $c, int $d, int $e) : int|float { return $a + $b + $c + $d + $e; } Now the return type hint matches what opcache infers the result of the expression will be and the cost of return type checking disappears completely: func($a,$b,$c,$d,$e) 0.663 0.568 func(int $a,$b,$c,$d,$e) 0.574 0.475 func(int $a,$b,$c,$d,$e): int|float 0.561 0.466 Then, on to another observation. The SSA forms currently produced by opcache show union types like string|int. This suggests that opcache supports union types for type inference already. It explains why opcache can nicely optimize type checks away even when union types are used. This is not true for unions of classes though. A union type like int|Foo copies into the SSA form just fine while Foo|Bar becomes 'object'. Code like this: class Foo {} class Bar {} function func(): Foo|Bar { return new Foo(); } func(); produces the following SSA form: func: ; (lines=4, args=0, vars=0, tmps=1, ssa_vars=2, no_loops) ; (before dfa pass) ; /php-src/sapi/cli/test.php:6-8 ; return [object] BB0: start exit lines=[0-3] ; level=0 #0.V0 [object (Foo)] = NEW 0 string("Foo") DO_FCALL VERIFY_RETURN_TYPE #0.V0 [object (Foo)] -> #1.V0 [object] RETURN #1.V0 [object] which will still perform a return type check even though the return type hint matches the actual type of the variable. Apparently the union type support in opcache is present but incomplete. So, while union types can incur higher type checking cost they also provide more powerful means to help type inference and improve performance. As opcache improves over time I think we can expect the cost to decrease while the gain increases. Or am I too optimistic here? Regards, Dik Takken
  107758
November 5, 2019 09:46 dmitry@zend.com (Dmitry Stogov)
Hi Dik,

On 11/3/19 12:44 AM, Dik Takken wrote:
> On 25-10-19 12:22, Nikita Popov wrote: >> For reference, here are the results I get with/without JIT: >> https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33 > > I toyed a bit with the benchmark script (union_bench.php) as well and > wanted to share some observations. First of all I noticed the benchmark > script has a typo on line 90 where it is calling the wrong function. It > should read: > > func6(1, 2, 3, 4, 5); > > When running the corrected script I see that adding 5 argument type > checks and a return type check cause almost 4x slowdown. My results > (with opcache / jit): > > func($a,$b,$c,$d,$e) 0.680 0.583 > func(int $a,$b,$c,$d,$e): int 2.106 2.009
Thanks for catching this. At least, now I see 2 times slowdown without JIT, that I expected, but didn't see. func($a,$b,$c,$d,$e) 1.746 1.555 func(int $a,$b,$c,$d,$e): int 3.647 3.455 JIT will able to eliminate type checks only if it exactly knows the called function at caller site. Unfortunately, this is quite rare case, because the functions may be declared in different files, OOP, type variance, etc.
> > However, this appears to be entirely due to the return type check > lacking a JIT implementation, as pointed out by Nikita. Adding one more > test to the benchmark shows this nicely: > > func($a,$b,$c,$d,$e) 0.675 0.575 > func(int $a,$b,$c,$d,$e) 0.574 0.475 > func(int $a,$b,$c,$d,$e): int 2.106 2.009 > > Now we can see that the argument type hint actually improves > performance, I guess due to it narrowing down the number of possible > types that need to be considered for the function arguments. > > Union types allow for more accurate type hinting as well as type hinting > in places where this is currently not possible. As a result union types > can be used to obtain performance gains. As an example, consider the > case where the return type hint matches the type information that > opcache has inferred about the variable that is returned. In that case, > the return type check is optimized away. Let us try and leverage union > types to make this happen. From the benchmark script we take func6: > > function func6(int $a, int $b, int $c, int $d, int $e) : int { > return $a + $b + $c + $d + $e; > } > > and adjust it to read: > > function func6(int $a, int $b, int $c, int $d, int $e) : int|float { > return $a + $b + $c + $d + $e; > } > > Now the return type hint matches what opcache infers the result of the > expression will be and the cost of return type checking disappears > completely: > > func($a,$b,$c,$d,$e) 0.663 0.568 > func(int $a,$b,$c,$d,$e) 0.574 0.475 > func(int $a,$b,$c,$d,$e): int|float 0.561 0.466 > > Then, on to another observation. The SSA forms currently produced by > opcache show union types like string|int. This suggests that opcache > supports union types for type inference already. It explains why opcache > can nicely optimize type checks away even when union types are used. > > This is not true for unions of classes though. A union type like int|Foo > copies into the SSA form just fine while Foo|Bar becomes 'object'. Code > like this: > > class Foo {} > class Bar {} > > function func(): Foo|Bar { > return new Foo(); > } > > func(); > > produces the following SSA form: > > func: ; (lines=4, args=0, vars=0, tmps=1, ssa_vars=2, no_loops) > ; (before dfa pass) > ; /php-src/sapi/cli/test.php:6-8 > ; return [object] > BB0: start exit lines=[0-3] > ; level=0 > #0.V0 [object (Foo)] = NEW 0 string("Foo") > DO_FCALL > VERIFY_RETURN_TYPE #0.V0 [object (Foo)] -> #1.V0 [object] > RETURN #1.V0 [object] > > which will still perform a return type check even though the return type > hint matches the actual type of the variable. Apparently the union type > support in opcache is present but incomplete. > So, while union types can incur higher type checking cost they also > provide more powerful means to help type inference and improve > performance. As opcache improves over time I think we can expect the > cost to decrease while the gain increases. Or am I too optimistic here?
In my experience, static optimizations are not able to eliminate most type checks in PHP. Probably, if we developed more complete type-system and used type declaration everywhere we could achieve better results. Introducing more type checks and more complex rules will increase run-time overhead. I'm currently working on attempt of speculative optimizations based on run-time feedback, and the results might change the whole picture a bit. Anyway, I'm especially against of mixing multiple classes in unions, not because of performance, but because of complex rules of method inheritance compatibility checks in conjunction with type variance. Thanks. Dmitry.
> > Regards, > Dik Takken > > > CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe. >
  107781
November 7, 2019 21:22 d.takken@xs4all.nl (Dik Takken)
On 05-11-19 10:46, Dmitry Stogov wrote:
> I'm currently working on attempt of speculative optimizations based on > run-time feedback, and the results might change the whole picture a bit.
That looks like yet another impressive bit of work. It would be very interesting to see some preliminary results when you're ready to share them. Regards, Dik Takken