>
> > So we'd probably need some built-in definition of a "package", which
> could be analysed and compiled as one unit, and didn't rely on any run-time
> loading.
> That idea of a "package" came up during a debate on this list at least
> once, a few months ago, and I think it makes a lot of sense. And what I
> proposed effectively implies that namespaces would be treated like packages
> from the perspective of the compiler.
Putting aside the idea of distributing pre-compiled PHP scripts, if we're
only debating the precompilation as, notably, a means to reduce the cost of
type checks, I wouldn't mind if the precompilation occurred *only if
preloading is in use*, i.e if most class definitions are known on server
startup, which is when the compilation / optimization passes could occur.
No preloading = no such optimizations, I could personally live with that.
No need for a package definition, IMO.
â Benjamin
On Mon, 28 Oct 2019 at 00:56, Mike Schinkel <mike@newclarity.net> wrote:
> > On Oct 27, 2019, at 7:04 PM, Rowan Tommins collins@gmail.com>
> wrote:
>
> Thank you for your comments.
>
> > I chose the phrase "static analysis tool" deliberately, because I wanted
> to think about the minimum requirements for such a tool, rather than its
> long-term possibilities.
>
> Your points are all well-considered.
>
> To be clear, I wasn't stating the idea as a alternative to your idea, I
> was only stating that your comments inspired me to have the idea of a
> pre-compiler.
>
> IOW, I saw no reason both could not be done, one sooner and the other
> later.
>
> > However, combining those usefully may not be that easy.
>
> Also for clarity, I was not assuming existing OpCache would be 100%
> unmodified, I was talking about benefits that a pre-compiler could have and
> was less focused on ensuring it could slot into an existing OpCache
> implementation as-is.
>
> IOW, if it is worth doing it might be worth extending how the OpCache
> works.
>
> > So we'd probably need some built-in definition of a "package", which
> could be analysed and compiled as one unit, and didn't rely on any run-time
> loading.
>
> That idea of a "package" came up during a debate on this list at least
> once, a few months ago, and I think it makes a lot of sense. And what I
> proposed effectively implies that namespaces would be treated like packages
> from the perspective of the compiler.
>
> But then again a new package concept might be needed in addition to
> namespaces, I am not certain either way.
>
> > Unlike P++, Editions, or Strict Mode, this would undeniably define that
> the deprecated features were "the wrong way".
>
> I am not sure I cam agree that it would define them as the "wrong way."
>
> The way I would see it is there would be a "strict way" and an "unstrict
> way." If you prefer the simplicity of low strictness and do not need
> more/better performance or the benefits of type-safety that are needed for
> building large applications, then the "right way" would still be the
> "unstrict way."
>
> And the non-strict features would not be "deprecated" per-se, they would
> instead be disallowed for the strict (compiled) way, but still allowed for
> the unstrict (interpreted) way.
>
> > If the engine had to support the feature anyway,
>
> I think we are talking two engines; one for compiling and another for
> interpreting. They could probably share a lot of code, but I would think
> it would still need to be two different engines.
>
> > I'm not sure what the advantage would be of tying it to "compiled vs
> non-compiled", rather than opting in via a declare() statement or package
> config.
>
> The advantage would be two-fold:
>
> 1. Backward compatibility
>
> 2. Allowing PHP to continue to meet the needs of new/less-skilled
> programmers and/or people who want a more productive language for smaller
> projects that do not need or want all the enterprisey type-safe features.
>
> Frankly it is this advantage which is the primary reason I though to send
> a message to the list. The chance to have the benefit of strictness and
> high performance for more advanced PHP developers while still having full
> BC for existing code and for beginner developers seemed highly compelling
> to me.
>
> -Mike
>
>
On Mon, 28 Oct 2019 at 01:33, Benjamin Morel morel@gmail.com> wrote:
>
> >
> > > So we'd probably need some built-in definition of a "package", which
> > could be analysed and compiled as one unit, and didn't rely on any run-time
> > loading.
> > That idea of a "package" came up during a debate on this list at least
> > once, a few months ago, and I think it makes a lot of sense. And what I
> > proposed effectively implies that namespaces would be treated like packages
> > from the perspective of the compiler.
>
>
>
> Putting aside the idea of distributing pre-compiled PHP scripts, if we're
> only debating the precompilation as, notably, a means to reduce the cost of
> type checks, I wouldn't mind if the precompilation occurred *only if
> preloading is in use*, i.e if most class definitions are known on server
> startup, which is when the compilation / optimization passes could occur.
> No preloading = no such optimizations, I could personally live with that.
>
> No need for a package definition, IMO.
This would break as soon as we have two versions of a class, and a
runtime choice which of them to use.
(see also Mark Randall's comment)
What about this, instead:
- Instead of a cli command, lazily "compile" in the opcache. So more
or less what we are already doing, I guess.
- Possibility to store/cache multiple versions of a file, depending on
other files it depends on.
Somehow like this, per file:
1. Compile a low-level version of the file, or load it from a cache,
with cache id = file path.
2. Recursively process all the files and classes (autoload) this file
depends on.
3. Generate a hash from the dependencies.
4. Compile the final version of the file, or load it from a cache,
with cache id = file path + dependencies hash.
Perhaps this could even be further optimized with some "guessing":
Assume everything is as it was the last time, until we hit a conflict.
This is probably more complicated than I am describing it here.
I kept the term "dependencies" intentionally vague, because I am not
sure what exactly we would need to look at.
Perhaps we would store not just multiple versions of each file, but of
each global symbol (class, function).
- One "base version" for each distinct definition of a symbol in a
distinct file.
- One "specific version" per combination of versions of other symbols
this depends on.
One problem I see is that some of the dependees may be unknown at the
time a file is included.
E.g. a function might call a static method from a class that has not
yet been included, triggering the autoloader.
Since the autoloader can be anything, we have no way to predict which
file will be included, and thus, which version the static method to
typecheck against.
Even if we previously scanned the entire project directory, and found
only one class with the given static method, the autoloader might
instead include a file outside the project directory, or define the
class with eval() or stream wrappers, or dump a generated file in
/tmp.
This would mean we would have to run a non-deterministic model until
all dependees are included.
So perhaps this idea is a dead end :)
-- Andreas
>
> â Benjamin
>
> On Mon, 28 Oct 2019 at 00:56, Mike Schinkel <
mike@newclarity.net> wrote:
>
> > > On Oct 27, 2019, at 7:04 PM, Rowan Tommins
collins@gmail.com>
> > wrote:
> >
> > Thank you for your comments.
> >
> > > I chose the phrase "static analysis tool" deliberately, because I wanted
> > to think about the minimum requirements for such a tool, rather than its
> > long-term possibilities.
> >
> > Your points are all well-considered.
> >
> > To be clear, I wasn't stating the idea as a alternative to your idea, I
> > was only stating that your comments inspired me to have the idea of a
> > pre-compiler.
> >
> > IOW, I saw no reason both could not be done, one sooner and the other
> > later.
> >
> > > However, combining those usefully may not be that easy.
> >
> > Also for clarity, I was not assuming existing OpCache would be 100%
> > unmodified, I was talking about benefits that a pre-compiler could have and
> > was less focused on ensuring it could slot into an existing OpCache
> > implementation as-is.
> >
> > IOW, if it is worth doing it might be worth extending how the OpCache
> > works.
> >
> > > So we'd probably need some built-in definition of a "package", which
> > could be analysed and compiled as one unit, and didn't rely on any run-time
> > loading.
> >
> > That idea of a "package" came up during a debate on this list at least
> > once, a few months ago, and I think it makes a lot of sense. And what I
> > proposed effectively implies that namespaces would be treated like packages
> > from the perspective of the compiler.
> >
> > But then again a new package concept might be needed in addition to
> > namespaces, I am not certain either way.
> >
> > > Unlike P++, Editions, or Strict Mode, this would undeniably define that
> > the deprecated features were "the wrong way".
> >
> > I am not sure I cam agree that it would define them as the "wrong way."
> >
> > The way I would see it is there would be a "strict way" and an "unstrict
> > way." If you prefer the simplicity of low strictness and do not need
> > more/better performance or the benefits of type-safety that are needed for
> > building large applications, then the "right way" would still be the
> > "unstrict way."
> >
> > And the non-strict features would not be "deprecated" per-se, they would
> > instead be disallowed for the strict (compiled) way, but still allowed for
> > the unstrict (interpreted) way.
> >
> > > If the engine had to support the feature anyway,
> >
> > I think we are talking two engines; one for compiling and another for
> > interpreting. They could probably share a lot of code, but I would think
> > it would still need to be two different engines.
> >
> > > I'm not sure what the advantage would be of tying it to "compiled vs
> > non-compiled", rather than opting in via a declare() statement or package
> > config.
> >
> > The advantage would be two-fold:
> >
> > 1. Backward compatibility
> >
> > 2. Allowing PHP to continue to meet the needs of new/less-skilled
> > programmers and/or people who want a more productive language for smaller
> > projects that do not need or want all the enterprisey type-safe features.
> >
> > Frankly it is this advantage which is the primary reason I though to send
> > a message to the list. The chance to have the benefit of strictness and
> > high performance for more advanced PHP developers while still having full
> > BC for existing code and for beginner developers seemed highly compelling
> > to me.
> >
> > -Mike
> >
> >
>
> This would break as soon as we have two versions of a class, and a
> runtime choice which of them to use.
> (see also Mark Randall's comment)
That's why I'm suggesting to only make these optimizations when preloading
<https://wiki.php.net/rfc/preload>is in use, which means that you know
ahead of time the class definitions, and you cannot have 2 runtime
definitions of a given class.
No preloading = no optimizations.
Full preloading (whole codebase) = maximum optimizations.
Partial preloading = the compiler should still be able to optimize *some *of
the code involving only the preloaded classes.
We already have, since PHP 7.4, a mechanism to know static class
definitions on startup, so why not build further optimizations on top of it?
â â Benjamin
On 27/10/2019 23:56, Mike Schinkel wrote:
> 2. Allowing PHP to continue to meet the needs of new/less-skilled programmers and/or people who want a more productive language for smaller projects that do not need or want all the enterprisey type-safe features.
This concept of type safety being an enterprise feature needs to die.
Types are a way of preventing your program from getting into states that
you don't expect it to be in, so you don't have to worry about handling
them in the first place.
Scalars, and strict types would have saved me _so much_ time when I
started trying to learn PHP.
Here's a video I stumbled upon recently that helps explain why types
help make coding easier, by reducing the number of possible states an
application can be in:
https://youtu.be/q1Yi-WM7XqQ?t=656
--
Mark Randall
On Sun, 27 Oct 2019 at 23:56, Mike Schinkel <mike@newclarity.net> wrote:
>
> So we'd probably need some built-in definition of a "package", which could
> be analysed and compiled as one unit, and didn't rely on any run-time
> loading.
>
>
> That idea of a "package" came up during a debate on this list at least
> once, a few months ago, and I think it makes a lot of sense. And what I
> proposed effectively implies that namespaces would be treated like packages
> from the perspective of the compiler.
>
> But then again a new package concept might be needed in addition to
> namespaces, I am not certain either way.
>
>
Current tools tend to actually work on a directory level, because you don't
actually know what namespaces are involved until after you've loaded it,
and a file can include code for two completely separate namespaces. My
thinking was that a package would pre-define the full list of files that
define it, with no auto-loader, and no conditional definitions evaluated at
run-time. As Benjamin points out, this is closely related to preloading.
> Unlike P++, Editions, or Strict Mode, this would undeniably define that
> the deprecated features were "the wrong way".
>
>
> I am not sure I cam agree that it would define them as the "wrong way."
>
>
> The way I would see it is there would be a "strict way" and an "unstrict
> way." If you prefer the simplicity of low strictness and do not need
> more/better performance or the benefits of type-safety that are needed for
> building large applications, then the "right way" would still be the
> "unstrict way."
>
>
And what if you want simplicity *and* performance? Most of the things
people want to make strict about the language don't make it faster, so if
we limited "pre-compiled mode" to be strict, we'd be making a deliberate
choice to group objectively good things (fast vs slow) with subjective
preferences (strict vs simple). That pretty clearly marks strict mode as
"the better way".
> If the engine had to support the feature anyway,
>
>
> I think we are talking two engines; one for compiling and another for
> interpreting. They could probably share a lot of code, but I would think
> it would still need to be two different engines.
>
>
That sounds like the worst kind of fork: two different engines, running two
different dialects of the language. At that point, you might as well just
switch to Hack.
Note that this was exactly what "P++" was intended to avoid - the two
dialects would exist in the same engine, and get the same performance and
security enhancements.
> I'm not sure what the advantage would be of tying it to "compiled vs
> non-compiled", rather than opting in via a declare() statement or package
> config.
>
> The advantage would be two-fold:
>
> 1. Backward compatibility
>
> 2. Allowing PHP to continue to meet the needs of new/less-skilled
> programmers and/or people who want a more productive language for smaller
> projects that do not need or want all the enterprisey type-safe features.
>
>
Both of these are reasons to have some sort of "strict mode", but not for
tying it to some other feature.
Regards,
--
Rowan Tommins
[IMSoP]
> On Oct 28, 2019, at 6:00 AM, Rowan Tommins collins@gmail.com> wrote:
>
> Current tools tend to actually work on a directory level, because you don't
> actually know what namespaces are involved until after you've loaded it,
> and a file can include code for two completely separate namespaces. My
> thinking was that a package would pre-define the full list of files that
> define it, with no auto-loader, and no conditional definitions evaluated at
> run-time. As Benjamin points out, this is closely related to preloading.
I would rather a tool that did not require specifying the files. I personally would be fine with one that used a directory as the demarcator, and even if it only worked when you put your namespace in another directory it won't work.
> And what if you want simplicity *and* performance? Most of the things
> people want to make strict about the language don't make it faster, so if
> we limited "pre-compiled mode" to be strict, we'd be making a deliberate
> choice to group objectively good things (fast vs slow) with subjective
> preferences (strict vs simple). That pretty clearly marks strict mode as
> "the better way".
At the risk of being too flippant, I defer to the wisdom on that great philosopher Mick Jagger and say you can't always get what you want...
But seriously, at some point tradeoffs have to be made to see any forward progress. What we have not found before was a good tradeoff between strict and BC. Maybe this it is? After all, while not all strict things are about performance but many things that enable performance are strict.
> That sounds like the worst kind of fork: two different engines, running two
> different dialects of the language. At that point, you might as well just
> switch to Hack.
That feels like an over-reaction. Hack has purposely diverged from PHP and requires a different runtime than PHP.
The idea I was proposing is that the PHP runtime be one but operates in two different modes â one mode per "engine" â and the goal of two different modes would to be to stay more similar than different, but allow one of them to have BC breaks.
> Note that this was exactly what "P++" was intended to avoid - the two
> dialects would exist in the same engine, and get the same performance and
> security enhancements.
It could also be one engine, it just seemed like that coupling would be more problematic than separating them.
That said, I'm not skilled enough in PHP internals to implement it (yet?) so I can only speak to it at a high level.
>> The advantage would be two-fold:
>>
>> 1. Backward compatibility
>>
>> 2. Allowing PHP to continue to meet the needs of new/less-skilled
>> programmers and/or people who want a more productive language for smaller
>> projects that do not need or want all the enterprisey type-safe features.
>
> Both of these are reasons to have some sort of "strict mode", but not for
> tying it to some other feature.
I don't understand your reply, but maybe it is moot considering the rest of the dialog?
What we have today is a rock vs a hard-place, and no one wants to give even a millimeter.
So, if this is not a viable solution in your mind to break the logjam between BC and the desire for strictness-in-all-the-things, do you have an alternate, better proposal?
-Mike
On 29/10/2019 19:04, Mike Schinkel wrote:
>> Note that this was exactly what "P++" was intended to avoid - the two
>> dialects would exist in the same engine, and get the same performance and
>> security enhancements.
>
> It could also be one engine, it just seemed like that coupling
> would be more problematic than separating them.
>
I think the problem is that as soon as you have two engines targeting
different feature sets, it will be hard to persuade people to spend
equal attention on both. If all the new features end up being added to
one engine, the other one is going to increasingly feel like "legacy
mode", rather than "equal but different".
>> Both of these are reasons to have some sort of "strict mode", but not for
>> tying it to some other feature.
>
> I don't understand your reply, but maybe it is moot considering
> the rest of the dialog?
>
> What we have today is a rock vs a hard-place, and no one wants to
> give even a millimeter.
>
> So, if this is not a viable solution in your mind to break the
> logjam between BC and the desire for strictness-in-all-the-things,
> do you have an alternate, better proposal?
>
The idea of an "extra strict" and/or "less backwards compatible" mode
has been mentioned on the list several times, but you're the first to
suggest making it mandatory when using an otherwise unrelated
performance feature.
It would be much better to keep it separate, and opt into it via a
declare() statement, or a package configuration, or a file extension.
There have been proposals for a single flag, lots of separate flags, a
complete "P++" dialect, or bundles of settings ("Editions").
Whatever the approach, a key goal in my mind should be to maximise the
compatibility between the two, and share as much implementation as
possible. Both/all modes should get the same performance improvements,
except where the actual features are necessarily slower or faster.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
> On Oct 29, 2019, at 5:49 PM, Rowan Tommins collins@gmail.com> wrote:
>
> I think the problem is that as soon as you have two engines targeting different feature sets, it will be hard to persuade people to spend equal attention on both. If all the new features end up being added to one engine, the other one is going to increasingly feel like "legacy mode", rather than "equal but different".
That is a fair point.
> It would be much better to keep it separate, and opt into it via a declare() statement, or a package configuration, or a file extension. There have been proposals for a single flag, lots of separate flags, a complete "P++" dialect, or bundles of settings ("Editions").
Correct me if I am wrong, but all of those have been objected to, strenuously, by at least several people on the list.
What will it take to finally get enough consensus to move forward?
> Both/all modes should get the same performance improvements, except where the actual features are necessarily slower or faster.
Fine. But a pre-compiler still could have merit.
One of the things I would like to see from a pre-compiler is getting rid of the need to deal with an autoloader and hence we able to store multiple related classes in the same file.
Primarily I would like this will doing R&D on a project idea prior to fully understanding what the object hierarchy needs to be. That, of course, would conflict with the non-pre-compiled code by its very nature.
-Mike
On 29/10/2019 21:56, Mike Schinkel wrote:
>
>
>> It would be much better to keep it separate, and opt into it via a
>> declare() statement, or a package configuration, or a file extension.
>> There have been proposals for a single flag, lots of separate flags,
>> a complete "P++" dialect, or bundles of settings ("Editions").
>
> Correct me if I am wrong, but all of those have been objected to,
> strenuously, by at least several people on the list.
>
Indeed, but adding "the strict mode will be faster than the legacy mode"
is likely to make those objections stronger, not resolve them, unless
you can demonstrate _why_ the strict mode needs to be mandatory for the
pre-compiled mode.
>> Both/all modes should get the same performance improvements, except
>> where the actual features are necessarily slower or faster.
>
> Fine. But a pre-compiler still could have merit.
>
Absolutely! In case you've forgotten, it was my remark that started this
whole discussion: https://externals.io/message/106844#107656
> One of the things I would like to see from a pre-compiler is
> getting rid of the need to deal with an autoloader and hence we
> able to store multiple related classes in the same file.
>
Yes, I think moving from auto-loading to eager loading would make sense
for a lot of projects.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]