ZTS improvement idea

  104378
February 13, 2019 08:26 dmitry@zend.com (Dmitry Stogov)
Hi,


After JIT+ZTS related discussion with Joe and Bob, and some related analyzes.

I came to more or less formed design idea and described it at https://wiki.php.net/zts-improvement

This is not an RFC and I'm not sure, if I like to implement TSRM changes myself now.


Comments are welcome.


Thanks. Dmitry.
  104379
February 13, 2019 09:02 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com> wrote:

> Hi, > > > After JIT+ZTS related discussion with Joe and Bob, and some related > analyzes. > > I came to more or less formed design idea and described it at > https://wiki.php.net/zts-improvement > > This is not an RFC and I'm not sure, if I like to implement TSRM changes > myself now. > > > Comments are welcome. >
Hi Dmitry, Thanks for looking into this issue. As a possible alternative I would like to suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc globals on Linux (on Windows this is not possible due to DLL linkage restrictions). __thread generates very good code (single load over %fs segment with constant address) if the global is defined and used in an executable. I'm not sure what kind of code it generates when TLS is declared in an executable and used in a shared object, but as direct access from extensions to the engine globals shouldn't be common, it's probably okay even if it uses __tls_get_addr. Nikita
  104380
February 13, 2019 09:26 krakjoe@gmail.com (Joe Watkins)
Morning all,

I'm very pleased to see effort going into this, and the resulting ideas.

I don't have anything to add about the implementation.

Since most people are not interested in ZTS, there aren't going to be many
voices pushing you to actually make changes, so I want to be that voice.

The ZTS build is very commonly used in Windows today, and I'm sure everyone
doing that would appreciate you making these changes as soon as reasonable,
which looks like 7.4, beyond that before we can talk about developing JIT
support in Windows, ZTS support must be in place.

Thanks for the effort so far.

Cheers
Joe

On Wed, 13 Feb 2019 at 10:02, Nikita Popov ppv@gmail.com> wrote:

> On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com> wrote: > >> Hi, >> >> >> After JIT+ZTS related discussion with Joe and Bob, and some related >> analyzes. >> >> I came to more or less formed design idea and described it at >> https://wiki.php.net/zts-improvement >> >> This is not an RFC and I'm not sure, if I like to implement TSRM changes >> myself now. >> >> >> Comments are welcome. >> > > Hi Dmitry, > > Thanks for looking into this issue. As a possible alternative I would like > to suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc globals on > Linux (on Windows this is not possible due to DLL linkage restrictions). > __thread generates very good code (single load over %fs segment with > constant address) if the global is defined and used in an executable. I'm > not sure what kind of code it generates when TLS is declared in an > executable and used in a shared object, but as direct access from > extensions to the engine globals shouldn't be common, it's probably okay > even if it uses __tls_get_addr. > > Nikita >
  104382
February 13, 2019 09:43 dmitry@zend.com (Dmitry Stogov)
Hi Joe,

On 2/13/19 12:26 PM, Joe Watkins wrote:
> Morning all, > > I'm very pleased to see effort going into this, and the resulting ideas. > > I don't have anything to add about the implementation. > > Since most people are not interested in ZTS, there aren't going to be > many voices pushing you to actually make changes, so I want to be that > voice. > > The ZTS build is very commonly used in Windows today, and I'm sure > everyone doing that would appreciate you making these changes as soon as > reasonable, which looks like 7.4, beyond that before we can talk about > developing JIT support in Windows, ZTS support must be in place.
There are many things that would be great to get, but with very limited forces, we have to set priorities and select the most desired functionality, to work on it first. Thanks. Dmitry.
> > Thanks for the effort so far. > > Cheers > Joe > > On Wed, 13 Feb 2019 at 10:02, Nikita Popov ppv@gmail.com > <mailto:nikita.ppv@gmail.com>> wrote: > > On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com > <mailto:dmitry@zend.com>> wrote: > > Hi, > > > After JIT+ZTS related discussion with Joe and Bob, and some > related analyzes. > > I came to more or less formed design idea and described it at > https://wiki.php.net/zts-improvement > > This is not an RFC and I'm not sure, if I like to implement TSRM > changes myself now. > > > Comments are welcome. > > > Hi Dmitry, > > Thanks for looking into this issue. As a possible alternative I > would like to suggest the use of ZEND_TLS (__thread) for the > EG/CG/BG etc globals on Linux (on Windows this is not possible due > to DLL linkage restrictions). __thread generates very good code > (single load over %fs segment with constant address) if the global > is defined and used in an executable. I'm not sure what kind of code > it generates when TLS is declared in an executable and used in a > shared object, but as direct access from extensions to the engine > globals shouldn't be common, it's probably okay even if it uses > __tls_get_addr. > > Nikita >
  104403
February 14, 2019 11:56 zeev@php.net (Zeev Suraski)
On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote:

> The ZTS build is very commonly used in Windows today >
Any idea why? Zeev
  104411
February 14, 2019 15:22 rowan.collins@gmail.com (Rowan Collins)
On Thu, 14 Feb 2019 at 11:57, Zeev Suraski <zeev@php.net> wrote:

> On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > > The ZTS build is very commonly used in Windows today > > > > Any idea why? >
https://windows.php.net/ currently recommends using an NTS build with FastCGI, but there is (or was?) also an ISAPI module, i.e. the IIS equivalent of Apache mod_php. I think that requires / required ZTS, so that may be where the perception of "thread-safety is important for Windows users" comes from. Regards, -- Rowan Collins [IMSoP]
  104412
February 14, 2019 15:28 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 5:22 PM Rowan Collins collins@gmail.com>
wrote:

> On Thu, 14 Feb 2019 at 11:57, Zeev Suraski <zeev@php.net> wrote: > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > > > > The ZTS build is very commonly used in Windows today > > > > > > > Any idea why? > > > > > https://windows.php.net/ currently recommends using an NTS build with > FastCGI, but there is (or was?) also an ISAPI module, i.e. the IIS > equivalent of Apache mod_php. >
Yep, well aware of it, I wrote it ages ago... I believe it's dead, and if it isn't, it should be...
> I think that requires / required ZTS, so that may be where the perception > of "thread-safety is important for Windows users" comes from.
It's definitely the original reason for wanting ZTS on Windows (which used to make sense, until we worked with Microsoft to bring FastCGI to IIS). But if that's still the only reason, it's a pretty bogus one - there's virtually nothing but downsides to using the ISAPI module (or ZTS under Windows in general), at least there aren't any tangible advantages I can think of. That's why I'm asking... Zeev
  104413
February 14, 2019 15:39 krakjoe@gmail.com (Joe Watkins)
Packages such as xampp, which are very widely used, bundle a thread safe
interpreter.

It's a fact that ZTS is important on Windows.

Cheers
Joe

On Thu, 14 Feb 2019 at 16:22, Rowan Collins collins@gmail.com> wrote:

> On Thu, 14 Feb 2019 at 11:57, Zeev Suraski <zeev@php.net> wrote: > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > > > > The ZTS build is very commonly used in Windows today > > > > > > > Any idea why? > > > > > https://windows.php.net/ currently recommends using an NTS build with > FastCGI, but there is (or was?) also an ISAPI module, i.e. the IIS > equivalent of Apache mod_php. > > I think that requires / required ZTS, so that may be where the perception > of "thread-safety is important for Windows users" comes from. > > Regards, > -- > Rowan Collins > [IMSoP] >
  104414
February 14, 2019 15:47 cmbecker69@gmx.de ("Christoph M. Becker")
On 14.02.2019 at 12:56, Zeev Suraski wrote:

> On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > >> The ZTS build is very commonly used in Windows today > > Any idea why?
windows.php.net: | With Apache you have to use the Thread Safe (TS) versions of PHP. -- Christoph M. Becker
  104415
February 14, 2019 15:58 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 5:47 PM Christoph M. Becker <cmbecker69@gmx.de>
wrote:

> On 14.02.2019 at 12:56, Zeev Suraski wrote: > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > > >> The ZTS build is very commonly used in Windows today > > > > Any idea why? > > windows.php.net: > > | With Apache you have to use the Thread Safe (TS) versions of PHP. >
That's a great explanation (wasn't aware of it!), but that's actually not true, it's much better to use FastCGI instead (faster and definitely a lot more reliable and robust). Looks like XAMPP might be under the same false impression. Might be a good opportunity in PHP 8 to change these not-so-healthy defaults. After all, ISAPI used to be super popular too (under Windows) until worked with Microsoft to bring FastCGI into IIS. Zeev
  104416
February 14, 2019 15:59 rowan.collins@gmail.com (Rowan Collins)
On Thu, 14 Feb 2019 at 15:47, Christoph M. Becker <cmbecker69@gmx.de> wrote:

> On 14.02.2019 at 12:56, Zeev Suraski wrote: > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > > >> The ZTS build is very commonly used in Windows today > > > > Any idea why? > > windows.php.net: > > | With Apache you have to use the Thread Safe (TS) versions of PHP. > >
Ah, that makes sense; the only Apache MPM supported on Windows is mpm_winnt, which is thread-based: http://httpd.apache.org/docs/2.4/mod/mpm_winnt.html Again, this only makes sense if using server modules (mod_php); anyone using FastCGI will presumably be unaffected. All that being said, it would be nice if ZTS became more mainstream, so more people had access to userland threading / parallel processing extensions. Regards, -- Rowan Collins [IMSoP]
  104417
February 14, 2019 16:04 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 5:59 PM Rowan Collins collins@gmail.com>
wrote:

> All that being said, it would be nice if ZTS became more mainstream, so > more people had access to userland threading / parallel processing > extensions.
I think the most promising parallel processing paradigms today are based on asynchronous IO and not threading. That's generally the direction the world is going in (Node.js / Swoole / etc.) Threading (ZTS) actually comes at a fairly high cost, primarily in terms of performance but also in terms of some implementation complexity, and has few advantages over async IO. Zeev
  104418
February 14, 2019 16:13 krakjoe@gmail.com (Joe Watkins)
EXACTLY

Sorry I didn't answer in full, but I've been listening to people say its
not important for so long, I'm pretty tired of it by now.

We are talking about merging a thing that has the ability to make some
maths faster, at huge cost to the project, in two days I wrote a new
extension called parallel that can make *any* code faster if you have the
cores, which we all do, without complicating anything, and without cost.
Just a bit of thinking necessary.

I don't object to the JIT at all, but without zts it's a non starter for
me. The fact that there are easy improvements that could be done for ZTS
that we seem to being held to ransom for is just awful.

I wish more people would really think ...

https://gist.github.com/krakjoe/254897be71d23b5d5ac2d436f52e8d7d

Cheers
Joe

On Thu, 14 Feb 2019, 16:59 Rowan Collins collins@gmail.com wrote:

> On Thu, 14 Feb 2019 at 15:47, Christoph M. Becker <cmbecker69@gmx.de> > wrote: > > > On 14.02.2019 at 12:56, Zeev Suraski wrote: > > > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> > wrote: > > > > > >> The ZTS build is very commonly used in Windows today > > > > > > Any idea why? > > > > windows.php.net: > > > > | With Apache you have to use the Thread Safe (TS) versions of PHP. > > > > > > Ah, that makes sense; the only Apache MPM supported on Windows is > mpm_winnt, which is thread-based: > http://httpd.apache.org/docs/2.4/mod/mpm_winnt.html > > Again, this only makes sense if using server modules (mod_php); anyone > using FastCGI will presumably be unaffected. > > > All that being said, it would be nice if ZTS became more mainstream, so > more people had access to userland threading / parallel processing > extensions. > > Regards, > -- > Rowan Collins > [IMSoP] >
  104419
February 14, 2019 16:25 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 6:13 PM Joe Watkins <krakjoe@gmail.com> wrote:

> EXACTLY > > Sorry I didn't answer in full, but I've been listening to people say its > not important for so long, I'm pretty tired of it by now. >
Joe, all, Again, ISAPI was super important until one day, it wasn't. If the main reason ZTS is important is because of a misperception, we need to fix the misperception instead of of just treating it as a God-given commandment. Perhaps there's a real mainstream use case for ZTS. Perhaps there isn't. Usage alone of things like mod_php under Windows - which isn't a very good idea - isn't strong enough an indicator. I created ZTS mode - taking a lot of trouble to do so, refactoring huge chunks of the PHP code base at the time, and also wrote the ISAPI module and IIRC, ported mod_php to Windows (not 100% sure about that last one, it was 20 years ago). If the technologies that exist today existed back then - I'm virtually certain I wouldn't have done it. FastCGI/FPM is superior in every possible way to thread-safe server plugins. Heck, today it's widely accepted that FPM is even superior to mod_php under Linux in pretty much every respect. Under Windows the advantages are a lot more significant, as they're not limited to performance - but also provide much better robustness/stability. Zeev
  104420
February 14, 2019 17:27 levim@php.net (Levi Morrison)
To all internals, but especially to Zeev and Dmitry,

Having a JIT which does not support ZTS is a bit short-sighted. Think
about it. You want to add a JIT partly so that PHP will be better at
non-web, CPU intensive tasks. If the JIT is successful, eventually you
will want parallelism in that domain, and if you are going to use
threading then something like ZTS is required.

Additionally, a lot of people are using ZTS builds today. You can
argue all you want, we can't remove it easily. Imagine if I proposed a
breaking change in the language that would affect these same users.
Unless we got a *lot* of value out of it, I don't think it would pass.
And no offense, but in its current state the JIT does not have a lot
of value. It's a promising prototype, but it's not going to help the
vast majority of our users.

So, respectfully, I hope we can quit talking about removing ZTS. It is
required for the present, and it will probably be desirable in the
future. Therefore, *some* mechanism for a thread-safe JIT code
generation is a requirement. It doesn't necessarily have to be the ZTS
we have today, but it needs something.

Regards,

Levi Morrison
  104424
February 14, 2019 22:05 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 7:27 PM Levi Morrison <levim@php.net> wrote:

> To all internals, but especially to Zeev and Dmitry, > > Having a JIT which does not support ZTS is a bit short-sighted.
I don't think anybody's advocating for not having ZTS support for JIT. I am questioning the portof the absence of ZTS support in the current JIT as an absolute deal breaker. The question is do we need ZTS support for JIT in February 2019, ~2 years before PHP 8 comes out, or is it OK that it will be implemented a bit further down the line? Is it a horrific crime to include an experimental version in 7.4 that won't have ZTS support?
> Think > about it. You want to add a JIT partly so that PHP will be better at > non-web, CPU intensive tasks. If the JIT is successful, eventually you > will want parallelism in that domain, and if you are going to use > threading then something like ZTS is required. >
Multithreading isn't the way of choice today for scalable parallelism. Again, it doesn't mean that ZTS shouldn't be supported - but between Swoole support and ZTS support - I would rate Swoole as a lot more strategic. Additionally, a lot of people are using ZTS builds today. Which they can continue doing, even if we fast forward to 2020, JIT is in, and it doesn't support ZTS. It seems as if people think that if JIT doesn't support ZTS, it means ZTS is no longer viable. That's obviously not the case. You will be able to continue using PHP just fine with JIT disabled. You can
> argue all you want, we can't remove it easily. Imagine if I proposed a > breaking change in the language that would affect these same users. > Unless we got a *lot* of value out of it, I don't think it would pass. >
I'm not campaigning for the removal of ZTS; The fact I wouldn't go about implementing it from scratch today, doesn't mean we need to axe it. There are two discussions, that kind of got intermixed in this thread: - How high should we prioritize ZTS build support for JIT? I admit I have a very hard time accepting that it's as critical as some here portray it. That said, Dmitry actually started doing some thinking about how we could improve ZTS to make it friendlier to JIT. So I guess the ZTS crowd got its way, as it's been de-facto prioritized as high priority, although he seems to want to get some help there. - Separate thread, that spun off from Joe's comment that lots of people use ZTS builds under Windows. I asked a simple question - why he thinks that is, because to me it sounded anachronistic. While Joe didn't respond, others did, suggesting that it's because windows.php.net mistakenly tells people that the only way to run PHP under Apache on Windows, while there's a much better way of doing it that's both faster and more reliable (FastCGI). To me, that's something that we should tackle, and it's completely independent from the JIT discussion. And no offense, but in its current state the JIT does not have a lot
> of value.
None taken. I do think it has a lot of value (covered in the RFC), but as I've said numerous times in various places - it's not quite the no-brainer that phpng was (twice the speed, half the memory consumption, no downsides).
> It's a promising prototype, but it's not going to help the > vast majority of our users. >
That's something I think people don't seem to understand. It's not really a prototype, and arguably, if you don't like it the way it is - I'm not sure it would hold much promise from your point of view. It's not a prototype, but rather the intended base implementation of the feature if we were to accept it. It's the 3rd attempt at doing this, and I don't believe there are any intentions to try a 4th (nor should we expect any radically different results if we were to do a 4th). At this point we don't believe we'll be able to evolve significantly beyond where we are in terms of performance, although we can improve things in the way of fixing bugs, and maybe refactoring certain elements to make it more portable. And of course there's always room for improvement - but nothing that would radically change the high level characteristics of this feature. It's not going to make WordPress - or any other conventional PHP Web app - go 2x faster in the sense that PHP 7 did. So, respectfully, I hope we can quit talking about removing ZTS. Again, I'm not arguing for the removal of ZTS. ZTS was born to do one thing - which is allow PHP to run within a multithreaded Web server. However, it evolved to do something else - which is allow thread usage from within PHP. These are distinct, and almost inherently separate use cases. I have nothing against the latter - but I do think the former is outdated and should be discouraged. Much like we discontinued the ISAPI module, we may want to discontinue support for mod_php under Windows and point people to use FastCGI instead (which, in turn, can be a ZTS build, it's completely orthogonal). Like I mentioned before, today, mod_php is hard to find in any modern deployment, it's been pretty much exclusively replaced by fpm and for very good reasons. All these reasons apply in Windows as well, and they're supplemented by improved stability and robustness. There's ALWAYS the possibility that I'm missing something, I sent a note to one of the XAMPP maintainers to understand their rationale on this, as well as their take on what we should consider doing in this front. I think it is fair to say that in all likelihood, most people who use ZTS today do so because they think they have to, and not because they are users of the pthreads extension. It's the former group that I think we should tackle, in the exact same way that today, not a single person on the planet is running ISAPI and they've all moved to use FastCGI (or so I hope, at least, it's been gone for ages). This is something we ought to consider doing completely independent from this whole JIT discussion.
> It is > required for the present, and it will probably be desirable in the > future.
No argument here.
> Therefore, *some* mechanism for a thread-safe JIT code > generation is a requirement.
I don't see how one leads to the other. The fact PHP is being used on a certain platform or in a certain mode by does not inherently mean that JIT must support it, certainly not in this early, pre-approval stage. If we can indeed do away with the bogus perception that you must use ZTS in order to use PHP under Apache/Windows, and change XAMPP's deployment model to use FastCGI (which again we should probably try to do *independently* from this whole JIT discussion), it would in turn radically reduce the size of the applicable ZTS install base, and indirectly make this less of a priority for JIT. Of course, one thing that can change this whole thing is if we start shipping internal functions that are implemented in PHP, while relying on the performance of JIT to make them sufficiently fast. If that happens, our platform support, as well as ZTS support, must be a lot more exhaustive than it is today. But we're far from being there at this point. Zeev
  104431
February 15, 2019 07:26 kontakt@beberlei.de (Benjamin Eberlei)
On Thu, Feb 14, 2019 at 11:05 PM Zeev Suraski <zeev@php.net> wrote:

> On Thu, Feb 14, 2019 at 7:27 PM Levi Morrison <levim@php.net> wrote: > > > To all internals, but especially to Zeev and Dmitry, > > > > Having a JIT which does not support ZTS is a bit short-sighted. > > > I don't think anybody's advocating for not having ZTS support for JIT. I > am questioning the portof the absence of ZTS support in the current JIT as > an absolute deal breaker. The question is do we need ZTS support for JIT > in February 2019, ~2 years before PHP 8 comes out, or is it OK that it will > be implemented a bit further down the line? Is it a horrific crime to > include an experimental version in 7.4 that won't have ZTS support? > > > > Think > > about it. You want to add a JIT partly so that PHP will be better at > > non-web, CPU intensive tasks. If the JIT is successful, eventually you > > will want parallelism in that domain, and if you are going to use > > threading then something like ZTS is required. > > > > Multithreading isn't the way of choice today for scalable parallelism. > Again, it doesn't mean that ZTS shouldn't be supported - but between Swoole > support and ZTS support - I would rate Swoole as a lot more strategic. > > Additionally, a lot of people are using ZTS builds today. >
Async.io is used for scalability In languages with entirely different architectures than PHP. Node.js isn't well suited to monolithic applications such as Wordpress, Magento and the fast amount of applicatoins that people are building in PHP right now. One needs some supporting microservices here and there (to do websockets, high throughput work with lots of I/O, small daemons, ...) which right now people are using Go or Node.js for next to PHP, swoole or react-php. The more monolithic, heavily shared nothing based use case isn't going away, just because we have JIT and swoole and now can write our PHP as if it were Node.js. If we assume shared nothing is a major of PHPs unique selling points, then we want users to do parallel work to improve their code right now. parallel is a vastly superior threading API to everything that has been before, how awesome is doing something like this in your Wordpress, Symfony, Magento code: $runtime = new \parallel\Runtime(); $fetchJson = function ($url) { return json_decode(file_get_contents($url), true); } $future1 = $runtime->run($fetch, ["http://php.net/releases.json"]); $future2 = $runtime->run($fetch, ["http://nodejs.org/releases.json"]); $runtime->await([$future1, $future2]); // a joined wait is not on parallel yet echo $future1->getValue() . $future2->getValue(); This is like the 80% use-case of threading, Multiple HTTP requests, multiple long running SQL queries. An API like parallel would allow each and everyone of us to make controllers faster today without a large effort. Swoole is nice, but it will never be the 80% use-case for PHP users. If we add a JIT, then those non-web use cases which gain the most from it, would still also benefit from threads for parallel I/O. All async I/O APIs heavily lean on node.js APIs, which is an entirely different paradigm to program on and not PHP. Plus you have to rewrite all libraries to support this, instead of being able to re-use the libraries and style that PHP has championed for all this time.
> Which they can continue doing, even if we fast forward to 2020, JIT is in, > and it doesn't support ZTS. It seems as if people think that if JIT > doesn't support ZTS, it means ZTS is no longer viable. That's obviously > not the case. You will be able to continue using PHP just fine with JIT > disabled. > > You can > > argue all you want, we can't remove it easily. Imagine if I proposed a > > breaking change in the language that would affect these same users. > > Unless we got a *lot* of value out of it, I don't think it would pass. > > > > I'm not campaigning for the removal of ZTS; The fact I wouldn't go about > implementing it from scratch today, doesn't mean we need to axe it. There > are two discussions, that kind of got intermixed in this thread: > > - How high should we prioritize ZTS build support for JIT? I admit I have > a very hard time accepting that it's as critical as some here portray it. > That said, Dmitry actually started doing some thinking about how we could > improve ZTS to make it friendlier to JIT. So I guess the ZTS crowd got its > way, as it's been de-facto prioritized as high priority, although he seems > to want to get some help there. > - Separate thread, that spun off from Joe's comment that lots of people use > ZTS builds under Windows. I asked a simple question - why he thinks that > is, because to me it sounded anachronistic. While Joe didn't respond, > others did, suggesting that it's because windows.php.net mistakenly tells > people that the only way to run PHP under Apache on Windows, while there's > a much better way of doing it that's both faster and more reliable > (FastCGI). To me, that's something that we should tackle, and it's > completely independent from the JIT discussion. > > And no offense, but in its current state the JIT does not have a lot > > of value. > > > None taken. I do think it has a lot of value (covered in the RFC), but as > I've said numerous times in various places - it's not quite the no-brainer > that phpng was (twice the speed, half the memory consumption, no > downsides). > > > > It's a promising prototype, but it's not going to help the > > vast majority of our users. > > > > That's something I think people don't seem to understand. It's not really > a prototype, and arguably, if you don't like it the way it is - I'm not > sure it would hold much promise from your point of view. It's not a > prototype, but rather the intended base implementation of the feature if we > were to accept it. It's the 3rd attempt at doing this, and I don't believe > there are any intentions to try a 4th (nor should we expect any radically > different results if we were to do a 4th). At this point we don't believe > we'll be able to evolve significantly beyond where we are in terms of > performance, although we can improve things in the way of fixing bugs, and > maybe refactoring certain elements to make it more portable. And of course > there's always room for improvement - but nothing that would radically > change the high level characteristics of this feature. It's not going to > make WordPress - or any other conventional PHP Web app - go 2x faster in > the sense that PHP 7 did. > > So, respectfully, I hope we can quit talking about removing ZTS. > > > Again, I'm not arguing for the removal of ZTS. > > ZTS was born to do one thing - which is allow PHP to run within a > multithreaded Web server. However, it evolved to do something else - which > is allow thread usage from within PHP. These are distinct, and almost > inherently separate use cases. I have nothing against the latter - but I > do think the former is outdated and should be discouraged. Much like we > discontinued the ISAPI module, we may want to discontinue support for > mod_php under Windows and point people to use FastCGI instead (which, in > turn, can be a ZTS build, it's completely orthogonal). Like I mentioned > before, today, mod_php is hard to find in any modern deployment, it's been > pretty much exclusively replaced by fpm and for very good reasons. All > these reasons apply in Windows as well, and they're supplemented by > improved stability and robustness. There's ALWAYS the possibility that I'm > missing something, I sent a note to one of the XAMPP maintainers to > understand their rationale on this, as well as their take on what we should > consider doing in this front. > > I think it is fair to say that in all likelihood, most people who use ZTS > today do so because they think they have to, and not because they are users > of the pthreads extension. It's the former group that I think we should > tackle, in the exact same way that today, not a single person on the planet > is running ISAPI and they've all moved to use FastCGI (or so I hope, at > least, it's been gone for ages). This is something we ought to consider > doing completely independent from this whole JIT discussion. > > > > It is > > required for the present, and it will probably be desirable in the > > future. > > > No argument here. > > > > Therefore, *some* mechanism for a thread-safe JIT code > > generation is a requirement. > > > I don't see how one leads to the other. The fact PHP is being used on a > certain platform or in a certain mode by does not inherently mean that JIT > must support it, certainly not in this early, pre-approval stage. If we > can indeed do away with the bogus perception that you must use ZTS in order > to use PHP under Apache/Windows, and change XAMPP's deployment model to use > FastCGI (which again we should probably try to do *independently* from this > whole JIT discussion), it would in turn radically reduce the size of the > applicable ZTS install base, and indirectly make this less of a priority > for JIT. > > Of course, one thing that can change this whole thing is if we start > shipping internal functions that are implemented in PHP, while relying on > the performance of JIT to make them sufficiently fast. If that happens, > our platform support, as well as ZTS support, must be a lot more exhaustive > than it is today. But we're far from being there at this point. > > Zeev >
  104435
February 15, 2019 10:14 pierre.php@gmail.com (Pierre Joye)
Hi Benjamin,

On Fri, Feb 15, 2019, 2:26 PM Benjamin Eberlei <kontakt@beberlei.de wrote:

> > Async.io is used for scalability In languages with entirely different > architectures than PHP. > > Node.js isn't well suited to monolithic applications such as Wordpress, > Magento and the fast amount of applicatoins that people are building in PHP > right now.
> One needs some supporting microservices here and there (to do websockets, > high throughput work with lots of I/O, small daemons, ...) which right now > people are using Go or Node.js for next to PHP, swoole or react-php. >
This is like the 80% use-case of threading, Multiple HTTP requests,
> multiple long running SQL queries. An API like parallel would allow > each and everyone of us to make controllers faster today without a > large effort. > > Swoole is nice, but it will never be the 80% use-case for PHP users.
All async I/O APIs heavily lean on node.js APIs, which is an entirely
> different paradigm to program on and not PHP. Plus you have to rewrite > all libraries to support this, instead of being able to re-use the > libraries and style that PHP has championed for all this time. >
I think you are right in your analysis. However I fail to see where you are heading to. It is the same as when couchbase or other Mongodb drivers were created, or OO features. 100% of the existing code base were not suited for them. Still, these features open new doors for php as a leading web language. The same applies to async io and to some extend parallelism (in different area than web servers tho'). And yes, async requires much more efforts (overall applications architecture) to port libraries or application to fully use them and benefit from async IO. I am convinced PHP will be able to have these features and remain a leader for a few more years. PS: I will still use node, go or Python as well as always ;-) best, Pierre
>
  104437
February 15, 2019 11:12 rowan.collins@gmail.com (Rowan Collins)
On Fri, 15 Feb 2019 at 07:26, Benjamin Eberlei <kontakt@beberlei.de> wrote:

> This is like the 80% use-case of threading, Multiple HTTP requests, multiple long running SQL queries. An API like parallel would allow each and everyone of us to make controllers faster today without a large effort. > > I think an expanded example would better demonstrate the power of this.
Consider a search API that aggregates other search APIs; for each API, it will need to do several steps: - Look up code mappings in a DB to convert input into format needed by this API - Perform an auth request to get a fresh token - Send the search API call - Look up more code mappings to convert the response - Post-process results into a standard form The challenge is to call as many APIs as possible at once, and get the results to the user. The slow parts of this are mostly I/O bound, but to take advantage of async I/O you need to rewrite not just the I/O parts but the entire process to use some async framework. Maybe there's even parts that are CPU heavy in between the requests - a complex cryptographic algorithm, for instance - that async I/O wouldn't help with at all, but which could happen on separate cores when load was low. It would be absolutely amazing if you could use something like Joe's new parallel extension to handle this workload: run the handling of each API as a thread, and leave the code inside them entirely unchanged. I realise it's probably not as simple as that in practice, but to me, the ability to adapt existing code is a huge benefit of a straightforward parallel/thread/worker API over one based explicitly on async I/O. Regards, -- Rowan Collins [IMSoP]
  104423
February 14, 2019 17:57 larry@garfieldtech.com (Larry Garfield)
On Thursday, February 14, 2019 9:59:18 AM CST Rowan Collins wrote:
> On Thu, 14 Feb 2019 at 15:47, Christoph M. Becker <cmbecker69@gmx.de> wrote: > > On 14.02.2019 at 12:56, Zeev Suraski wrote: > > > On Wed, Feb 13, 2019 at 11:26 AM Joe Watkins <krakjoe@gmail.com> wrote: > > >> The ZTS build is very commonly used in Windows today > > > > > > Any idea why? > > > > windows.php.net: > > | With Apache you have to use the Thread Safe (TS) versions of PHP. > > Ah, that makes sense; the only Apache MPM supported on Windows is > mpm_winnt, which is thread-based: > http://httpd.apache.org/docs/2.4/mod/mpm_winnt.html > > Again, this only makes sense if using server modules (mod_php); anyone > using FastCGI will presumably be unaffected. > > > All that being said, it would be nice if ZTS became more mainstream, so > more people had access to userland threading / parallel processing > extensions. > > Regards,
Data point: At Platform.sh (web host), we've been running ZTS builds of 7.1, 7.2, and 7.3 exclusively for a while now. We don't even offer non-ZTS versions of those releases. It's been quite solid, and mysteriously even slightly faster than the non-ZTS version of 7.1 on code that wasn't doing anything threaded at all (which surprised me, but hey). I have to agree with Joe's post yesterday on this front: https://blog.krakjoe.ninja/2019/02/parallel-php-next-chapter.html It's not threads that are unsafe for end users; it's badly designed thread APIs that start by pointing a gun at your foot. :-) A well-designed thread- backed concurrency model (Go routines being a good but not the only example) is way better, and at least from a user side I frankly prefer it to async IO. You get much of the same benefits with less need to restructure you're code, even with async/await. Plus you can parallelize CPU intensive tasks, something async IO simply cannot do. I'm not against the efforts to add async IO to PHP, but it's not an either/or with thread-safe code. I'd also love to see more non-IO-bound concurrency added to the language. --Larry Garfield
  104425
February 14, 2019 22:33 zeev@php.net (Zeev Suraski)
On Thu, Feb 14, 2019 at 7:57 PM Larry Garfield <larry@garfieldtech.com>
wrote:

> Data point: At Platform.sh (web host), we've been running ZTS builds of > 7.1, > 7.2, and 7.3 exclusively for a while now. We don't even offer non-ZTS > versions of those releases.
I presume you haven't been using a threaded Web server module though, right? As I pointed out in my lengthy response to Levi, it's thread-safe Web server plugins that are bad news, not ZTS itself. It's been quite solid It would be solid as long as you don't use a thread-safe Web server plugin. Then, any slight thread-safety issue in any of the underlying libraries could result in your Web server process crashing in its entirety.
> , and mysteriously even > slightly faster than the non-ZTS version of 7.1 on code that wasn't doing > anything threaded at all (which surprised me, but hey). >
That is indeed curious. I'd be interested to follow up with you to better understand how you measured it, as it goes against both my expectations (ZTS code does more compared to non-ZTS code) and my experience. We can maybe take that offline...
> I have to agree with Joe's post yesterday on this front: > > https://blog.krakjoe.ninja/2019/02/parallel-php-next-chapter.html > > It's not threads that are unsafe for end users; it's badly designed thread > APIs that start by pointing a gun at your foot. :-) A well-designed > thread- > backed concurrency model (Go routines being a good but not the only > example) > is way better, and at least from a user side I frankly prefer it to async > IO. > You get much of the same benefits with less need to restructure you're > code, > even with async/await. Plus you can parallelize CPU intensive tasks, > something async IO simply cannot do. >
It's not just a matter of preference. Async IO is significantly more efficient than threads both in terms of speed and memory overhead. It's no coincidence that all high performance Web servers nowadays are async-IO based, and that the most scalable app delivery platforms are also async IO based. In addition to the underlying building blocks being more efficient, the async-IO model, where you have a long running process that is in fact directly responding to HTTP requests - is most likely in itself bringing the biggest performance gains - as effectively you have a 'hot', ready to execute app that can preload all sorts of elements into memory and be ready to go with little to no initialization (a direction we went in with the code preloading feature, but things like Node.js or Swoole go a lot farther than that). All that doesn't come to say we must not implement threads - not at all - just that they're probably not the preferred method of achieving in-request parallelism within the Web environment. Zeev
  104427
February 15, 2019 01:27 larry@garfieldtech.com (Larry Garfield)
On Thursday, February 14, 2019 4:33:27 PM CST Zeev Suraski wrote:
> On Thu, Feb 14, 2019 at 7:57 PM Larry Garfield <larry@garfieldtech.com> > > wrote: > > Data point: At Platform.sh (web host), we've been running ZTS builds of > > 7.1, > > 7.2, and 7.3 exclusively for a while now. We don't even offer non-ZTS > > versions of those releases. > > I presume you haven't been using a threaded Web server module though, right? > As I pointed out in my lengthy response to Levi, it's thread-safe Web > server plugins that are bad news, not ZTS itself. > > It's been quite solid > > > It would be solid as long as you don't use a thread-safe Web server > plugin. Then, any slight thread-safety issue in any of the underlying > libraries could result in your Web server process crashing in its entirety.
Right, we're using typical Nginx/FPM for web requests. ZTS is just there so that CLI users can run their own process that use pthreads. (I don't know if anyone has done so, but we support it, so yay.)
> > , and mysteriously even > > slightly faster than the non-ZTS version of 7.1 on code that wasn't doing > > anything threaded at all (which surprised me, but hey). > > That is indeed curious. I'd be interested to follow up with you to better > understand how you measured it, as it goes against both my expectations > (ZTS code does more compared to non-ZTS code) and my experience. We can > maybe take that offline...
It's been a while since we ran the tests. I believe it was just for the 7.1 release, and we tested with normal ab or something. Happy to chat off-list if you want but I'm not sure that I can remember much at this point. :-)
> > I have to agree with Joe's post yesterday on this front: > > > > https://blog.krakjoe.ninja/2019/02/parallel-php-next-chapter.html > > > > It's not threads that are unsafe for end users; it's badly designed thread > > APIs that start by pointing a gun at your foot. :-) A well-designed > > thread- > > backed concurrency model (Go routines being a good but not the only > > example) > > is way better, and at least from a user side I frankly prefer it to async > > IO. > > You get much of the same benefits with less need to restructure you're > > code, > > even with async/await. Plus you can parallelize CPU intensive tasks, > > something async IO simply cannot do. > > It's not just a matter of preference. Async IO is significantly more > efficient than threads both in terms of speed and memory overhead. It's no > coincidence that all high performance Web servers nowadays are async-IO > based, and that the most scalable app delivery platforms are also async IO > based. In addition to the underlying building blocks being more efficient, > the async-IO model, where you have a long running process that is in fact > directly responding to HTTP requests - is most likely in itself bringing > the biggest performance gains - as effectively you have a 'hot', ready to > execute app that can preload all sorts of elements into memory and be ready > to go with little to no initialization (a direction we went in with the > code preloading feature, but things like Node.js or Swoole go a lot farther > than that). All that doesn't come to say we must not implement threads - > not at all - just that they're probably not the preferred method of > achieving in-request parallelism within the Web environment. > > Zeev
I'm not saying async IO doesn't have advantages, and having a "hot" running application that persists between requests is something I really want, too. However, async IO by definition doesn't help unless your blocker is IO. Often it is, but often you have a CPU bound problem. Even in PHP. (I used to work on Drupal, remember; it does a LOT of non-IO work...) Async IO isn't going to help you much if IO isn't your bottleneck. I think we're largely on the same page, though: Threads good, threaded webserver host bad. PHP should have good thread support in the future. Whether ZTS specifically counts as "good thread support" at the engine level I have no opinion, as I haven't been anywhere close to the engine to know what I'm talking about. --Larry Garfield
  104428
February 15, 2019 02:33 pierre.php@gmail.com (Pierre Joye)
Good morning,

Again a long reply from me, sorry :)

There are a few things I would like to clarify from my perspective.On
Windows, historically (for years), the only usable webserver was
Apache. And the only way to work with it for years as well was
mod_php. Given the Apache's design, it requires thread safety. As Zeev
mentioned, ZTS was created to bring a thread safety framework for the
engine and the extensions. It did not get updated for years and relies
on old APIs or ways of doing it. On Windows the performance impact was
even bigger.

With the introduction of thread local storage support introduce
(https://wiki.php.net/rfc/native-tls) things improved drastically. TLS
solves many issues we had with the old ZTS version and also made it
more stable. Stability improvement was not only due toTLS but also to
the great work Anatol pushed into this RFC by reviewing the entire php
code base and extensions (incl pecl). ZTS mode became quite solid.
From a performance point of view, it matched or beat NTS on Windows,
and in some cases by good margin. I do not have the numbers at hand
but if someone is interested, looking at the discussions for this RFC
should give them all the details.

All this applied to ISAPI as well for the time it was still part of
the core, only even less stable.

Now, about ZTS usage on Windows being huge. This is correct. Is it a
de-facto choice for Windows users? No. It is a communication and
marketing issue. Microsoft introduce FCGI support to IIS long time ago
and became a de facto standard for any kind of PHP (or other) support
with IIS. Application pool and most of the IIS support are fully
supported with PHP and FCGI.

Many developers use ZTS with Apache thinking it will be better and
faster, or comparable to FPM, because FPM is not supported on
Windows.This is clearly something we need to communicate better.
Keeping in mind that FPM (and its key features) are not possible to be
ported to Windows due to the windows architecture. However, many
webservers support similar features (IIS or other) natively and with
php-FastCGI.

From a strategy point of view, I  agree with Zeev. With a slightly
different approach.

We need thread safety internally for many modern needs (async
processing/IOs f.e. use threads internally). From an end user
perspective, that should be transparent (easier said than done). That
being threading in userland is complete different story. There are a
few great things out there to make this case:

- Joe made a great job with his parallel extension
(https://github.com/krakjoe/parallel).
- AMP has something similar (from a user perespective) with
https://github.com/amphp/parallel.
- Swoole, Zeev mentioned it and I highly recommend to look at it
- many other projects trying to change the way we use PHP, ReactPHP f.e.

You get the idea. There is a need. And I think we should provide the
tool to help them to do it nicely and with good performance. ZTS is
not going to help here, this is not adapted to where we have to go (or
should be already). I would rather scratch it and discuss with Swoole,
Joe, AMP or React teams, get inspirations in other languages to see
what can be done at the engine level to make this kind of tasks
possible, portable and fast.

Best,

On Fri, Feb 15, 2019 at 5:33 AM Zeev Suraski <zeev@php.net> wrote:
> > On Thu, Feb 14, 2019 at 7:57 PM Larry Garfield <larry@garfieldtech.com> > wrote: > > > Data point: At Platform.sh (web host), we've been running ZTS builds of > > 7.1, > > 7.2, and 7.3 exclusively for a while now. We don't even offer non-ZTS > > versions of those releases. > > > I presume you haven't been using a threaded Web server module though, right? > As I pointed out in my lengthy response to Levi, it's thread-safe Web > server plugins that are bad news, not ZTS itself. > > It's been quite solid > > > It would be solid as long as you don't use a thread-safe Web server > plugin. Then, any slight thread-safety issue in any of the underlying > libraries could result in your Web server process crashing in its entirety. > > > > , and mysteriously even > > slightly faster than the non-ZTS version of 7.1 on code that wasn't doing > > anything threaded at all (which surprised me, but hey). > > > > That is indeed curious. I'd be interested to follow up with you to better > understand how you measured it, as it goes against both my expectations > (ZTS code does more compared to non-ZTS code) and my experience. We can > maybe take that offline... > > > > I have to agree with Joe's post yesterday on this front: > > > > https://blog.krakjoe.ninja/2019/02/parallel-php-next-chapter.html > > > > It's not threads that are unsafe for end users; it's badly designed thread > > APIs that start by pointing a gun at your foot. :-) A well-designed > > thread- > > backed concurrency model (Go routines being a good but not the only > > example) > > is way better, and at least from a user side I frankly prefer it to async > > IO. > > You get much of the same benefits with less need to restructure you're > > code, > > even with async/await. Plus you can parallelize CPU intensive tasks, > > something async IO simply cannot do. > > > > It's not just a matter of preference. Async IO is significantly more > efficient than threads both in terms of speed and memory overhead. It's no > coincidence that all high performance Web servers nowadays are async-IO > based, and that the most scalable app delivery platforms are also async IO > based. In addition to the underlying building blocks being more efficient, > the async-IO model, where you have a long running process that is in fact > directly responding to HTTP requests - is most likely in itself bringing > the biggest performance gains - as effectively you have a 'hot', ready to > execute app that can preload all sorts of elements into memory and be ready > to go with little to no initialization (a direction we went in with the > code preloading feature, but things like Node.js or Swoole go a lot farther > than that). All that doesn't come to say we must not implement threads - > not at all - just that they're probably not the preferred method of > achieving in-request parallelism within the Web environment. > > Zeev
-- Pierre @pierrejoye | http://www.libgd.org
  104381
February 13, 2019 09:35 dmitry@zend.com (Dmitry Stogov)
Hi Nikita,


On 2/13/19 12:02 PM, Nikita Popov wrote:
> On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com > <mailto:dmitry@zend.com>> wrote: > > Hi, > > > After JIT+ZTS related discussion with Joe and Bob, and some related > analyzes. > > I came to more or less formed design idea and described it at > https://wiki.php.net/zts-improvement > > This is not an RFC and I'm not sure, if I like to implement TSRM > changes myself now. > > > Comments are welcome. > > > Hi Dmitry, > > Thanks for looking into this issue. As a possible alternative I would > like to suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc > globals on Linux (on Windows this is not possible due to DLL linkage > restrictions).
I played with __thread long time ago (may be 10 years), and that time it didn't work as expected. I can't remember the exact problems. May be it made troubles when used in DSO PHP build with DSO extensions, may be the size of "__thread" segment was limited, may be both.
> __thread generates very good code (single load over %fs > segment with constant address) if the global is defined and used in an > executable.
I suppose 2 loads: movq executor_globals@gottpoff(%rip), %rax movq %fs:field_offset(%rax), %rax
> I'm not sure what kind of code it generates when TLS is > declared in an executable and used in a shared object, but as direct > access from extensions to the engine globals shouldn't be common, it's > probably okay even if it uses __tls_get_addr.
The main problem with "__thread" might be portability. I especially thought about keeping TSRM layer with the same API, to avoid portability issues, but if "__thread" works fine, we definitely should use it. On the other hand, I'm not sure, who uses ZTS build today and if they need performance. Thanks. Dmitry.
> > Nikita
  104422
February 14, 2019 17:55 ab@php.net (Anatol Belski)
Hi Nikita,

> -----Original Message----- > From: Nikita Popov ppv@gmail.com> > Sent: Wednesday, February 13, 2019 1:02 AM > To: Dmitry Stogov <dmitry@zend.com> > Cc: Joe Watkins <krakjoe@gmail.com>; Bob Weinand <bwoebi@php.net>; > Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) <ab@php.net>; > zeev@php.net; PHP internals <internals@lists.php.net> > Subject: Re: ZTS improvement idea > > On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com > <mailto:dmitry@zend.com> > wrote: > > > Hi, > > > > > After JIT+ZTS related discussion with Joe and Bob, and some related > analyzes. > > I came to more or less formed design idea and described it at > https://wiki.php.net/zts-improvement > > This is not an RFC and I'm not sure, if I like to implement TSRM > changes myself now. > > > > > Comments are welcome. > > > Hi Dmitry, > > Thanks for looking into this issue. As a possible alternative I would like to > suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc globals on > Linux (on Windows this is not possible due to DLL linkage restrictions). > __thread generates very good code (single load over %fs segment with > constant address) if the global is defined and used in an executable. I'm not > sure what kind of code it generates when TLS is declared in an executable > and used in a shared object, but as direct access from extensions to the > engine globals shouldn't be common, it's probably okay even if it uses > __tls_get_addr. > TLS data available across shared objects is a GNU extension and AFAIK there's a lot of black magic behind it. Thread local storage should be indeed local to some scope, be it a function or a binary unit, as per design. Like for C++11 as well, it's thread_local we currently use. It'd hurt the portability and likely introduce issues in the future, as it might affect any non GNU systems which we rarely test. Otherwise, of course it would be easy to say, we add ZEND_TLS to the definition, and be good :)
Thanks Anatol
  104430
February 15, 2019 07:21 dmitry@zend.com (Dmitry Stogov)
On 2/14/19 8:55 PM, Anatol Belski wrote:
> Hi Nikita, > >> -----Original Message----- >> From: Nikita Popov ppv@gmail.com> >> Sent: Wednesday, February 13, 2019 1:02 AM >> To: Dmitry Stogov <dmitry@zend.com> >> Cc: Joe Watkins <krakjoe@gmail.com>; Bob Weinand <bwoebi@php.net>; >> Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) <ab@php.net>; >> zeev@php.net; PHP internals <internals@lists.php.net> >> Subject: Re: ZTS improvement idea >> >> On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com >> <mailto:dmitry@zend.com> > wrote: >> >> >> Hi, >> >> >> >> >> After JIT+ZTS related discussion with Joe and Bob, and some related >> analyzes. >> >> I came to more or less formed design idea and described it at >> https://wiki.php.net/zts-improvement >> >> This is not an RFC and I'm not sure, if I like to implement TSRM >> changes myself now. >> >> >> >> >> Comments are welcome. >> >> >> Hi Dmitry, >> >> Thanks for looking into this issue. As a possible alternative I would like to >> suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc globals on >> Linux (on Windows this is not possible due to DLL linkage restrictions). >> __thread generates very good code (single load over %fs segment with >> constant address) if the global is defined and used in an executable. I'm not >> sure what kind of code it generates when TLS is declared in an executable >> and used in a shared object, but as direct access from extensions to the >> engine globals shouldn't be common, it's probably okay even if it uses >> __tls_get_addr. >> > TLS data available across shared objects is a GNU extension and AFAIK there's a lot of black magic behind it. Thread local storage should be indeed local to some scope, be it a function or a binary unit, as per design. Like for C++11 as well, it's thread_local we currently use. It'd hurt the portability and likely introduce issues in the future, as it might affect any non GNU systems which we rarely test. Otherwise, of course it would be easy to say, we add ZEND_TLS to the definition, and be good :)
In two words, if we make executor_glabal to be "__thread", we will get troubles accessing it from DSO extensions. Right? Thanks. Dmitry.
> > Thanks > > Anatol >
  104443
February 16, 2019 06:29 ab@php.net (Anatol Belski)
> -----Original Message----- > From: Dmitry Stogov <dmitry@zend.com> > Sent: Thursday, February 14, 2019 11:21 PM > To: Anatol Belski <ab@php.net>; Nikita Popov ppv@gmail.com> > Cc: Joe Watkins <krakjoe@gmail.com>; Bob Weinand <bwoebi@php.net>; > Nikita Popov <nikic@php.net>; zeev@php.net; PHP internals > <internals@lists.php.net> > Subject: Re: ZTS improvement idea > > > > On 2/14/19 8:55 PM, Anatol Belski wrote: > > Hi Nikita, > > > >> -----Original Message----- > >> From: Nikita Popov ppv@gmail.com> > >> Sent: Wednesday, February 13, 2019 1:02 AM > >> To: Dmitry Stogov <dmitry@zend.com> > >> Cc: Joe Watkins <krakjoe@gmail.com>; Bob Weinand > <bwoebi@php.net>; > >> Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) > >> <ab@php.net>; zeev@php.net; PHP internals <internals@lists.php.net> > >> Subject: Re: ZTS improvement idea > >> > >> On Wed, Feb 13, 2019 at 9:26 AM Dmitry Stogov <dmitry@zend.com > >> <mailto:dmitry@zend.com> > wrote: > >> > >> > >> Hi, > >> > >> > >> > >> > >> After JIT+ZTS related discussion with Joe and Bob, and some related > >> analyzes. > >> > >> I came to more or less formed design idea and described it at > >> https://wiki.php.net/zts-improvement > >> > >> This is not an RFC and I'm not sure, if I like to implement TSRM > >> changes myself now. > >> > >> > >> > >> > >> Comments are welcome. > >> > >> > >> Hi Dmitry, > >> > >> Thanks for looking into this issue. As a possible alternative I would > >> like to suggest the use of ZEND_TLS (__thread) for the EG/CG/BG etc > >> globals on Linux (on Windows this is not possible due to DLL linkage > restrictions). > >> __thread generates very good code (single load over %fs segment with > >> constant address) if the global is defined and used in an executable. > >> I'm not sure what kind of code it generates when TLS is declared in > >> an executable and used in a shared object, but as direct access from > >> extensions to the engine globals shouldn't be common, it's probably > >> okay even if it uses __tls_get_addr. > >> > > TLS data available across shared objects is a GNU extension and AFAIK > > there's a lot of black magic behind it. Thread local storage should be > > indeed local to some scope, be it a function or a binary unit, as per > > design. Like for C++11 as well, it's thread_local we currently use. > > It'd hurt the portability and likely introduce issues in the future, > > as it might affect any non GNU systems which we rarely test. > > Otherwise, of course it would be easy to say, we add ZEND_TLS to the > > definition, and be good :) > > In two words, if we make executor_glabal to be "__thread", we will get > troubles accessing it from DSO extensions. Right? > > Thanks. Dmitry. > Exactly. Plus, it might introduce issues with C++ compatibility.
Thanks Anatol
  104421
February 14, 2019 17:44 ab@php.net (Anatol Belski)
Hi Dmitry,

> -----Original Message----- > From: Dmitry Stogov <dmitry@zend.com> > Sent: Wednesday, February 13, 2019 12:26 AM > To: Joe Watkins <krakjoe@gmail.com>; Bob Weinand <bwoebi@php.net>; > Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) <ab@php.net>; > zeev@php.net > Cc: PHP internals <internals@lists.php.net> > Subject: ZTS improvement idea > > Hi, > > > > > After JIT+ZTS related discussion with Joe and Bob, and some related > analyzes. > > I came to more or less formed design idea and described it at > https://wiki.php.net/zts-improvement > I thought about it as well. The reason for the additional dereference levels is probably ,that every globals structure has its own size. That way, it needs to go on the heap. What we indeed could do were handling some specific known structures a different way. It'd be like EG and others, that belong to the very core and are always available. Other globals, especially from extensions that can be built shared, would be probably still handled the old way. Maybe it would be a good start to speedup the very core as first. I'd wonder which particular data structures and mechanism you had in mind.
> This is not an RFC and I'm not sure, if I like to implement TSRM changes > myself now. > Certainly not an RFC. I'm short of time as well, perhaps it will change in a couple of months.
Thanks Anatol
  104429
February 15, 2019 07:12 dmitry@zend.com (Dmitry Stogov)
Hi Anatol,

On 2/14/19 8:44 PM, Anatol Belski wrote:
> Hi Dmitry, > >> -----Original Message----- >> From: Dmitry Stogov <dmitry@zend.com> >> Sent: Wednesday, February 13, 2019 12:26 AM >> To: Joe Watkins <krakjoe@gmail.com>; Bob Weinand <bwoebi@php.net>; >> Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) <ab@php.net>; >> zeev@php.net >> Cc: PHP internals <internals@lists.php.net> >> Subject: ZTS improvement idea >> >> Hi, >> >> >> >> >> After JIT+ZTS related discussion with Joe and Bob, and some related >> analyzes. >> >> I came to more or less formed design idea and described it at >> https://wiki.php.net/zts-improvement >> > I thought about it as well. The reason for the additional dereference > levels is probably ,that every globals structure has its own size. > That way, it needs to go on the heap.
Not necessary. In case all the structures are known at MINIT time, we may realloc()-ate the whole flattened tsrm_tls_entry and then access data faster. It may be a problem with dl(), but it must already be problematic in ZTS build.
> What we indeed could do were handling some specific known structures > a different way. It'd be like EG and others, that belong to the very > core and are always available. Other globals, especially from extensions > that can be built shared, would be probably still handled the old way. > Maybe it would be a good start to speedup the very core as first. I'd > wonder which particular data structures and mechanism you had in mind.
In https://wiki.php.net/zts-improvement I proposed: - make "executor_globals_id" to be constant (it's quite easy to do). - make all "...globals_id" to keep offsets instead if indexes (this is a bit more complex and require changes in TSRM implementation). Thanks. Dmitry.
> > >> This is not an RFC and I'm not sure, if I like to implement TSRM changes >> myself now. >> > Certainly not an RFC. I'm short of time as well, perhaps it will change in a couple of months. > > Thanks > > Anatol >
  104444
February 16, 2019 06:51 ab@php.net (Anatol Belski)
Hi Dmitry,

> > I thought about it as well. The reason for the additional dereference > > levels is probably ,that every globals structure has its own size. > > That way, it needs to go on the heap. > > Not necessary. In case all the structures are known at MINIT time, we may > realloc()-ate the whole flattened tsrm_tls_entry and then access data faster. > > It may be a problem with dl(), but it must already be problematic in ZTS build. > Perhaps an idea for this could be to generate a special function for every extension to be exported, like get_module_globals_size(). It could be probably seamlessly integrated into ZEND_DECLARE_MODULE_GLOBALS(), so require no change to the existing code. Then on module load we could realloc() the flattened structure on demand. Still it'd be on heap, but it would be a contiguous chunk of memory.
> > What we indeed could do were handling some specific known structures a > > different way. It'd be like EG and others, that belong to the very > > core and are always available. Other globals, especially from > > extensions that can be built shared, would be probably still handled the old > way. > > Maybe it would be a good start to speedup the very core as first. I'd > > wonder which particular data structures and mechanism you had in mind. > > In https://wiki.php.net/zts-improvement I proposed: > > - make "executor_globals_id" to be constant (it's quite easy to do). > - make all "...globals_id" to keep offsets instead if indexes (this is a bit more > complex and require changes in TSRM implementation). > Yeah, gotcha. The very core globals should be easy to "prefill" for the flattened structure as the sizes are known at the compile time. All the others could be appended to them, perhaps the way I've suggested above or another if a better one is found.
Thanks Anatol
  104445
February 16, 2019 11:12 nikita.ppv@gmail.com (Nikita Popov)
On Sat, Feb 16, 2019 at 7:51 AM Anatol Belski <ab@php.net> wrote:

> Hi Dmitry, > > > > I thought about it as well. The reason for the additional dereference > > > levels is probably ,that every globals structure has its own size. > > > That way, it needs to go on the heap. > > > > Not necessary. In case all the structures are known at MINIT time, we may > > realloc()-ate the whole flattened tsrm_tls_entry and then access data > faster. > > > > It may be a problem with dl(), but it must already be problematic in ZTS > build. > > > Perhaps an idea for this could be to generate a special function for every > extension to be exported, like get_module_globals_size(). It could be > probably seamlessly integrated into ZEND_DECLARE_MODULE_GLOBALS(), so > require no change to the existing code. Then on module load we could > realloc() the flattened structure on demand. Still it'd be on heap, but it > would be a contiguous chunk of memory. > > > > > What we indeed could do were handling some specific known structures a > > > different way. It'd be like EG and others, that belong to the very > > > core and are always available. Other globals, especially from > > > extensions that can be built shared, would be probably still handled > the old > > way. > > > Maybe it would be a good start to speedup the very core as first. I'd > > > wonder which particular data structures and mechanism you had in mind. > > > > In https://wiki.php.net/zts-improvement I proposed: > > > > - make "executor_globals_id" to be constant (it's quite easy to do). > > - make all "...globals_id" to keep offsets instead if indexes (this is > a bit more > > complex and require changes in TSRM implementation). > > > Yeah, gotcha. The very core globals should be easy to "prefill" for the > flattened structure as the sizes are known at the compile time. All the > others could be appended to them, perhaps the way I've suggested above or > another if a better one is found. >
I think we need to distinguish two cases: 1. Globals that are local to a DSO. The majority of globals in extensions is of this kind. While it is currently common to declare these globals in an exported header, they really shouldn't be. We should move these towards ZEND_TLS (__thread). This should be fine on all platforms, including Windows. In unusual cases where globals need to be accessed outside the extension, getters/setters can be provided (for parts of the structure, or if necessary the whole structure). Unfortunately some of our code is currently written around the assumption that TSRM globals are used, e.g. the STD_INI_ENTRY macros. 2. Globals that are accessed across DSOs. These are the core globals EG, CG, BG, etc. On platforms that support it (e.g. Linux) this should use ZEND_TLS as well. On platforms that don't, we can use the proposed mechanism. We can hardcode the supported globals here, as we don't need to support additional extension globals. It makes no sense to pay the DSO TLS overhead for the TSRM cache variable, and then add the TSRM indirection overhead on top of that, if we can avoid it. Nikita
  104465
February 19, 2019 00:28 ab@php.net (Anatol Belski)
Hi Nikita,

> I think we need to distinguish two cases: > > 1. Globals that are local to a DSO. The majority of globals in extensions is of this > kind. While it is currently common to declare these globals in an exported > header, they really shouldn't be. We should move these towards ZEND_TLS > (__thread). This should be fine on all platforms, including Windows. In unusual > cases where globals need to be accessed outside the extension, > getters/setters can be provided (for parts of the structure, or if necessary the > whole structure). > > Unfortunately some of our code is currently written around the assumption > that TSRM globals are used, e.g. the STD_INI_ENTRY macros. > This is correct, there is more on that. The current expectation is globals to be exported. Take as an example PCRE - while porting to PCRE2, I've used also some ZEND_TLS variables, which are still integrated into GINIT. STD_INI_ENTRY which has globals as a storage is useful, if one expects the values to be changed or read also from somewhere else. In ext/pcre these use cases are present - use of real exported globals, use of the local globals, use globals to store INI.
Currently, if a module only needs local globals, it'd still declare some globals structure to be used with GINIT. Thus, we might need another API like GINIT_LOCAL. To use ZEND_TLS, it is actually not necessary to have a structure, variables can just be put into a source file. Than GINIT local API wouldn't have to pass any globals therefore, what it would need is just to be called once per thread, same as the current GINIT does.
> > 2. Globals that are accessed across DSOs. These are the core globals EG, CG, > BG, etc. On platforms that support it (e.g. Linux) this should use ZEND_TLS as > well. On platforms that don't, we can use the proposed mechanism. We can > hardcode the supported globals here, as we don't need to support additional > extension globals. > > It makes no sense to pay the DSO TLS overhead for the TSRM cache variable, > and then add the TSRM indirection overhead on top of that, if we can avoid it. > The author of the original RFC Arnauld Le Blanc made some more research https://wiki.php.net/rfc/tls. His latest patch was also relying on the offsets, but didn't flatten the data structures as Dmitry proposed and it also didn't remove the additional TSRMLS_* args as it was done for 7.0 by the other RFC. Arnauld's patch did also rely on what we currently call ZEND_TLS in a near sense, still having the old structures.
I'd anticipate that we end with two completely different implementations. For example, ZEND_TLS is indeed "static __thread", for exporting it would require to differentiate extern and static and the visibility. As all the globals are currently exported and we don't know how they're used in any external modules, there might be BC issues if some are not exported anymore. Preferably the current mechanism were to be improved first, as suggested, then one could think about complicating things more ;) There's btw a more detailed doc about TLS in ELF bins https://www.akkadia.org/drepper/tls.pdf, we might be not far from that if the offset access is implemented. TLS internally is not much different from what the implementation in PHP does - there are tables per thread which are accessed by some index. Depending on how much improvement would be to see from the offset access, it might be acceptable enough instead of having multiple implementations and all the maintenance/QA effort. Regards Anatol
  104577
March 4, 2019 14:00 ab@php.net (Anatol Belski)
Hi Dmitry,

> -----Original Message----- > From: Dmitry Stogov <dmitry@zend.com> > Sent: Friday, February 15, 2019 8:12 AM > To: Anatol Belski <ab@php.net>; Joe Watkins <krakjoe@gmail.com>; Bob > Weinand <bwoebi@php.net>; Nikita Popov <nikic@php.net>; zeev@php.net > Cc: PHP internals <internals@lists.php.net> > Subject: Re: ZTS improvement idea > > Hi Anatol, > > On 2/14/19 8:44 PM, Anatol Belski wrote: > > Hi Dmitry, > > > >> -----Original Message----- > >> From: Dmitry Stogov <dmitry@zend.com> > >> Sent: Wednesday, February 13, 2019 12:26 AM > >> To: Joe Watkins <krakjoe@gmail.com>; Bob Weinand > <bwoebi@php.net>; > >> Nikita Popov <nikic@php.net>; Anatol Belski (ab@php.net) > >> <ab@php.net>; zeev@php.net > >> Cc: PHP internals <internals@lists.php.net> > >> Subject: ZTS improvement idea > >> > >> Hi, > >> > >> > >> > >> > >> After JIT+ZTS related discussion with Joe and Bob, and some related > >> analyzes. > >> > >> I came to more or less formed design idea and described it at > >> https://wiki.php.net/zts-improvement > >> > > I thought about it as well. The reason for the additional dereference > > levels is probably ,that every globals structure has its own size. > > That way, it needs to go on the heap. > > Not necessary. In case all the structures are known at MINIT time, we may > realloc()-ate the whole flattened tsrm_tls_entry and then access data faster. > > It may be a problem with dl(), but it must already be problematic in ZTS build.
I was doing some research on this. Depending on how far we want to go, with the current approach there seems to be an issue. The current flow is as follows - core allocate tsrm_tls_entry entry - core puts it into TLS - TSRMLS_CACHE pointer gets updated in the core - any shared ext updates TSRMLS_CACHE - any globals are accessed using that same TSRMLS_CACHE pointer, disregarding shared or static ext The TSRMLS_CACHE pointer is updated by a tsrm_get_ls_cache() call once in GINIt or alike. That's a real function, has to be, as TSRMLS_CACHE can't be exported. Exts not having ZEND_ENABLE_STATIC_TSRMLS_CACHE implemented would do that call on every global access. The actual issue, assumed we reallocate, the TSRMLS_CACHE pointer would change and we need to overwrite it in TLS. That's ok for all the static stuff, but for the dynamic modules will still have an old TSRMLS_CACHE. Dynamic modules would need to have a way to know, that the TLS pointer has changed. There might be a solution requiring a more extensive refactoring. For example, into the module entry a callback might be included so then the core would go through all the modules and update TSRMLS_CACHE automatically, once reallocation happens. That might work, but need to experiment more to have a proof. If we don't change the existing mechanics, the structure could still be flattened, but as TSRMLS_CACHE would have to stay same once allocated - the flat structure would exist within TSRMLS_CACHE (like say turn the current storage array into a contiguous memory chunk) and would need one additional indirection level to be accessed. Perhaps a more serious refactoring would be preferable. Perhaps I oversee something, please let me know. Regards Anatol