Adding a way to disable the stat cache

  115912
September 2, 2021 08:27 kevin@ie.suberic.net (Kevin Lyda)
PHP has a stat cache which is... unfortunate. As noted in this bug from
2004[0] it causes a number of issues for PHP users and is irrelevant
in modern operating systems. Heck, it's not even useful in OS's people
might consider ancient at this point.

I have a PR[1] to fix this, but had not noticed it had been referred to
this mailing list.  I'll need fix the PR at this point as the code has
drifted, but before I do, can whatever discussion happen so that the PR
will be accepted?  The change I've made will allow people to disable the
cache so that it won't cause errors and it leaves the current broken
behaviour in place.  So there's no real change to the user experience
except they now have a tool to disable buggy behaviour.  I feel that
this is the least intrusive way to fix this bug.

In the discussion on the bug there was resistance to getting rid of the
cache so this was the solution proposed in 2007 and seemed reasonable.
The default behaviour remains but can be turned off by people who want
it off - and there are a number of people who would like it off.

Is this an acceptable solution? Would this functionality be accepted
either via my PR or another.

Kevin

[0]: https://bugs.php.net/bug.php?id=28790
[1]: https://github.com/php/php-src/pull/5894

-- 
Kevin Lyda
Galway, Ireland
  115913
September 2, 2021 08:39 kjarli@gmail.com (Lynn)
On Thu, Sep 2, 2021 at 10:27 AM Kevin Lyda <kevin@ie.suberic.net> wrote:

> PHP has a stat cache which is... unfortunate. As noted in this bug from > 2004[0] it causes a number of issues for PHP users and is irrelevant > in modern operating systems. Heck, it's not even useful in OS's people > might consider ancient at this point. >
I have had to add clearstatcache in the past, and probably having a bunch of places where this should be added and isn't yet causing subtle and hard to find bugs, I welcome this option.
  115914
September 2, 2021 08:48 phpmailinglists@gmail.com (Peter Bowyer)
On Thu, 2 Sept 2021 at 09:27, Kevin Lyda <kevin@ie.suberic.net> wrote:

> The change I've made will allow people to disable the > cache so that it won't cause errors and it leaves the current broken > behaviour in place. >
Is there any good reason not to remove it completely? I've been bitten by the stat cache before and would be pleased to see it gone for good. Peter
  115915
September 2, 2021 11:15 kevin@lyda.ie (Kevin Lyda)
Just as an aside, Lynn and Peter are two more people who have had this
issue - in addition to the folks mentioned in the various bugs about this.
Really would like clear guidance on what's needed to move the PR on
once I resolve the current merge conflict.

On Thu, Sep 2, 2021 at 9:49 AM Peter Bowyer <phpmailinglists@gmail.com> wrote:
> On Thu, 2 Sept 2021 at 09:27, Kevin Lyda <kevin@ie.suberic.net> wrote: > > The change I've made will allow people to disable the > > cache so that it won't cause errors and it leaves the current broken > > behaviour in place. > > Is there any good reason not to remove it completely?
Removing it completely would be ideal, however a number of people objected in the linked bug. And while it's not needed in modern Unix operating systems, it's not clear if Windows might benefit from this. Personally I'd argue that due to its buggy behaviour it should be removed. There's no way to make this work correctly in userspace so it's not a good place for it. OS and filesystem developers implemented caching on those levels because they have the state to properly maintain a cache. However if you use the filesystem a certain way and you pretend other processes can't modify a file out from under you, the current caching system could provide a performance benefit on OSs and filesystems that don't offer this sort of caching. And maybe someone depends on this. That's the best argument I can make for retaining it.
> I've been bitten by the stat cache before and would be pleased to see it > gone for good.
As have I. And I had to write some goofy sed scripts to insert clearstatcache() in order to stop rare bugs from popping up. Which means I had to experience it at least twice. Kevin
  115926
September 3, 2021 13:34 cschneid@cschneid.com (Christian Schneider)
Am 02.09.2021 um 13:15 schrieb Kevin Lyda <kevin@lyda.ie>:
> Removing it completely would be ideal, however a number of people objected > in the linked bug. And while it's not needed in modern Unix operating > systems, it's not clear if Windows might benefit from this.
If I remember correctly it was about reducing the number of system calls. Is this no issue any more? Has a quick benchmark been done to see the positive / negative impact of the stat cache for a typical application? Just curious, - Chris
  115930
September 3, 2021 14:07 kevin@lyda.ie (Kevin Lyda)
On Fri, Sep 3, 2021 at 2:34 PM Christian Schneider
<cschneid@cschneid.com> wrote:
> If I remember correctly it was about reducing the number of system calls. Is this no issue any more? > Has a quick benchmark been done to see the positive / negative impact of the stat cache for a typical application?
In the lifespan of php it really wasn't an issue unless someone was doing something that wasn't wise - I can't think of a single reason to stat a file in a tight loop. However more importantly the current behaviour returns bad data for perfectly correct programs. So for example on a unix box... Now this is a silly toy, but imagine using is_file to see if a graphics file exists, running an image processing program on it to modify it, and then using a stat call to get the file length to populate the Content-Length field - it will almost certainly be wrong. Personally I value correctness first and then performance is a priority further down that list of attributes (security and readability would also be higher priorities). However as the current behaviour has existed for several decades, this change makes sure the incorrect historical behaviour is the default. Kevin
  115932
September 3, 2021 14:52 cschneid@cschneid.com (Christian Schneider)
Am 03.09.2021 um 16:07 schrieb Kevin Lyda <kevin@lyda.ie>:
> On Fri, Sep 3, 2021 at 2:34 PM Christian Schneider > <cschneid@cschneid.com> wrote: >> If I remember correctly it was about reducing the number of system calls. Is this no issue any more? >> Has a quick benchmark been done to see the positive / negative impact of the stat cache for a typical application? > > In the lifespan of php it really wasn't an issue unless someone was > doing something that wasn't wise - I can't think of a single reason to > stat a file in a tight loop.
How can you say "it never was a problem" if we never had to live without stat cache? Can you back up that claim with numbers? There are some of us who run high-volume websites where system load increase could be a problem.
> However more importantly the current behaviour returns bad data for > perfectly correct programs. So for example on a unix box... > > passthru('touch foo'); > if (is_file('foo')) { > echo "Correct\n"; > } > passthru('rm foo'); > if (is_file('foo')) { > echo "Incorrect\n"; > } > ?>
I disagree that this (with current PHP) is a correct program. The documentation at https://www.php.net/manual/en/function.is-file.php#refsect1-function.is-file-notes <https://www.php.net/manual/en/function.is-file.php> clearly states that is_file() is cached and clearstatcache() should be used. Yes, the stat cache is a foot gun, but it was put in for a reason back in the day.
> Personally I value correctness first and then performance is a > priority further down that list of attributes (security and > readability would also be higher priorities). However as the current > behaviour has existed for several decades, this change makes sure the > incorrect historical behaviour is the default.
There are two problems here: a) Removing the stat cache altogether (which is not your proposal, I know, but it was mentioned in the thread) could lead to a performance regression. I was asking for data backing up the claim that this regression either does not exist or is small enough to not be a problem. Just saying "it never was a problem" or "modern Linux should handle this" does not give me that certainty. b) Making it an .ini-Option seems to be a BC preserving measure but, in fact, it creates an even bigger issue: Your code snippet is correct or buggy depending on an ini-settings. Something we want to avoid as much as possible, see magic_quotes :-) - Chris
  115933
September 3, 2021 15:16 cmbecker69@gmx.de ("Christoph M. Becker")
On 03.09.2021 at 16:07, Kevin Lyda wrote:

> On Fri, Sep 3, 2021 at 2:34 PM Christian Schneider > <cschneid@cschneid.com> wrote: > >> If I remember correctly it was about reducing the number of system calls. Is this no issue any more? >> Has a quick benchmark been done to see the positive / negative impact of the stat cache for a typical application? > > In the lifespan of php it really wasn't an issue unless someone was > doing something that wasn't wise - I can't think of a single reason to > stat a file in a tight loop.
It seems to me a typical example where the stat cache is useful would be -- Christoph M. Becker
  115934
September 3, 2021 15:23 nikita.ppv@gmail.com (Nikita Popov)
On Fri, Sep 3, 2021 at 4:08 PM Kevin Lyda <kevin@lyda.ie> wrote:

> On Fri, Sep 3, 2021 at 2:34 PM Christian Schneider > <cschneid@cschneid.com> wrote: > > If I remember correctly it was about reducing the number of system > calls. Is this no issue any more? > > Has a quick benchmark been done to see the positive / negative impact of > the stat cache for a typical application? > > In the lifespan of php it really wasn't an issue unless someone was > doing something that wasn't wise - I can't think of a single reason to > stat a file in a tight loop. > > However more importantly the current behaviour returns bad data for > perfectly correct programs. So for example on a unix box... > > passthru('touch foo'); > if (is_file('foo')) { > echo "Correct\n"; > } > passthru('rm foo'); > if (is_file('foo')) { > echo "Incorrect\n"; > } > ?> > > Now this is a silly toy, but imagine using is_file to see if a > graphics file exists, running an image processing program on it to > modify it, and then using a stat call to get the file length to > populate the Content-Length field - it will almost certainly be wrong. >
Just to throw it out there: Maybe we should clear the stat cache when functions in the exec family are used? Even if we allow disabling the stat cache, I think we can easily avoid that particular footgun. And if calls to external binaries are involved we likely don't have to worry about stat overhead. Regards, Nikita
  115935
September 3, 2021 16:02 cschneid@cschneid.com (Christian Schneider)
Am 03.09.2021 um 17:23 schrieb Nikita Popov ppv@gmail.com>:
> Just to throw it out there: Maybe we should clear the stat cache when functions in the exec family are used? Even if we allow disabling the stat cache, I think we can easily avoid that particular footgun. And if calls to external binaries are involved we likely don't have to worry about stat overhead.
While this would make the foot gun a bit smaller it introduces more magic. I'm not completely against it but it feels dirty. On top of that: I hope people using exec and friends are properly escaping parameters. Which in our case is a helper function where a clear_stat_cache() could easily be added in user land, making it explicit. Side-note: We should teach people not to use exec style function when normal PHP functions work :-) The following works fine:
  115938
September 3, 2021 17:09 kevin@lyda.ie (Kevin Lyda)
[sent a second time, now to the list, sorry]

On Fri, Sep 3, 2021 at 3:53 PM Christian Schneider
<cschneid@cschneid.com> wrote:
> How can you say "it never was a problem" if we never had to live without stat cache? > Can you back up that claim with numbers? There are some of us who run high-volume websites where system load increase could be a problem.
Using this bash script: #!/bin/bash echo "Without cache" time ./sapi/cli/php -d enable_stat_cache=3DFalse "$@" echo "With cache" time ./sapi/cli/php "$@" To run this php script: > I disagree that this (with current PHP) is a correct program. The documentation at > https://www.php.net/manual/en/function.is-file.php#refsect1-function.is-file-notes <https://www.php.net/manual/en/function.is-file.php> > clearly states that is_file() is cached and clearstatcache() should be used.
That may be, but it happens for people.
> Yes, the stat cache is a foot gun, but it was put in for a reason back in the day.
Well, we each have opinions on the validity of those reasons. May they bring us joy.
> There are two problems here: > > a) Removing the stat cache altogether (which is not your proposal, I know= , but it was mentioned in the thread) could lead to a performance regressio=
n. I was asking for data backing up the claim that this regression either d= oes not exist or is small enough to not be a problem. Just saying "it never= was a problem" or "modern Linux should handle this" does not give me that = certainty. OK, but as you note, not what this PR is going to do.
> b) Making it an .ini-Option seems to be a BC preserving measure but, in f= act, it creates an even bigger issue: Your code snippet is correct or buggy=
depending on an ini-settings. Something we want to avoid as much as possib= le, see magic_quotes :-) First, it's possible to disable other caches in PHP. In fact, I think this is the only cache you can't disable. So it puts it in line with all other caches. Second, it's not possible to make this cache work correctly. But hey, it's hardly the only cache with that problem. There are several ini-cache settings that will make code behave differently. Pretty much every cache setting for example. Kevin
  115940
September 3, 2021 20:12 cschneid@cschneid.com (Christian Schneider)
Am 03.09.2021 um 18:12 schrieb Kevin Lyda <kevin@lyda.ie <mailto:kevin@lyda.ie>>:
> To run this php script: > > $iterations = 1000000; > function all_the_stats($filename) { > @lstat($filename); > @stat($filename); > } > while ($iterations--) { > all_the_stats(__FILE__); > } > > I see this output: > > Without cache > > real 0m7.326s > user 0m5.877s > sys 0m1.448s > With cache > > real 0m5.010s > user 0m5.009s > sys 0m0.000s
So this is almost a 50% performance regression ;-) And the more interesting number for me here is "sys 0m1.448s" vs "sys 0m0.000s". Which means 1.5s out of the additional 2.3s are spent in system calls. Side-note: Why is the user time without cache higher than with cache? That seems counter-intuitive. Maybe the checking of the ini-settings or some libc-code? I'm interested in the load put on a system with a high request count and a typical application. Reducing system calls used to matter there as the kernel does not multi-process the same way user land does. But then again, maybe I'm overly cautious :-) Regards, - Chris
  115941
September 3, 2021 20:21 kevin@lyda.ie (Kevin Lyda)
On Fri, Sep 3, 2021 at 9:12 PM Christian Schneider
<cschneid@cschneid.com> wrote:
> I'm interested in the load put on a system with a high request count and a typical application. > Reducing system calls used to matter there as the kernel does not multi-process the same way user land does. > > But then again, maybe I'm overly cautious :-)
This PR allows people to do just that experiment. Kevin
  115942
September 3, 2021 21:50 divinity76@gmail.com (Hans Henrik Bergan)
PS i've seen *HORRIBLE* fs performance for php-running-on-windows,
where the same filesystem operations on the same files took like 5 seconds
on linux-running-on-vmware-on-laptop-running-windows-10, versus several
minutes for the same operation on the same laptop on windows 10 directly..
for people looking for best-case-scenario for the stat cache, try looking
at windows fs performance.. (if anyone even cares about that? i personally
don't, i never run anything performance-sensitive-php code on Windows, just
noticed horrible fs performance in the past)

On Fri, 3 Sept 2021 at 22:22, Kevin Lyda <kevin@lyda.ie> wrote:

> On Fri, Sep 3, 2021 at 9:12 PM Christian Schneider > <cschneid@cschneid.com> wrote: > > I'm interested in the load put on a system with a high request count and > a typical application. > > Reducing system calls used to matter there as the kernel does not > multi-process the same way user land does. > > > > But then again, maybe I'm overly cautious :-) > > This PR allows people to do just that experiment. > > Kevin > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
  115943
September 4, 2021 01:49 pierre.php@gmail.com (Pierre Joye)
Good morning,

On Sat, Sep 4, 2021 at 4:51 AM Hans Henrik Bergan <divinity76@gmail.com> wrote:
> > PS i've seen *HORRIBLE* fs performance for php-running-on-windows, > where the same filesystem operations on the same files took like 5 seconds > on linux-running-on-vmware-on-laptop-running-windows-10, versus several > minutes for the same operation on the same laptop on windows 10 directly.. > for people looking for best-case-scenario for the stat cache, try looking > at windows fs performance.. (if anyone even cares about that? i personally > don't, i never run anything performance-sensitive-php code on Windows, just > noticed horrible fs performance in the past)
Windows TS is as far as I remember the main reason for stat cache being there as well. Not necessarily because stats are slower on windows but because a userland stat causes multiple system stat (virtual cwd and related does multiple calls ). It was and most likely still is a performance hit back then. As for the option to disable stat cache, I don't see how it can hurt, as long as it is well documented and off by default. So daemon and similar apps can disable it if desired. Other apps, I honestly do not see any gain :). Best, -- Pierre @pierrejoye | http://www.libgd.org
  115944
September 4, 2021 11:28 kevin@lyda.ie (Kevin Lyda)
On Fri, Sep 3, 2021 at 10:51 PM Hans Henrik Bergan <divinity76@gmail.com> wrote:
> PS i've seen *HORRIBLE* fs performance for php-running-on-windows, > where the same filesystem operations on the same files took like 5 seconds > on linux-running-on-vmware-on-laptop-running-windows-10, versus several > minutes for the same operation on the same laptop on windows 10 directly..
Amusingly Windows FS performance made emigrating easier for me. I've never used or coded on Windows, but one of the job offers I got when I was emigrating back in 1998 was a Word doc. To read it I ran it through strings on Solaris and that's when I discovered Word saves diffs. To read my offer letter I had to reconstruct the previous five offer letters so I had a good view of what compensation was. From talking to Windows people I learned it did diffs due to FS performance. Which seemed like an amusing solution to a problem. Kevin
  115960
September 6, 2021 11:13 kontakt@beberlei.de (Benjamin Eberlei)
On Fri, Sep 3, 2021 at 11:51 PM Hans Henrik Bergan <divinity76@gmail.com>
wrote:

> PS i've seen *HORRIBLE* fs performance for php-running-on-windows, > where the same filesystem operations on the same files took like 5 seconds > on linux-running-on-vmware-on-laptop-running-windows-10, versus several > minutes for the same operation on the same laptop on windows 10 directly.. > for people looking for best-case-scenario for the stat cache, try looking > at windows fs performance.. (if anyone even cares about that? i personally > don't, i never run anything performance-sensitive-php code on Windows, just > noticed horrible fs performance in the past) >
The stat cache does not necessarily solve this issues though, only in very limited cases where you work with the *same* file over and over again. The stat cache only ever has exactly one entry, the *last* file that was accessed. So if you work with many diferent files the stat cache does not do what you would expect, to store the information of all these files. It overwrites the cache with a new file entry when its not the same file as currently stored.
> > On Fri, 3 Sept 2021 at 22:22, Kevin Lyda <kevin@lyda.ie> wrote: > > > On Fri, Sep 3, 2021 at 9:12 PM Christian Schneider > > <cschneid@cschneid.com> wrote: > > > I'm interested in the load put on a system with a high request count > and > > a typical application. > > > Reducing system calls used to matter there as the kernel does not > > multi-process the same way user land does. > > > > > > But then again, maybe I'm overly cautious :-) > > > > This PR allows people to do just that experiment. > > > > Kevin > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: https://www.php.net/unsub.php > > > > >
  115961
September 6, 2021 12:00 pierre.php@gmail.com (Pierre Joye)
On Mon, Sep 6, 2021, 6:14 PM Benjamin Eberlei <kontakt@beberlei.de> wrote:

> On Fri, Sep 3, 2021 at 11:51 PM Hans Henrik Bergan <divinity76@gmail.com> > wrote: > > > The stat cache does not necessarily solve this issues though, only in very > limited cases where you work with the *same* file over and over again. The > stat cache only ever has exactly one entry, the *last* file that was > accessed. So if you work with many diferent files the stat cache does not > do what you would expect, to store the information of all these files. It > overwrites the cache with a new file entry when its not the same file as > currently stored.
that is what happened in TS mode. For every part of a path, every single time a path is used (like source files in a common root path). I did not check it lately but I suppose it still does it. Luckily it should not be an issue anymore as most should be using nts as well and the opcache, maybe even wincache too which implements a smart cache with events (on change).
  115962
September 6, 2021 12:05 cmbecker69@gmx.de ("Christoph M. Becker")
On 06.09.2021 at 14:00, Pierre Joye wrote:

> On Mon, Sep 6, 2021, 6:14 PM Benjamin Eberlei <kontakt@beberlei.de> wrote: > >> On Fri, Sep 3, 2021 at 11:51 PM Hans Henrik Bergan <divinity76@gmail.com> >> wrote: >> >> The stat cache does not necessarily solve this issues though, only in very >> limited cases where you work with the *same* file over and over again. The >> stat cache only ever has exactly one entry, the *last* file that was >> accessed. So if you work with many diferent files the stat cache does not >> do what you would expect, to store the information of all these files. It >> overwrites the cache with a new file entry when its not the same file as >> currently stored. > > that is what happened in TS mode. For every part of a path, every single > time a path is used (like source files in a common root path). I did not > check it lately but I suppose it still does it.
This is the realpath cache, though.
> Luckily it should not be an > issue anymore as most should be using nts as well and the opcache, maybe > even wincache too which implements a smart cache with events (on change)..
FWIW, WinCache has been discontinued[1]. [1] <https://pecl.php.net/package/WinCache> Christoph
  115964
September 6, 2021 14:46 pierre.php@gmail.com (Pierre Joye)
On Mon, Sep 6, 2021 at 7:05 PM Christoph M. Becker <cmbecker69@gmx.de> wrote:
> > On 06.09.2021 at 14:00, Pierre Joye wrote: > > > On Mon, Sep 6, 2021, 6:14 PM Benjamin Eberlei <kontakt@beberlei.de> wrote: > > > >> On Fri, Sep 3, 2021 at 11:51 PM Hans Henrik Bergan <divinity76@gmail.com> > >> wrote: > >> > >> The stat cache does not necessarily solve this issues though, only in very > >> limited cases where you work with the *same* file over and over again. The > >> stat cache only ever has exactly one entry, the *last* file that was > >> accessed. So if you work with many diferent files the stat cache does not > >> do what you would expect, to store the information of all these files. It > >> overwrites the cache with a new file entry when its not the same file as > >> currently stored. > > > > that is what happened in TS mode. For every part of a path, every single > > time a path is used (like source files in a common root path). I did not > > check it lately but I suppose it still does it. > > This is the realpath cache, though.
realpath(path, 1) does not do stat anymore? I mean with virtual_file_ex and co? Not so important, too long since I touched this :) Also as someone mentioned here afterwards, instead of removing it straight away, I would go with the flag first, less risky :) best, -- Pierre @pierrejoye | http://www.libgd.org
  115966
September 6, 2021 15:50 cschneid@cschneid.com (Christian Schneider)
Am 06.09.2021 um 16:46 schrieb Pierre Joye php@gmail.com>:
> Also as someone mentioned here afterwards, instead of removing it > straight away, I would go with the flag first, less risky :)
Out of curiosity: Do you think disabling the stat cache could be harmful? If no, why make it an option? If yes, should we be allowing it? Follow-up question: What would be the default? Cache off which might be harmful to some (assumedly bigger) sites? Cache on which won't fix the bug reports? If you put the burden on the bigger PHP users by making it default to off you're assuming they have more rigorous update procedures, I guess. Probably true, just make sure that it is mentioned prominently enough in UPGRADING. Ok, I've said my piece :-) - Chris
  115969
September 6, 2021 16:29 pierre.php@gmail.com (Pierre Joye)
On Mon, Sep 6, 2021, 10:50 PM Christian Schneider <cschneid@cschneid.com>
wrote:

> Am 06.09.2021 um 16:46 schrieb Pierre Joye php@gmail.com>: > > Also as someone mentioned here afterwards, instead of removing it > > straight away, I would go with the flag first, less risky :) > > Out of curiosity: Do you think disabling the stat cache could be harmful? >
only for performance on windows TS. I think still too much use apache with mod php I think. If no, why make it an option?
> If yes, should we be allowing it? >
yes, off by default first? best Pierre
  115983
September 7, 2021 08:32 nikita.ppv@gmail.com (Nikita Popov)
On Fri, Sep 3, 2021 at 7:10 PM Kevin Lyda <kevin@lyda.ie> wrote:

> [sent a second time, now to the list, sorry] > > On Fri, Sep 3, 2021 at 3:53 PM Christian Schneider > <cschneid@cschneid.com> wrote: > > How can you say "it never was a problem" if we never had to live without > stat cache? > > Can you back up that claim with numbers? There are some of us who run > high-volume websites where system load increase could be a problem. > > Using this bash script: > > #!/bin/bash > echo "Without cache" > time ./sapi/cli/php -d enable_stat_cache=3DFalse "$@" > echo "With cache" > time ./sapi/cli/php "$@" > > To run this php script: > > $iterations =3D 1000000; > function all_the_stats($filename) { > @lstat($filename); > @stat($filename); > } > while ($iterations--) { > all_the_stats(__FILE__); > } > > I see this output: > > Without cache > > real 0m7.326s > user 0m5.877s > sys 0m1.448s > With cache > > real 0m5.010s > user 0m5.009s > sys 0m0.000s > > So that's 2 seconds slower to do 2 million uncached stat calls vs > cached with a 100% cache hit rate (minus the first stat/lstat calls). > > Technically, yes, it's slower, but I'd suggest that making 2 million stat > calls to a single file is a bad idea. And remember, the cache holds *one* > file. If you stat a second file it's a cache miss. >
These numbers look pretty good to me. It would be great if someone on Windows and macos could repeat this experiment, so we have an idea of how other platforms fare in this worst-case scenario. Regards, Nikita
  115987
September 7, 2021 10:31 tekiela246@gmail.com (Kamil Tekiela)
> It would be great if someone on Windows and macos could repeat this experiment
I ran this on Windows and I got the following results: Native Windows build: Without cache real 0m31.170s user 0m0.000s sys 0m0.000s With cache real 0m0.694s user 0m0.000s sys 0m0.000s Ubuntu WSL: Without cache real 3m8.695s user 0m7.009s sys 0m18.819s With cache real 0m0.707s user 0m0.562s sys 0m0.010s I am not sure how to interpret it. I doubt that any system would actually perform 1 million operations on the same file, but I can definitely see a noticeable difference.
  115988
September 7, 2021 10:42 divinity76@gmail.com (Hans Henrik Bergan)
WSL1 or WSL2? afaik they have significant performance differences, and we
should only consider WSL2 as WSL1 is being deprecated afaik

On Tue, 7 Sept 2021 at 12:31, Kamil Tekiela <tekiela246@gmail.com> wrote:

> > It would be great if someone on > Windows and macos could repeat this experiment > > I ran this on Windows and I got the following results: > Native Windows build: > > Without cache > real 0m31.170s > user 0m0.000s > sys 0m0.000s > > With cache > real 0m0.694s > user 0m0.000s > sys 0m0.000s > > Ubuntu WSL: > > Without cache > real 3m8.695s > user 0m7.009s > sys 0m18.819s > > With cache > real 0m0.707s > user 0m0.562s > sys 0m0.010s > > I am not sure how to interpret it. I doubt that any system would actually > perform 1 million operations on the same file, but I can definitely see a > noticeable difference. >
  115989
September 7, 2021 10:49 tekiela246@gmail.com (Kamil Tekiela)
It's WSL2
uname -r
5.4.72-microsoft-standard-WSL2
  115990
September 7, 2021 10:53 pierre.php@gmail.com (Pierre Joye)
Hi,




On Tue, Sep 7, 2021, 5:49 PM Kamil Tekiela <tekiela246@gmail.com> wrote:

> It's WSL2 > uname -r > 5.4.72-microsoft-standard-WSL2 >
no need to test on wsl2, that's a dockerized linux. best, Pierre
>
  115937
September 3, 2021 16:15 kevin@lyda.ie (Kevin Lyda)
On Fri, Sep 3, 2021 at 4:24 PM Nikita Popov ppv@gmail.com> wrote:
> Just to throw it out there: Maybe we should clear the stat cache when functions in the exec family are used? Even if we allow disabling the stat cache, I think we can easily avoid that particular footgun. And if calls to external binaries are involved we likely don't have to worry about stat overhead.
This code also breaks:
  115925
September 3, 2021 12:58 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Sep 2, 2021 at 10:27 AM Kevin Lyda <kevin@ie.suberic.net> wrote:

> PHP has a stat cache which is... unfortunate. As noted in this bug from > 2004[0] it causes a number of issues for PHP users and is irrelevant > in modern operating systems. Heck, it's not even useful in OS's people > might consider ancient at this point. > > I have a PR[1] to fix this, but had not noticed it had been referred to > this mailing list. I'll need fix the PR at this point as the code has > drifted, but before I do, can whatever discussion happen so that the PR > will be accepted? The change I've made will allow people to disable the > cache so that it won't cause errors and it leaves the current broken > behaviour in place. So there's no real change to the user experience > except they now have a tool to disable buggy behaviour. I feel that > this is the least intrusive way to fix this bug. > > In the discussion on the bug there was resistance to getting rid of the > cache so this was the solution proposed in 2007 and seemed reasonable. > The default behaviour remains but can be turned off by people who want > it off - and there are a number of people who would like it off. > > Is this an acceptable solution? Would this functionality be accepted > either via my PR or another. > > Kevin > > [0]: https://bugs.php.net/bug.php?id=28790 > [1]: https://github.com/php/php-src/pull/5894 >
Looks like a reasonable addition to me. Regards, Nikita
  115956
September 5, 2021 11:30 kevin@ie.suberic.net (Kevin Lyda)
Any more thoughts on https://github.com/php/php-src/pull/5894 ?

I've resolved the merge conflict. It would be nice to close out this bug.

Kevin

On Thu, Sep 2, 2021 at 9:27 AM Kevin Lyda <kevin@ie.suberic.net> wrote:
> > PHP has a stat cache which is... unfortunate. As noted in this bug from > 2004[0] it causes a number of issues for PHP users and is irrelevant > in modern operating systems. Heck, it's not even useful in OS's people > might consider ancient at this point. > > I have a PR[1] to fix this, but had not noticed it had been referred to > this mailing list. I'll need fix the PR at this point as the code has > drifted, but before I do, can whatever discussion happen so that the PR > will be accepted? The change I've made will allow people to disable the > cache so that it won't cause errors and it leaves the current broken > behaviour in place. So there's no real change to the user experience > except they now have a tool to disable buggy behaviour. I feel that > this is the least intrusive way to fix this bug. > > In the discussion on the bug there was resistance to getting rid of the > cache so this was the solution proposed in 2007 and seemed reasonable. > The default behaviour remains but can be turned off by people who want > it off - and there are a number of people who would like it off. > > Is this an acceptable solution? Would this functionality be accepted > either via my PR or another. > > Kevin > > [0]: https://bugs.php.net/bug.php?id=28790 > [1]: https://github.com/php/php-src/pull/5894 > > -- > Kevin Lyda > Galway, Ireland
-- Kevin Lyda Galway, Ireland
  115963
September 6, 2021 13:12 cschneid@cschneid.com (Christian Schneider)
Am 05.09.2021 um 13:30 schrieb Kevin Lyda <kevin@ie.suberic.net>:
> Any more thoughts on https://github.com/php/php-src/pull/5894 ? > I've resolved the merge conflict. It would be nice to close out this bug.
For the record in case it was missed: If we deem the stat cache to be useless I'd rather remove it completely to avoid further WTFs based on a hoster's configuration. - Chris