Parallelised run-tests.php (patch)

  100838
October 8, 2017 03:47 ajf@ajf.me (Andrea Faulds)
Hi there,

Do you spend HOURS every day AGONISING over how long run-tests.php takes 
to complete?

Have you long since ABANDONED every test directory besides Zend/tests?

Does it feel like your eight CPU cores are pointlessly WASTED by 
single-threaded PHP test execution?

Are you SADDENED every day at how high-quality the run-tests.php code 
is, DREAMING of an even worse mess?

Then I've got just the trick for you!

…*ahem*. Okay, enough terrible salesmanship. I felt like parallelising 
run-tests.php, so I did it. If you give it the flag -jX, it'll spawn X 
worker processes and throw batches at tests at them, and those worker 
processes will send back the results of tests for the parent process to 
display and collate. To avoid potential problems with tests that step on 
eachother's toes (i.e. access the same file or port or whatever), it 
gives each worker a particular directory of tests and it then executes 
those in parallel. Obviously, this means things aren't as parallel as 
they could be for very large test directories (looking at you, 
Zend/tests/…), though that's a solvable problem. :) Per a suggestion by 
Rasmus in an earlier thread, the parent hands out large directories to 
children first to reduce the time spent on children occupied with huge 
directories.

And what a difference it makes. On my Ubuntu 16.04 VirtualBox VM running 
inside Windows 10 Pro x64 on my Ryzen 1700 8-core 16-thread PC, here's 
vanilla run-tests.php against master on a --disable-all build:

=====================================================================
TEST RESULT SUMMARY
---------------------------------------------------------------------
Exts skipped    :   68
Exts tested     :    7
---------------------------------------------------------------------

Number of tests : 15613              8481
Tests skipped   : 7132 ( 45.7%) --------
Tests warned    :    0 (  0.0%) (  0.0%)
Tests failed    :   30 (  0.2%) (  0.4%)
Expected fail   :   31 (  0.2%) (  0.4%)
Tests passed    : 8420 ( 53.9%) ( 99.3%)
---------------------------------------------------------------------
Time taken      :  317 seconds
=====================================================================

And here's run-tests.php -j16:

=====================================================================
TEST RESULT SUMMARY
---------------------------------------------------------------------
Exts skipped    :   68
Exts tested     :    7
---------------------------------------------------------------------

Number of tests : 15613              8481
Tests skipped   : 7132 ( 45.7%) --------
Tests warned    :    0 (  0.0%) (  0.0%)
Tests failed    :   35 (  0.2%) (  0.4%)
Expected fail   :   31 (  0.2%) (  0.4%)
Tests passed    : 8415 ( 53.9%) ( 99.2%)
---------------------------------------------------------------------
Time taken      :   80 seconds
=====================================================================

That's about a quarter of the single-threaded time. It could get even 
better still if there were more test directories to work with; currently 
the number of alive workers declines as the test run goes on, because 
all the small directories get finished quickly and you're left waiting 
on the Zend/tests-like behemoths. One idea might be to add some sort of 
special file you could add to a directory which tells run-tests.php that 
it is safe to parallelise, in which case it would break it into chunks 
at runtime for you. Though I don't think we should have huge numbers of 
files in a single directory anyway, if possible.

Okay, but what's the catch? Well, the code isn't the most… elegant 
thing. I did make run-tests.php faster, but I didn't refactor it to be 
less of a global-variables-dependant mess than it is at present, except 
for moving all the top-level code into a single function for the sake of 
my own sanity. Therefore, the worker child processes get their own copy 
of the global state to initialise them. It's a bit hacky, but this is 
basically what UNIX fork() does, so maybe I shouldn't feel so bad. ;) On 
the other hand, the remainder of the design is quite elegant! It's all 
clean message-passing over STDIN and STDOUT.

This isn't fully tested yet. I haven't actually investigated those 5 
test failures, and my implementation probably breaks some run-tests.php 
features right now. I also haven't bothered to try this on a non 
--disable-all build yet. That said, I'm impressed how well it actually 
works right now and how effectively it tolerates errors. The child 
processes will even happily kill themselves if the parent does.

The code is here:

https://github.com/php/php-src/pull/2822

Happy Halloween.
-- 
Andrea Faulds
https://ajf.me/
  100846
October 10, 2017 07:01 Michael Wallner <mike@php.net>
--FP3s9R4fwXigkxeK3juBGPKb7s7mKPWfe
Content-Type: text/plain; charset=utf-8
Content-Language: en-GB
Content-Transfer-Encoding: quoted-printable

Hi!

On 08/10/17 05:47, Andrea Faulds wrote:> Hi there,
> > Do you spend HOURS every day AGONISING over how long run-tests.php take= s
> to complete? Yes!
> > Have you long since ABANDONED every test directory besides Zend/tests? > > Does it feel like your eight CPU cores are pointlessly WASTED by > single-threaded PHP test execution? Yes!
> > Are you SADDENED every day at how high-quality the run-tests.php code > is, DREAMING of an even worse mess? > > Then I've got just the trick for you! Ha, welcome to the club! I'm glad someone else feels the need, too.
https://github.com/php/php-src/compare/master...m6w6:parallel-run-tests But, I just noticed, you've been part of that discussion, too: https://externals.io/message/75308 --=20 Regards, Mike --FP3s9R4fwXigkxeK3juBGPKb7s7mKPWfe--
  100847
October 10, 2017 09:09 tyra3l@gmail.com (Ferenc Kovacs)
On Tue, Oct 10, 2017 at 9:01 AM, Michael Wallner <mike@php.net> wrote:

> Hi! > > On 08/10/17 05:47, Andrea Faulds wrote:> Hi there, > > > > Do you spend HOURS every day AGONISING over how long run-tests.php takes > > to complete? > Yes! > > > > > Have you long since ABANDONED every test directory besides Zend/tests? > > > > Does it feel like your eight CPU cores are pointlessly WASTED by > > single-threaded PHP test execution? > Yes! > > > > > Are you SADDENED every day at how high-quality the run-tests.php code > > is, DREAMING of an even worse mess? > > > > Then I've got just the trick for you! > Ha, welcome to the club! I'm glad someone else feels the need, too. > https://github.com/php/php-src/compare/master...m6w6:parallel-run-tests > > But, I just noticed, you've been part of that discussion, too: > https://externals.io/message/75308 > > -- > Regards, > Mike > > > > thanks for keeping the tradition of calling out the new participant about
the previous attempt :D https://externals.io/message/75308#75321 -- Ferenc Kovács @Tyr43l - http://tyrael.hu
  100848
October 10, 2017 10:00 ajf@ajf.me (Andrea Faulds)
Hi!

Ferenc Kovacs wrote:
> On Tue, Oct 10, 2017 at 9:01 AM, Michael Wallner <mike@php.net> wrote: > >> Hi! >> >> On 08/10/17 05:47, Andrea Faulds wrote:> Hi there, >>> >>> Then I've got just the trick for you! >> Ha, welcome to the club! I'm glad someone else feels the need, too. >> https://github.com/php/php-src/compare/master...m6w6:parallel-run-tests >> >> But, I just noticed, you've been part of that discussion, too: >> https://externals.io/message/75308 >> >> -- >> Regards, >> Mike >> >> >> >> > thanks for keeping the tradition of calling out the new participant about > the previous attempt :D > https://externals.io/message/75308#75321 > >
…oh dear. Clearly, we need yet another attempt after mine :p -- Andrea Faulds https://ajf.me/
  100849
October 10, 2017 12:09 johannes@schlueters.de (Johannes =?ISO-8859-1?Q?Schl=FCter?=)
On So, 2017-10-08 at 04:47 +0100, Andrea Faulds wrote:
> Have you long since ABANDONED every test directory besides > Zend/tests?
.... or ran only eext/foo/tests ;)
> …*ahem*. Okay, enough terrible salesmanship. I felt like > parallelising  > run-tests.php, so I did it. If you give it the flag -jX, it'll spawn > X  > worker processes and throw batches at tests at them, and those > worker 
This is cool! I also see (from very very very) short look on the github diff that you have a parallelization protection for some tests, which might share resources. Very good! Kind of unrelated: Somewhere on my 10+ years old todo list I also have the item of using FastCGI or similar for running tests to avoid running tthrough MINIT/MSHUTDOWN for each and every test (for some we can'T avoid due to ini requirements, but well) maybe a less hackish way for parallelizing might be using fpm workers and async io (just to spin the idea, maybe somebody takes it up ...) johannes
  100850
October 10, 2017 12:35 ajf@ajf.me (Andrea Faulds)
Hi Johannes,

Johannes Schlüter wrote:
> On So, 2017-10-08 at 04:47 +0100, Andrea Faulds wrote: >> Have you long since ABANDONED every test directory besides >> Zend/tests? > > ... or ran only eext/foo/tests ;) > >> …*ahem*. Okay, enough terrible salesmanship. I felt like >> parallelising >> run-tests.php, so I did it. If you give it the flag -jX, it'll spawn >> X >> worker processes and throw batches at tests at them, and those >> worker > > This is cool! I also see (from very very very) short look on the github > diff that you have a parallelization protection for some tests, which > might share resources. Very good!
I'm glad you like it! Although I discover now that I am hardly the first to attempt this. Maybe I'll be the first to get it merged :p
> Kind of unrelated: Somewhere on my 10+ years old todo list I also have > the item of using FastCGI or similar for running tests to avoid running > tthrough MINIT/MSHUTDOWN for each and every test (for some we can'T > avoid due to ini requirements, but well) maybe a less hackish way for > parallelizing might be using fpm workers and async io (just to spin the > idea, maybe somebody takes it up ...)
That's a reasonable idea. But I wonder if at that point, we should just use a “real” unit-testing framework, or at least a stripped-down version of one, which would run functions rather than files. PHPT is a simple format, but it requires invoking a PHP interpreter every time. We only really need to do that if we expect a fatal error or something like that… :) -- Andrea Faulds https://ajf.me/
  100853
October 10, 2017 14:04 johannes@schlueters.de (Johannes =?ISO-8859-1?Q?Schl=FCter?=)
On Di, 2017-10-10 at 13:35 +0100, Andrea Faulds wrote:

> > This is cool! I also see (from very very very) short look on the > > github diff that you have a parallelization protection for some > > tests, which  might share resources. Very good! > I'm glad you like it! Although I discover now that I am hardly the > first to attempt this.
Hardly, aside from the mentioned one's Zoe once did a rewrite which works parallel, phpUnit also has a phpt mode and I'm sue there are yet others ;-)
> Maybe I'll be the first to get it merged :p
Keep working to hat goal! We need it. (your marketing missed the gcov run btw. :D)
> > > > Kind of unrelated: Somewhere on my 10+ years old todo list I also > > have > > the item of using FastCGI or similar for running tests to avoid > > running > > tthrough MINIT/MSHUTDOWN for each and every test (for some we can'T > > avoid due to ini requirements, but well) maybe a less hackish way > > for > > parallelizing might be using fpm workers and async io (just to spin > > the > > idea, maybe somebody takes it up ...) > That's a reasonable idea. But I wonder if at that point, we should > just use a “real” unit-testing framework, or at least a stripped-down > version of one, which would run functions rather than files. PHPT is > a simple format, but it requires invoking a PHP interpreter every > time. We only really need to do that if we expect a fatal error or > something like that… :)
We need a PHP interpreter also for leaks, ini settings (we should actually get rid of most of these ...), everything where we need a clean space to register classes/functions/... or where shutdown behavior is tested or so many else :-)  (and yes, detecting leaks via my fastcgi/fpm approach is not trivial, but might be doable with some help from engine/sapi) PHP has a nice mechanism for providing a clean environment, the RINIT- SHUTDWN-cycle ... aka request :-) Anyways, let's not get distracted and work on the proposal at hand and then do the next iteration (... in ~20 years :-/ or can we speed up?) johannes