Call for participation: Annotating internal function argument andreturn types

  106522
August 10, 2019 10:56 nikita.ppv@gmail.com (Nikita Popov)
Hi,

Lack of type information for internal functions in Reflection is a
long-standing issue. In PHP 8 we finally have all the necessary technical
groundwork done to support argument and return type annotations on internal
functions at scale.

The only thing left is to actually add those type annotations ... to many
hundreds of functions. This is something everyone can help with, as it does
not need expertise in C.

I've opened a sample PR to show the process:
https://github.com/php/php-src/pull/4499

Here, we take some existing arginfo structures in basic_functions.c and
convert them into stubs in basic_functions.stub.php. We can take the
argument names from the existing arginfo. To figure out the types, we need
to look at the implementation (the php.net documentation tends to lie about
details).

For example, the first function, set_time_limit is defined in
https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501.
We see that it accepts an "l" parameter, which is an int. We see that it
uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool.

Once this is done, we need to auto-generate new arginfo data. If you have a
development setup, this is done automatically when running "make".
Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php
ext/standard/basic_functions.stub.php".

Any help would be appreciated :)

Regards,
Nikita
  106523
August 10, 2019 11:21 php@duncanc.co.uk (Craig Duncan)
I'm happy to help.

Am I correct in thinking that the best way to locate outstanding ones is to
to search for any instances of *ZEND_BEGIN_ARG_INFO* or
*ZEND_BEGIN_ARG_INFO_EX*
in *.c files?
As once they've been converted they'll only appear in stub and header files?
  106524
August 10, 2019 12:00 php-lists@koalephant.com (Stephen Reay)
> On 10 Aug 2019, at 17:56, Nikita Popov ppv@gmail.com> wrote: > > Hi, > > Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale. > > The only thing left is to actually add those type annotations ... to many > hundreds of functions. This is something everyone can help with, as it does > not need expertise in C. > > I've opened a sample PR to show the process: > https://github.com/php/php-src/pull/4499 > > Here, we take some existing arginfo structures in basic_functions.c and > convert them into stubs in basic_functions.stub.php. We can take the > argument names from the existing arginfo. To figure out the types, we need > to look at the implementation (the php.net documentation tends to lie about > details). > > For example, the first function, set_time_limit is defined in > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501. > We see that it accepts an "l" parameter, which is an int. We see that it > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool. > > Once this is done, we need to auto-generate new arginfo data. If you have a > development setup, this is done automatically when running "make". > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > ext/standard/basic_functions.stub.php". > > Any help would be appreciated :) > > Regards, > Nikita
I’d like to help if I can. Some extensions have what appear to be conditional arginfo (for example the LDAP extension, which I *assume* is to handle building against different versions of underlying ldap library that support different features) - is there a way / need to handle those in this stub format that the generator script understands? Cheers Stephen
  106525
August 10, 2019 12:58 benjamin.morel@gmail.com (Benjamin Morel)
Hi,

Sorry if I’m missing something, but can’t the process be automated?

i.e. parse the C source files with a script to generate the stubs? If correctly done, this could be much less error-prone than manual handling.

Benjamin 

> Le 10 août 2019 à 14:00, Stephen Reay <php-lists@koalephant.com> a écrit : >
  106526
August 10, 2019 13:11 nikita.ppv@gmail.com (Nikita Popov)
On Sat, Aug 10, 2019 at 2:00 PM Stephen Reay <php-lists@koalephant.com>
wrote:

> > > On 10 Aug 2019, at 17:56, Nikita Popov ppv@gmail.com> wrote: > > > > Hi, > > > > Lack of type information for internal functions in Reflection is a > > long-standing issue. In PHP 8 we finally have all the necessary technical > > groundwork done to support argument and return type annotations on > internal > > functions at scale. > > > > The only thing left is to actually add those type annotations ... to many > > hundreds of functions. This is something everyone can help with, as it > does > > not need expertise in C. > > > > I've opened a sample PR to show the process: > > https://github.com/php/php-src/pull/4499 > > > > Here, we take some existing arginfo structures in basic_functions.c and > > convert them into stubs in basic_functions.stub.php. We can take the > > argument names from the existing arginfo. To figure out the types, we > need > > to look at the implementation (the php.net documentation tends to lie > about > > details). > > > > For example, the first function, set_time_limit is defined in > > > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501 > . > > We see that it accepts an "l" parameter, which is an int. We see that it > > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a > bool. > > > > Once this is done, we need to auto-generate new arginfo data. If you > have a > > development setup, this is done automatically when running "make". > > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > > ext/standard/basic_functions.stub.php". > > > > Any help would be appreciated :) > > > > Regards, > > Nikita > > I’d like to help if I can. > > Some extensions have what appear to be conditional arginfo (for example > the LDAP extension, which I *assume* is to handle building against > different versions of underlying ldap library that support different > features) - is there a way / need to handle those in this stub format that > the generator script understands? >
You can do the same as in the source code: #ifdef HAVE_LDAP_SASL function ldap_sasl_bind(...) {} #endif These conditions will be carried over in the generated code. Nikita
  106529
August 10, 2019 15:54 markyr@gmail.com (Mark Randall)
On 10/08/2019 11:56, Nikita Popov wrote:
> Hi, > > Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale.
Question - If every function needs looking through and parsing, would this be a suitable time to move from the ZPP varadic to the macros?
  106531
August 10, 2019 16:11 cmbecker69@gmx.de ("Christoph M. Becker")
On 10.08.2019 at 17:54, Mark Randall wrote:

> On 10/08/2019 11:56, Nikita Popov wrote: > >> Lack of type information for internal functions in Reflection is a >> long-standing issue. In PHP 8 we finally have all the necessary technical >> groundwork done to support argument and return type annotations on >> internal >> functions at scale. > > Question - If every function needs looking through and parsing, would > this be a suitable time to move from the ZPP varadic to the macros?
In my opionion, we should only use the macros if the performance improvement matters for the function at hand; otherwise the increased memory footprint of the macros unnecessarily bloats the executables. Thanks, Christoph
  106533
August 10, 2019 19:55 nikita.ppv@gmail.com (Nikita Popov)
On Sat, Aug 10, 2019 at 12:56 PM Nikita Popov ppv@gmail.com> wrote:

> Hi, > > Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale. > > The only thing left is to actually add those type annotations ... to many > hundreds of functions. This is something everyone can help with, as it does > not need expertise in C. > > I've opened a sample PR to show the process: > https://github.com/php/php-src/pull/4499 > > Here, we take some existing arginfo structures in basic_functions.c and > convert them into stubs in basic_functions.stub.php. We can take the > argument names from the existing arginfo. To figure out the types, we need > to look at the implementation (the php.net documentation tends to lie > about details). > > For example, the first function, set_time_limit is defined in > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501. > We see that it accepts an "l" parameter, which is an int. We see that it > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool. > > Once this is done, we need to auto-generate new arginfo data. If you have > a development setup, this is done automatically when running "make". > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > ext/standard/basic_functions.stub.php". > > Any help would be appreciated :) >
A couple of things I didn't mention but noticed during reviews: * We can't add return types to methods (at least for non-final classes), because these would force inheriting classes to specify the return type as well. So return types on methods should be specified using @return for now -- we should discuss if we want to add them as proper types (with the BC break that implies). * For by-reference parameters, very likely no type should be specified. This is because types are always input types. These are usually not constrained for reference parameters. The output type may be constrained, but there's no way to specify it in PHP right now. * Many internal functions have defaults that cannot be expressed as a constant expression, e.g. because they depend on ini settings or other parameters. In this cases by convention "$foo = UNKNOWN" is used, just to mark that the argument is optional, but doesn't have a "proper" default value. Nikita
  106536
August 11, 2019 03:32 markyr@gmail.com (Mark Randall)
On 10/08/2019 11:56, Nikita Popov wrote:
> Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale.
Well I tried to put together a very very ugly hack-job of a C tokeniser/parser that could infer types based on a combination of old-style ZPP and macros used to access them (https://github.com/marandall/cpti) but too many hours later I came to the conclusion that to get the job done properly would probably require building out a full C AST parser to be able to handle things like default values. I did notice a few things in the process of making it (not least that IMO GD needs several hundred things changing to throw UnexpectedValueExceptions). I noticed that in some cases, the internal constants don't match the userland constants, for example the array functions use PHP_SORT_REGULAR but the userland variants are just SORT_REGULAR. Which one would be used for the stubs? In the example from Nikita the flag names do match (PHP_OUTPUT_HANDLER_STDFLAGS). I noticed you're documenting them as string|false rather than string|bool. I assume this is something to do with parsing "false" to be an error condition? Before I re-clone the php-src repo and start editing, I just wanted to check if there had been any consideration to opening up the protocol style to add basic comments? One thing I did see in the GD library with _php_image_create_from is that ZPP is different depending on logical expressions, rather than pre-processor statements. How should these be handled? Mark Randall
  106562
August 12, 2019 18:59 d.takken@xs4all.nl (Dik Takken)
Hi,

I'd be happy to donate some time to this effort as well. If you just
assign me a file or directory (not too much at once please) I can create
a PR that covers that part of the code base.

Would that work for you?

Regards,
Dik

On 10-08-19 12:56, Nikita Popov wrote:
> Hi, > > Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale. > > The only thing left is to actually add those type annotations ... to many > hundreds of functions. This is something everyone can help with, as it does > not need expertise in C. > > I've opened a sample PR to show the process: > https://github.com/php/php-src/pull/4499 > > Here, we take some existing arginfo structures in basic_functions.c and > convert them into stubs in basic_functions.stub.php. We can take the > argument names from the existing arginfo. To figure out the types, we need > to look at the implementation (the php.net documentation tends to lie about > details). > > For example, the first function, set_time_limit is defined in > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501. > We see that it accepts an "l" parameter, which is an int. We see that it > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool. > > Once this is done, we need to auto-generate new arginfo data. If you have a > development setup, this is done automatically when running "make". > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > ext/standard/basic_functions.stub.php". > > Any help would be appreciated :) > > Regards, > Nikita >
  106704
August 26, 2019 13:10 nikita.ppv@gmail.com (Nikita Popov)
On Sat, Aug 10, 2019 at 12:56 PM Nikita Popov ppv@gmail.com> wrote:

> Hi, > > Lack of type information for internal functions in Reflection is a > long-standing issue. In PHP 8 we finally have all the necessary technical > groundwork done to support argument and return type annotations on internal > functions at scale. > > The only thing left is to actually add those type annotations ... to many > hundreds of functions. This is something everyone can help with, as it does > not need expertise in C. > > I've opened a sample PR to show the process: > https://github.com/php/php-src/pull/4499 > > Here, we take some existing arginfo structures in basic_functions.c and > convert them into stubs in basic_functions.stub.php. We can take the > argument names from the existing arginfo. To figure out the types, we need > to look at the implementation (the php.net documentation tends to lie > about details). > > For example, the first function, set_time_limit is defined in > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501. > We see that it accepts an "l" parameter, which is an int. We see that it > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool. > > Once this is done, we need to auto-generate new arginfo data. If you have > a development setup, this is done automatically when running "make". > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > ext/standard/basic_functions.stub.php". > > Any help would be appreciated :) > > Regards, > Nikita >
Thanks everyone for your help with this project! I think at this point we have about half of the extensions covered, here's a hopefully up to date list of the extensions that are done and those that are still missing stubs: Complete: ext/bcmath ext/bz2 ext/calendar ext/com_dotnet ext/ctype ext/curl ext/date ext/enchant ext/filter ext/ftp ext/gd ext/gettext ext/gmp ext/iconv ext/json ext/openssl ext/pcre ext/posix ext/readline ext/shmop ext/sqlite3 ext/sysvmsg ext/sysvsem ext/sysvshm ext/tokenizer ext/zip ext/zlib Pending PRs: ext/exif ext/fileinfo ext/ldap ext/session ext/simplexml Incomplete: ext/standard Missing: ext/dba ext/dom ext/ffi ext/hash ext/imap ext/intl ext/libxml ext/mbstring ext/mysqli ext/mysqlnd ext/oci8 ext/odbc ext/opcache ext/pcntl ext/pdo ext/pdo_* ext/pgsql ext/phar ext/pspell ext/reflection ext/snmp ext/soap ext/sockets ext/sodium ext/spl ext/tidy ext/xml ext/xmlreader ext/xmlrpc ext/xmlwriter ext/xsl Regards, Nikita
  106705
August 26, 2019 16:02 me@ryanmccullagh.com ("Ryan McCullagh")
I would be happy to help annotate the functions from the extensions in the Missing list. One question. Should we be using the documentation as the source of truth for parameter types, and return values?



On Mon, Aug 26, 2019, at 8:11 AM, Nikita Popov wrote:
> On Sat, Aug 10, 2019 at 12:56 PM Nikita Popov ppv@gmail.com> wrote: > > > Hi, > > > > Lack of type information for internal functions in Reflection is a > > long-standing issue. In PHP 8 we finally have all the necessary technical > > groundwork done to support argument and return type annotations on internal > > functions at scale. > > > > The only thing left is to actually add those type annotations ... to many > > hundreds of functions. This is something everyone can help with, as it does > > not need expertise in C. > > > > I've opened a sample PR to show the process: > > https://github.com/php/php-src/pull/4499 > > > > Here, we take some existing arginfo structures in basic_functions.c and > > convert them into stubs in basic_functions.stub.php. We can take the > > argument names from the existing arginfo. To figure out the types, we need > > to look at the implementation (the php.net documentation tends to lie > > about details). > > > > For example, the first function, set_time_limit is defined in > > https://github.com/php/php-src/blob/172c71980df0fe4c9d62c3365f0a2cdb139e3e86/main/main.c#L1501. > > We see that it accepts an "l" parameter, which is an int. We see that it > > uses RETVAL_TRUE and RETVAL_FALSE, which means the return value is a bool. > > > > Once this is done, we need to auto-generate new arginfo data. If you have > > a development setup, this is done automatically when running "make". > > Otherwise, it's possible to manually run "php scripts/dev/gen_stub.php > > ext/standard/basic_functions.stub.php". > > > > Any help would be appreciated :) > > > > Regards, > > Nikita > > > > Thanks everyone for your help with this project! > > I think at this point we have about half of the extensions covered, here's > a hopefully up to date list of the extensions that are done and those that > are still missing stubs: > > Complete: > ext/bcmath > ext/bz2 > ext/calendar > ext/com_dotnet > ext/ctype > ext/curl > ext/date > ext/enchant > ext/filter > ext/ftp > ext/gd > ext/gettext > ext/gmp > ext/iconv > ext/json > ext/openssl > ext/pcre > ext/posix > ext/readline > ext/shmop > ext/sqlite3 > ext/sysvmsg > ext/sysvsem > ext/sysvshm > ext/tokenizer > ext/zip > ext/zlib > > Pending PRs: > ext/exif > ext/fileinfo > ext/ldap > ext/session > ext/simplexml > > Incomplete: > ext/standard > > Missing: > ext/dba > ext/dom > ext/ffi > ext/hash > ext/imap > ext/intl > ext/libxml > ext/mbstring > ext/mysqli > ext/mysqlnd > ext/oci8 > ext/odbc > ext/opcache > ext/pcntl > ext/pdo > ext/pdo_* > ext/pgsql > ext/phar > ext/pspell > ext/reflection > ext/snmp > ext/soap > ext/sockets > ext/sodium > ext/spl > ext/tidy > ext/xml > ext/xmlreader > ext/xmlrpc > ext/xmlwriter > ext/xsl > > Regards, > Nikita >
  106706
August 26, 2019 16:07 george.banyard@gmail.com ("G. P. B.")
On Mon, 26 Aug 2019 at 18:02, Ryan McCullagh <me@ryanmccullagh.com> wrote:

> I would be happy to help annotate the functions from the extensions in the > Missing list. One question. Should we be using the documentation as the > source of truth for parameter types, and return values? > > No, the documentation is at times out of sync or doesn't inform about
failure return types, the source of truth is the implementation of these functions. Best regards George P. Banyard
  106711
August 27, 2019 22:00 d.takken@xs4all.nl (Dik Takken)
On 26-08-19 15:10, Nikita Popov wrote:
> Thanks everyone for your help with this project!
It had been quiet on the list since I offered to help out, so I completely missed people were already working on this. To avoid any duplication of work, just assign me a couple of extensions that nobody is working on yet and I will see what I can do. Regards, Dik Takken
  106712
August 28, 2019 00:42 theodorejb@outlook.com (Theodore Brown)
On Tue, Aug 27, 2019 at 5:00 PM Dik Takken takken@xs4all.nl> wrote:

> It had been quiet on the list since I offered to help out, so I > completely missed people were already working on this. > > To avoid any duplication of work, just assign me a couple of extensions > that nobody is working on yet and I will see what I can do.
There's no need to wait to be assigned - feel free to pick any of the extensions that haven't been migrated yet and don't have pending PRs. For the bigger extensions that take longer to migrate, it's okay to submit a WIP pull request to avoid duplication of work. Best regards, Theodore