[RFC] Escape PDO "?" parameter placeholder

  105810
May 31, 2019 11:26 php@beccati.com (Matteo Beccati)
Hi everyone,

following some recent unrest in the comments of the related PR, I've
decided to invest a bit of time on it and finally move it to draft status:

https://wiki.php.net/rfc/pdo_escape_placeholders

The PR is currently closed, but I will soon rebase and update it.

Any feedback kindly accepted!


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/
  105811
May 31, 2019 14:33 levim@php.net (Levi Morrison)
On Fri, May 31, 2019 at 5:27 AM Matteo Beccati <php@beccati.com> wrote:
> > Hi everyone, > > following some recent unrest in the comments of the related PR, I've > decided to invest a bit of time on it and finally move it to draft status: > > https://wiki.php.net/rfc/pdo_escape_placeholders > > The PR is currently closed, but I will soon rebase and update it. > > Any feedback kindly accepted! > > > Cheers > -- > Matteo Beccati > > Development & Consulting - http://www.beccati.com/ > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php
My initial thought is that unless 2 or more database vendors we support have this feature then it shouldn't get added. From our manual:
> The PHP Data Objects (PDO) extension defines a lightweight, consistent > interface for accessing databases in PHP. Each database driver that > implements the PDO interface can expose database-specific features > as regular extension functions.
As this proposal affects syntax I think such changes are unwise.
  105812
May 31, 2019 14:54 php@beccati.com (Matteo Beccati)
On 31/05/2019 16:33, Levi Morrison wrote:
> My initial thought is that unless 2 or more database vendors we > support have this feature then it shouldn't get added. From our > manual: > >> The PHP Data Objects (PDO) extension defines a lightweight, consistent >> interface for accessing databases in PHP. Each database driver that >> implements the PDO interface can expose database-specific features >> as regular extension functions.
The point is that the driver cannot expose anything in this case as the parser is global.
> As this proposal affects syntax I think such changes are unwise.
Could you please elaborate? Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
  105813
May 31, 2019 16:18 rowan.collins@gmail.com (Rowan Collins)
On Fri, 31 May 2019 at 12:27, Matteo Beccati <php@beccati.com> wrote:

> Hi everyone, > > following some recent unrest in the comments of the related PR, I've > decided to invest a bit of time on it and finally move it to draft status: > > https://wiki.php.net/rfc/pdo_escape_placeholders >
Hi Matteo, This sounds like a very useful change. As you say in the RFC, these operators are going to be more and more common for Postgres users now that they've been defined for JSON types, and users of most other DBMSes will be completely unaffected. I wonder if there's any way we can gauge the BC impact, specifically this case:
> The only exception to that is that Postgres (and possibly other RDMSs) allows the creation of custom operators: anyone having a custom “??”
operator in use would need to escape it as “????”. - Are there any standard or public Postgres extensions that define a "??" operator? e.g. anything published on https://pgxn.org/ - Are there any other databases supported by PDO that allow custom operators, or which might have a "??" operator or syntax token? If there is a risk of this affecting a reasonable number of people, we could add a deprecation notice on encountering "??" in any SQL string in 7.4, and postpone adding the escaping mechanism until 8.0. One other thing worth clarifying is the precise circumstances where ?? will be treated as an escape. Is it only when it's a separate "word" (start/end of string or surrounded by whitespace)? For instance, would "A ??= B" be "unescaped" to "A ?= B" or passed through as is? Note that this changes the scope of the BC break, as well as the usability of the chosen syntax, as both "?=" and "??=" could be defined as operators. Regards, -- Rowan Collins [IMSoP]
  105814
June 1, 2019 06:46 php@beccati.com (Matteo Beccati)
Hi Rowan,

On 31/05/2019 18:18, Rowan Collins wrote:
> This sounds like a very useful change. As you say in the RFC, these > operators are going to be more and more common for Postgres users now that > they've been defined for JSON types, and users of most other DBMSes will be > completely unaffected.
Thanks, the patch is been sitting on my HD since a few years now. Mostly no one complained about the geometric operators containing a "?", but for JSON I guess it's already a bit too late.
> I wonder if there's any way we can gauge the BC impact, specifically this > case: > >> The only exception to that is that Postgres (and possibly other RDMSs) > allows the creation of custom operators: anyone having a custom “??” > operator in use would need to escape it as “????”. > - Are there any standard or public Postgres extensions that define a "??" > operator? e.g. anything published on https://pgxn.org/
I couldn't find anything, but I can try and ask on the postgres mailing lists.
> - Are there any other databases supported by PDO that allow custom > operators, or which might have a "??" operator or syntax token?
Not that I know of. Researching that hasn't been easy, but it seems that most of the databases use "?" for positional parameters in prepared queries. Any additional help would be appreciated.
> If there is a risk of this affecting a reasonable number of people, we > could add a deprecation notice on encountering "??" in any SQL string in > 7.4, and postpone adding the escaping mechanism until 8.0.
Nice catch. I thought one couldn't have "??" but it seems the parser currently doesn't allow two consecutive positional parameter question marks and "??" was sent as-is. I'll try doing more research on the topic.
> One other thing worth clarifying is the precise circumstances where ?? will > be treated as an escape. Is it only when it's a separate "word" (start/end > of string or surrounded by whitespace)? For instance, would "A ??= B" be > "unescaped" to "A ?= B" or passed through as is? Note that this changes the > scope of the BC break, as well as the usability of the chosen syntax, as > both "?=" and "??=" could be defined as operators.
No word boundaries involved. Any occurrence of "??" outside comments will be translated into "?", so that operators containing the "?" ("A ?| B", "A ?& B") can be used. If one had defined both "?=" and "??=", they would need to be escaped in PDO as "??=" and "????=". Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
  106125
July 2, 2019 13:07 nikita.ppv@gmail.com (Nikita Popov)
On Fri, May 31, 2019 at 1:27 PM Matteo Beccati <php@beccati.com> wrote:

> Hi everyone, > > following some recent unrest in the comments of the related PR, I've > decided to invest a bit of time on it and finally move it to draft status: > > https://wiki.php.net/rfc/pdo_escape_placeholders > > The PR is currently closed, but I will soon rebase and update it. > > Any feedback kindly accepted! >
Friendly reminder that this RFC needs to go into voting until Monday (preferably earlier) to make it into 7.4. Here's my feedback: * I would prefer to make escaping not driver-sensitive, as the current implementation is. Whether ?? is interpreted as a single ? or ?? should not depend on the driver. * I would prefer to use \? instead of ?? for escaping. The former is much more easily understood by a PHP developer and has less chance of clashing with operators (PHP itself has a ?? operator, it's not so absurd to think that it also exists elsewhere). The RFC argues against this because it makes writing a literal \? harder (which would be \\\\?), but I think that a) the need for a literal \? seems rather rare and b) double-escaping is already a well-understood problem for anyone who ever used regular expressions. Regards, Nikita
  106126
July 2, 2019 14:41 thruska@cubiclesoft.com (Thomas Hruska)
On 7/2/2019 6:07 AM, Nikita Popov wrote:
> that it also exists elsewhere). The RFC argues against this because it > makes writing a literal \? harder (which would be \\\\?), but I think that
\\? should result in the correct string. \\\\? adds in an extra backslash. -- Thomas Hruska CubicleSoft President I've got great, time saving software that you will find useful. http://cubiclesoft.com/ And once you find my software useful: http://cubiclesoft.com/donate/
  106127
July 2, 2019 14:55 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Jul 2, 2019 at 4:42 PM Thomas Hruska <thruska@cubiclesoft.com>
wrote:

> On 7/2/2019 6:07 AM, Nikita Popov wrote: > > that it also exists elsewhere). The RFC argues against this because it > > makes writing a literal \? harder (which would be \\\\?), but I think > that > > \\? should result in the correct string. \\\\? adds in an extra backslash. >
I guess I should clarify that I meant use in a quoted string. In that case "\\\\?" becomes \\? after PHP is done with it, and \? after PDO is done with it. Of course, if you use nowdoc for SQL queries, then it would be just \\?. In any case, I'm also fine with the ?? escaping, this is just a preference. Nikita
  106129
July 2, 2019 18:22 php@beccati.com (Matteo Beccati)
Hi Nikita,

On 02/07/2019 15:07, Nikita Popov wrote:
> Friendly reminder that this RFC needs to go into voting until Monday > (preferably earlier) to make it into 7.4.
Thanks! Without the reminder, I would probably have missed it.
> Here's my feedback: > > * I would prefer to make escaping not driver-sensitive, as the current > implementation is. Whether ?? is interpreted as a single ? or ?? should not > depend on the driver.
Most of the feedback I had was quite the opposite (fear of desruption in the other drivers). In fact in the latest iteration, I went for PDO API setting, that lets the driver decide whether or not to enable the feature, which means only pdo_pgsql would be affected: https://github.com/mbeccati/php-src/commit/b8a9703b805e0dffd618823656c8610777efdc3e
> * I would prefer to use \? instead of ?? for escaping. The former is much > more easily understood by a PHP developer and has less chance of clashing > with operators (PHP itself has a ?? operator, it's not so absurd to think > that it also exists elsewhere). The RFC argues against this because it > makes writing a literal \? harder (which would be \\\\?), but I think that > a) the need for a literal \? seems rather rare and b) double-escaping is > already a well-understood problem for anyone who ever used regular > expressions.
Fair enough. Tbh, I have no strong preference... Would "\?" require also implementing escape of the escape? Would that require some re2c magic? Cheers -- Matteo Beccati Development & Consulting - http://www.beccati.com/
  106154
July 5, 2019 10:37 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Jul 2, 2019 at 8:22 PM Matteo Beccati <php@beccati.com> wrote:

> Hi Nikita, > > On 02/07/2019 15:07, Nikita Popov wrote: > > Friendly reminder that this RFC needs to go into voting until Monday > > (preferably earlier) to make it into 7.4. > > Thanks! Without the reminder, I would probably have missed it. > > > Here's my feedback: > > > > * I would prefer to make escaping not driver-sensitive, as the current > > implementation is. Whether ?? is interpreted as a single ? or ?? should > not > > depend on the driver. > > Most of the feedback I had was quite the opposite (fear of desruption in > the other drivers). In fact in the latest iteration, I went for PDO API > setting, that lets the driver decide whether or not to enable the > feature, which means only pdo_pgsql would be affected: > > > https://github.com/mbeccati/php-src/commit/b8a9703b805e0dffd618823656c8610777efdc3e
This sounds nice now -- but what if another database adds an operator using ? in the future? We'd have to enable support for ? escaping at that point. This would leave us with a mess where ? escaping is available or not available depending on the specific combination of database driver + PHP version you are using. As the BC concern here seems to be purely theoretical (as far as I can see), it seems better to do this for all drivers at the same time.
> * I would prefer to use \? instead of ?? for escaping. The former is much > > more easily understood by a PHP developer and has less chance of clashing > > with operators (PHP itself has a ?? operator, it's not so absurd to think > > that it also exists elsewhere). The RFC argues against this because it > > makes writing a literal \? harder (which would be \\\\?), but I think > that > > a) the need for a literal \? seems rather rare and b) double-escaping is > > already a well-understood problem for anyone who ever used regular > > expressions. > > Fair enough. Tbh, I have no strong preference... Would "\?" require also > implementing escape of the escape? Would that require some re2c magic? >
Yeah, we'd probably need to support escape of the escape for consistency, even if nobody needs it. Overall I'm okay either way here. I think \? will be more obvious for PHP programmers, but seeing the JDBC document you linked (https://jdbc.postgresql.org/documentation/head/statement.html) there is existing precedent for using ?? and it may be worthwhile to follow it. Nikita
  106155
July 5, 2019 10:51 php@beccati.com (Matteo Beccati)
Hi Nikita,

> On 5 Jul 2019, at 12:37, Nikita Popov ppv@gmail.com> wrote: > >> On Tue, Jul 2, 2019 at 8:22 PM Matteo Beccati <php@beccati.com> wrote: >> Hi Nikita, >> >> On 02/07/2019 15:07, Nikita Popov wrote: >> > Friendly reminder that this RFC needs to go into voting until Monday >> > (preferably earlier) to make it into 7.4. >> >> Thanks! Without the reminder, I would probably have missed it. >> >> > Here's my feedback: >> > >> > * I would prefer to make escaping not driver-sensitive, as the current >> > implementation is. Whether ?? is interpreted as a single ? or ?? should not >> > depend on the driver. >> >> Most of the feedback I had was quite the opposite (fear of desruption in >> the other drivers). In fact in the latest iteration, I went for PDO API >> setting, that lets the driver decide whether or not to enable the >> feature, which means only pdo_pgsql would be affected: >> >> https://github.com/mbeccati/php-src/commit/b8a9703b805e0dffd618823656c8610777efdc3e > > This sounds nice now -- but what if another database adds an operator using ? in the future? We'd have to enable support for ? escaping at that point. This would leave us with a mess where ? escaping is available or not available depending on the specific combination of database driver + PHP version you are using. As the BC concern here seems to be purely theoretical (as far as I can see), it seems better to do this for all drivers at the same time.
Makes sense and it’s a very good point.
>> > * I would prefer to use \? instead of ?? for escaping. The former is much >> > more easily understood by a PHP developer and has less chance of clashing >> > with operators (PHP itself has a ?? operator, it's not so absurd to think >> > that it also exists elsewhere). The RFC argues against this because it >> > makes writing a literal \? harder (which would be \\\\?), but I think that >> > a) the need for a literal \? seems rather rare and b) double-escaping is >> > already a well-understood problem for anyone who ever used regular >> > expressions. >> >> Fair enough. Tbh, I have no strong preference... Would "\?" require also >> implementing escape of the escape? Would that require some re2c magic? > > Yeah, we'd probably need to support escape of the escape for consistency, even if nobody needs it. Overall I'm okay either way here. I think \? will be more obvious for PHP programmers, but seeing the JDBC document you linked (https://jdbc.postgresql.org/documentation/head/statement.html) there is existing precedent for using ?? and it may be worthwhile to follow it.
Would it be possible or even recommended to add a second vote to decide which one to use? Cheers