PCRE partial matching

  104817
March 20, 2019 09:31 nikita.ppv@gmail.com (Nikita Popov)
Hi internals,

PCRE has some very nice partial matching functionality described at
https://www.pcre.org/current/doc/html/pcre2partial.html. This is useful for
streaming processing, as it allows you to distinguish between "there's
definitely no match here" and "this could match starting from position N,
but we need more data to find out".

Here is a PR to expose this functionality from PHP:
https://github.com/php/php-src/pull/3969 The PR has a basic description of
the API.

What do you think?

Nikita
  104830
March 20, 2019 15:26 cananian@wikimedia.org ("C. Scott Ananian")
Looks nice to me.  In connection with the PREG_LENGTH_CAPTURE option
floated in a previous post, this would easily let the wikimedia/remex-html
package parse HTML in a streaming fashion; it would fill up a buffer array
and then do an incremental parse, stopping as soon as a (hard) partial
match was found, then move the prefix returned to the start of the buffer,
wait for the buffer to fill more, then restart.
 --scott

On Wed, Mar 20, 2019 at 5:32 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > PCRE has some very nice partial matching functionality described at > https://www.pcre.org/current/doc/html/pcre2partial.html. This is useful > for > streaming processing, as it allows you to distinguish between "there's > definitely no match here" and "this could match starting from position N, > but we need more data to find out". > > Here is a PR to expose this functionality from PHP: > https://github.com/php/php-src/pull/3969 The PR has a basic description of > the API. > > What do you think? > > Nikita >
-- (http://cscott.net)
  105651
May 9, 2019 10:08 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Mar 20, 2019 at 10:31 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > PCRE has some very nice partial matching functionality described at > https://www.pcre.org/current/doc/html/pcre2partial.html. This is useful > for streaming processing, as it allows you to distinguish between "there's > definitely no match here" and "this could match starting from position N, > but we need more data to find out". > > Here is a PR to expose this functionality from PHP: > https://github.com/php/php-src/pull/3969 The PR has a basic description > of the API. > > What do you think? > > Nikita >
Any more comments on this one? Nikita