Re: [PHP-DEV] A validator module for PHP7

This is only part of a thread. view whole thread
  100534
September 12, 2017 04:04 php-lists@koalephant.com (Stephen Reay)
> On 12 Sep 2017, at 04:07, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Stephen, > > On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com <mailto:php-lists@koalephant.com>> wrote: > >> On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net>> wrote: >> >> Hi Stephen, >> >> On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com <mailto:php-lists@koalephant.com>> >> wrote: >> >>> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net>> wrote: >>> >>> It seems you haven't try to use filter module seriously. >>> It simply does not have enough feature for input validations. >>> e.g. You cannot validate "strings". >>> >>> >>> Yasuo, >>> >>> I’ve asked previously what your proposal actually offers over the filter >>> functions, and got no response, so please elaborate on this? >>> >> >> >>> Can you show a concrete example that cannot be validated in user land >>> currently, using the filter functions as a base? >>> >> >> FILTER_VALIDATE_REGEXP is not good enough simply. >> PCRE is known that it is vulnerable to regex DoS still. (as well as >> Oniguruma) >> Users should avoid regex validation whenever it is possible also to avoid >> various >> risks. >> >> In addition, current filter module does not provide nested array validation >> array key validation, etc. It's not true validation neither. It does not >> provide >> simple length, min/max validations. It does non explicit conversions (i.e. >> trim), etc. >> Length, min/max validation is mandatory validation if you would like to >> follow >> ISO 27000 requirement. >> >> Regards, >> >> -- >> Yasuo Ohgaki >> yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net> > > So, you still didn’t actually provide an example. I *guess* you’re talking about character class validation or something else equally “simple”, because I can’t imagine what else would be a common enough case that you’d want to have built-in rules for, and that you wouldn’t internally use RegExp to test anyway. > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explained > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > $input = [ > 'defined_but_should_not_exist' => 'Developer should not allow unwanted value', > '_invalid_utf8_key_should_not_be_allowed_' => 'Developer should validate key value as well', > 'utf8_text' => 'Validator should be able to allow UTF-8 and validate its validity at least', > 'default_must_be_safe' => 'Crackers send all kinds of chars. CNTRL chars must not be allowed by default', > 'array' => [ > 'complex' => 1, > 'nested' => 'any validation rule should be able to be applied', > 'array' => 1, > 'key_should_be_validated_also' => 1, > 'array' => [ > 'any_num_of_nesting' => 'is allowed', > ], > ], > 'array_num_elements_must_be_validated' => [ > "a", "b", "c", "d", "e", "f", "and so on", "values must be able to be validated as user wants", > ], > ]; > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter > - no STRING validator currently > - it allows any inputs by default > - it does not allow multiple rules that allows complex validation rules for string > - it does not have callback validator > - it does not have key value validation (note: PHP's key could be binary) > - it does not validate num of elements in array. > - it cannot forbids unwanted elements in array. > - it cannot validate "char encoding". > - it does not enforce white listing. > - and so on > > These are the list that "filter" cannot do. > > Ok so we can’t use filter_var() rules to validate that a string field is an Alpha or AlphaNum, between 4 and 8 characters long (technically you could pass mb_strlen() to the INT filter with {min,max}_range options set to get the length validation, but I’ll grant you that *is* kind of a crappy workaround right now) > > Why not stop trying to re-invent every single feature already present in PHP (yes, I’ve been paying attention to all your other proposals), and just *add* the functionality that’s missing: > > https://wiki.php.net/rfc/add_validate_functions_to_filter <https://wiki.php.net/rfc/add_validate_functions_to_filter> > It's _declined_. You should have supported this RFC if you would like to add features to filter. > (I'm glad there is a new RFC supporter regardless of occasion) > > I don't mind this result much. > Adding features to "filter" has some of shortcomings mentioned above > even with my proposal. > > A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max` => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) > > Simply adding these wouldn't work well as validator because > > - Filter is designed for black listing > > As you may know, all of security standards/guidelines require > > - White listing for validation > > We may change "filter", but it requires BC. > > > Lastly: it may not be the format you personally want, but the filter extension *does* have the `filter_{input,var}_array` functions. Claiming something doesn’t exist because it doesn’t work exactly how you would like it to, makes you seem immature and petty, IMO. > > Discussion is confusing because you ignore this RFC result. > https://wiki.php.net/rfc/add_validate_functions_to_filter <https://wiki.php.net/rfc/add_validate_functions_to_filter> > This RFC proposes filter module improvement while keeping compatibility. > > I understand your point. This exactly the same reason why I proposed > "improvement" at first, not new extension. > > I don't understand why you insist already failed attempt repeatedly. > > Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it passes. > I must use "white list" as much as possible. > > Regards, > > P.S. "Filter" module is black listing module. "Validate" is white listing module. > Even with BC, mixing them would result in confusing FLAGs and codes. > Codes may be cleaned up later, but FLAGs cannot. > We should consider this also. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net> > >
I was going to give a lot of detailed replies inline, but I’ve come to the realisation its pointless with you. You really respond to what people say, you just use their comments as jumping off points to re-post your same little rant, ad nauseam. So here’s the summary. Don’t both replying, because I won’t be reading it. - I never asked for a working code example that is impossible with the current extension. I asked for a simple example of what you wanted to achieve. - More than half the “issues” you claim with the filter extension, are only “valid” if you agree that it needs to do complex array structure validation. I do not agree with this. Userland can iterate an array of rules/input and validate quite easily. - The *actual* issues with the filter extension could be solved by improving/adding filters. - I already agreed that a string based filter (to test character class and min/max length, etc) would be a good addition. Continuing to bring it up when others have acknowledged something doesn’t help your case, at all. - ACCEPTING that string validation is missing, number/bool/ is still whitelisting. I don’t know how they’re implemented in C. Maybe that needs improvement. But a rule saying “validate that $X is a number between 10 and 100” or “validate that $Y is an email address) is whitelisting.
  100554
September 12, 2017 22:10 yohgaki@ohgaki.net (Yasuo Ohgaki)
On Tue, Sep 12, 2017 at 1:04 PM, Stephen Reay <php-lists@koalephant.com>
wrote:

> > On 12 Sep 2017, at 04:07, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Stephen, > > On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com> > wrote: > >> >> On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: >> >> Hi Stephen, >> >> On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com> >> wrote: >> >> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: >> >> It seems you haven't try to use filter module seriously. >> It simply does not have enough feature for input validations. >> e.g. You cannot validate "strings". >> >> >> Yasuo, >> >> I’ve asked previously what your proposal actually offers over the filter >> functions, and got no response, so please elaborate on this? >> >> >> >> Can you show a concrete example that cannot be validated in user land >> currently, using the filter functions as a base? >> >> >> FILTER_VALIDATE_REGEXP is not good enough simply. >> PCRE is known that it is vulnerable to regex DoS still. (as well as >> Oniguruma) >> Users should avoid regex validation whenever it is possible also to avoid >> various >> risks. >> >> In addition, current filter module does not provide nested array >> validation >> array key validation, etc. It's not true validation neither. It does not >> provide >> simple length, min/max validations. It does non explicit conversions (i.e. >> trim), etc. >> Length, min/max validation is mandatory validation if you would like to >> follow >> ISO 27000 requirement. >> >> Regards, >> >> -- >> Yasuo Ohgaki >> yohgaki@ohgaki.net >> >> >> >> So, you still didn’t actually provide an example. I *guess* you’re >> talking about character class validation or something else equally >> “simple”, because I can’t imagine what else would be a common enough case >> that you’d want to have built-in rules for, and that you wouldn’t >> internally use RegExp to test anyway. >> > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explained > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > $input = [ > 'defined_but_should_not_exist' => 'Developer should not allow unwanted > value', > '_invalid_utf8_key_should_not_be_allowed_' => 'Developer should > validate key value as well', > 'utf8_text' => 'Validator should be able to allow UTF-8 and validate its > validity at least', > 'default_must_be_safe' => 'Crackers send all kinds of chars. CNTRL chars > must not be allowed by default', > 'array' => [ > 'complex' => 1, > 'nested' => 'any validation rule should be able to be applied', > 'array' => 1, > 'key_should_be_validated_also' => 1, > 'array' => [ > 'any_num_of_nesting' => 'is allowed', > ], > ], > 'array_num_elements_must_be_validated' => [ > "a", "b", "c", "d", "e", "f", "and so on", "values must be able to > be validated as user wants", > ], > ]; > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter > - no STRING validator currently > - it allows any inputs by default > - it does not allow multiple rules that allows complex validation rules > for string > - it does not have callback validator > - it does not have key value validation (note: PHP's key could be binary) > - it does not validate num of elements in array. > - it cannot forbids unwanted elements in array. > - it cannot validate "char encoding". > - it does not enforce white listing. > - and so on > > These are the list that "filter" cannot do. > > Ok so we can’t use filter_var() rules to validate that a string field is >> an Alpha or AlphaNum, between 4 and 8 characters long (technically you >> could pass mb_strlen() to the INT filter with {min,max}_range options set >> to get the length validation, but I’ll grant you that *is* kind of a crappy >> workaround right now) >> >> Why not stop trying to re-invent every single feature already present in >> PHP (yes, I’ve been paying attention to all your other proposals), and just >> *add* the functionality that’s missing: >> > > https://wiki.php.net/rfc/add_validate_functions_to_filter > It's _declined_. You should have supported this RFC if you would like to > add features to filter. > (I'm glad there is a new RFC supporter regardless of occasion) > > I don't mind this result much. > Adding features to "filter" has some of shortcomings mentioned above > even with my proposal. > > A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max` >> => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a >> built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) >> > > Simply adding these wouldn't work well as validator because > > - Filter is designed for black listing > > As you may know, all of security standards/guidelines require > > - White listing for validation > > We may change "filter", but it requires BC. > > >> >> Lastly: it may not be the format you personally want, but the filter >> extension *does* have the `filter_{input,var}_array` functions. Claiming >> something doesn’t exist because it doesn’t work exactly how you would like >> it to, makes you seem immature and petty, IMO. >> > > Discussion is confusing because you ignore this RFC result. > https://wiki.php.net/rfc/add_validate_functions_to_filter > This RFC proposes filter module improvement while keeping compatibility. > > I understand your point. This exactly the same reason why I proposed > "improvement" at first, not new extension. > > I don't understand why you insist already failed attempt repeatedly. > > Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it > passes. > I must use "white list" as much as possible. > > Regards, > > P.S. "Filter" module is black listing module. "Validate" is white listing > module. > Even with BC, mixing them would result in confusing FLAGs and codes. > Codes may be cleaned up later, but FLAGs cannot. > We should consider this also. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net > > > > I was going to give a lot of detailed replies inline, but I’ve come to the > realisation its pointless with you. You really respond to what people say, > you just use their comments as jumping off points to re-post your same > little rant, ad nauseam. >
May be I shouldn't reply if a reply indicates previous mails aren't read. I usually reply all regardless. As a result, I reply the basically the same thing. Since someone mentioned hash_hkdf() mess on this thread, short note for this. It's clearly Nikita and Andrey's fault. They don't read the internet RFC fully. I had no idea why they're acting like ignorant, kept insisting ridiculous/insecure API clearly violates the RFC, i.e. Salt as last optional param. Wrong is wrong. I cannot stop point it out problems in security feature(key derivation) unless it is fixed. If one feels curious, read RFC 5869, then you'll see why the API is so ridiculous/insecure without salt. So here’s the summary. Don’t both replying, because I won’t be reading it.
> > - I never asked for a working code example that is impossible with the > current extension. I asked for a simple example of what you wanted to > achieve. >
OK. My excuse for misunderstanding. Unit tests do not cover all features yet, but you can see them from working "validate" module's *.phpt.
> > - More than half the “issues” you claim with the filter extension, are > only “valid” if you agree that it needs to do complex array structure > validation. I do not agree with this. Userland can iterate an array of > rules/input and validate quite easily. >
I totally agree that "validate" and "filter" is similar. I also totally agree that it's very easily done by scripts. I thought "proper application level validation" would be common sense many years ago since it is easy, but it is not. As you can see from this discussion, there are many people that "database" and/or "model" level validation is good enough for apps. This is one of my motivation, another is performance. Validation should be done always, so module functions are suitable for both performance and documentation purpose. For example, developers provide escape API for security purpose even when it is trivial with string functions. It's good for documentation purpose, as well as performance. It's the same. - The *actual* issues with the filter extension could be solved by
> improving/adding filters. >
Largest filter module issue as validator is "filter is made for filtering" and "blacklisting nature came from filtering architecture". I realized following issues with my filter module improvement RFC, of course. My approach back then was "it's better than nothing". There are many issues. I picked 2 most importants. - Although it can be used for whitelisting, but it's optional by design. It does not enforce whitelisting by default. Enforced whitelisting archives much better results. Therefore, security related features should use whitelisting. e.g. MAC(Mandatory Access Control), SELinux - It applies extremely dangerous default filter which does nothing in case user sets invalid filter/validator. i.e. Simply pass inputs, let code use it. No security check at all. Making filter's validation a true whitelist validator is possible. However, there are issues. Making filter a true validator requires a lot of BC. We may have "strict option switch", but making filter a whitelist validator isn't so simple as it may seem. "strict option switch" will add many branches and code might be unmaintainable. Even without "strict option switch", the code is based on "filtering"/"blacklisting" and requires large refactoring. I've already tried both "filter validate improvement" and "new validation module". From this experience, MySQL to MySQLi like transition is the best choice, IMO. We can use whatever API/interface(e.g. spec array, flags) that is easy to understand/maintain/expand. But again, I don't mind implementing filter's validation improvement if anyone would like to spend time for RFC. Even if it would be far from true validation, it's still better than nothing. If you would like to create filter improvement RFC and if it passes, I'll write code for it. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net