Re: [PHP-DEV] A validator module for PHP7

This is only part of a thread. view whole thread
  100408
September 6, 2017 11:15 rowan.collins@gmail.com (Rowan Collins)
On 6 September 2017 09:29:37 BST, Lester Caine <lester@lsces.co.uk> wrote:
>My only problem with Yasuo's latest offering is once again it adds a >whole new set of defines that have to be mapped to existing metadata >definitions ... That and it is a lot of longhand code using a different >style to existing arrays. We need yet another wrapper to build these >arrays from existing code ...
The rules have to be defined somehow, and I'm not aware of a standard format that current code is likely to follow. Unless there is already such a standard, I can't see any way to avoid existing code having to be wrapped or amended. Which is why Yasuo and I have both suggested we work together to come up with such a standard format that can be used or adapted for these different parts of the application. If you have suggestions for how the format should look, we are eager to hear them and see some examples. Regards, -- Rowan Collins [IMSoP]
  100409
September 6, 2017 11:38 danack@basereality.com (Dan Ackroyd)
On 6 September 2017 at 12:15, Rowan Collins collins@gmail.com> wrote:

> If you have suggestions for how the format should look
Don't use a format. Just write code - see below.
> Which is why Yasuo and I have both suggested we work together
If you're going to work together and continue the conversation, please can you move this conversation elsewhere? It doesn't appear to be actually anything to do with PHP internals. On 4 September 2017 at 07:33, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote:
> > Since I didn't get much feedbacks during the RFC discussion, I cannot tell > what part is disliked.
Yes you did. You got feedback during the discussion and also during the vote. For example: http://news.php.net/php.internals/95164 However you continually choose to ignore that feedback. I will attempt once more, to get the main point through to you. Perhaps a small amount of repetition, will get it through: This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. This type of library should be done in PHP, not in C. cheers Dan Ack function validateOrderAmount($value) : int { $count = preg_match("/[^0-9]*/", $value); if ($count) { throw new InvalidOrderAmount("The order value must contain only digits."); } $value = intval($value); if ($value < 1) { throw new InvalidOrderAmount("The order value must be one or more."); } if ($value >= MAX_ORDER_AMOUNT) { throw new InvalidOrderAmount( "Order value to big. Maximum allowed value is ".MAX_ORDER_AMOUNT ); } return $value; } (i'd probably recommend not using exceptions, but instead return [$valid, $value] to allow validating multiple items without having to use exceptions for flow control.)
  100415
September 6, 2017 12:31 rowan.collins@gmail.com (Rowan Collins)
On 6 September 2017 12:38:03 BST, Dan Ackroyd <danack@basereality.com> wrote:
>On 6 September 2017 at 12:15, Rowan Collins collins@gmail.com> >wrote: > >> If you have suggestions for how the format should look > >Don't use a format. Just write code - see below.
I'm going to assume that the code you posted was something of a straw man, and you're not actually advocating people copy 20 lines of code for every variable they want to validate. As soon as you have some shared functions, you have some design decisions about how those functions are named, what arguments they take, etc. When used for a lot of variables, that will resemble some kind of DSL, and that's all I meant by "format".
>This type of library should be done in PHP, not in C.
I can certainly see the argument for that. In that case, how would you like to see the language evolve: Should we deprecate ext/filter, to encourage people to use better implementations in userland? Should the language include some set of basic primitives that these userland implementations can use, rather than having to endlessly reimplement things like a regex for detecting a valid integer? Or do you view ext/filter's current scope as about right, and it should be neither extended nor reduced? Regards, -- Rowan Collins [IMSoP]
  100425
September 6, 2017 20:33 danack@basereality.com (Dan Ackroyd)
On 6 September 2017 at 13:31, Rowan Collins collins@gmail.com> wrote:
> I'm going to assume that the code you posted was something of a straw > man, and you're not actually advocating people copy 20 lines of code for > every variable they want to validate.
You assume wrong. No it's not, and yes I am. I can point a junior developer at the function and they can understand it. If I ask that junior developer to add an extra rule that doesn't currently exist, they can without having to dive into a full library of validation code. If I need to modify the validation based on extra input (e.g whether the user has already made several purchases, or whether they're a brand new signup), it's trivial to add that to the function. This is one of the times where code re-use through copying and pasting is far superior to trying to make stuff "simple" by going through an array based 'specification'. It turns out that that doesn't save much time to begin with, and then becomes hard to manage when your requirements get more complication.
> I can certainly see the argument for that. In that case, how would > you like to see the language evolve:
I do not believe that trying to design features through an email conversation, is a productive use of individual peoples time. But also, this list is sent to thousands of people. The number of back and forth messages required to (possibly) come up with a decent design mulitplied thousands of people is a huge time sink. I believe it is far more productive to come up with a good idea off-list, and then present an almost finished version for discussion, rather than design features from scratch via this list. cheers Dan Ack
  100426
September 6, 2017 22:47 rowan.collins@gmail.com (Rowan Collins)
On 6 September 2017 21:33:53 BST, Dan Ackroyd <danack@basereality.com> wrote:
>On 6 September 2017 at 13:31, Rowan Collins collins@gmail.com> >wrote: >> I'm going to assume that the code you posted was something of a straw >> man, and you're not actually advocating people copy 20 lines of code >for >> every variable they want to validate. > >You assume wrong. No it's not, and yes I am. > >I can point a junior developer at the function and they can understand >it. > >If I ask that junior developer to add an extra rule that doesn't >currently exist, they can without having to dive into a full library >of validation code.
I can certainly agree that a complex DSL might be more pain than it's worth, but copying around a regex and all its scaffolding is surely worse than a clearly named function like validate_non_negative_int? That's why I asked in my last email if you thought ext/filter should be removed, and perhaps replaced by some more straightforward primitives, or just left as it is. If a broad strategic question like that feels too much like "wasting time on the list designing features" to you, I'm not sure what you would find acceptable discussion. Regards, -- Rowan Collins [IMSoP]
  100434
September 7, 2017 08:40 TonyMarston@hotmail.com ("Tony Marston")
"Dan Ackroyd"  wrote in message 
news:CA+kxMuSL1kEW60S7DFJb06+r2Q3rC1ueeWU1jAP78FY65aJoDg@mail.gmail.com...
> >On 6 September 2017 at 13:31, Rowan Collins collins@gmail.com> >wrote: >> I'm going to assume that the code you posted was something of a straw >> man, and you're not actually advocating people copy 20 lines of code for >> every variable they want to validate. > >You assume wrong. No it's not, and yes I am. > >I can point a junior developer at the function and they can understand it. > >If I ask that junior developer to add an extra rule that doesn't >currently exist, they can without having to dive into a full library >of validation code. > >If I need to modify the validation based on extra input (e.g whether >the user has already made several purchases, or whether they're a >brand new signup), it's trivial to add that to the function. > >This is one of the times where code re-use through copying and pasting >is far superior to trying to make stuff "simple" by going through an >array based 'specification'. It turns out that that doesn't save much >time to begin with, and then becomes hard to manage when your >requirements get more complication.
As a person who has been developing database applications for several decades and with PHP since 2003 I'd like to chip in with my 2 cent's worth. Firstly I agree with Dan's statement: This type of library should be done in PHP, not in C. Secondly, there is absolutely no way that you can construct a standard library which can execute all the possible validation rules that may exist. In my not inconsiderable experience there are two types of validation: 1) Primary validation, where each field is validated against the column specifications in the database to ensure that the value can be written to that column without causing an error. For example this checks that a number is a number, a data is a date, a required field is not null, etc. 2) Secondary validation, where additional validation/business rules are applied such as comparing the values from several fields. For example, to check that START_DATE is not later tyhan END_DATE. Primary validation is easy to automate. I have a separate class for each database table, and each class contains an array of field specifications. This is never written by hand as it is produced by my Data Dictionary which imports data from the database schema then exports that data in the form of table class files and table structure files. When data is sent to a table class for inserting or updating in the database I have written a standard validation procedure which takes two arrays - an array of field=value pairs and a array of field=specifications - and then checks that each field conforms to its specifications. This validation procedure is built into the framework and executed automatically before any data is written to the database, so requires absolutely no intervention by the developer. Secondary validation cannot be automated, so it requires additional code to be inserted into the relevant validation method. There are several of these which are defined in my abstract table class and which are executed automatically at a predetermined point in the processing cycle. These methods are defined in the abstract class but are empty. If specific code is required then the empty class can be copied from the abstract class to the concrete class where it can be filled with the necessary code. If there are any developers out there who are still writing code to perform primary validation then you may learn something from my implementation. If there are any developers out there who think that secondary validation can be automated I can only say "dream on". -- Tony Marston
  100438
September 7, 2017 11:49 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi Tony,

On Thu, Sep 7, 2017 at 5:40 PM, Tony Marston <TonyMarston@hotmail.com>
wrote:

> "Dan Ackroyd" wrote in message news:CA+kxMuSL1kEW60S7DFJb06+r > 2Q3rC1ueeWU1jAP78FY65aJoDg@mail.gmail.com... > >> >> On 6 September 2017 at 13:31, Rowan Collins collins@gmail.com> >> wrote: >> >>> I'm going to assume that the code you posted was something of a straw >>> man, and you're not actually advocating people copy 20 lines of code for >>> every variable they want to validate. >>> >> >> You assume wrong. No it's not, and yes I am. >> >> I can point a junior developer at the function and they can understand it. >> >> If I ask that junior developer to add an extra rule that doesn't >> currently exist, they can without having to dive into a full library >> of validation code. >> >> If I need to modify the validation based on extra input (e.g whether >> the user has already made several purchases, or whether they're a >> brand new signup), it's trivial to add that to the function. >> >> This is one of the times where code re-use through copying and pasting >> is far superior to trying to make stuff "simple" by going through an >> array based 'specification'. It turns out that that doesn't save much >> time to begin with, and then becomes hard to manage when your >> requirements get more complication. >> > > As a person who has been developing database applications for several > decades and with PHP since 2003 I'd like to chip in with my 2 cent's worth. > Firstly I agree with Dan's statement: > > This type of library should be done in PHP, not in C. > > Secondly, there is absolutely no way that you can construct a standard > library which can execute all the possible validation rules that may exist. > In my not inconsiderable experience there are two types of validation: > 1) Primary validation, where each field is validated against the column > specifications in the database to ensure that the value can be written to > that column without causing an error. For example this checks that a number > is a number, a data is a date, a required field is not null, etc. > 2) Secondary validation, where additional validation/business rules are > applied such as comparing the values from several fields. For example, to > check that START_DATE is not later tyhan END_DATE. > > Primary validation is easy to automate. I have a separate class for each > database table, and each class contains an array of field specifications. > This is never written by hand as it is produced by my Data Dictionary which > imports data from the database schema then exports that data in the form of > table class files and table structure files. When data is sent to a table > class for inserting or updating in the database I have written a standard > validation procedure which takes two arrays - an array of field=value pairs > and a array of field=specifications - and then checks that each field > conforms to its specifications. This validation procedure is built into the > framework and executed automatically before any data is written to the > database, so requires absolutely no intervention by the developer. > > Secondary validation cannot be automated, so it requires additional code > to be inserted into the relevant validation method. There are several of > these which are defined in my abstract table class and which are executed > automatically at a predetermined point in the processing cycle. These > methods are defined in the abstract class but are empty. If specific code > is required then the empty class can be copied from the abstract class to > the concrete class where it can be filled with the necessary code. > > If there are any developers out there who are still writing code to > perform primary validation then you may learn something from my > implementation. > > If there are any developers out there who think that secondary validation > can be automated I can only say "dream on". >
Please let me explain rationale behind input validation at outermost trust boundary. There are 3 reasons why I would like propose the validation. All of 3 requires validation at outermost trust boundary. 1. Security reasons Input validation should be done with Fail Fast manner. 2. Design by Contract (DbC or Contract Programming) In order DbC to work, validations at outermost boundary is mandatory. With DbC, all inputs are validated inside functions/methods to make sure correct program executions. However, almost all checks (in fact, all checks done by DbC support) are disabled for production. How to make sure program works correctly? All inputs data must be validated at outermost boundary when DbC is disabled. Otherwise, DbC may not work. (DbC is supposed to achieve both secure and efficient code execution.) 3. Native PHP Types Although my validate module is designed not to do unwanted conversions, but it converts basic types to PHP native types by default. (This can be disabled) With this conversion at outermost trust boundary, native PHP type works fluently. Although, my current primary goal is 1, but 2 and 3 is important as well. 2 is important especially. Providing DbC without proper basic validation feature does not make much sense, and could be disaster. Users may validate input with their own validation library, but my guess is pessimistic. User wouldn't do proper validation due to too loose validation libraries and rules. There are too few validators that do true validations that meet requirements for 1 and 2. IMHO, even if there are good enough validators, PHP should provide usable validator for core features. (DbC is not implemented, though) I hope you understand my intentions and accept the feature in core. Feature for core should be in core. IMO.
> 1) Primary validation, where each field is validated against the column specifications in the database to ensure that the value can be written to
that column without causing an error. For example this checks that a number is a number, a data is a date, a required field is not null, etc.
> 2) Secondary validation, where additional validation/business rules are applied such as comparing the values from several fields. For example, to
check that START_DATE is not later than END_DATE. Validation rules for input, logic and database may differ. Suppose you validate "user comment" data. Input: 0 - 10240 bytes - Input might have to allow larger size than logic. i.e. lacks client side validation. Logic: 10 - 1024 bytes - Logic may require smaller range as correct data. Database: 0 - 102400 bytes - Database may allow much larger size for future extension. Under ideal situation, all of these may be the same but they are not in real world. I wouldn't aim to consolidate all validations, but I would like to avoid unnecessary incompatibilities so that different validations can cooperate if it is possible. I'm very interested in PDO level validation because SQLite3 could be very dangerous. (i.e. Type affinity allows store strings in int/float/date/etc) It may be useful if PDO can simply use "validate" module's rule or API. BTW, Input validation should only validate format(used char, length, range, encoding) if we follow single responsibility principle. Logical correctness is upto logic. i.e. Model in MVC. Anyway, goal is providing usable basic validator for core features and security. Required trade offs may be allowed. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100483
September 9, 2017 09:26 TonyMarston@hotmail.com ("Tony Marston")
"Yasuo Ohgaki"  wrote in message 
news:CAGa2bXa4UvkL-ZsLAB2bF05L4q_oduixSzVvYzu9nddkSVttXw@mail.gmail.com...
> >Hi Tony, >
>> >> As a person who has been developing database applications for several >> decades and with PHP since 2003 I'd like to chip in with my 2 cent's >> worth. >> Firstly I agree with Dan's statement: >> >> This type of library should be done in PHP, not in C. >> >> Secondly, there is absolutely no way that you can construct a standard >> library which can execute all the possible validation rules that may >> exist. >> In my not inconsiderable experience there are two types of validation: >> 1) Primary validation, where each field is validated against the column >> specifications in the database to ensure that the value can be written to >> that column without causing an error. For example this checks that a >> number >> is a number, a data is a date, a required field is not null, etc. >> 2) Secondary validation, where additional validation/business rules are >> applied such as comparing the values from several fields. For example, to >> check that START_DATE is not later than END_DATE. >> >> Primary validation is easy to automate. I have a separate class for each >> database table, and each class contains an array of field specifications. >> This is never written by hand as it is produced by my Data Dictionary >> which >> imports data from the database schema then exports that data in the form >> of >> table class files and table structure files. When data is sent to a table >> class for inserting or updating in the database I have written a standard >> validation procedure which takes two arrays - an array of field=value >> pairs >> and a array of field=specifications - and then checks that each field >> conforms to its specifications. This validation procedure is built into >> the >> framework and executed automatically before any data is written to the >> database, so requires absolutely no intervention by the developer. >> >> Secondary validation cannot be automated, so it requires additional code >> to be inserted into the relevant validation method. There are several of >> these which are defined in my abstract table class and which are executed >> automatically at a predetermined point in the processing cycle. These >> methods are defined in the abstract class but are empty. If specific code >> is required then the empty class can be copied from the abstract class to >> the concrete class where it can be filled with the necessary code. >> >> If there are any developers out there who are still writing code to >> perform primary validation then you may learn something from my >> implementation. >> >> If there are any developers out there who think that secondary validation >> can be automated I can only say "dream on". >> > >Please let me explain rationale behind input validation at outermost trust >boundary. There are 3 reasons why I would like propose the validation. All >of 3 >requires validation at outermost trust boundary. > >1. Security reasons >Input validation should be done with Fail Fast manner.
The language should only provide the basic features which allow values to be validated. That is what the filter functions are for. All that is necessary is for user input to be validated before any attempt is made to write it to the database.
>2. Design by Contract (DbC or Contract Programming) >In order DbC to work, validations at outermost boundary is mandatory. >With DbC, all inputs are validated inside functions/methods to make sure >correct program executions.
Irrelevant. DbC is a methodology which PHP was never designed to support, and I see no reason why it should. If you really want DbC then switch to a language which supports it, or use a third-party extension which provides supports.
>However, almost all checks (in fact, all checks done by DbC support) >are disabled for production. How to make sure program works correctly? >All inputs data must be validated at outermost boundary when DbC is >disabled. Otherwise, DbC may not work. (DbC is supposed to achieve >both secure and efficient code execution.)
>3. Native PHP Types >Although my validate module is designed not to do unwanted conversions, >but it converts basic types to PHP native types by default. (This can be >disabled) With this conversion at outermost trust boundary, native PHP type >works >fluently.
What is the difference between a basic type and a PHP native type?
>Although, my current primary goal is 1, but 2 and 3 is important as well. > >2 is important especially. Providing DbC without proper basic validation >feature does not make much sense, and could be disaster. >Users may validate input with their own validation library, but my guess >is pessimistic. User wouldn't do proper validation due to too loose >validation libraries and rules. There are too few validators that do >true validations that meet requirements for 1 and 2. IMHO, even if >there are good enough validators, PHP should provide usable validator >for core features. (DbC is not implemented, though)
It does, in the form of the filter functions.
>I hope you understand my intentions and accept the feature in core. >Feature for core should be in core. IMO.
The filter functions are already in core. How these functions are used is down to userland code.
>> 1) Primary validation, where each field is validated against the column >specifications in the database to ensure that the value can be written to >that column without causing an error. For example this checks that a number >is a number, a data is a date, a required field is not null, etc. >> 2) Secondary validation, where additional validation/business rules are >applied such as comparing the values from several fields. For example, to >check that START_DATE is not later than END_DATE. > >Validation rules for input, logic and database may differ. >Suppose you validate "user comment" data. >Input: 0 - 10240 bytes - Input might have to allow larger size >than logic. i.e. lacks client side validation. >Logic: 10 - 1024 bytes - Logic may require smaller range as >correct data. >Database: 0 - 102400 bytes - Database may allow much larger size for future >extension. > >Under ideal situation, all of these may be the same but they are not in >real world. > >I wouldn't aim to consolidate all validations, but I would like to avoid >unnecessary >incompatibilities so that different validations can cooperate if it is >possible.
What exactly are these "unnecessary incompatibilities"?
>I'm very interested in PDO level validation because SQLite3 could be very >dangerous.
Anything which is misused can be dangerous. It is almost impossible to provide a function and prevent stupid people from misusing it.
>(i.e. Type affinity allows store strings in int/float/date/etc) It may be >useful if PDO >can simply use "validate" module's rule or API. > >BTW, Input validation should only validate format(used char, length, range, >encoding) >if we follow single responsibility principle. Logical correctness is upto >logic. i.e. Model in >MVC. > >Anyway, goal is providing usable basic validator for core features and >security.
If you wish to improve the filter functions ten go ahead. Anything more than this would be a step too far.
>Required trade offs may be allowed.
Do not waste time by trying to add into core what should be done in userland code.
>Regards, > >-- >Yasuo Ohgaki >yohgaki@ohgaki.net
-- Tony Marston
  100517
September 11, 2017 08:42 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi Tony,

On Sat, Sep 9, 2017 at 6:26 PM, Tony Marston <TonyMarston@hotmail.com>
wrote:

> "Yasuo Ohgaki" wrote in message news:CAGa2bXa4UvkL-ZsLAB2bF05L > 4q_oduixSzVvYzu9nddkSVttXw@mail.gmail.com... > >> >> Hi Tony, >> >> > >> >>> As a person who has been developing database applications for several >>> decades and with PHP since 2003 I'd like to chip in with my 2 cent's >>> worth. >>> Firstly I agree with Dan's statement: >>> >>> This type of library should be done in PHP, not in C. >>> >>> Secondly, there is absolutely no way that you can construct a standard >>> library which can execute all the possible validation rules that may >>> exist. >>> In my not inconsiderable experience there are two types of validation: >>> 1) Primary validation, where each field is validated against the column >>> specifications in the database to ensure that the value can be written to >>> that column without causing an error. For example this checks that a >>> number >>> is a number, a data is a date, a required field is not null, etc. >>> 2) Secondary validation, where additional validation/business rules are >>> applied such as comparing the values from several fields. For example, to >>> check that START_DATE is not later than END_DATE. >>> >>> >>> Primary validation is easy to automate. I have a separate class for each >>> database table, and each class contains an array of field specifications. >>> This is never written by hand as it is produced by my Data Dictionary >>> which >>> imports data from the database schema then exports that data in the form >>> of >>> table class files and table structure files. When data is sent to a table >>> class for inserting or updating in the database I have written a standard >>> validation procedure which takes two arrays - an array of field=value >>> pairs >>> and a array of field=specifications - and then checks that each field >>> conforms to its specifications. This validation procedure is built into >>> the >>> framework and executed automatically before any data is written to the >>> database, so requires absolutely no intervention by the developer. >>> >>> Secondary validation cannot be automated, so it requires additional code >>> to be inserted into the relevant validation method. There are several of >>> these which are defined in my abstract table class and which are executed >>> automatically at a predetermined point in the processing cycle. These >>> methods are defined in the abstract class but are empty. If specific code >>> is required then the empty class can be copied from the abstract class to >>> the concrete class where it can be filled with the necessary code. >>> >>> If there are any developers out there who are still writing code to >>> perform primary validation then you may learn something from my >>> implementation. >>> >>> If there are any developers out there who think that secondary validation >>> can be automated I can only say "dream on". >>> >>> >> Please let me explain rationale behind input validation at outermost trust >> boundary. There are 3 reasons why I would like propose the validation. >> All of 3 >> requires validation at outermost trust boundary. >> >> 1. Security reasons >> Input validation should be done with Fail Fast manner. >> > > The language should only provide the basic features which allow values to > be validated. That is what the filter functions are for. All that is > necessary is for user input to be validated before any attempt is made to > write it to the database.
The reason why data should be validated at outermost trust boundary is explained by me and other. Validation at database level is simply too late for security purposes. Input validations must be done at outermost boundary for the best security. This is a secure coding best practice. 2. Design by Contract (DbC or Contract Programming)
>> In order DbC to work, validations at outermost boundary is mandatory. >> With DbC, all inputs are validated inside functions/methods to make sure >> correct program executions. >> > > Irrelevant. DbC is a methodology which PHP was never designed to support, > and I see no reason why it should. If you really want DbC then switch to a > language which supports it, or use a third-party extension which provides > supports.
DbC is ad-hoc. No BC nor shortcomings. All most all languages including PHP have support feature for it in some forms. If PHP is designed for DbC or not is irrelevant. One can totally ignore DbC support just like some D users do, yet DbC can achieve both better security and performance with proper design and usage. DbC is _extremely_ useful for building solid and faster app when it is used properly.
> However, almost all checks (in fact, all checks done by DbC support) >> are disabled for production. How to make sure program works correctly? >> All inputs data must be validated at outermost boundary when DbC is >> disabled. Otherwise, DbC may not work. (DbC is supposed to achieve >> both secure and efficient code execution.) >> > > 3. Native PHP Types >> Although my validate module is designed not to do unwanted conversions, >> but it converts basic types to PHP native types by default. (This can be >> disabled) With this conversion at outermost trust boundary, native PHP >> type works >> fluently. >> > > What is the difference between a basic type and a PHP native type?
PHP native types are NULL/BOOL/INT/FLOAT/STRING/ARRAY/OBJECT. Almost all inputs are "text" in web apps. Data comes from clients is "text". So they are "STRING", while PHP native types are zend_bool/zend_long/double/ zend_string/hash/object. While basic type form string is almost the same as PHP native type, but there is a little difference. e.g. 't' is TRUE, '999999999999999999999999' is valid as integer, but not for PHP int type.
> Although, my current primary goal is 1, but 2 and 3 is important as well. >> >> 2 is important especially. Providing DbC without proper basic validation >> feature does not make much sense, and could be disaster. >> Users may validate input with their own validation library, but my guess >> is pessimistic. User wouldn't do proper validation due to too loose >> validation libraries and rules. There are too few validators that do >> true validations that meet requirements for 1 and 2. IMHO, even if >> there are good enough validators, PHP should provide usable validator >> for core features. (DbC is not implemented, though) >> > > It does, in the form of the filter functions.
It seems you haven't try to use filter module seriously. It simply does not have enough feature for input validations. e.g. You cannot validate "strings". I hope you understand my intentions and accept the feature in core.
>> Feature for core should be in core. IMO. >> > > The filter functions are already in core. How these functions are used is > down to userland code.
I suppose filter module is not used for validations much, since it cannot validate string without my RFC for filter. 1) Primary validation, where each field is validated against the column
>>> >> specifications in the database to ensure that the value can be written to >> that column without causing an error. For example this checks that a >> number >> is a number, a data is a date, a required field is not null, etc. >> >>> 2) Secondary validation, where additional validation/business rules are >>> >> applied such as comparing the values from several fields. For example, to >> check that START_DATE is not later than END_DATE. >> >> Validation rules for input, logic and database may differ. >> Suppose you validate "user comment" data. >> Input: 0 - 10240 bytes - Input might have to allow larger size >> than logic. i.e. lacks client side validation. >> Logic: 10 - 1024 bytes - Logic may require smaller range as >> correct data. >> Database: 0 - 102400 bytes - Database may allow much larger size for >> future >> extension. >> >> Under ideal situation, all of these may be the same but they are not in >> real world. >> >> I wouldn't aim to consolidate all validations, but I would like to avoid >> unnecessary >> incompatibilities so that different validations can cooperate if it is >> possible. >> > > What exactly are these "unnecessary incompatibilities"?
I don't know either now, but there would be some.
> I'm very interested in PDO level validation because SQLite3 could be very >> dangerous. >> > > Anything which is misused can be dangerous. It is almost impossible to > provide a function and prevent stupid people from misusing it.
Correct. However, too many users are ignoring the fact SQLite3 has type affinity that allows strings for any types. This is just an example for better security.
> (i.e. Type affinity allows store strings in int/float/date/etc) It may be >> useful if PDO >> can simply use "validate" module's rule or API. >> >> BTW, Input validation should only validate format(used char, length, >> range, >> encoding) >> if we follow single responsibility principle. Logical correctness is upto >> logic. i.e. Model in >> MVC. >> >> Anyway, goal is providing usable basic validator for core features and >> security. >> > > If you wish to improve the filter functions ten go ahead. Anything more > than this would be a step too far.
I did it already by RFC with PoC patch. https://wiki.php.net/rfc/add_validate_functions_to_filter
> Required trade offs may be allowed. >> > > Do not waste time by trying to add into core what should be done in > userland code.
Proper input validation is the most important task in secure coding. https://www.securecoding.cert.org/confluence/display/seccode/Top+10+Secure+Coding+Practices Nonetheless, I rarely see app that has proper input validations. It would be nice to have module for it with proper document. "All that is necessary is for user input to be validated before any attempt is made to write it to the database." This fine for database, but not for app. There are too many codes that don't even require database. Even when database is used, there are too many cases database level validation is too late. BTW, PHP script implemented validator cannot be faster than native C module function. As you know, function call overhead is not cheap. We have number of array functions for this reason. Why not for validation which must be called always? Regards, P.S. Many of us are confused what application level validation is. Application level input validation is $_GET/$_POST/$_COOKIE/$_SERVER/$_FILES validation _before_ they are used by app codes. "Validate" module is intended for this. Logic(Model in MVC) or DB level validations are another input validations. It cannot be replaced by others with proper design. i.e. Fail Fast, Single Responsibility principle. -- Yasuo Ohgaki yohgaki@ohgaki.net
  100518
September 11, 2017 09:37 php-lists@koalephant.com (Stephen Reay)
> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > It seems you haven't try to use filter module seriously. > It simply does not have enough feature for input validations. > e.g. You cannot validate "strings". >
Yasuo, I’ve asked previously what your proposal actually offers over the filter functions, and got no response, so please elaborate on this? Can you show a concrete example that cannot be validated in user land currently, using the filter functions as a base? Cheers Stephen
  100519
September 11, 2017 10:41 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi Stephen,

On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com>
wrote:

> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > It seems you haven't try to use filter module seriously. > It simply does not have enough feature for input validations. > e.g. You cannot validate "strings". > > > Yasuo, > > I’ve asked previously what your proposal actually offers over the filter > functions, and got no response, so please elaborate on this? >
> Can you show a concrete example that cannot be validated in user land > currently, using the filter functions as a base? >
FILTER_VALIDATE_REGEXP is not good enough simply. PCRE is known that it is vulnerable to regex DoS still. (as well as Oniguruma) Users should avoid regex validation whenever it is possible also to avoid various risks. In addition, current filter module does not provide nested array validation array key validation, etc. It's not true validation neither. It does not provide simple length, min/max validations. It does non explicit conversions (i.e. trim), etc. Length, min/max validation is mandatory validation if you would like to follow ISO 27000 requirement. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100520
September 11, 2017 15:22 php-lists@koalephant.com (Stephen Reay)
> On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Hi Stephen, > > On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com <mailto:php-lists@koalephant.com>> > wrote: > >> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: >> >> It seems you haven't try to use filter module seriously. >> It simply does not have enough feature for input validations. >> e.g. You cannot validate "strings". >> >> >> Yasuo, >> >> I’ve asked previously what your proposal actually offers over the filter >> functions, and got no response, so please elaborate on this? >> > > >> Can you show a concrete example that cannot be validated in user land >> currently, using the filter functions as a base? >> > > FILTER_VALIDATE_REGEXP is not good enough simply. > PCRE is known that it is vulnerable to regex DoS still. (as well as > Oniguruma) > Users should avoid regex validation whenever it is possible also to avoid > various > risks. > > In addition, current filter module does not provide nested array validation > array key validation, etc. It's not true validation neither. It does not > provide > simple length, min/max validations. It does non explicit conversions (i.e. > trim), etc. > Length, min/max validation is mandatory validation if you would like to > follow > ISO 27000 requirement. > > Regards, > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net>
So, you still didn’t actually provide an example. I *guess* you’re talking about character class validation or something else equally “simple”, because I can’t imagine what else would be a common enough case that you’d want to have built-in rules for, and that you wouldn’t internally use RegExp to test anyway. Ok so we can’t use filter_var() rules to validate that a string field is an Alpha or AlphaNum, between 4 and 8 characters long (technically you could pass mb_strlen() to the INT filter with {min,max}_range options set to get the length validation, but I’ll grant you that *is* kind of a crappy workaround right now) Why not stop trying to re-invent every single feature already present in PHP (yes, I’ve been paying attention to all your other proposals), and just *add* the functionality that’s missing: A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max` => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) Lastly: it may not be the format you personally want, but the filter extension *does* have the `filter_{input,var}_array` functions. Claiming something doesn’t exist because it doesn’t work exactly how you would like it to, makes you seem immature and petty, IMO.
  100523
September 11, 2017 21:07 yohgaki@ohgaki.net (Yasuo Ohgaki)
Stephen,

On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com>
wrote:

> > On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Hi Stephen, > > On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com> > wrote: > > On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > It seems you haven't try to use filter module seriously. > It simply does not have enough feature for input validations. > e.g. You cannot validate "strings". > > > Yasuo, > > I’ve asked previously what your proposal actually offers over the filter > functions, and got no response, so please elaborate on this? > > > > Can you show a concrete example that cannot be validated in user land > currently, using the filter functions as a base? > > > FILTER_VALIDATE_REGEXP is not good enough simply. > PCRE is known that it is vulnerable to regex DoS still. (as well as > Oniguruma) > Users should avoid regex validation whenever it is possible also to avoid > various > risks. > > In addition, current filter module does not provide nested array validation > array key validation, etc. It's not true validation neither. It does not > provide > simple length, min/max validations. It does non explicit conversions (i.e.. > trim), etc. > Length, min/max validation is mandatory validation if you would like to > follow > ISO 27000 requirement. > > Regards, > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net > > > > So, you still didn’t actually provide an example. I *guess* you’re talking > about character class validation or something else equally “simple”, > because I can’t imagine what else would be a common enough case that you’d > want to have built-in rules for, and that you wouldn’t internally use > RegExp to test anyway. >
Your request is like "Devil's Proof". Example code that cannot do things with existing API cannot exist with meaningful manner. It can be explained why it cannot, though. Try what "validate" string validator can do, Then you'll see. $input = [ 'defined_but_should_not_exist' => 'Developer should not allow unwanted value', '_invalid_utf8_key_should_not_be_allowed_' => 'Developer should validate key value as well', 'utf8_text' => 'Validator should be able to allow UTF-8 and validate its validity at least', 'default_must_be_safe' => 'Crackers send all kinds of chars. CNTRL chars must not be allowed by default', 'array' => [ 'complex' => 1, 'nested' => 'any validation rule should be able to be applied', 'array' => 1, 'key_should_be_validated_also' => 1, 'array' => [ 'any_num_of_nesting' => 'is allowed', ], ], 'array_num_elements_must_be_validated' => [ "a", "b", "c", "d", "e", "f", "and so on", "values must be able to be validated as user wants", ], ]; There is no STRING validation filter currently. This fact alone, it could be said "filter cannot do string validation currently". List of problems in current validation filter - no STRING validator currently - it allows any inputs by default - it does not allow multiple rules that allows complex validation rules for string - it does not have callback validator - it does not have key value validation (note: PHP's key could be binary) - it does not validate num of elements in array. - it cannot forbids unwanted elements in array. - it cannot validate "char encoding". - it does not enforce white listing. - and so on These are the list that "filter" cannot do. Ok so we can’t use filter_var() rules to validate that a string field is an
> Alpha or AlphaNum, between 4 and 8 characters long (technically you could > pass mb_strlen() to the INT filter with {min,max}_range options set to get > the length validation, but I’ll grant you that *is* kind of a crappy > workaround right now) > > Why not stop trying to re-invent every single feature already present in > PHP (yes, I’ve been paying attention to all your other proposals), and just > *add* the functionality that’s missing: >
https://wiki.php.net/rfc/add_validate_functions_to_filter It's _declined_. You should have supported this RFC if you would like to add features to filter. (I'm glad there is a new RFC supporter regardless of occasion) I don't mind this result much. Adding features to "filter" has some of shortcomings mentioned above even with my proposal. A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max`
> => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a > built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) >
Simply adding these wouldn't work well as validator because - Filter is designed for black listing As you may know, all of security standards/guidelines require - White listing for validation We may change "filter", but it requires BC.
> > Lastly: it may not be the format you personally want, but the filter > extension *does* have the `filter_{input,var}_array` functions. Claiming > something doesn’t exist because it doesn’t work exactly how you would like > it to, makes you seem immature and petty, IMO. >
Discussion is confusing because you ignore this RFC result. https://wiki.php.net/rfc/add_validate_functions_to_filter This RFC proposes filter module improvement while keeping compatibility. I understand your point. This exactly the same reason why I proposed "improvement" at first, not new extension. I don't understand why you insist already failed attempt repeatedly. Would you like me to propose previous RFC again? and implement "ture validation" with filter? I don't mind implementing it if you would like to update the RFC and it passes. I must use "white list" as much as possible. Regards, P.S. "Filter" module is black listing module. "Validate" is white listing module. Even with BC, mixing them would result in confusing FLAGs and codes. Codes may be cleaned up later, but FLAGs cannot. We should consider this also. -- Yasuo Ohgaki yohgaki@ohgaki.net
  100524
September 11, 2017 21:35 lists@rhsoft.net ("lists@rhsoft.net")
Am 11.09.2017 um 23:07 schrieb Yasuo Ohgaki
> On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com> >> So, you still didn’t actually provide an example. I *guess* you’re talking >> about character class validation or something else equally “simple”, >> because I can’t imagine what else would be a common enough case that you’d >> want to have built-in rules for, and that you wouldn’t internally use >> RegExp to test anyway. > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explained > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter but you still fail to explain why in the world you don#t try to enhance
the existing filter functions instead invent a new beast leading finally to have the existin filter functions and your new stuff which share the same intention
  100526
September 11, 2017 21:39 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi,

On Tue, Sep 12, 2017 at 6:35 AM, lists@rhsoft.net <lists@rhsoft.net> wrote:

> > Am 11.09.2017 um 23:07 schrieb Yasuo Ohgaki > >> On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com> >> >>> So, you still didn’t actually provide an example. I *guess* you’re >>> talking >>> about character class validation or something else equally “simple”, >>> because I can’t imagine what else would be a common enough case that >>> you’d >>> want to have built-in rules for, and that you wouldn’t internally use >>> RegExp to test anyway. >>> >> >> Your request is like "Devil's Proof". Example code that cannot do things >> with existing API cannot exist with meaningful manner. It can be explained >> why it cannot, though. Try what "validate" string validator can do, >> Then you'll see. >> >> There is no STRING validation filter currently. This fact alone, >> it could be said "filter cannot do string validation currently". >> >> List of problems in current validation filter >> > but you still fail to explain why in the world you don#t try to enhance > the existing filter functions instead invent a new beast leading finally to > have the existin filter functions and your new stuff which share the same > intention > > Why don't you read previous RFC and the vote result?
https://wiki.php.net/rfc/add_validate_functions_to_filter Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100527
September 11, 2017 21:49 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi,

On Tue, Sep 12, 2017 at 6:39 AM, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote:

> Hi, > > On Tue, Sep 12, 2017 at 6:35 AM, lists@rhsoft.net <lists@rhsoft.net> > wrote: > >> >> Am 11.09.2017 um 23:07 schrieb Yasuo Ohgaki >> >>> On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com >>> > >>> >>>> So, you still didn’t actually provide an example. I *guess* you’re >>>> talking >>>> about character class validation or something else equally “simple”, >>>> because I can’t imagine what else would be a common enough case that >>>> you’d >>>> want to have built-in rules for, and that you wouldn’t internally use >>>> RegExp to test anyway. >>>> >>> >>> Your request is like "Devil's Proof". Example code that cannot do things >>> with existing API cannot exist with meaningful manner. It can be >>> explained >>> why it cannot, though. Try what "validate" string validator can do, >>> Then you'll see. >>> >>> There is no STRING validation filter currently. This fact alone, >>> it could be said "filter cannot do string validation currently". >>> >>> List of problems in current validation filter >>> >> but you still fail to explain why in the world you don#t try to enhance >> the existing filter functions instead invent a new beast leading finally to >> have the existin filter functions and your new stuff which share the same >> intention >> >> > Why don't you read previous RFC and the vote result? > https://wiki.php.net/rfc/add_validate_functions_to_filter >
I'm a bit surprised by the fact there are "filter improvement" supporters. You should have participated in the previous RFC discussion. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100529
September 11, 2017 21:56 lists@rhsoft.net ("lists@rhsoft.net")
Am 11.09.2017 um 23:49 schrieb Yasuo Ohgaki:
> but you still fail to explain why in the world you don#t try to > enhance the existing filter functions instead invent a new beast > leading finally to have the existin filter functions and your > new stuff which share the same intention > > > Why don't you read previous RFC and the vote result? > https://wiki.php.net/rfc/add_validate_functions_to_filter > <https://wiki.php.net/rfc/add_validate_functions_to_filter> > > > I'm a bit surprised by the fact there are "filter improvement" supporters. > You should have participated in the previous RFC discussion
and i am suprise that you act *that* stubborn and obviously think when you give the bike a new name someone will buy it instead really consider the contras of previous proposals
  100528
September 11, 2017 21:54 lists@rhsoft.net ("lists@rhsoft.net")
Am 11.09.2017 um 23:39 schrieb Yasuo Ohgaki:
> On Tue, Sep 12, 2017 at 6:35 AM, lists@rhsoft.net > but you still fail to explain why in the world you don#t try to > enhance the existing filter functions instead invent a new beast > leading finally to have the existin filter functions and your new > stuff which share the same intention > > > Why don't you read previous RFC and the vote result? > https://wiki.php.net/rfc/add_validate_functions_to_filter
and why do you not take the contra arguments against how do you think things should be done into your considerations and believe bikeshed it with a different name will achieve anything? it's basially the same as your hash_hkdf() related stuff - you just ignore everybody and cntinue to ride a dead horse up to a level where even pure readers of the internals list just have enough and only think "stop it guy"
  100530
September 11, 2017 22:16 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi,

On Tue, Sep 12, 2017 at 6:54 AM, lists@rhsoft.net <lists@rhsoft.net> wrote:

> > > Am 11.09.2017 um 23:39 schrieb Yasuo Ohgaki: > >> On Tue, Sep 12, 2017 at 6:35 AM, lists@rhsoft.net but you still fail >> to explain why in the world you don#t try to >> enhance the existing filter functions instead invent a new beast >> leading finally to have the existin filter functions and your new >> stuff which share the same intention >> >> >> Why don't you read previous RFC and the vote result? >> https://wiki.php.net/rfc/add_validate_functions_to_filter >> > > and why do you not take the contra arguments against how do you think > things should be done into your considerations and believe bikeshed it with > a different name will achieve anything? >
If you understand the difference, there are huge different with respect to behaviors. Previous RFC was halfway finished "validation", it's far from "true validation". it's basially the same as your hash_hkdf() related stuff - you just ignore
> everybody and cntinue to ride a dead horse up to a level where even pure > readers of the internals list just have enough and only think "stop it guy"
hash_hkdf() discussion comes to conclusion finally if you haven't noticed it. It is clear now that Nikita and Andrey does not understand the algorithm ( including underlying HMAC and cypto hash characteristics) and RFC. See the relevant thread for conclusion. (The latest one) In short, current hash_hkdf() is not only violates RFC 5869, but also encourages extremely insecure usage, has unnecessarily incompatible API with respect to other hash functions. On Tue, Sep 12, 2017 at 6:56 AM, lists@rhsoft.net <lists@rhsoft.net> wrote:
> and i am suprise that you act *that* stubborn and obviously think when you > give the bike a new name someone will buy it instead really consider the > contras of previous proposals
"Validate" and "filter improvement" fundamentally different proposal in fact. i.e. Validate does true white list validation, while filter improvement is halfway. Almost all apps do not implement "proper application level input validation" yet, even if all of security guidelines/standards recommends/requires it. What do you mean by "stubborn"? Would you like me to try to remove "input validations" from security guidelines or standards? If you seriously think so, you're the one should try. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100525
September 11, 2017 21:37 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi Stephen,

On Tue, Sep 12, 2017 at 6:07 AM, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote:

> Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it > passes. > It must use "white list" as much as possible. >
BTW, the patch mentioned here https://wiki.php.net/rfc/add_validate_functions_to_filter is seriously broken by merge and it does not work currently. I didn't look into details, but it seems some codes are gone by merge. Previous patch is written so that there are less changes. If you're serious about proposing "filter" improvement, I don't mind make it work again. (And improve further. It's PoC in the first place.) Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100534
September 12, 2017 04:04 php-lists@koalephant.com (Stephen Reay)
> On 12 Sep 2017, at 04:07, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Stephen, > > On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com <mailto:php-lists@koalephant.com>> wrote: > >> On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net>> wrote: >> >> Hi Stephen, >> >> On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com <mailto:php-lists@koalephant.com>> >> wrote: >> >>> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net>> wrote: >>> >>> It seems you haven't try to use filter module seriously. >>> It simply does not have enough feature for input validations. >>> e.g. You cannot validate "strings". >>> >>> >>> Yasuo, >>> >>> I’ve asked previously what your proposal actually offers over the filter >>> functions, and got no response, so please elaborate on this? >>> >> >> >>> Can you show a concrete example that cannot be validated in user land >>> currently, using the filter functions as a base? >>> >> >> FILTER_VALIDATE_REGEXP is not good enough simply. >> PCRE is known that it is vulnerable to regex DoS still. (as well as >> Oniguruma) >> Users should avoid regex validation whenever it is possible also to avoid >> various >> risks. >> >> In addition, current filter module does not provide nested array validation >> array key validation, etc. It's not true validation neither. It does not >> provide >> simple length, min/max validations. It does non explicit conversions (i.e. >> trim), etc. >> Length, min/max validation is mandatory validation if you would like to >> follow >> ISO 27000 requirement. >> >> Regards, >> >> -- >> Yasuo Ohgaki >> yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net> > > So, you still didn’t actually provide an example. I *guess* you’re talking about character class validation or something else equally “simple”, because I can’t imagine what else would be a common enough case that you’d want to have built-in rules for, and that you wouldn’t internally use RegExp to test anyway. > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explained > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > $input = [ > 'defined_but_should_not_exist' => 'Developer should not allow unwanted value', > '_invalid_utf8_key_should_not_be_allowed_' => 'Developer should validate key value as well', > 'utf8_text' => 'Validator should be able to allow UTF-8 and validate its validity at least', > 'default_must_be_safe' => 'Crackers send all kinds of chars. CNTRL chars must not be allowed by default', > 'array' => [ > 'complex' => 1, > 'nested' => 'any validation rule should be able to be applied', > 'array' => 1, > 'key_should_be_validated_also' => 1, > 'array' => [ > 'any_num_of_nesting' => 'is allowed', > ], > ], > 'array_num_elements_must_be_validated' => [ > "a", "b", "c", "d", "e", "f", "and so on", "values must be able to be validated as user wants", > ], > ]; > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter > - no STRING validator currently > - it allows any inputs by default > - it does not allow multiple rules that allows complex validation rules for string > - it does not have callback validator > - it does not have key value validation (note: PHP's key could be binary) > - it does not validate num of elements in array. > - it cannot forbids unwanted elements in array. > - it cannot validate "char encoding". > - it does not enforce white listing. > - and so on > > These are the list that "filter" cannot do. > > Ok so we can’t use filter_var() rules to validate that a string field is an Alpha or AlphaNum, between 4 and 8 characters long (technically you could pass mb_strlen() to the INT filter with {min,max}_range options set to get the length validation, but I’ll grant you that *is* kind of a crappy workaround right now) > > Why not stop trying to re-invent every single feature already present in PHP (yes, I’ve been paying attention to all your other proposals), and just *add* the functionality that’s missing: > > https://wiki.php.net/rfc/add_validate_functions_to_filter <https://wiki.php.net/rfc/add_validate_functions_to_filter> > It's _declined_. You should have supported this RFC if you would like to add features to filter. > (I'm glad there is a new RFC supporter regardless of occasion) > > I don't mind this result much. > Adding features to "filter" has some of shortcomings mentioned above > even with my proposal. > > A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max` => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) > > Simply adding these wouldn't work well as validator because > > - Filter is designed for black listing > > As you may know, all of security standards/guidelines require > > - White listing for validation > > We may change "filter", but it requires BC. > > > Lastly: it may not be the format you personally want, but the filter extension *does* have the `filter_{input,var}_array` functions. Claiming something doesn’t exist because it doesn’t work exactly how you would like it to, makes you seem immature and petty, IMO. > > Discussion is confusing because you ignore this RFC result. > https://wiki.php.net/rfc/add_validate_functions_to_filter <https://wiki.php.net/rfc/add_validate_functions_to_filter> > This RFC proposes filter module improvement while keeping compatibility. > > I understand your point. This exactly the same reason why I proposed > "improvement" at first, not new extension. > > I don't understand why you insist already failed attempt repeatedly. > > Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it passes. > I must use "white list" as much as possible. > > Regards, > > P.S. "Filter" module is black listing module. "Validate" is white listing module. > Even with BC, mixing them would result in confusing FLAGs and codes. > Codes may be cleaned up later, but FLAGs cannot. > We should consider this also. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net <mailto:yohgaki@ohgaki.net> > >
I was going to give a lot of detailed replies inline, but I’ve come to the realisation its pointless with you. You really respond to what people say, you just use their comments as jumping off points to re-post your same little rant, ad nauseam. So here’s the summary. Don’t both replying, because I won’t be reading it. - I never asked for a working code example that is impossible with the current extension. I asked for a simple example of what you wanted to achieve. - More than half the “issues” you claim with the filter extension, are only “valid” if you agree that it needs to do complex array structure validation. I do not agree with this. Userland can iterate an array of rules/input and validate quite easily. - The *actual* issues with the filter extension could be solved by improving/adding filters. - I already agreed that a string based filter (to test character class and min/max length, etc) would be a good addition. Continuing to bring it up when others have acknowledged something doesn’t help your case, at all. - ACCEPTING that string validation is missing, number/bool/ is still whitelisting. I don’t know how they’re implemented in C. Maybe that needs improvement. But a rule saying “validate that $X is a number between 10 and 100” or “validate that $Y is an email address) is whitelisting.
  100554
September 12, 2017 22:10 yohgaki@ohgaki.net (Yasuo Ohgaki)
On Tue, Sep 12, 2017 at 1:04 PM, Stephen Reay <php-lists@koalephant.com>
wrote:

> > On 12 Sep 2017, at 04:07, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > Stephen, > > On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay <php-lists@koalephant.com> > wrote: > >> >> On 11 Sep 2017, at 17:41, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: >> >> Hi Stephen, >> >> On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay <php-lists@koalephant.com> >> wrote: >> >> On 11 Sep 2017, at 15:42, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: >> >> It seems you haven't try to use filter module seriously. >> It simply does not have enough feature for input validations. >> e.g. You cannot validate "strings". >> >> >> Yasuo, >> >> I’ve asked previously what your proposal actually offers over the filter >> functions, and got no response, so please elaborate on this? >> >> >> >> Can you show a concrete example that cannot be validated in user land >> currently, using the filter functions as a base? >> >> >> FILTER_VALIDATE_REGEXP is not good enough simply. >> PCRE is known that it is vulnerable to regex DoS still. (as well as >> Oniguruma) >> Users should avoid regex validation whenever it is possible also to avoid >> various >> risks. >> >> In addition, current filter module does not provide nested array >> validation >> array key validation, etc. It's not true validation neither. It does not >> provide >> simple length, min/max validations. It does non explicit conversions (i.e. >> trim), etc. >> Length, min/max validation is mandatory validation if you would like to >> follow >> ISO 27000 requirement. >> >> Regards, >> >> -- >> Yasuo Ohgaki >> yohgaki@ohgaki.net >> >> >> >> So, you still didn’t actually provide an example. I *guess* you’re >> talking about character class validation or something else equally >> “simple”, because I can’t imagine what else would be a common enough case >> that you’d want to have built-in rules for, and that you wouldn’t >> internally use RegExp to test anyway. >> > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explained > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > $input = [ > 'defined_but_should_not_exist' => 'Developer should not allow unwanted > value', > '_invalid_utf8_key_should_not_be_allowed_' => 'Developer should > validate key value as well', > 'utf8_text' => 'Validator should be able to allow UTF-8 and validate its > validity at least', > 'default_must_be_safe' => 'Crackers send all kinds of chars. CNTRL chars > must not be allowed by default', > 'array' => [ > 'complex' => 1, > 'nested' => 'any validation rule should be able to be applied', > 'array' => 1, > 'key_should_be_validated_also' => 1, > 'array' => [ > 'any_num_of_nesting' => 'is allowed', > ], > ], > 'array_num_elements_must_be_validated' => [ > "a", "b", "c", "d", "e", "f", "and so on", "values must be able to > be validated as user wants", > ], > ]; > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter > - no STRING validator currently > - it allows any inputs by default > - it does not allow multiple rules that allows complex validation rules > for string > - it does not have callback validator > - it does not have key value validation (note: PHP's key could be binary) > - it does not validate num of elements in array. > - it cannot forbids unwanted elements in array. > - it cannot validate "char encoding". > - it does not enforce white listing. > - and so on > > These are the list that "filter" cannot do. > > Ok so we can’t use filter_var() rules to validate that a string field is >> an Alpha or AlphaNum, between 4 and 8 characters long (technically you >> could pass mb_strlen() to the INT filter with {min,max}_range options set >> to get the length validation, but I’ll grant you that *is* kind of a crappy >> workaround right now) >> >> Why not stop trying to re-invent every single feature already present in >> PHP (yes, I’ve been paying attention to all your other proposals), and just >> *add* the functionality that’s missing: >> > > https://wiki.php.net/rfc/add_validate_functions_to_filter > It's _declined_. You should have supported this RFC if you would like to > add features to filter. > (I'm glad there is a new RFC supporter regardless of occasion) > > I don't mind this result much. > Adding features to "filter" has some of shortcomings mentioned above > even with my proposal. > > A `FILTER_VALIDATE_STRING` filter, with “Options” of `min` => ?int, `max` >> => ?int and “Flags” of FILTER_FLAG_ALPHA, FILTER_FLAG_NUMERIC (possibly a >> built in bit mask “FILTER_FLAG_ALPHANUMERIC” ?) >> > > Simply adding these wouldn't work well as validator because > > - Filter is designed for black listing > > As you may know, all of security standards/guidelines require > > - White listing for validation > > We may change "filter", but it requires BC. > > >> >> Lastly: it may not be the format you personally want, but the filter >> extension *does* have the `filter_{input,var}_array` functions. Claiming >> something doesn’t exist because it doesn’t work exactly how you would like >> it to, makes you seem immature and petty, IMO. >> > > Discussion is confusing because you ignore this RFC result. > https://wiki.php.net/rfc/add_validate_functions_to_filter > This RFC proposes filter module improvement while keeping compatibility. > > I understand your point. This exactly the same reason why I proposed > "improvement" at first, not new extension. > > I don't understand why you insist already failed attempt repeatedly. > > Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it > passes. > I must use "white list" as much as possible. > > Regards, > > P.S. "Filter" module is black listing module. "Validate" is white listing > module. > Even with BC, mixing them would result in confusing FLAGs and codes. > Codes may be cleaned up later, but FLAGs cannot. > We should consider this also. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net > > > > I was going to give a lot of detailed replies inline, but I’ve come to the > realisation its pointless with you. You really respond to what people say, > you just use their comments as jumping off points to re-post your same > little rant, ad nauseam. >
May be I shouldn't reply if a reply indicates previous mails aren't read. I usually reply all regardless. As a result, I reply the basically the same thing. Since someone mentioned hash_hkdf() mess on this thread, short note for this. It's clearly Nikita and Andrey's fault. They don't read the internet RFC fully. I had no idea why they're acting like ignorant, kept insisting ridiculous/insecure API clearly violates the RFC, i.e. Salt as last optional param. Wrong is wrong. I cannot stop point it out problems in security feature(key derivation) unless it is fixed. If one feels curious, read RFC 5869, then you'll see why the API is so ridiculous/insecure without salt. So here’s the summary. Don’t both replying, because I won’t be reading it.
> > - I never asked for a working code example that is impossible with the > current extension. I asked for a simple example of what you wanted to > achieve. >
OK. My excuse for misunderstanding. Unit tests do not cover all features yet, but you can see them from working "validate" module's *.phpt.
> > - More than half the “issues” you claim with the filter extension, are > only “valid” if you agree that it needs to do complex array structure > validation. I do not agree with this. Userland can iterate an array of > rules/input and validate quite easily. >
I totally agree that "validate" and "filter" is similar. I also totally agree that it's very easily done by scripts. I thought "proper application level validation" would be common sense many years ago since it is easy, but it is not. As you can see from this discussion, there are many people that "database" and/or "model" level validation is good enough for apps. This is one of my motivation, another is performance. Validation should be done always, so module functions are suitable for both performance and documentation purpose. For example, developers provide escape API for security purpose even when it is trivial with string functions. It's good for documentation purpose, as well as performance. It's the same. - The *actual* issues with the filter extension could be solved by
> improving/adding filters. >
Largest filter module issue as validator is "filter is made for filtering" and "blacklisting nature came from filtering architecture". I realized following issues with my filter module improvement RFC, of course. My approach back then was "it's better than nothing". There are many issues. I picked 2 most importants. - Although it can be used for whitelisting, but it's optional by design. It does not enforce whitelisting by default. Enforced whitelisting archives much better results. Therefore, security related features should use whitelisting. e.g. MAC(Mandatory Access Control), SELinux - It applies extremely dangerous default filter which does nothing in case user sets invalid filter/validator. i.e. Simply pass inputs, let code use it. No security check at all. Making filter's validation a true whitelist validator is possible. However, there are issues. Making filter a true validator requires a lot of BC. We may have "strict option switch", but making filter a whitelist validator isn't so simple as it may seem. "strict option switch" will add many branches and code might be unmaintainable. Even without "strict option switch", the code is based on "filtering"/"blacklisting" and requires large refactoring. I've already tried both "filter validate improvement" and "new validation module". From this experience, MySQL to MySQLi like transition is the best choice, IMO. We can use whatever API/interface(e.g. spec array, flags) that is easy to understand/maintain/expand. But again, I don't mind implementing filter's validation improvement if anyone would like to spend time for RFC. Even if it would be far from true validation, it's still better than nothing. If you would like to create filter improvement RFC and if it passes, I'll write code for it. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100429
September 7, 2017 04:26 yohgaki@ohgaki.net (Yasuo Ohgaki)
Hi Dan


On Wed, Sep 6, 2017 at 8:38 PM, Dan Ackroyd <danack@basereality.com> wrote:

> On 6 September 2017 at 12:15, Rowan Collins collins@gmail.com> > wrote: > > > If you have suggestions for how the format should look > > Don't use a format. Just write code - see below. > > > Which is why Yasuo and I have both suggested we work together > > If you're going to work together and continue the conversation, please > can you move this conversation elsewhere? > > It doesn't appear to be actually anything to do with PHP internals. > > On 4 September 2017 at 07:33, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote: > > > > Since I didn't get much feedbacks during the RFC discussion, I cannot > tell > > what part is disliked. > > Yes you did. You got feedback during the discussion and also during > the vote. For example: http://news.php.net/php.internals/95164 > > However you continually choose to ignore that feedback. > > I will attempt once more, to get the main point through to you. > Perhaps a small amount of repetition, will get it through: > > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > This type of library should be done in PHP, not in C. > > cheers > Dan > Ack > > > function validateOrderAmount($value) : int { > $count = preg_match("/[^0-9]*/", $value); > > if ($count) { > throw new InvalidOrderAmount("The order value must contain > only digits."); > } > > $value = intval($value); > > if ($value < 1) { > throw new InvalidOrderAmount("The order value must be one or > more."); > } > > if ($value >= MAX_ORDER_AMOUNT) { > throw new InvalidOrderAmount( > "Order value to big. Maximum allowed value is ".MAX_ORDER_AMOUNT > ); > } > > return $value; > } >
You seems mixing up responsibility between - Input validation that should be input handling code's responsibility. - Logical validation that should be model code's responsibility . Please stick to single responsibility principle that you should be well aware of. Input handling code should only accepts valid inputs and let other codes use it. Other responsibilities are not input handling code's responsibilities. Your example code should be in logic, not input handling, that is written by PHP script. As I wrote in README.md, there are only 3 types of inputs. 1. Valid data should accepted. 2. Valid data should accepted, but user's mistake. e.g. Logical error like your example above. 3. Invalid. Anything other than 1 and 2 (i.e. Client cannot send these value) "validate" module is supposed to take care 3 which is nothing to do with models, etc. It should validate against input data spec, not logical meaning of the input. If programmer did this, single responsibility principle is broken. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net
  100433
September 7, 2017 08:08 lester@lsces.co.uk (Lester Caine)
On 07/09/17 05:26, Yasuo Ohgaki wrote:
> As I wrote in README.md, there are only 3 types of inputs. > > 1. Valid data should accepted. > 2. Valid data should accepted, but user's mistake. e.g. Logical error like > your example above. > 3. Invalid. Anything other than 1 and 2 (i.e. Client cannot send these > value) > > "validate" module is supposed to take care 3 which is nothing to do with > models, etc. > It should validate against input data spec, not logical meaning of the > input. If programmer did this, single responsibility principle is broken.
BUT you require an accurate 'input data spec' in order to establish what is not part of '3' and this is the same metadata that is needed to ALSO define the 'logical checks'. Once you have established that the input data has a valid set of data you need to VALIDATE that the data is within the limits defined by the 'input data spec' and those checks ALSO apply to any subsequent processing of the data set. The 'input data spec' is important not only to your 'single validation process', but also to further processing that data prior to producing some sort of output. ( No mention of databases but in a lot of cases that is where the key metadata resides? ) My point is that the 'input data spec' is not simply a stand alone array of data only used by the validator. It is something either created by other parts of the 'logic' or it is needed to give individual responses to 'user's mistake' as per '2' ... I understand that you want to return a 'fail' at the earliest possible point, and a single step 'validate' meets that need, but the bulk of the reasons validation should fail is because someone is trying to hack a site by creating 'user's mistakes' that pass '3' that are not handled correctly by '2'. I think where the latest offering fails is that it now requires that any 'custom' validation needs to be written in 'C' while that same code may be needed as a PHP version as in Dan's example. The validation processing needs to be ABLE to be iterated through variable by variable once one has established that there IS a valid set of variables to work with. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  100411
September 6, 2017 11:52 lester@lsces.co.uk (Lester Caine)
On 06/09/17 12:15, Rowan Collins wrote:
> On 6 September 2017 09:29:37 BST, Lester Caine <lester@lsces.co.uk> wrote: >> My only problem with Yasuo's latest offering is once again it adds a >> whole new set of defines that have to be mapped to existing metadata >> definitions ... That and it is a lot of longhand code using a different >> style to existing arrays. We need yet another wrapper to build these >> arrays from existing code ... > The rules have to be defined somehow, and I'm not aware of a standard format that current code is likely to follow. Unless there is already such a standard, I can't see any way to avoid existing code having to be wrapped or amended. > > Which is why Yasuo and I have both suggested we work together to come up with such a standard format that can be used or adapted for these different parts of the application. If you have suggestions for how the format should look, we are eager to hear them and see some examples.
The likes of ADOdb datadict are still used as a base for metadata in projects, but PDO destroyed the standardisation that used to exist by spawning a number of competing wrappers. https://github.com/ADOdb/ADOdb has evolved from a private project to being supported by it's own community and is worth reconsidering as a proper cross database standard to build on. Validation rules simply build on that base. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  100412
September 6, 2017 12:00 lists@rhsoft.net ("lists@rhsoft.net")
Am 06.09.2017 um 13:52 schrieb Lester Caine:
> The likes of ADOdb datadict are still used as a base for metadata in > projects, but PDO destroyed the standardisation that used to exist by > spawning a number of competing wrappers. https://github.com/ADOdb/ADOdb > has evolved from a private project to being supported by it's own > community and is worth reconsidering as a proper cross database standard > to build on. Validation rules simply build on that base
frankly - why don't you realize that input validation DOES NOT turn around databases at all - databases and SQL injection are *just one* subset of it and not every application works with databases at all
  100414
September 6, 2017 12:11 lester@lsces.co.uk (Lester Caine)
On 06/09/17 13:00, lists@rhsoft.net wrote:
> Am 06.09.2017 um 13:52 schrieb Lester Caine: >> The likes of ADOdb datadict are still used as a base for metadata in >> projects, but PDO destroyed the standardisation that used to exist by >> spawning a number of competing wrappers. https://github.com/ADOdb/ADOdb >> has evolved from a private project to being supported by it's own >> community and is worth reconsidering as a proper cross database standard >> to build on. Validation rules simply build on that base > > frankly - why don't you realize that input validation DOES NOT turn > around databases at all - databases and SQL injection are *just one* > subset of it and not every application works with databases at all
Metadata describing a data set should be standard across all interfaces. PHP has had standards on the database interfaces for a long time and writing different standards yet again for other interfaces is all I am objecting to. If you build metadata for a form that can then simply be dropped into the interface for a database or to pass into another storage method then it will be a lot more usable. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  100416
September 6, 2017 12:41 rowan.collins@gmail.com (Rowan Collins)
On 6 September 2017 12:52:24 BST, Lester Caine <lester@lsces.co.uk> wrote:
>On 06/09/17 12:15, Rowan Collins wrote: >> On 6 September 2017 09:29:37 BST, Lester Caine <lester@lsces.co.uk> >wrote: >> Which is why Yasuo and I have both suggested we work together to come >up with such a standard format that can be used or adapted for these >different parts of the application. If you have suggestions for how the >format should look, we are eager to hear them and see some examples. > >The likes of ADOdb datadict are still used as a base for metadata in >projects
I'm going to cut you off there, and say this one more time: please can you show us an example of how you would like it to look. I have no idea what an "ADOdb datadict" is, so if that's what you're advocating, you'll need to show us what it looks like. Spare the commentary on what decisions we could have made differently 15 years ago, and concentrate on what we can do right now, specifically, for this particular piece of functionality. Thank you, -- Rowan Collins [IMSoP]
  100413
September 6, 2017 12:05 php-lists@koalephant.com (Stephen Reay)
> On 6 Sep 2017, at 18:15, Rowan Collins collins@gmail.com> wrote: > >> On 6 September 2017 09:29:37 BST, Lester Caine <lester@lsces.co.uk> wrote: >> My only problem with Yasuo's latest offering is once again it adds a >> whole new set of defines that have to be mapped to existing metadata >> definitions ... That and it is a lot of longhand code using a different >> style to existing arrays. We need yet another wrapper to build these >> arrays from existing code ... > > The rules have to be defined somehow, and I'm not aware of a standard format that current code is likely to follow. Unless there is already such a standard, I can't see any way to avoid existing code having to be wrapped or amended. > > Which is why Yasuo and I have both suggested we work together to come up with such a standard format that can be used or adapted for these different parts of the application. If you have suggestions for how the format should look, we are eager to hear them and see some examples. > > Regards, > > -- > Rowan Collins > [IMSoP] > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php >
Does this proposal actually offer any improvement over what's available with filter_var/etc and a userland wrapper class? If there are deficiencies in how the existing filters work, maybe work on detailing the issues and then fixing/improving them? That way, all the userland validation currently built around them, needs *possibly* minor changes to make use of new flags/options, as opposed to a completely new API to use. Cheers Stephen