JSON_THROW_ON_ERROR implementation detail

  105653
May 9, 2019 16:06 Danack@basereality.com (Dan Ackroyd)
Hi,

Apparently there is an implementation detail in JSON_THROW_ON_ERROR
that differs in the RFC text, from the discussion on list
http://news.php.net/php.internals/100569:

> I decided to reset it to no error because there's no > previous error that needs handling by error-checking code, the exception > takes care of that.
RFC text:
> When passed this flag, the error behaviour of these functions is changed. > The global error state is left untouched, and if an error occurs that would > otherwise set it, these functions instead throw a JsonException with the > message and code set to whatever json_last_error() and json_last_error_msg() > would otherwise be respectively.
The implementation is highly surprising and can lead to bugs. Imagine this code exists in a code base. --------------------- $foo = json_decode('[bar'); // many lines of code. // many lines of code. // many lines of code. $bar = json_encode('Hello world!'); if (json_last_error() !== JSON_ERROR_NONE) { throw new Exception("encoding went wrong!"); } echo "All is fine"; // output is "All is fine" --------------------- And to start moving to have json_encode throw on error, they start using the flag on the json_encode ---------------------- $foo = json_decode('[bar'); // many lines of code. // many lines of code. // many lines of code. $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); if (json_last_error() !== JSON_ERROR_NONE) { throw new Exception("encoding went wrong!"); } echo "All is fine"; ------------------- The result of this is that the user throw exception will be thrown even though the encode worked correctly. https://3v4l.org/JJD5g This is highly surprising*, and doesn't seem the right thing to be doing. Does anyone have any objection to changing this, so that json_last_error() + json_last_error_msg() always refer to the most recent json_encode()/json_decode() function, as per the discussion. Or does this need an RFC? I've logged a bug for this regardless - https://bugs.php.net/bug.php?id=77997 cheers Dan Ack * https://en.wikipedia.org/wiki/Principle_of_least_astonishment
  105665
May 10, 2019 14:42 bishop@php.net (Bishop Bettini)
On Thu, May 9, 2019 at 12:06 PM Dan Ackroyd <Danack@basereality.com> wrote:

> Apparently there is an implementation detail in JSON_THROW_ON_ERROR > that differs in the RFC text, from the discussion on list > http://news.php.net/php.internals/100569: > > > I decided to reset it to no error because there's no > > previous error that needs handling by error-checking code, the exception > > takes care of that. > > RFC text: > > > When passed this flag, the error behaviour of these functions is changed. > > The global error state is left untouched, and if an error occurs that > would > > otherwise set it, these functions instead throw a JsonException with the > > message and code set to whatever json_last_error() and > json_last_error_msg() > > would otherwise be respectively. > > The implementation is highly surprising and can lead to bugs. > > Imagine this code exists in a code base. > > --------------------- > > $foo = json_decode('[bar'); > // many lines of code. > // many lines of code. > // many lines of code. > > $bar = json_encode('Hello world!'); > if (json_last_error() !== JSON_ERROR_NONE) { > throw new Exception("encoding went wrong!"); > } > > echo "All is fine"; > // output is "All is fine" > --------------------- > > > And to start moving to have json_encode throw on error, they start > using the flag on the json_encode > > ---------------------- > > $foo = json_decode('[bar'); > // many lines of code. > // many lines of code. > // many lines of code. > > $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); > if (json_last_error() !== JSON_ERROR_NONE) { > throw new Exception("encoding went wrong!"); > } > > echo "All is fine"; > > ------------------- > > The result of this is that the user throw exception will be thrown > even though the encode worked correctly. https://3v4l.org/JJD5g > > This is highly surprising*, and doesn't seem the right thing to be doing. >
The new code has what looks like, to me, a refactor bug. I'd expect that, when one adds the JSON_THROW_ON_ERROR flag, one must also remove the subsequent json_last_error() handling, as it's no longer relevant to that call and instead relevant to the most recent prior call. In this case, since the most recent prior call had no error handling itself, it now gets one attached to it. I don't find this action at a distance surprising: it's one of the known intrinsic downsides of a global-last-error mechanism. IMO, that this code now fails is a good thing, because now the author knows to error check after json_decode().
> Does anyone have any objection to changing this, so that > json_last_error() + json_last_error_msg() always refer to the most > recent json_encode()/json_decode() function, as per the discussion. Or > does this need an RFC? >
I think I'd rather see this handled in the documentation. "Note that you should review use of json_last_error() and json_last_error_msg() error handling when adding or removing the JSON_THROW_ON_ERROR flag." If there was to be any language change, I don't think we can't just change the behavior. Extant 7.3 code might be: // Run through an encode/decode cycle to validate the user-supplied data's valid. We don't // support nested arrays on user data, so we set a depth of 1. // @throws \App\Json\DecodeException when the user-supplied data is invalid json or is nested // @throws \JsonException when the data cannot be stored in valid form (eg, bad UTF-8 characters) $valid_data = json_encode(json_decode($user_data, 0, 1), JSON_THROW_ON_ERROR); if (JSON_ERROR_NONE !== json_last_error()) { throw new App\Json\DecodeException(json_last_error_msg(), json_last_error()); } Here, the author expects the error from json_decode() to flow through json_encode(). Yes, this code could be better written to separate out the branches, but it's plausible (IMO) and would be subject to a BC break by a language change on json_last_error() and JSON_THROW_ON_ERROR semantics. We could keep the last error from non-throw, and last error from throw, as separate global states, then provide either new functions to access each, or a boolean flag to json_last_error($includeLastExceptionError = false): int, but both of those seem like fitting a problem to a solution. In summary, mixed error handling interaction has a couple of possible implementations. We chose one, with its pros and cons. The other implementation has its pros and cons, as well. I don't see a reason to swap one set of consequences for another, esp. now that we've already walked down the road of choice. I'd rather we just raise awareness in the form of documentation, and let authors move their code toward one mechanism, or the other.
  105666
May 10, 2019 18:52 riikka.kalliomaki@riimu.net (=?UTF-8?Q?Riikka_Kalliom=C3=A4ki?=)
> The new code has what looks like, to me, a refactor bug. I'd expect that, when one adds the JSON_THROW_ON_ERROR flag, one must also remove the
subsequent json_last_error() handling, as it's no longer relevant to that call and instead relevant to the most recent prior call. This does create a kind of an unfortunate situation with vendor libraries, however. Any library that accepts arbitrary json flags and uses json_last_error() for error handling will be broken if JSON_THROW_ON_ERROR is passed as a flag. This means that library consumers effectively may have to become aware of how a library internally deals with json errors. It kind of turns JSON_THROW_ON_ERROR into unconventional BC break, because any code that takes arbitrary flags and uses json_last_error() needs to account for it. In some cases, I imagine, it can be necessary to add a "json_encode(null);" before the actual json handling just to reset the error state, just in case. I wouldn't be surprised if this becomes a source of unexpected bugs in json handling code down the line. In my opinion, it's very unexpected behaviour that json_last_error() can reflect the state prior to the previous json_encode/json_decode call depending on the flags passed to the function as this has not been the case before. Perhaps it should be documented in the BC breaks section for 7.3 that json_last_error() may now reflect the state of arbitrary previous call? On Fri, May 10, 2019 at 5:43 PM Bishop Bettini <bishop@php.net> wrote:
> > On Thu, May 9, 2019 at 12:06 PM Dan Ackroyd <Danack@basereality.com> wrote: > > > Apparently there is an implementation detail in JSON_THROW_ON_ERROR > > that differs in the RFC text, from the discussion on list > > http://news.php.net/php.internals/100569: > > > > > I decided to reset it to no error because there's no > > > previous error that needs handling by error-checking code, the exception > > > takes care of that. > > > > RFC text: > > > > > When passed this flag, the error behaviour of these functions is changed. > > > The global error state is left untouched, and if an error occurs that > > would > > > otherwise set it, these functions instead throw a JsonException with the > > > message and code set to whatever json_last_error() and > > json_last_error_msg() > > > would otherwise be respectively. > > > > The implementation is highly surprising and can lead to bugs. > > > > Imagine this code exists in a code base. > > > > --------------------- > > > > $foo = json_decode('[bar'); > > // many lines of code. > > // many lines of code. > > // many lines of code. > > > > $bar = json_encode('Hello world!'); > > if (json_last_error() !== JSON_ERROR_NONE) { > > throw new Exception("encoding went wrong!"); > > } > > > > echo "All is fine"; > > // output is "All is fine" > > --------------------- > > > > > > And to start moving to have json_encode throw on error, they start > > using the flag on the json_encode > > > > ---------------------- > > > > $foo = json_decode('[bar'); > > // many lines of code. > > // many lines of code. > > // many lines of code. > > > > $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); > > if (json_last_error() !== JSON_ERROR_NONE) { > > throw new Exception("encoding went wrong!"); > > } > > > > echo "All is fine"; > > > > ------------------- > > > > The result of this is that the user throw exception will be thrown > > even though the encode worked correctly. https://3v4l.org/JJD5g > > > > This is highly surprising*, and doesn't seem the right thing to be doing. > > > > The new code has what looks like, to me, a refactor bug. I'd expect that, > when one adds the JSON_THROW_ON_ERROR flag, one must also remove the > subsequent json_last_error() handling, as it's no longer relevant to that > call and instead relevant to the most recent prior call. > > In this case, since the most recent prior call had no error handling > itself, it now gets one attached to it. I don't find this action at a > distance surprising: it's one of the known intrinsic downsides of a > global-last-error mechanism. IMO, that this code now fails is a good thing, > because now the author knows to error check after json_decode(). > > > > > Does anyone have any objection to changing this, so that > > json_last_error() + json_last_error_msg() always refer to the most > > recent json_encode()/json_decode() function, as per the discussion. Or > > does this need an RFC? > > > > I think I'd rather see this handled in the documentation. "Note that you > should review use of json_last_error() and json_last_error_msg() error > handling when adding or removing the JSON_THROW_ON_ERROR flag." > > If there was to be any language change, I don't think we can't just change > the behavior. Extant 7.3 code might be: > > // Run through an encode/decode cycle to validate the user-supplied data's > valid. We don't > // support nested arrays on user data, so we set a depth of 1. > // @throws \App\Json\DecodeException when the user-supplied data is invalid > json or is nested > // @throws \JsonException when the data cannot be stored in valid form (eg, > bad UTF-8 characters) > $valid_data = json_encode(json_decode($user_data, 0, 1), > JSON_THROW_ON_ERROR); > if (JSON_ERROR_NONE !== json_last_error()) { > throw new App\Json\DecodeException(json_last_error_msg(), > json_last_error()); > } > > Here, the author expects the error from json_decode() to flow through > json_encode(). Yes, this code could be better written to separate out the > branches, but it's plausible (IMO) and would be subject to a BC break by a > language change on json_last_error() and JSON_THROW_ON_ERROR semantics. > > We could keep the last error from non-throw, and last error from throw, as > separate global states, then provide either new functions to access each, > or a boolean flag to json_last_error($includeLastExceptionError = false): > int, but both of those seem like fitting a problem to a solution. > > In summary, mixed error handling interaction has a couple of possible > implementations. We chose one, with its pros and cons. The other > implementation has its pros and cons, as well. I don't see a reason to swap > one set of consequences for another, esp. now that we've already walked > down the road of choice. I'd rather we just raise awareness in the form of > documentation, and let authors move their code toward one mechanism, or the > other.
-- Riikka Kalliomäki https://github.com/Riimu
  105671
May 11, 2019 02:11 bishop@php.net (Bishop Bettini)
On Fri, May 10, 2019 at 2:52 PM Riikka Kalliomäki <
riikka.kalliomaki@riimu.net> wrote:

> > The new code has what looks like, to me, a refactor bug. I'd expect that, > when one adds the JSON_THROW_ON_ERROR flag, one must also remove the > subsequent json_last_error() handling, as it's no longer relevant to that > call and instead relevant to the most recent prior call. > > This does create a kind of an unfortunate situation with vendor > libraries, however. Any library that accepts arbitrary json flags and > uses json_last_error() for error handling will be broken if > JSON_THROW_ON_ERROR is passed as a flag. >
I would expect such libraries to be sensitive to the possible variations implied by accepting arbitrary arguments, and would compensate accordingly. For example: function library_json_decode(string $json, int $options) { if (70300 <= PHP_VERSION_ID) { $options &= ~JSON_THROW_ON_ERROR; } $decoded = json_decode($json, $options); if (JSON_ERROR_NONE !== json_last_error()) { throw new LibraryException(json_last_error_msg(), json_last_error()); } return $decoded; } This means that library consumers effectively may have to become aware
> of how a library internally deals with json errors. > > It kind of turns JSON_THROW_ON_ERROR into unconventional BC break, > because any code that takes arbitrary flags and uses json_last_error() > needs to account for it. In some cases, I imagine, it can be necessary > to add a "json_encode(null);" before the actual json handling just to > reset the error state, just in case. >
I agree this is possible, but I contend any function taking arbitrary options needs to understand the implication of those options. It's in many ways just like any other user-supplied data: authors must understand the possible values available for the supported versions and handle the accordingly.
> I wouldn't be surprised if this becomes a source of unexpected bugs in > json handling code down the line. In my opinion, it's very unexpected > behaviour that json_last_error() can reflect the state prior to the > previous json_encode/json_decode call depending on the flags passed to > the function as this has not been the case before. > > Perhaps it should be documented in the BC breaks section for 7.3 that > json_last_error() may now reflect the state of arbitrary previous > call? >
Yes, better documentation surrounding JSON_THROW_ON_ERROR would be welcomed. The docs are, right now, quite lean.
  105683
May 12, 2019 17:06 Danack@basereality.com (Dan Ackroyd)
On Fri, 10 May 2019 at 15:42, Bishop Bettini <bishop@php.net> wrote:
> > Extant 7.3 code might be:
People would only write code like that if they hadn't read the manual (or code documentation in their IDE) for json_last_error() which clearly states 'Returns the last error (if any) occurred during the last JSON encoding/decoding,'.
> I'd rather we just raise awareness in the form of documentation
If you have to carefully read the documentation for function, and not taking enough care means that you're going to be surprised by the result, then the function is probably poorly designed.
> We chose one, with its pros and cons.
As we've seen recently, people don't always consider the exact detail implications that deeply. Seeing as people can also just make wrong choices, I think this should be changed. Seeing as you've objected, I'll raise an RFC to put it to a vote. cheers Dan Ack
  105784
May 23, 2019 23:55 ajf@ajf.me (Andrea Faulds)
Hi Dan,

Please see the discussion on the original pull request: 
https://github.com/php/php-src/pull/2662

The existing behaviour is deliberate, was already discussed, and was 
part of the RFC that was accepted. So there is a high standard to meet 
for going back on that.

The idea was to keep the two error handling approaches cleanly 
separated. This would mean that we could eventually change the default 
behaviour and deprecate the old error mechanism, if we wanted to.

If you explicitly request the modern exception-based error-handling 
mechanism and catch the exception, which contains the error code and 
message, you have no good reason to then try to retrieve the same error 
code and message via the legacy functions. What's the advantage of 
mixing and matching interfaces like this? It doesn't feel like good 
coding practice.

Sorry for the slow response. I haven't checked the internals list that 
frequently lately.

Andrea
  105785
May 24, 2019 00:33 ajf@ajf.me (Andrea Faulds)
Hi Dan,

Sorry, I think I should directly address your points.

Dan Ackroyd wrote:
> > Apparently there is an implementation detail in JSON_THROW_ON_ERROR > that differs in the RFC text, from the discussion on list > http://news.php.net/php.internals/100569: > >> I decided to reset it to no error because there's no >> previous error that needs handling by error-checking code, the exception >> takes care of that. > > RFC text: > >> When passed this flag, the error behaviour of these functions is changed. >> The global error state is left untouched, and if an error occurs that would >> otherwise set it, these functions instead throw a JsonException with the >> message and code set to whatever json_last_error() and json_last_error_msg() >> would otherwise be respectively.
That difference came about, IIRC, because someone objected to the behaviour in, I think, a pull request comment. I think they were right, because…
> > The implementation is highly surprising and can lead to bugs. > > Imagine this code exists in a code base. > > --------------------- > > $foo = json_decode('[bar'); > // many lines of code. > // many lines of code. > // many lines of code. > > $bar = json_encode('Hello world!'); > if (json_last_error() !== JSON_ERROR_NONE) { > throw new Exception("encoding went wrong!"); > } > > echo "All is fine"; > // output is "All is fine" > --------------------- > > > And to start moving to have json_encode throw on error, they start > using the flag on the json_encode > > ---------------------- > > $foo = json_decode('[bar'); > // many lines of code. > // many lines of code. > // many lines of code. > > $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); > if (json_last_error() !== JSON_ERROR_NONE) { > throw new Exception("encoding went wrong!"); > } > > echo "All is fine"; > > ------------------- > > The result of this is that the user throw exception will be thrown > even though the encode worked correctly. https://3v4l.org/JJD5g > > This is highly surprising*, and doesn't seem the right thing to be doing.
The problem with that code is it mixes two intentionally separate error-handling mechanisms. If you want to use the old last_error system: $bar = json_encode('Hello world!'); if (json_last_error() !== JSON_ERROR_NONE) { // Handle error case } If you want to use the new exception system: try { $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); } catch (JsonException $err) { // Handle error case } These examples are internally consistent. The first (implicitly) requests the old error handling approach, then checks for an error the old way. The second explicitly requests the new error handling approach, then checks for an error the new way. But your new example is not internally consistent, it asks for the new approach then tries to handle it the old way. It's true, it will break if the old-system global error state is dirty. But it will also break if there is an *actual error*: $bar = json_encode($unencodable, JSON_THROW_ON_ERROR); // throws JsonException, unhandled The following line checking json_last_error() would never even be executed. Let's imagine we were to make json_last_error() still be set if JSON_THROW_ON_ERROR was set. It would then be possible to write this: try { $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); } catch (JsonException $err) { if (json_last_error() !== JSON_ERROR_NONE) { // Handle error } } But this is redundant, because when we've caught a JsonException we already know there was an error. Likewise if we just use json_last_error() to find out what type of error, this is also redundant, because $err already contains that information. It would also, in that hypothetical scenario, be possible to write this: try { $bar = json_encode('Hello world!', JSON_THROW_ON_ERROR); } catch (JsonException $err) { } if (json_last_error() !== JSON_ERROR_NONE) { // Handle error } This… would work, but it defeats the purpose of using JSON_THROW_ON_ERROR at all. I agree it might be surprising to have json_last_error() not be touched when you use JSON_THROW_ON_ERROR, but I assign more value to keeping the error mechanisms cleanly separated than to allowing strangely-written self-defeating code to not crash. At first I assumed you were making an argument about compatibility between PHP versions, where on an older version the JSON_THROW_ON_ERROR constant wouldn't do anything. But because the constant wouldn't exist, you couldn't use it without causing some error, so you'd really need to write: if (constant('JSON_THROW_ON_ERROR') !== NULL) { $bar = json_encode($foo, JSON_THROW_ON_ERROR); } else { $bar = json_encode($foo); if (json_last_error() !== JSON_ERROR_NONE) { throw new Exception(…); } } Thanks, Andrea