[RFC] DOM Living Standard API

  104878
March 22, 2019 20:14 kontakt@beberlei.de (Benjamin Eberlei)
Hi Internals,

Thomas and I are working on updating the ext/dom to add support for the
current DOM Living Standard API as standardized here:
https://dom.spec.whatwg.org/

https://wiki.php.net/rfc/dom_living_standard_api

This RFC is targeting 7.4 and contains three independent changes:

- a set of new methods and interfaces that can be implemented BC as
addition to the existing ext/dom.
- a removal of a few "dead" classes that are exposed to userland, but
neither documented nor containing implementation code.
- a compatibility layer to switch the implementation between DOM Level 1-3
and the Living Standard in places where BC is not possible.

A pull request includes a nearly complete implementation of the new
methods, but nothing of the cleanup/compatibility yet:
https://github.com/beberlei/php-src/pull/1

We are looking forward for your feedback.

greetings
Benjamin
  104881
March 22, 2019 22:26 claude.pache@gmail.com (Claude Pache)
Beware that behaviour of some methods should differ between HTML and non-HTML documents. For instance, the RFC says:

> DOMElement→nodeName casing was previously undefined, it is now changed to always uppercase.
However, the DOM Living Standard says it is uppercase (even, ASCII-uppercased) only in the HTML namespace. For XML documents, the casing is not modified. —Claude
  104884
March 23, 2019 09:33 kontakt@beberlei.de (Benjamin Eberlei)
On Fri, Mar 22, 2019 at 11:26 PM Claude Pache pache@gmail.com>
wrote:

> Beware that behaviour of some methods should differ between HTML and > non-HTML documents. For instance, the RFC says: > > > DOMElement→nodeName casing was previously undefined, it is now changed > to always uppercase. > > However, the DOM Living Standard says it is uppercase (even, > ASCII-uppercased) only in the HTML namespace. For XML documents, the casing > is not modified. >
You are absolutely right, i missed that in the convoluted description of the behavior :-) I need to rethink how this would fit with the compatibility flags, it might cause a problem given that loadHTML for example doesn't automatically put the elements into HTML namespace. I updated the RFC to reflect this. To be honest, the compatibility thing is what i am least sure in, especially if this should be combined with the new methods + removal of unused code or should be handled separately.
> —Claude > >
  104976
March 28, 2019 11:57 rrichards@cdatazone.org (Rob Richards)
On 3/23/19 5:33 AM, Benjamin Eberlei wrote:
> On Fri, Mar 22, 2019 at 11:26 PM Claude Pache pache@gmail.com> > wrote: > >> Beware that behaviour of some methods should differ between HTML and >> non-HTML documents. For instance, the RFC says: >> >>> DOMElement→nodeName casing was previously undefined, it is now changed >> to always uppercase. >> >> However, the DOM Living Standard says it is uppercase (even, >> ASCII-uppercased) only in the HTML namespace. For XML documents, the casing >> is not modified. >> > You are absolutely right, i missed that in the convoluted description of > the behavior :-) I need to rethink how this would fit with the > compatibility flags, it might cause a problem given that loadHTML for > example doesn't automatically put the elements into HTML namespace. I > updated the RFC to reflect this. > > To be honest, the compatibility thing is what i am least sure in, > especially if this should be combined with the new methods + removal of > unused code or should be handled separately. > > >> —Claude >> >> I'm still running through all the changes but my suggestion would be to
start with only new methods and then deal with the rest. At least with the new functionality you don't cause any unintended BC breaks. Rob
  104978
March 28, 2019 12:20 kontakt@beberlei.de (Benjamin Eberlei)
On Thu, Mar 28, 2019 at 12:57 PM Rob Richards <rrichards@cdatazone.org>
wrote:

> On 3/23/19 5:33 AM, Benjamin Eberlei wrote: > > On Fri, Mar 22, 2019 at 11:26 PM Claude Pache pache@gmail.com> > > wrote: > > > >> Beware that behaviour of some methods should differ between HTML and > >> non-HTML documents. For instance, the RFC says: > >> > >>> DOMElement→nodeName casing was previously undefined, it is now changed > >> to always uppercase. > >> > >> However, the DOM Living Standard says it is uppercase (even, > >> ASCII-uppercased) only in the HTML namespace. For XML documents, the > casing > >> is not modified. > >> > > You are absolutely right, i missed that in the convoluted description of > > the behavior :-) I need to rethink how this would fit with the > > compatibility flags, it might cause a problem given that loadHTML for > > example doesn't automatically put the elements into HTML namespace. I > > updated the RFC to reflect this. > > > > To be honest, the compatibility thing is what i am least sure in, > > especially if this should be combined with the new methods + removal of > > unused code or should be handled separately. > > > > > >> —Claude > >> > >> > I'm still running through all the changes but my suggestion would be to > start with only new methods and then deal with the rest. At least with > the new functionality you don't cause any unintended BC breaks. >
Yes, thinking about this more I am coming to this conclusion myself :-) I think the DOMImplementation and compatibility layer changes should probably not be part of this RFC, it should only be about adding the new functionality. However, we will break BC anyways I realized with the registerNodeClass, if you provide a subclass for DOMElement that implements next/previousElementSibling (for example) with a __get overwrite, then this RFC adding this method potentially with a slightly different behavior will cause a BC break for users subclassing via registerNodeClass.
> > Rob > >
  107130
September 16, 2019 00:39 kontakt@beberlei.de (Benjamin Eberlei)
On Fri, Mar 22, 2019 at 9:14 PM Benjamin Eberlei <kontakt@beberlei.de>
wrote:

> Hi Internals, > > Thomas and I are working on updating the ext/dom to add support for the > current DOM Living Standard API as standardized here: > https://dom.spec.whatwg.org/ > > https://wiki.php.net/rfc/dom_living_standard_api > > This RFC is targeting 7.4 and contains three independent changes: > > - a set of new methods and interfaces that can be implemented BC as > addition to the existing ext/dom. > - a removal of a few "dead" classes that are exposed to userland, but > neither documented nor containing implementation code. > - a compatibility layer to switch the implementation between DOM Level 1-3 > and the Living Standard in places where BC is not possible. > > A pull request includes a nearly complete implementation of the new > methods, but nothing of the cleanup/compatibility yet: > https://github.com/beberlei/php-src/pull/1 > > We are looking forward for your feedback. > > greetings > Benjamin >
Hi internals, a few month have gone by and I came back to revisit this RFC and simplify to get something shipped. i have updated the RFC to only include the set of new methods and interfaces that the DOM Living Standard has implemented. https://wiki.php.net/rfc/dom_living_standard_api I am asking about feedback especially on the section "Implementation Details", that explains some key differences to "PHPify" the DOM Living Standard API to PHP and ext/dom. Do you have any comments about the reasonability of the choices? Also the section on "Not adopting Nodes for now" is new and I need some feedback on this issue: To keep the proposal slim with respect to changing existing behavior, the improved behavior of the DOM Living Standard over Level 1-2 tof automatically adopting nodes instead of throwing a WRONG DOCUMENT DOMexception is not considered for now. Do you think this is a reasonable approach to go forward with? As for the implementation, it is now in this PR https://github.com/php/php-src/pull/4709 Looking forward for further input. Benjamin List of Changes to RFC: - The RFC has been updated to include a few more details about what each new property or method does. - The new behavior of the Living Standard of automatically adopting nodes is skipped in implementation for now to keep the existing behavior of other manipulation methods to throw WRONG document. Workaround is still to importNode first. Adopting nodes can be implemented as improvement later, because it has no backwards compatibility impact. - I removed all sections on trying to achieve compatibility between old DOM level 1-3 and the living standard, especially w.r.t. to DOMHtmlDocument (uppercase nodeName, body property, ...). I think we can live with not 100% compliant, ext/dom never was fully compliant. helpful would be a new section in the docs that explains our existing differences to the spec to users. Specifically the spec itself says that users should test existance of features by checking for properties or methods. - The cleanup of dead / unimplemented classes is not RFC worthy and can be done without making the RFC more complex.
  107144
September 16, 2019 08:00 phpmailinglists@gmail.com (Peter Bowyer)
Hi Benjamin,

I like the proposal.

On Mon, 16 Sep 2019 at 01:40, Benjamin Eberlei <kontakt@beberlei.de> wrote:

> I am asking about feedback especially on the section "Implementation > Details", that explains some key differences to "PHPify" the DOM Living > Standard API to PHP and ext/dom. Do you have any comments about the > reasonability of the choices? >
I don't have feedback, other than to ask what choices other languages have made when bringing in the DOM Living Standard API?
> Also the section on "Not adopting Nodes for now" is new and I need some > feedback on this issue: To keep the proposal slim with respect to changing > existing behavior, the improved behavior of the DOM Living Standard over > Level 1-2 tof automatically adopting nodes instead of throwing a WRONG > DOCUMENT DOMexception is not considered for now. Do you think this is a > reasonable approach to go forward with? >
If I understand correctly, the issue is the behaviour of a method has changed significantly. If not implemented in PHP 8, where BC breaks are expected, when would be a better time? Would deviating and implementing the new behaviour with a different method name e.g. appendAndAdoptChild(); or guarded by a version flag (so users choose whether they want the 'Living' behaviour or the 'Level 2' behaviour in this method) be options? Peter
  107197
September 17, 2019 22:12 kontakt@beberlei.de (Benjamin Eberlei)
On Mon, Sep 16, 2019 at 10:01 AM Peter Bowyer <phpmailinglists@gmail.com>
wrote:

> Hi Benjamin, > > I like the proposal. > > On Mon, 16 Sep 2019 at 01:40, Benjamin Eberlei <kontakt@beberlei.de> > wrote: > >> I am asking about feedback especially on the section "Implementation >> Details", that explains some key differences to "PHPify" the DOM Living >> Standard API to PHP and ext/dom. Do you have any comments about the >> reasonability of the choices? >> > > I don't have feedback, other than to ask what choices other languages have > made when bringing in the DOM Living Standard API? >
good question! the only other non javascript languages that have ext/dom equivalents (i found) are java and python and both haven't changed their APIs to the new living standard yet.
> > >> Also the section on "Not adopting Nodes for now" is new and I need some >> feedback on this issue: To keep the proposal slim with respect to changing >> existing behavior, the improved behavior of the DOM Living Standard over >> Level 1-2 tof automatically adopting nodes instead of throwing a WRONG >> DOCUMENT DOMexception is not considered for now. Do you think this is a >> reasonable approach to go forward with? >> > > If I understand correctly, the issue is the behaviour of a method has > changed significantly. >
No, actually it behaves mostly the same. It only added a new use case that previously lead to an exception, when you appendChild a node from a different document. Since this is not really a line of code that happens dynamically, this code will not be in the wild, instead you will find the workaround $element->appendChild($element->ownerDocument->importNode($otherNode)); which will not be affected by the new behvavior at all.
> > If not implemented in PHP 8, where BC breaks are expected, when would be a > better time? >
This is no BC break in my opinion. Changing an exception that essentially says "don't do this, you connected the wrong things" into the behavior that people would expect anyways is not a BC break in my opinion and therefore could be done at any time.
> > Would deviating and implementing the new behaviour with a different method > name e.g. appendAndAdoptChild(); or guarded by a version flag (so users > choose whether they want the 'Living' behaviour or the 'Level 2' behaviour > in this method) be options? > > Peter >