Re: [PHP-DEV] [Discussion] Unifying Documentation and UI-BasedEditing

September 23, 2019 06:05 Andreas Heigl <andreas@heigl.org>
--bwNRbUXptAjxjCabGyTPknXSc18dnN7q9
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable

Hey Mark

Thanks for the huge amount of work you invested there.

But to be honest: Personally I don't think it helps in moving the
documentation to git for several reasons.

Am 22.09.19 um 23:05 schrieb Mark Randall:
> Hello all, >=20 > After participating in various conversations, and listening to the most=
> recent PHP internals podcast about moving the documentation to GIT, and=
> the problems with editing, I spent most of the weekend knocking up a > prototype that I want to present as a possible solution to these proble= ms.
>=20 > The prototype is available at https://php-doceditor.markrandall.uk/ and=
> has some basic functionality. I must stress it is far from complete. >=20 > In this PoC I attempted to solve the following issues: >=20 >=20 > ** 1. Difficulty in editing XML ** > Something I hear a lot is that the XML format is difficult to work with= =2E
> The format clearly has its benefits, after all almost all the web is > written in it, but I think for something like the documentation, a UI > based editor would be a superior mechanism. >=20 > To that end I created a front-end for editing the files, or rather, a > converted format of the files which takes the XML and ports it into > JSON, which is programatically MUCH easier to work with. >=20 > It in effect still deals with nodes, but they are abstracted out to > nested UI elements.
XML is much easier to edit than JSON. There is a reason why the documentation is written in DocBook as it's a widely used format for writing documentations. At least it was a widely used format. And there have been different ideas already to move the documentation to a different format (MarkDown, ASCII-_Docs, ReST, to name a few) but JSON has never so far popped up. JSON is something that is intended for machine-to-machine transfer of information. It is intended to be easily readable for machines but not for human beings. XML on the contrary is easy to understand by machines as well as human beings. So programatically it is as easy to work with as JSON but from a human POV it is much easier to work with than JSON. And as long as human beings are creating the documentation we need to consider them as well. Also using JSON means you will need to have a technical aid for writing the documentation. With XML all you need is a text-editor. Add the posibility to add comments to XML and I doubt that the people doing the Documentation will be happy with that step. A point that is completely missing here is that the change from one doc-format to another one is not helping in the transition from SVN to git. On the contrary. It is a) binding additional resources and b) if the change is necessary to transition to git it would delay the transition considerably.
>=20 >=20 >=20 > ** 2. Immediate Visual Feedback ** > The process for re-creating the documents once they have been uploaded > to SVN is not an instant one. I wanted a way for the editors to be able=
> to see what they would get, and so I provided a very simplistic > Javascript based renderer. >=20 > Naturally it will need a lot of additional work to include all of the > features we would require of it.
There already *is* a tool to modify the translation and get immediate feedback. It's available at https://edit.php.net but the visual feedback generator is broken for years. So it looks like it is not *that* easy as so far no one repaired it. Or it is not necessary to get the immediate feedback. That does not say that it is a nice add on to the process, but to get the docs as fast as possible (and I'm at the project for 3 years already) from SVN to git it's not a necessity. And again: it binds a lot of workforce for a tool that doesn't seem to be necessary at the moment.
>=20 >=20 >=20 > ** 3. Translation ** > This is perhaps the biggest and most controversial of suggestions. At > present, various translations are kept in completely independent files > where the structure is re-created in each one. >=20 > My proposal and demo turns this on its head. Rather than using multiple=
> files, all languages will coexist in a single document that will > simultaneously act as both the template, the English source and its > translations. >=20 > To achieve this I have introduced the idea of text sections. These are > effectively containers for a set of paragraphs or examples. Crucially, > each text section will contain effectively the same text, but in > multiple languages. >=20 > This means that text that effectively says the same thing, will be > stored next to each other. As you can see on the example, it is easy to=
> open both the French and English editing boxes at once.
This will mean breaking with everything that is currently available. The current project tries to replace the Version Control System underneath the documentation while trying to leave as much of the currently existing workflows, tools and processes they way they are. That does not mean that they are the best processes on earth but those are the processes the people creating the documentation are used to and changing one bit at a time is easier to grok than changing everything at once. Especially when the people doing the work are doing it in their free time. One of the reasons there are different files, even folders, even workspaces per language is the fact that not everyone wants to download the complete documentation. Additionally the single files will get even bigger than they are. And if you have the english version as well as all the translations in one file, does that mean that I have to modify every language when I modify the english version? Otherwise the local versions would not be in sync with the english version. Exactly this is currently solved by the revision-number that is written into the translated file. It is the revision number of the english file this translation is based on. When the english file changes it gets a new revision and I can see that the currentl translated file is not in sync anymore and that it needs to be revisited and at least checked. How do you allow that when all translations are in one file? In addition to that the current setup with different files, even different workspaces, allows the PHP-project to have different people in charge of the different translations. And not only in terms of being able to modify a part of a file but actually having the right to merge or reject changes to a file for a certain translation. That should be spread onto different shoulders as not everyone is as fluent in a certain language as someone else. How would you achieve that if you have all translations in one file?
>=20 >=20 >=20 > ** 4. Translation Synchronisation ** > Following one from the UI element of translation, as I understand it, > one of the biggest problems with moving to GIT is needing to store > hashes so that the system knows what information is out of date and by > how many revisions. >=20 > The ability for the UI to add metadata to the text sections eliminates > this problem completely. I propose that each translated section would > have a button displayed that would indicate that a major change had bee= n
> made, which would update a modified timestamp in the respective > language=E2=80=99s JSON object for that section. >=20 > This would allow easy comparison of when translations were outdated. Th= e
> renderer, having access to all known translations at once, could > potentially give warnings that a given section was wildly out of date, > and offer to display the English or other more recent version inline.
The "need" to store the hash is not the biggest problem. It is *currently* the problem that is challenging *me personally* the most. And it is as I want to change as little as possible in the current workflow. Currently people are accustomed to modifying the revision number of the english file in the translated file. So instead of using the revision number they would use the hash. That seems not to work for some technical reasons (SVN is a bitch here) so we might have to find a different solution here that changes the current workflow as little as possible. Some ideas are already brewing here. Putting everything into one file doesn'T solve the issue. Also having a timestamp doesn't solve the issue as it merely states *when* the last change was made, not *how many* changes were made since then. Knowing the latest change was 3 days ago is one thing, knowing there where 15 changes in between says something completely different. Must have been a major thing...
>=20 >=20 >=20 > ** 5. Improved Type Information ** > With the almost certain addition of unions in 8.0, the types for > parameters and return types are becoming much more formalised. >=20 > Rather than just expecting one type, I have designed the prototype to > allow specifying multiple parameter types, and multiple return types, > and providing distinct information for each, with the UI adjusting > accordingly when union types are present.
This is awesome, but I hope to have finished the transition before we get to the point where we need that. And I'm sure DocBook has a way to handle that as well and I'm equaly sure the people doing the documentation have their ideas on that topic already. I know for example that Nikita has started an initiative to add all the types to the documentation. One thing we didn't yet speak about is the complete toolchain in the back that needs to be adapted to create all the files we know as the PHP-Documentation but also all the files most of us do not know (yet) that are hidden at http://doc.php.net/ So while I believe that the current way of writing and translating the documentation can be improved in many ways such an improvement always needs to take the current situation and the people working with the current situation and their willingness to change their beloved workflows into account. Especially on OSS projects where people usually don't get paid to work on that. This attempt is in my eyes a great idea that should be discussed on the documentation mailinglist whether it is an attempt for a future modification of the files, processes and workflows. But in the meantime I sadly don't see it helping in the current project of moving the currently existing documentation from SVN to git. My 0.02 =E2=82=AC Cheers Andreas PS:
>=20 >=20 > Pre-Emptive Q&A > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20 > 1. Why do certain blocks have the ability to add multiple text sections= ?
> This was a design choice to help with translation. While it would have > been possible, and frankly easier to make each major part only have one=
> translation for each, these could become quite large, and I think it > makes sense to split them up. >=20 > Therefore, sections like notes and examples have the ability to extend > themselves with a multiple text sections, where each one is tracked and=
> translated independently. >=20 >=20 >=20 > 2. But I love XML > At present, I have only made a one-way conversion process that takes XM= L
> and turns it into the necessary JSON for rendering, and it=E2=80=99s my=
> intention to improve it some to be able to pull in existing translation= s
> from multiple languages (using common identifiers such as a parameter > name as a point of reference to join the data). >=20 > It would be possible to write something that did this in reverse, and > took the JSON and turned it back into valid Docbook XML. If this makes > any sense in the long run I am not convinced as I think writing the > renderers is a lot easier in JSON, and it can be committed to GIT all > the same if it's pretty-printed so it's not all mushed up on one line.
We currently *have* renderers in place. They are working quite well. Moving to JSON means we have to *rewrite* them which binds working forces= =2E
>=20 >=20 > 3. Validation? > Definitely needs to have JSON Schema applied to it before it's put into=
> use.
We *have* validation by XML-Schemata. Moving to JSON means we have to *redo* it again. Which =E2=80=93 again =E2=80=93 binds working forces
>=20 >=20 > 4. Source? > Sauce? Tomato Sauce? https://github.com/marandall/phpdoc-editor >=20 > Avoid the XML parsing code. It's pure cancer.
A personal note here: An XSLT file would be able to do the transition without the need for PHP ;-) --=20 ,,, (o o) +---------------------------------------------------------ooO-(_)-Ooo-+ | Andreas Heigl | | mailto:andreas@heigl.org N 50=C2=B022'59.5" E 08=C2=B0= 23'58" | | http://andreas.heigl.org http://hei.gl/wiFKy7 | +---------------------------------------------------------------------+ | http://hei.gl/root-ca | +---------------------------------------------------------------------+ --bwNRbUXptAjxjCabGyTPknXSc18dnN7q9--