Re: [RFC] token_get_all() TOKEN_AS_OBJECT mode

This is only part of a thread. view whole thread
  108755
February 25, 2020 15:59 nikita.ppv@gmail.com (Nikita Popov)
On Thu, Feb 13, 2020 at 10:47 AM Nikita Popov ppv@gmail.com> wrote:

> Hi internals, > > This has been discussed a while ago already, now as a proper proposal: > https://wiki.php.net/rfc/token_as_object > > tl;dr is that it allows you to get token_get_all() output as an array of > PhpToken objects. This reduces memory usage, improves performance, makes > code more uniform and readable... What's not to like? > > An open question is whether (at least to start with) PhpToken should be > just a data container, or whether we want to add some helper methods to it. > If this generates too much bikeshed, I'll drop methods from the proposal. >
I think this proposal is in a pretty decent shape now, and I'd like to move it to voting soon. The only remaining open question is whether we want to add any additional predefined methods. As the class can now be extended, every library can add their own methods, but there might still be value in providing some things by default, primarily for performance reason. For example, the proposed is() method can be a good bit more efficient when implemented directly in extension code. Any feedback on this point? Nikita
  108756
February 25, 2020 16:58 larry@garfieldtech.com ("Larry Garfield")
On Tue, Feb 25, 2020, at 9:59 AM, Nikita Popov wrote:
> On Thu, Feb 13, 2020 at 10:47 AM Nikita Popov ppv@gmail.com> wrote: > > > Hi internals, > > > > This has been discussed a while ago already, now as a proper proposal: > > https://wiki.php.net/rfc/token_as_object > > > > tl;dr is that it allows you to get token_get_all() output as an array of > > PhpToken objects. This reduces memory usage, improves performance, makes > > code more uniform and readable... What's not to like? > > > > An open question is whether (at least to start with) PhpToken should be > > just a data container, or whether we want to add some helper methods to it. > > If this generates too much bikeshed, I'll drop methods from the proposal. > > > > I think this proposal is in a pretty decent shape now, and I'd like to move > it to voting soon. The only remaining open question is whether we want to > add any additional predefined methods. As the class can now be extended, > every library can add their own methods, but there might still be value in > providing some things by default, primarily for performance reason. For > example, the proposed is() method can be a good bit more efficient when > implemented directly in extension code. > > Any feedback on this point? > > Nikita
Seems good to me in its current state. One last thought I had: in its current form, it *seems* like it would be possible, in concept, to create tokens directly via the constructor (ignoring the positional values), then have a generic render command you can apply to a list of them to use for code generation. I know that's not the intent here, but in concept is what I described possible? --Larry Garfield
  108764
February 25, 2020 21:21 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Feb 25, 2020 at 5:59 PM Larry Garfield <larry@garfieldtech.com>
wrote:

> On Tue, Feb 25, 2020, at 9:59 AM, Nikita Popov wrote: > > On Thu, Feb 13, 2020 at 10:47 AM Nikita Popov ppv@gmail.com> > wrote: > > > > > Hi internals, > > > > > > This has been discussed a while ago already, now as a proper proposal: > > > https://wiki.php.net/rfc/token_as_object > > > > > > tl;dr is that it allows you to get token_get_all() output as an array > of > > > PhpToken objects. This reduces memory usage, improves performance, > makes > > > code more uniform and readable... What's not to like? > > > > > > An open question is whether (at least to start with) PhpToken should be > > > just a data container, or whether we want to add some helper methods > to it. > > > If this generates too much bikeshed, I'll drop methods from the > proposal. > > > > > > > I think this proposal is in a pretty decent shape now, and I'd like to > move > > it to voting soon. The only remaining open question is whether we want to > > add any additional predefined methods. As the class can now be extended, > > every library can add their own methods, but there might still be value > in > > providing some things by default, primarily for performance reason. For > > example, the proposed is() method can be a good bit more efficient when > > implemented directly in extension code. > > > > Any feedback on this point? > > > > Nikita > > Seems good to me in its current state. > > One last thought I had: in its current form, it *seems* like it would be > possible, in concept, to create tokens directly via the constructor > (ignoring the positional values), then have a generic render command you > can apply to a list of them to use for code generation. I know that's not > the intent here, but in concept is what I described possible? >
Sure, that works. You can construct your own tokens and the "render command" should actually be as simple as implode(array_column($tokens, 'text')). Nikita
  108757
February 25, 2020 18:16 theodorejb@outlook.com (Theodore Brown)
On Tue, Feb 25, 2020 at 9:59 AM Nikita Popov ppv@gmail.com> wrote:

> On Thu, Feb 13, 2020 at 10:47 AM Nikita Popov ppv@gmail.com> wrote: > > > This has been discussed a while ago already, now as a proper proposal: > > https://wiki.php.net/rfc/token_as_object > > > > An open question is whether (at least to start with) PhpToken should be > > just a data container, or whether we want to add some helper methods to it. > > If this generates too much bikeshed, I'll drop methods from the proposal. > > I think this proposal is in a pretty decent shape now, and I'd like to move > it to voting soon. The only remaining open question is whether we want to > add any additional predefined methods. As the class can now be extended, > every library can add their own methods, but there might still be value in > providing some things by default, primarily for performance reason. For > example, the proposed is() method can be a good bit more efficient when > implemented directly in extension code. > > Any feedback on this point?
Hi Nikita, Thanks for this RFC. The proposed `PhpToken` class will definitely make it easier to work with parsed tokens (I struggled with this when working on the migration script for deprecated alternate array/string offset syntax). In regards to the open question of additional methods, the `getTokenName()` method would be very welcome. `isIgnorable()` would also be helpful sometimes, though I don't care much one way or another if it's added. But I'm really skeptical about the value of the `is($kind)` method. Code working with tokens should know the type of value being checked (ID int, text content, or array of one or the other), and a method that accepts any of three types can make it harder to understand what's going on. For example, the proposed `isIgnorable()` method doesn't need to use `is()`, it could just check whether the `id` property is in the array of token constants. Best regards, Theodore
  108763
February 25, 2020 21:13 nikita.ppv@gmail.com (Nikita Popov)
On Tue, Feb 25, 2020 at 7:16 PM Theodore Brown <theodorejb@outlook.com>
wrote:

> On Tue, Feb 25, 2020 at 9:59 AM Nikita Popov ppv@gmail.com> wrote: > > > On Thu, Feb 13, 2020 at 10:47 AM Nikita Popov ppv@gmail.com> > wrote: > > > > > This has been discussed a while ago already, now as a proper proposal: > > > https://wiki.php.net/rfc/token_as_object > > > > > > An open question is whether (at least to start with) PhpToken should be > > > just a data container, or whether we want to add some helper methods > to it. > > > If this generates too much bikeshed, I'll drop methods from the > proposal. > > > > I think this proposal is in a pretty decent shape now, and I'd like to > move > > it to voting soon. The only remaining open question is whether we want to > > add any additional predefined methods. As the class can now be extended, > > every library can add their own methods, but there might still be value > in > > providing some things by default, primarily for performance reason. For > > example, the proposed is() method can be a good bit more efficient when > > implemented directly in extension code. > > > > Any feedback on this point? > > Hi Nikita, > > Thanks for this RFC. The proposed `PhpToken` class will definitely make it > easier to work with parsed tokens (I struggled with this when working on > the migration script for deprecated alternate array/string offset syntax). > > In regards to the open question of additional methods, the `getTokenName()` > method would be very welcome. `isIgnorable()` would also be helpful > sometimes, though I don't care much one way or another if it's added. > > But I'm really skeptical about the value of the `is($kind)` method. Code > working with tokens should know the type of value being checked (ID int, > text content, or array of one or the other), and a method that accepts > any of three types can make it harder to understand what's going on. > > For example, the proposed `isIgnorable()` method doesn't need to use > `is()`, it could just check whether the `id` property is in the array > of token constants. >
To provide some context, a method like is() is primarily useful in the implementation of other methods accepting a token-descriptor to search for. Token stream implementation commonly have many helper methods along the lines of public function findRight($pos, $findTokenType) { $tokens = $this->tokens; for ($count = \count($tokens); $pos < $count; $pos++) { if ($tokens[$pos]->is($findTokenType)) { return $pos; } } return -1; } which search for tokens in different directions, skip tokens, check that tokens exist while skipping whitespace etc. Having a fast primitive to perform the token type check would be quite useful for that, I think. Regards, Nikita