Fwd: Re[2]: [PHP-DEV] PHP 8.1 enum const expressions problem

  115609
July 30, 2021 00:01 nesk@xakep.ru (=?UTF-8?B?0JrQuNGA0LjQu9C7INCd0LXRgdC80LXRj9C90L7Qsg==?=)
>> >> Hello internals! I apologize if such a discussion has already taken >> place, but I didn't find anything like it. >>   >> When working with enums, I ran into problems that are currently not >> resolved. In some cases, enumerations require self references and >> constant arithmetic expressions: >>   >> enum Example: int >> { >>     case A = 0b0001; >>     case B = 0b0010; >>     case C = 0b0100; >>   >>     public const EXAMPLE_MASK = self::A | self::B; // << Invalid expression >>     public const EXAMPLE_MASK2 = self::A->value | self::B->value; // << Same >> } >>   >> Similar examples can be taken in the existing PHP code, for example, >> for attributes (Attribute::TARGET_ALL) which is a binary mask for >> existing «targets». Thus, if targets for attributes were implemented >> through enumerations, and not through constants, then it would be >> impossible to implement such functionality. >>   >> In addition, enumeration values are not part of constant expressions, >> so implementation through "Example::A->value" is also not available. >>   >> Can you please tell me, maybe there have already been some discussions >> of this problem? And if not, then maybe we should solve this problem >> before PHP 8.1 release? Since it seems to me that the lack of such >> functionality is quite critical. >What you're describing is enum sets. That is, a value that is itself a union of two enum values. Those are not supported at this time. It's actually a bit tricky to do, since internally enum cases are not a bit value but an actual full PHP object. So self::A | self::B would be trying to bitwise-OR two objects, which is of course meaningless. > >The second doesn't work because technically self::A is an object, not a constant, so it cannot be used in a constant expression. The only dynamicness added in 8.1 was for "new" expressions in initializers. The objects don't exist at compile time so they cannot be reduced down to a constant value pre-runtime. > >I agree the current situation is suboptimal. I don't think anyone is opposed to enum sets, but they're a non-trivial amount of work to do and we haven't figured out how to do them yet. There's no simple way to address it as as bug, so any larger effort will have to wait for 8.2. > >--Larry Garfield > >-- >PHP Internals - PHP Runtime Development Mailing List >To unsubscribe, visit: https://www.php.net/unsub.php Yes, I saw this short description in your RFC ( https://wiki.php.net/rfc/enumerations ) about «enum sets». However, I do not quite understand why we can not now add a cast to scalars (string and int) and math expressions, which would solve this problem? This behavior has already been implemented for GMP objects.
$mask = gmp_init(0b0001) | gmp_init(0b0010); // object(GMP) { num = 3; } echo $mask; // 3   $mask = $mask + 1; // object(GMP) { num = 4; } $mask instanceof \GMP; // true   I mean, for such cases, we can create a new "virtual enum case" containing a new value instead special «EnumSetCase».   enum Some: int {     case A = 0b0001;     case B = 0b0010; }   var_dump(Some::A | Some::B); // enum(Some::@anonymous) { value = 3; }   I don’t think that it is necessary to consider the «enum sets» as a separate case, cause addition is also a fairly popular case:   case LAST = self::B + 1;   Like any other mathematical operations.     ----------------------------------------------------------------------   -- Kirill Nesmeyanov  
  115610
July 30, 2021 10:45 marc@mabe.berlin (Marc Bennewitz)
On 30.07.21 02:01, Кирилл Несмеянов wrote:
> Yes, I saw this short description in your RFC ( https://wiki.php.net/rfc/enumerations ) about «enum sets». However, I do not quite understand why we can not now add a cast to scalars (string and int) and math expressions, which would solve this problem? This behavior has already been implemented for GMP objects. > > $mask = gmp_init(0b0001) | gmp_init(0b0010); // object(GMP) { num = 3; } > echo $mask; // 3 > > $mask = $mask + 1; // object(GMP) { num = 4; } > $mask instanceof \GMP; // true > > I mean, for such cases, we can create a new "virtual enum case" containing a new value instead special «EnumSetCase». > > enum Some: int > { >     case A = 0b0001; >     case B = 0b0010; > } > > var_dump(Some::A | Some::B); // enum(Some::@anonymous) { value = 3; } > > I don’t think that it is necessary to consider the «enum sets» as a separate case, cause addition is also a fairly popular case: > > case LAST = self::B + 1; > > Like any other mathematical operations.
I think you missing something Consider this example: enum Some: int {     case A = 0b0001;     case B = 0b0010; case C = 0b0011; } var_dump(Some::B | Some::C) This should result in a set of B|C but with your logic it's the same as just C. and it also needs to work with strings: enum Some: string {     case A = 'a';     case B = 'b'; case C = 'c'; } This is where enum sets comes into play. Without having PHP internals C knowledge I think it should be possible to introduce an EnumSet which internally handles a bit array where each bit is pointing to the position (ordinal) of an enum case but I don't think the ordinal position is guarantied to be stable over processes/versions so this would not directly by serializable nor do we have generics to define an enum set of a specific type (ala EnumSet). Marc
  115611
July 30, 2021 12:20 nesk@xakep.ru (=?UTF-8?B?S2lyaWxsIE5lc21leWFub3Y=?=)
>Пятница, 30 июля 2021, 13:46 +03:00 от Marc Bennewitz <marc@mabe.berlin>: >  > >On 30.07.21 02:01, Кирилл Несмеянов wrote: >> Yes, I saw this short description in your RFC ( https://wiki.php.net/rfc/enumerations ) about «enum sets». However, I do not quite understand why we can not now add a cast to scalars (string and int) and math expressions, which would solve this problem? This behavior has already been implemented for GMP objects. >> >> $mask = gmp_init(0b0001) | gmp_init(0b0010); // object(GMP) { num = 3; } >> echo $mask; // 3 >> >> $mask = $mask + 1; // object(GMP) { num = 4; } >> $mask instanceof \GMP; // true >> >> I mean, for such cases, we can create a new "virtual enum case" containing a new value instead special «EnumSetCase». >> >> enum Some: int >> { >>     case A = 0b0001; >>     case B = 0b0010; >> } >> >> var_dump(Some::A | Some::B); // enum(Some::@anonymous) { value = 3; } >> >> I don’t think that it is necessary to consider the «enum sets» as a separate case, cause addition is also a fairly popular case: >> >> case LAST = self::B + 1; >> >> Like any other mathematical operations. >I think you missing something > >Consider this example: > >enum Some: int >{ >     case A = 0b0001; >     case B = 0b0010; >     case C = 0b0011; >} > >var_dump(Some::B | Some::C) > >This should result in a set of B|C but with your logic it's the same as >just C. > >and it also needs to work with strings: > >enum Some: string >{ >     case A = 'a'; >     case B = 'b'; >     case C = 'c'; >} > >This is where enum sets comes into play. > >Without having PHP internals C knowledge I think it should be possible >to introduce an EnumSet which internally handles a bit array where each >bit is pointing to the position (ordinal) of an enum case but I don't >think the ordinal position is guarantied to be stable over >processes/versions so this would not directly by serializable nor do we >have generics to define an enum set of a specific type (ala EnumSet). > >Marc > >-- >PHP Internals - PHP Runtime Development Mailing List >To unsubscribe, visit: https://www.php.net/unsub.php  
Yes, that's a good example. I agree. But in addition to bit-masks, there are also at least cases of addition and subtraction (the most popular). Can't we adapt all math expressions existing in nature to specific cases, thereby building our own pseudo-AST?   (Some::A + Some::B) * Some::C | Some::D; EnumSet of     - a: Some::D     - b: EnumMul of         - a: Some::C         - b: EnumSum of             - a: Some::A             - b: Some::B Imho, this is a hellish overcomplication. Isnt it?       -- Kirill Nesmeyanov  
  115612
July 30, 2021 13:45 rowan.collins@gmail.com (Rowan Tommins)
On 30/07/2021 13:20, Kirill Nesmeyanov wrote:
> But in addition to bit-masks, there are also at least cases of addition and subtraction (the most popular). Can't we adapt all math expressions existing in nature to specific cases, thereby building our own pseudo-AST? > > (Some::A + Some::B) * Some::C | Some::D;
If you are performing arithmetic on a value, that value is not of an enum type, or at least not in the way the current feature defines "enum". If you write an enum for days of the week, you might give them integer values, such that "Day::MONDAY->value === 1" and "Day::SUNDAY->value === 7". However, expressions such as "Day::SATURDAY + Day::SUNDAY"  and "Day::MONDAY * Day::TUESDAY" are clearly meaningless. At a stretch, a language with operator overloading might define "Day::MONDAY + 1" to return "Day::TUESDAY", and "Day::MONDAY + 10" to either error or wrap around to "Day::THURSDAY"; but PHP has covered that use case by allowing methods, e.g. "Day::MONDAY->advanceBy(10)", which is more flexible and arguably more expressive. On the other hand, the expression "$weekend = Day::SATURDAY | Day::SUNDAY" seems reasonable; but we don't need the result to be an integer - it would have no meaning on its own - we just need to be able to ask things like "is $today in $weekend?" Returning some kind of EnumSet object means we retain the type information, and can keep our values of 1 to 7 (or have no values at all) rather than having to use powers of 2. Regards, -- Rowan Tommins [IMSoP]
  115613
July 30, 2021 17:24 nesk@xakep.ru (=?UTF-8?B?S2lyaWxsIE5lc21leWFub3Y=?=)
>Пятница, 30 июля 2021, 16:45 +03:00 от Rowan Tommins collins@gmail.com>: >  >On 30/07/2021 13:20, Kirill Nesmeyanov wrote: >> But in addition to bit-masks, there are also at least cases of addition and subtraction (the most popular). Can't we adapt all math expressions existing in nature to specific cases, thereby building our own pseudo-AST? >> >> (Some::A + Some::B) * Some::C | Some::D; > >If you are performing arithmetic on a value, that value is not of an >enum type, or at least not in the way the current feature defines "enum". > >If you write an enum for days of the week, you might give them integer >values, such that "Day::MONDAY->value === 1" and "Day::SUNDAY->value === >7". However, expressions such as "Day::SATURDAY + Day::SUNDAY"  and >"Day::MONDAY * Day::TUESDAY" are clearly meaningless. > >At a stretch, a language with operator overloading might define >"Day::MONDAY + 1" to return "Day::TUESDAY", and "Day::MONDAY + 10" to >either error or wrap around to "Day::THURSDAY"; but PHP has covered that >use case by allowing methods, e.g. "Day::MONDAY->advanceBy(10)", which >is more flexible and arguably more expressive. > >On the other hand, the expression "$weekend = Day::SATURDAY | >Day::SUNDAY" seems reasonable; but we don't need the result to be an >integer - it would have no meaning on its own - we just need to be able >to ask things like "is $today in $weekend?" Returning some kind of >EnumSet object means we retain the type information, and can keep our >values of 1 to 7 (or have no values at all) rather than having to use >powers of 2. > >Regards, > >-- >Rowan Tommins >[IMSoP] > >-- >PHP Internals - PHP Runtime Development Mailing List >To unsubscribe, visit: https://www.php.net/unsub.php  
1) What about this example?   enum ErrorGroup: int {     case NOTICE = 0;     case WARNING = 10;     case ERROR = 20; }   enum ErrorCode: int {     case UNKNOWN = 0;          case A = ErrorGroup::NOTICE + 1;     case B = ErrorGroup::NOTICE + 2;     case С = ErrorGroup::NOTICE + 3;     case D = ErrorGroup::WARNING + 1;     case E = ErrorGroup::ERROR + 1; }   The same applies to, for example, designing Http statuses (However, since they are pre-specified, this does not make sense).   2) OR: As an example from real life, I can show this pseudo-enum: https://github.com/SerafimArts/ffi-sdl/blob/master/src/Kernel/Video/PixelFormat.php     How is this supposed to work in the future? Bit masks are just one case of many others.   -- Kirill Nesmeyanov  
  115614
July 30, 2021 20:01 rowan.collins@gmail.com (Rowan Tommins)
On 30/07/2021 18:24, Kirill Nesmeyanov wrote:
> enum ErrorGroup: int > { >     case NOTICE = 0; >     case WARNING = 10; >     case ERROR = 20; > } > > enum ErrorCode: int > { >     case UNKNOWN = 0; > >     case A = ErrorGroup::NOTICE + 1; >     case B = ErrorGroup::NOTICE + 2; >     case С = ErrorGroup::NOTICE + 3; >     case D = ErrorGroup::WARNING + 1; >     case E = ErrorGroup::ERROR + 1; > }
As I say, those are not enums in the way the current feature defines "enum". An enum, as implemented in the current 8.1 alphas, is not just a special kind of constant, even if it has an integer or string associated. Every enum is its own type (essentially a class), and every value is a distinct value of that type (essentially an instance), which compares equal only to itself. var_dump(ErrorGroup::WARNING); // enum(ErrorGroup::WARNING); var_dump(ErrorGroup::WARNING == 10); // bool(false) var_dump(get_class(ErrorGroup::WARNING)); // string(10) "ErrorGroup" There are languages where enums are just fancy names for integers, and you can write things like "if ( ErrorCode::D >= ErrorGroup::WARNING )". In my opinion, that makes enums much less useful, because it means you can also write nonsensical things like "if ( Month::JANUARY <= Day::WEDNESDAY )". If you want to pass around an integer, but have a convenient name for it, you can still use a class constant. The main value of native enums, in my opinion, is being able to write strongly typed code saying "this accepts a value of type ErrorGroup, and nothing else". Regards, -- Rowan Tommins [IMSoP]