[RFC] Under Discussion: Add Random class and RandomNumberGenerator interface

  114682
June 1, 2021 14:28 zeriyoshi@gmail.com (Go Kudo)
Hello internals.
Thanks for continuing to participate in the discussion.

I've sorted out the proposal, and finished writing and implementing the RFC.
(Funny, I know.) I think it's time to start a discussion.

https://wiki.php.net/rfc/rng_extension
https://github.com/php/php-src/pull/7079

The main changes since last time are as follows:

- The ugly RANDOM_USER has been removed.
- RandomNumberGenerator interface has been added for user-defined RNGs.
- Random class is now final.
- Random class now accepts a RandomNumberGenerator interface other than
string as the first argument to the constructor.
- INI directive has been removed. In 32-bit environments, the result is
always truncated.

What I'm struggling with now is the behavior when calling nextInt() in a
32-bit environment using a 64-bit RNG. It currently returns a truncated
result, which means that the code loses compatibility with the result of
running on a 64-bit machine.
I was also considering throwing an exception, but which would be preferable?

I would like to answer a few unanswered questions.

> What is bias?
I' ve confirmed that the bias issue in mt_rand has already been fixed and is no longer a problem. This implementation copies most of it from mt_rand. Therefore, this problem does not exist. Therefore, an equivalent method to mt_getrandmax() is no longer provided.
> uint64 & right-shift
This is a specification of the RNG implementation. PHP does not have unsigned integers, but RNG handles unsigned integers (to be precise, even the sign bit is random). As mentioned above, PHP does not have unsigned integers, which means that using the results generated by RNGs may cause compatibility problems in the future. To avoid this, a 1-bit right shift is always required (similar to mt_rand()).
  114692
June 2, 2021 08:51 j.boggiano@seld.be (Jordi Boggiano)
On 01/06/2021 16:28, Go Kudo wrote:
> Hello internals. > Thanks for continuing to participate in the discussion. > > I've sorted out the proposal, and finished writing and implementing the RFC. > (Funny, I know.) I think it's time to start a discussion. > > https://wiki.php.net/rfc/rng_extension > https://github.com/php/php-src/pull/7079 > > The main changes since last time are as follows: > > - The ugly RANDOM_USER has been removed. > - RandomNumberGenerator interface has been added for user-defined RNGs. > - Random class is now final. > - Random class now accepts a RandomNumberGenerator interface other than > string as the first argument to the constructor. > - INI directive has been removed. In 32-bit environments, the result is > always truncated.
Overall this looks much better! From a PHP userland perspective I can't see any huge problem at first glance. Just a few notes: - "Random class can be serialized or cloned if the algorithm supports it." It isn't clear to me how that support is defined in a userland implementation? Simply by implementing __serialize/__unserialize? - The __unserialize docblock has two typos (Useri*i*alize and *in* instead of if) - If an object is passed to `new Random($obj)`, probably it should throw an InvalidArgumentException if a seed is also passed in, as I assume in that case it would be otherwise be ignored. Best, Jordi -- Jordi Boggiano @seldaek - https://seld.be
  114693
June 2, 2021 11:55 zeriyoshi@gmail.com (Go Kudo)
It was good! I was finally able to put some things together.
Thank you for participating in the discussion.

> typo
Thanks, I fixed it!
> InvalidArgumentException
I think InvalidArgumentException is included in the ext/spl extension, but is it safe to use it from ext/standard? If it is available, then it is certainly more appropriate. Regards, Go Kudo 2021年6月2日(水) 17:51 Jordi Boggiano boggiano@seld.be>:
> On 01/06/2021 16:28, Go Kudo wrote: > > Hello internals. > > Thanks for continuing to participate in the discussion. > > > > I've sorted out the proposal, and finished writing and implementing the > RFC. > > (Funny, I know.) I think it's time to start a discussion. > > > > https://wiki.php.net/rfc/rng_extension > > https://github.com/php/php-src/pull/7079 > > > > The main changes since last time are as follows: > > > > - The ugly RANDOM_USER has been removed. > > - RandomNumberGenerator interface has been added for user-defined RNGs. > > - Random class is now final. > > - Random class now accepts a RandomNumberGenerator interface other than > > string as the first argument to the constructor. > > - INI directive has been removed. In 32-bit environments, the result is > > always truncated. > > Overall this looks much better! From a PHP userland perspective I can't > see any huge problem at first glance. > > Just a few notes: > > - "Random class can be serialized or cloned if the algorithm supports > it." It isn't clear to me how that support is defined in a userland > implementation? Simply by implementing __serialize/__unserialize? > - The __unserialize docblock has two typos (Useri*i*alize and *in* > instead of if) > - If an object is passed to `new Random($obj)`, probably it should throw > an InvalidArgumentException if a seed is also passed in, as I assume in > that case it would be otherwise be ignored. > > Best, > Jordi > > -- > Jordi Boggiano > @seldaek - https://seld.be > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
  114783
June 8, 2021 13:10 zeriyoshi@gmail.com (Go Kudo)
Sorry, I missed the first question.

> "Random class can be serialized or cloned if the algorithm supports it." It isn't clear to me how that support is defined in a userland
implementation? Simply by implementing __serialize/__unserialize? Userland implementations can be serialize / unserialize using the standard PHP serialization mechanism. For example, the following: ```php class UserRNG implements RandomNumberGenerator { protected int $current = 0; public function generate(): int { return ++$this->current; } } $random = new Random(new UserRNG()); $random->nextInt(); $random_serialized = serialize($random); var_dump($random_serialized); // O:6:"Random":2:{i:0;a:1:{s:11:"Randomrng";O:7:"UserRNG":1:{s:10:"*current";i:1;}}i:1;N;} $random_unserialized = unserialize($random_serialized); var_dump($random->nextInt() === $random_unserialized->nextInt()); // true ``` Regards, Go Kudo 2021年6月2日(水) 17:51 Jordi Boggiano boggiano@seld.be>:
> On 01/06/2021 16:28, Go Kudo wrote: > > Hello internals. > > Thanks for continuing to participate in the discussion. > > > > I've sorted out the proposal, and finished writing and implementing the > RFC. > > (Funny, I know.) I think it's time to start a discussion. > > > > https://wiki.php.net/rfc/rng_extension > > https://github.com/php/php-src/pull/7079 > > > > The main changes since last time are as follows: > > > > - The ugly RANDOM_USER has been removed. > > - RandomNumberGenerator interface has been added for user-defined RNGs. > > - Random class is now final. > > - Random class now accepts a RandomNumberGenerator interface other than > > string as the first argument to the constructor. > > - INI directive has been removed. In 32-bit environments, the result is > > always truncated. > > Overall this looks much better! From a PHP userland perspective I can't > see any huge problem at first glance. > > Just a few notes: > > - "Random class can be serialized or cloned if the algorithm supports > it." It isn't clear to me how that support is defined in a userland > implementation? Simply by implementing __serialize/__unserialize? > - The __unserialize docblock has two typos (Useri*i*alize and *in* > instead of if) > - If an object is passed to `new Random($obj)`, probably it should throw > an InvalidArgumentException if a seed is also passed in, as I assume in > that case it would be otherwise be ignored. > > Best, > Jordi > > -- > Jordi Boggiano > @seldaek - https://seld.be > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
  114782
June 8, 2021 12:32 zeriyoshi@gmail.com (Go Kudo)
Hello iinternals.

There doesn't seem to be much mention of it.
But, that may be because it has been discussed well in advance.
Thank you for participating in the discussion.

Now, if there is no particular discussion on this, I will try to start the
voting phase next week.
Of course, I will contact you separately.

However, I was looking back at the implementation and found only one point
of concern.

With the current implementation, the results of the following example will
match.

```php
$one = new Random();
$one->nextInt();
$two = clone $one;

var_dump($one->nextInt() === $two->nextInt()); // true
```

But, this is not true for user implementations.

```php
class UserRNG implements RandomNumberGenerator
{
    protected int $current = 0;

    public function generate(): int
    {
        return ++$this->current;
    }
}

$rng = new UserRNG();
$one = new Random($rng);
$one->nextInt();
$two = clone $one;

var_dump($one->nextInt() === $two->nextInt()); // false
```

This is because `$rng` is kept as a normal property and is managed by the
standard PHP mechanism. In other words, it is passed by reference.

This is not consistent with the built-in RNG behavior. However, I don't see
a problem with this behavior.
I feel that the standard PHP behavior is preferable to changing the
userland behavior in a specific way.

I would like to solicit opinions.

Regards,
Go Kudo

2021年6月1日(火) 23:28 Go Kudo <zeriyoshi@gmail..com>:

> Hello internals. > Thanks for continuing to participate in the discussion. > > I've sorted out the proposal, and finished writing and implementing the > RFC. > (Funny, I know.) I think it's time to start a discussion. > > https://wiki.php.net/rfc/rng_extension > https://github.com/php/php-src/pull/7079 > > The main changes since last time are as follows: > > - The ugly RANDOM_USER has been removed. > - RandomNumberGenerator interface has been added for user-defined RNGs. > - Random class is now final. > - Random class now accepts a RandomNumberGenerator interface other than > string as the first argument to the constructor. > - INI directive has been removed. In 32-bit environments, the result is > always truncated. > > What I'm struggling with now is the behavior when calling nextInt() in a > 32-bit environment using a 64-bit RNG. It currently returns a truncated > result, which means that the code loses compatibility with the result of > running on a 64-bit machine. > I was also considering throwing an exception, but which would be > preferable? > > I would like to answer a few unanswered questions. > > > What is bias? > > I' ve confirmed that the bias issue in mt_rand has already been fixed and > is no longer a problem. > > This implementation copies most of it from mt_rand. Therefore, this > problem does not exist. > > Therefore, an equivalent method to mt_getrandmax() is no longer provided. > > > uint64 & right-shift > > This is a specification of the RNG implementation. PHP does not have > unsigned integers, but RNG handles unsigned integers (to be precise, even > the sign bit is random). > > As mentioned above, PHP does not have unsigned integers, which means that > using the results generated by RNGs may cause compatibility problems in the > future. To avoid this, a 1-bit right shift is always required (similar to > mt_rand()). > > > >
  114794
June 9, 2021 10:07 guilliam.xavier@gmail.com (Guilliam Xavier)
Hello Go Kudo,

On Tue, Jun 8, 2021 at 2:33 PM Go Kudo <zeriyoshi@gmail.com> wrote:

> Hello iinternals. > > There doesn't seem to be much mention of it. > But, that may be because it has been discussed well in advance. > Thank you for participating in the discussion. >
I'm not sure you really addressed https://externals.io/message/114561#11468 ? especially this part: """ If you pick the second option, then you should consistently separate the source for both extension-provided RNGs and user-provided ones. I don't think it makes sense to have extension provided RNGs use a `new Random(RANDOM_XORSHIFT128PLUS)` interface, while user-provided ones use `new Random(new SomeRandomSource)`. Going down this route, it should be `new Random(new XorShift128Plus)`. This is a pure design, which also means that it's more complicated, and may make simple usages harder (though there is certainly room for a Random::createDefault() here.) """ Now, that might sound like it would take us back to the "full OO" version (with multiple classes), which was deemed too "user-*un*friendly" by several people (including me :s), *but* there are some new ideas to consider, e.g. (for the Random class): - add a `public static function createDefault(): self` as suggested by Nikita, or - change `public function __construct` parameters to e.g. `(string $algo = XorShift128Plus::class, ?int $seed = null, mixed ....$extra_args)` (that would internally do a `new $algo($seed, ...$extra_args)` to store), to be called not as `new Random(new UserDefinedRNG($seed_or_null, $foo, $bar))` but as `new Random(UserDefinedRNG::class, $seed_or_null, $foo, $bar)`. Any new opinions (or other ideas)?
> Now, if there is no particular discussion on this, I will try to start the > voting phase next week. > Of course, I will contact you separately. > > However, I was looking back at the implementation and found only one point > of concern. > > With the current implementation, the results of the following example will > match. > > ```php > $one = new Random(); > $one->nextInt(); > $two = clone $one; > > var_dump($one->nextInt() === $two->nextInt()); // true > ``` > > But, this is not true for user implementations. > > ```php > class UserRNG implements RandomNumberGenerator > { > protected int $current = 0; > > public function generate(): int > { > return ++$this->current; > } > } > > $rng = new UserRNG(); > $one = new Random($rng); > $one->nextInt(); > $two = clone $one; > > var_dump($one->nextInt() === $two->nextInt()); // false > ``` > > This is because `$rng` is kept as a normal property and is managed by the > standard PHP mechanism. In other words, it is passed by reference. > > This is not consistent with the built-in RNG behavior. However, I don't see > a problem with this behavior. > I feel that the standard PHP behavior is preferable to changing the > userland behavior in a specific way. > > I would like to solicit opinions. >
Couldn't the Random class simply implement `public function __clone(): void` (that would internally do like `$this->algo = clone $this->algo;`)?
> > Regards, > Go Kudo > > 2021年6月1日(火) 23:28 Go Kudo <zeriyoshi@gmail.com>: > > > Hello internals. > > Thanks for continuing to participate in the discussion. > > > > I've sorted out the proposal, and finished writing and implementing the > > RFC. > > (Funny, I know.) I think it's time to start a discussion. > > > > https://wiki.php.net/rfc/rng_extension > > https://github.com/php/php-src/pull/7079 > > > > The main changes since last time are as follows: > > > > - The ugly RANDOM_USER has been removed. > > - RandomNumberGenerator interface has been added for user-defined RNGs. > > - Random class is now final. > > - Random class now accepts a RandomNumberGenerator interface other than > > string as the first argument to the constructor. > > - INI directive has been removed. In 32-bit environments, the result is > > always truncated. > > > > What I'm struggling with now is the behavior when calling nextInt() in a > > 32-bit environment using a 64-bit RNG. It currently returns a truncated > > result, which means that the code loses compatibility with the result of > > running on a 64-bit machine. > > I was also considering throwing an exception, but which would be > > preferable? >
Indeed, a decision has to be made. If you throw an exception, we wouldn't have the "issue" of different results on 32- vs 64-bit machines, but XorShift128+ and Secure would be unusable on a 32-bit machine, right? How do other existing implementations handle this question?
> > > I would like to answer a few unanswered questions. > > > > > What is bias? > > > > I' ve confirmed that the bias issue in mt_rand has already been fixed and > > is no longer a problem. > > > > This implementation copies most of it from mt_rand. Therefore, this > > problem does not exist. > > > > Therefore, an equivalent method to mt_getrandmax() is no longer provided. >
Great!
> > > > > uint64 & right-shift > > > > This is a specification of the RNG implementation. PHP does not have > > unsigned integers, but RNG handles unsigned integers (to be precise, even > > the sign bit is random). > > > > As mentioned above, PHP does not have unsigned integers, which means that > > using the results generated by RNGs may cause compatibility problems in > the > > future. To avoid this, a 1-bit right shift is always required (similar to > > mt_rand()). >
Good to know, thanks. Regards, -- Guilliam Xavier
  114795
June 9, 2021 10:15 guilliam.xavier@gmail.com (Guilliam Xavier)
> > Couldn't the Random class simply implement `public function __clone(): > void` (that would internally do like `$this->algo = clone $this->algo;`)? >
Sorry I meant like `$this->rng = clone $this->rng;`. PS: Please don't reply to this erratum, but rather to the previous message ;)
  114796
June 9, 2021 13:16 zeriyoshi@gmail.com (Go Kudo)
Thanks Guilliam.

> I'm not sure you really addressed https://externals.io/message/114561#114680 ?
I thought I had solved this problem by implementing the `RandomNumberGenerator` interface. Accepting strings and objects as arguments is certainly complicated, but I thought it met the necessary requirements. From the aspect of static analysis, there seems to be no problem. Reverting to a full object-based approach would increase the cost of providing new RNGs from extensions. I think the string approach is superior in this respect. Nikita's `createDefault()` approach is certainly good, but I think it is still difficult for beginners to understand. I also thought about it for a while after that, and I think that the current implementation is the most appropriate for now. What do you think?
> Couldn't the Random class simply implement `public function __clone(): void` (that would internally do like `$this->algo = clone $this->algo;`)?
Implicitly cloning in areas where the user cannot interfere may cause problems when using objects that cannot be cloned. However, this may be overthinking it. The reason is that a `RandomNumberGenerator` that cannot be cloned should be implemented as algo in the first place. If there are no objections, I will change the implementation.
> Indeed, a decision has to be made. If you throw an exception, we wouldn't have the "issue" of different results on 32- vs 64-bit machines, but
XorShift128+ and Secure would be unusable on a 32-bit machine, right? I don't see any problem with the current implicit truncation behavior for this. Even if a user switches to a 64-bit environment, compatibility can be maintained by explicitly truncating the generated values. In addition, most of the other methods only use 32bit internally. If you are concerned about this, you should probably be concerned about the incompatibility with MT_RAND_PHP. That problem can also be compensated for with an extension that adds algo. Regards, Go Kudo 2021年6月9日(水) 19:07 Guilliam Xavier xavier@gmail.com>:
> Hello Go Kudo, > > On Tue, Jun 8, 2021 at 2:33 PM Go Kudo <zeriyoshi@gmail.com> wrote: > >> Hello iinternals. >> >> There doesn't seem to be much mention of it. >> But, that may be because it has been discussed well in advance. >> Thank you for participating in the discussion. >> > > I'm not sure you really addressed > https://externals.io/message/114561#114680 ? especially this part: > > """ > If you pick the second option, then you should consistently separate the > source for both extension-provided RNGs and user-provided ones. I don't > think it makes sense to have extension provided RNGs use a `new > Random(RANDOM_XORSHIFT128PLUS)` interface, while user-provided ones use > `new Random(new SomeRandomSource)`. Going down this route, it should be > `new Random(new XorShift128Plus)`. This is a pure design, which also means > that it's more complicated, and may make simple usages harder (though there > is certainly room for a Random::createDefault() here.) > """ > > Now, that might sound like it would take us back to the "full OO" version > (with multiple classes), which was deemed too "user-*un*friendly" by > several people (including me :s), > *but* there are some new ideas to consider, e.g. (for the Random class): > > - add a `public static function createDefault(): self` as suggested by > Nikita, or > > - change `public function __construct` parameters to e.g. > `(string $algo = XorShift128Plus::class, ?int $seed = null, mixed > ...$extra_args)` > (that would internally do a `new $algo($seed, ...$extra_args)` to store), > to be called not as > `new Random(new UserDefinedRNG($seed_or_null, $foo, $bar))` but as > `new Random(UserDefinedRNG::class, $seed_or_null, $foo, $bar)`. > > Any new opinions (or other ideas)? > > >> Now, if there is no particular discussion on this, I will try to start the >> voting phase next week. >> Of course, I will contact you separately. >> >> However, I was looking back at the implementation and found only one point >> of concern. >> >> With the current implementation, the results of the following example will >> match. >> >> ```php >> $one = new Random(); >> $one->nextInt(); >> $two = clone $one; >> >> var_dump($one->nextInt() === $two->nextInt()); // true >> ``` >> >> But, this is not true for user implementations. >> >> ```php >> class UserRNG implements RandomNumberGenerator >> { >> protected int $current = 0; >> >> public function generate(): int >> { >> return ++$this->current; >> } >> } >> >> $rng = new UserRNG(); >> $one = new Random($rng); >> $one->nextInt(); >> $two = clone $one; >> >> var_dump($one->nextInt() === $two->nextInt()); // false >> ``` >> >> This is because `$rng` is kept as a normal property and is managed by the >> standard PHP mechanism. In other words, it is passed by reference. >> >> This is not consistent with the built-in RNG behavior. However, I don't >> see >> a problem with this behavior. >> I feel that the standard PHP behavior is preferable to changing the >> userland behavior in a specific way. >> >> I would like to solicit opinions. >> > > Couldn't the Random class simply implement `public function __clone(): > void` (that would internally do like `$this->algo = clone $this->algo;`)? > > >> >> Regards, >> Go Kudo >> >> 2021年6月1日(火) 23:28 Go Kudo <zeriyoshi@gmail.com>: >> >> > Hello internals. >> > Thanks for continuing to participate in the discussion. >> > >> > I've sorted out the proposal, and finished writing and implementing the >> > RFC. >> > (Funny, I know.) I think it's time to start a discussion. >> > >> > https://wiki.php.net/rfc/rng_extension >> > https://github.com/php/php-src/pull/7079 >> > >> > The main changes since last time are as follows: >> > >> > - The ugly RANDOM_USER has been removed. >> > - RandomNumberGenerator interface has been added for user-defined RNGs.. >> > - Random class is now final. >> > - Random class now accepts a RandomNumberGenerator interface other than >> > string as the first argument to the constructor. >> > - INI directive has been removed. In 32-bit environments, the result is >> > always truncated. >> > >> > What I'm struggling with now is the behavior when calling nextInt() in a >> > 32-bit environment using a 64-bit RNG. It currently returns a truncated >> > result, which means that the code loses compatibility with the result of >> > running on a 64-bit machine. >> > I was also considering throwing an exception, but which would be >> > preferable? >> > > Indeed, a decision has to be made. If you throw an exception, we wouldn't > have the "issue" of different results on 32- vs 64-bit machines, but > XorShift128+ and Secure would be unusable on a 32-bit machine, right? > How do other existing implementations handle this question? > > > >> > I would like to answer a few unanswered questions. >> > >> > > What is bias? >> > >> > I' ve confirmed that the bias issue in mt_rand has already been fixed >> and >> > is no longer a problem. >> > >> > This implementation copies most of it from mt_rand. Therefore, this >> > problem does not exist. >> > >> > Therefore, an equivalent method to mt_getrandmax() is no longer >> provided. >> > > Great! > > >> > >> > > uint64 & right-shift >> > >> > This is a specification of the RNG implementation. PHP does not have >> > unsigned integers, but RNG handles unsigned integers (to be precise, >> even >> > the sign bit is random). >> > >> > As mentioned above, PHP does not have unsigned integers, which means >> that >> > using the results generated by RNGs may cause compatibility problems in >> the >> > future. To avoid this, a 1-bit right shift is always required (similar >> to >> > mt_rand()). >> > > Good to know, thanks. > > Regards, > > -- > Guilliam Xavier >
  114800
June 9, 2021 16:11 guilliam.xavier@gmail.com (Guilliam Xavier)
On Wed, Jun 9, 2021 at 3:16 PM Go Kudo <zeriyoshi@gmail.com> wrote:

> Thanks Guilliam. > > > I'm not sure you really addressed > https://externals.io/message/114561#114680 ? > > I thought I had solved this problem by implementing the > `RandomNumberGenerator` interface. > > Accepting strings and objects as arguments is certainly complicated, but I > thought it met the necessary requirements. > From the aspect of static analysis, there seems to be no problem. > > Reverting to a full object-based approach would increase the cost of > providing new RNGs from extensions. I think the string approach is superior > in this respect. >
How much more "costly" would it be to define a class (implementing RandomNumberGenerator) and use its (full) name as the algo identifier?
> Nikita's `createDefault()` approach is certainly good, but I think it is > still difficult for beginners to understand. > > I also thought about it for a while after that, and I think that the > current implementation is the most appropriate for now. What do you think? >
I still agree with Nikita that there's a discrepancy between e.g. `new Random(RANDOM_XXX, $seed)` and `new Random(new CustomRNG(/*custom args*/))`, which also poses additional questions, e.g.: - (already pointed out by Jordi) `new Random(new CustomRNG(/*custom args*/), $seed)` is "illogical" (the passed $seed won't be used) but "possible" (even if your current implementation throws a ValueError when $seed is non-null); - This opens the possibility of "leaking" the RNG "source" (even if no `public function getRNG()`), e.g.: ``` $rng = new CustomRNG(/*custom args*/); $r1 = new Random($rng); $r2 = new Random($rng); var_dump($r1->nextInt() === $r2->nextInt()); // false (not true), I suppose? ``` or: ``` $rng = new CustomRNG(/*custom args*/); $r = new Random($rng); var_dump($r->nextInt()); // 1st of sequence $rng->generate(); var_dump($r->nextInt()); // 3rd of sequence (not 2nd), I suppose? ``` (possibly in less obvious ways), which may be considered bad or not (but still a discrepancy vs extension-provided algos). Again, taking a class name and optional extra args (or an $options array, like password_hash(), maybe even including 'seed' only for the algos that need one) to construct, rather than an already constructed object, might be better? But that would also "split" the constructor args from the class (increasing the risk of "mismatch"), and make using anonymous classes less easy, and maybe we haven't considered all the uses cases (e.g. what about `Random::getAlgoInfo(CustomRNG::class)`?)... So that might also be worse :/ It would be great to have more insights! (if nobody else speaks up then let's keep it as is of course)
> > Couldn't the Random class simply implement `public function __clone(): > void` (that would internally do like `$this->rng = clone $this->rng;`)? > > Implicitly cloning in areas where the user cannot interfere may cause > problems when using objects that cannot be cloned. > However, this may be overthinking it. The reason is that a > `RandomNumberGenerator` that cannot be cloned should be implemented as algo > in the first place. > If there are no objections, I will change the implementation. >
Ah, true, I hadn't thought about non-clonable implementations... But that's already the case for `__serialize()` and non-serializable implementations, right? (But let's wait a bit for possible objections, indeed)
> > Indeed, a decision has to be made. If you throw an exception, we > wouldn't have the "issue" of different results on 32- vs 64-bit machines, > but XorShift128+ and Secure would be unusable on a 32-bit machine, right? > > I don't see any problem with the current implicit truncation behavior for > this. Even if a user switches to a 64-bit environment, compatibility can be > maintained by explicitly truncating the generated values. In addition, most > of the other methods only use 32bit internally. > If you are concerned about this, you should probably be concerned about > the incompatibility with MT_RAND_PHP. > That problem can also be compensated for with an extension that adds algo. >
I thought you were "struggling with [this implicit truncation] behavior when calling nextInt() in a 32-bit environment using a 64-bit RNG [...] which means that the code loses compatibility with the result of running on a 64-bit machine"? And you asked if throwing an exception would be preferable? Anyway, I personally don't care about 32-bit (but other people may).
> Regards, > Go Kudo >
Regards, -- Guilliam Xavier
  114833
June 12, 2021 10:30 zeriyoshi@gmail.com (Go Kudo)
Sorry, I'm late to reply.

> How much more "costly" would it be to define a class (implementing RandomNumberGenerator) and use its (full) name as the algo identifier?
If the Random class always accepts an instance of the RandomNumberGenerator, it will be necessary to provide a class that implements the RandomNumberGenerator whenever you try to implement a new RNG. The current approach of specifying the algorithm by string does not require the implementation of a class and is more developer friendly. However, I may be the only one who writes an extension that adds RNG in the first place, so the additional cost is well worth it.
> (already pointed out by Jordi) new Random(new CustomRNG(/*custom args*/), $seed) is "illogical" (the passed $seed won't be used) but
"possible" (even if your current implementation throws a ValueError when $seed is non-null); I thought about it carefully after that, and this is indeed a problem. It is difficult to find the problem easily by static analysis or other methods.
> 64-bit problems
I feel like I'm making too big a deal out of this. This kind of incompatibility happens some time in userland. In conclusion, I see no problem with the implicit truncation behavior. Indeed, the current implementation appears to have several problems. So, how about the following implementation? ```php interface RandomNumberGenerator { public function generate(): int; } final class Random { private RandomNumberGenerator $rng; public function __construct(?RandomNumberGenerator $rng = null) { $this->rng = $rng ?? new XorShift128Plus(random_int(PHP_INT_MIN, PHP_INT_MAX)); } public function nextInt(): int {} public function getInt(int $min, int $max): int {} public function getBytes(int $length): string {} public function shuffleArray(array $array): array {} public function shuffleString(string $string): string {} public function __serialize(): array {} public function __unserialize(array $data): void {} } class XorShift128PlusNumberGenerator implements RandomNumberGenerator { public function generate(): int {} public function __serialize(): array {} public function __unserialize(array $data): void {} } class MT19937NumberGenerator implements RandomNumberGenerator { public function generate(): int {} public function __serialize(): array {} public function __unserialize(array $data): void {} } class SecureNumberGenerator implements RandomNumberGenerator { public function generate(): int {} } ``` This is an approach similar to Nikita's `createDefault`. If the constructor has a null argument, it uses the default, XorShift128+, internally. Also, whether the Random class is serializable or clonable depends on the instance of RandomNumberGenerator being used. This means that when the Random class clone is called, the `$rng` member will be implicitly cloned. How about this? Regards, Go Kudo 2021年6月10日(木) 1:11 Guilliam Xavier xavier@gmail.com>:
> > On Wed, Jun 9, 2021 at 3:16 PM Go Kudo <zeriyoshi@gmail.com> wrote: > >> Thanks Guilliam. >> >> > I'm not sure you really addressed >> https://externals.io/message/114561#114680 ? >> >> I thought I had solved this problem by implementing the >> `RandomNumberGenerator` interface. >> >> Accepting strings and objects as arguments is certainly complicated, but >> I thought it met the necessary requirements. >> From the aspect of static analysis, there seems to be no problem. >> >> Reverting to a full object-based approach would increase the cost of >> providing new RNGs from extensions. I think the string approach is superior >> in this respect. >> > > How much more "costly" would it be to define a class (implementing > RandomNumberGenerator) and use its (full) name as the algo identifier? > > >> Nikita's `createDefault()` approach is certainly good, but I think it is >> still difficult for beginners to understand. >> >> I also thought about it for a while after that, and I think that the >> current implementation is the most appropriate for now. What do you think? >> > > I still agree with Nikita that there's a discrepancy between e.g. `new > Random(RANDOM_XXX, $seed)` and `new Random(new CustomRNG(/*custom > args*/))`, which also poses additional questions, e.g.: > > - (already pointed out by Jordi) `new Random(new CustomRNG(/*custom > args*/), $seed)` is "illogical" (the passed $seed won't be used) but > "possible" (even if your current implementation throws a ValueError when > $seed is non-null); > > - This opens the possibility of "leaking" the RNG "source" (even if no > `public function getRNG()`), e.g.: > > ``` > $rng = new CustomRNG(/*custom args*/); > $r1 = new Random($rng); > $r2 = new Random($rng); > var_dump($r1->nextInt() === $r2->nextInt()); // false (not true), I > suppose? > ``` > > or: > > ``` > $rng = new CustomRNG(/*custom args*/); > $r = new Random($rng); > var_dump($r->nextInt()); // 1st of sequence > $rng->generate(); > var_dump($r->nextInt()); // 3rd of sequence (not 2nd), I suppose? > ``` > > (possibly in less obvious ways), which may be considered bad or not (but > still a discrepancy vs extension-provided algos). > > Again, taking a class name and optional extra args (or an $options array, > like password_hash(), maybe even including 'seed' only for the algos that > need one) to construct, rather than an already constructed object, might be > better? > > But that would also "split" the constructor args from the class > (increasing the risk of "mismatch"), and make using anonymous classes less > easy, and maybe we haven't considered all the uses cases (e.g. what about > `Random::getAlgoInfo(CustomRNG::class)`?)... So that might also be worse :/ > > It would be great to have more insights! (if nobody else speaks up then > let's keep it as is of course) > > >> > Couldn't the Random class simply implement `public function __clone(): >> void` (that would internally do like `$this->rng = clone $this->rng;`)? >> >> Implicitly cloning in areas where the user cannot interfere may cause >> problems when using objects that cannot be cloned. >> However, this may be overthinking it. The reason is that a >> `RandomNumberGenerator` that cannot be cloned should be implemented as algo >> in the first place. >> If there are no objections, I will change the implementation. >> > > Ah, true, I hadn't thought about non-clonable implementations... But > that's already the case for `__serialize()` and non-serializable > implementations, right? (But let's wait a bit for possible objections, > indeed) > > >> > Indeed, a decision has to be made. If you throw an exception, we >> wouldn't have the "issue" of different results on 32- vs 64-bit machines, >> but XorShift128+ and Secure would be unusable on a 32-bit machine, right? >> >> I don't see any problem with the current implicit truncation behavior for >> this. Even if a user switches to a 64-bit environment, compatibility can be >> maintained by explicitly truncating the generated values. In addition, most >> of the other methods only use 32bit internally. >> If you are concerned about this, you should probably be concerned about >> the incompatibility with MT_RAND_PHP. >> That problem can also be compensated for with an extension that adds algo. >> > > I thought you were "struggling with [this implicit truncation] behavior > when calling nextInt() in a 32-bit environment using a 64-bit RNG [...] > which means that the code loses compatibility with the result of running on > a 64-bit machine"? And you asked if throwing an exception would be > preferable? > Anyway, I personally don't care about 32-bit (but other people may). > > >> Regards, >> Go Kudo >> > > Regards, > > -- > Guilliam Xavier >
  114854
June 14, 2021 09:31 guilliam.xavier@gmail.com (Guilliam Xavier)
> Indeed, the current implementation appears to have several problems. So, > how about the following implementation? > > ```php > interface RandomNumberGenerator > { > public function generate(): int; > } > > final class Random > { > private RandomNumberGenerator $rng; > public function __construct(?RandomNumberGenerator $rng = null) > { > $this->rng = $rng ?? new XorShift128Plus(random_int(PHP_INT_MIN, > PHP_INT_MAX)); > } > public function nextInt(): int {} > public function getInt(int $min, int $max): int {} > public function getBytes(int $length): string {} > public function shuffleArray(array $array): array {} > public function shuffleString(string $string): string {} > public function __serialize(): array {} > public function __unserialize(array $data): void {} > } > > class XorShift128PlusNumberGenerator implements RandomNumberGenerator > { > public function generate(): int {} > public function __serialize(): array {} > public function __unserialize(array $data): void {} > } > > class MT19937NumberGenerator implements RandomNumberGenerator > { > public function generate(): int {} > public function __serialize(): array {} > public function __unserialize(array $data): void {} > } > > class SecureNumberGenerator implements RandomNumberGenerator > { > public function generate(): int {} > } > ``` > > This is an approach similar to Nikita's `createDefault`. If the > constructor has a null argument, it uses the default, XorShift128+, > internally. > > Also, whether the Random class is serializable or clonable depends on the > instance of RandomNumberGenerator being used. This means that when the > Random class clone is called, the `$rng` member will be implicitly cloned. > > How about this? >
Hello, thanks for thinking again. A few editorial notes: - I guess that would be "new XorShift128PlusNumberGenerator" (like the class name)? - The non-Secure implementations' stubs are probably missing `public function __construct(?int $seed = null) {}`? - The Random class' and non-Secure implementations' stubs are probably missing `public function __clone(): void {}` (like `__serialize()`)? That would let us use the API like this: 1. Construction: one of: - default RNG and seed: `$random = new Random();` - chosen RNG, default seed: e.g. `$random = new Random(new MT19937NumberGenerator());` - chosen RNG and seed: e.g. `$random = new Random(new MT19937NumberGenerator(1234));` (no "default RNG, chosen seed", but the default RNG can be documented, and one could argue that a chosen seed only makes sense with a chosen RNG anyway) 2. Usage: `$int = $random->nextInt();`, `$percent = $random->getInt(0, 100);`, `$dword = $random->getBytes(4);`, `$shuffledList = $random->shuffleArray($list);` etc. I think this is well-consistent with the "pure design" described by Nikita, and I personally find it both flexible/extensible and easy-to-use =) (Just beware that the namespace question will probably pop up again.) PS: I feel like my numerous questions/suggestions (in this thread and the previous ones) may also have caused some deviations, so I hope that I won't need more and that other participants will reach a consensus... Best regards, -- Guilliam Xavier