[RFC] [Draft] Add Randomizer class (before: Add RNG extension)

  114561
May 22, 2021 20:56 zeriyoshi@gmail.com (Go Kudo)
Hi, Internals and all participated in the previous discussion.

RFCs have been cleaned up and the proposal has been substantially changed.

https://wiki.php.net/rfc/rng_extension

First of all, I apologize for not checking out the implementation of
password_hash(), which is a precedent to learn from.

I think I've answered all the questions I've been getting on Open Issues.
If I have left anything out, please let me know.

Regards,
Go Kudo
  114565
May 24, 2021 14:46 guilliam.xavier@gmail.com (Guilliam Xavier)
On Sat, May 22, 2021 at 10:57 PM Go Kudo <zeriyoshi@gmail.com> wrote:

> Hi, Internals and all participated in the previous discussion. > > RFCs have been cleaned up and the proposal has been substantially changed. > > https://wiki.php.net/rfc/rng_extension > > First of all, I apologize for not checking out the implementation of > password_hash(), which is a precedent to learn from. > > I think I've answered all the questions I've been getting on Open Issues. > If I have left anything out, please let me know. > > Regards, > Go Kudo >
Hi, This really looks like a different proposal indeed (simpler and clearer), thanks! A few (new) remarks/questions: - Naming: "Randomizer", but some have suggested just "Random", or "RNG" (or "PRNG"). - [I was about to say "If typical usage is expected to provide a seed (for reproducibility) more often than choosing a non-default algorithm, maybe the [optional] constructor parameters should be in the according order ($seed, $algo)?", but actually I'm not sure of "expected typical usage", and the current order ($algo, $seed) seems more "logical" (e.g. I think RANDOMIZER_SECURE will ignore $seed?), and we can use named arguments like `new Randomizer(seed: 1234)` if needed, so...] - Does `?int $seed = null` default to `time()` internally? or to something else? [not sure if important...] - Does `?int $min = PHP_INT_MIN` default to `PHP_INT_MIN` even if we pass `null`? But, given that we can use named arguments, do $min/$max really need to be nullable? - About `shuffle(array|string $target): array|string`: I think just "value" would be better than "target", and I'm not sure about the union type (notably for static analysis)... Ideally it should be distinct `(array $value): array` and `(string $value): string`, but that probably requires two distinct names? - For internal implementation, isn't there a signed/unsigned "mismatch" between PHP `function next(): int` and C `uint64_t (*next)(void)` return types? Regards, -- Guilliam Xavier
  114567
May 24, 2021 15:16 tekiela246@gmail.com (Kamil Tekiela)
Hi Go Kudo,

This proposal looks much better. Well done.
What would help is if you showed the examples from the introduction section
rewritten using the new API.
I don't think it makes sense to make parameters (?int $min = PHP_INT_MIN,
?int $max = PHP_INT_MAX) nullable.
Regarding the method names, I would be afraid of naming them using PHP
keywords. Maybe they should be called getInt and getBytes?
Why isn't the new class in a new extension? Is there a reason why it should
go into core?
Regarding the name, I think Random or Randomizer is fine, but I would avoid
acronyms.

Regards,
Kamil
  114592
May 25, 2021 14:41 Danack@basereality.com (Dan Ackroyd)
Hi Go,

On Sat, 22 May 2021 at 21:57, Go Kudo <zeriyoshi@gmail.com> wrote:
> > If I have left anything out, please let me know.
I guess the RFC is still being drafted, but some bits are still quite unclear, particularly whether the proposal includes deprecating existing functions. My suggestion would be to not deprecate anything for the time being. Although the new api may be better, it's a small improvement, and would be quite annoying for people who don't need to migrate. Longer term, splitting up/forking the core libraries, so that it is easier to evolve them is something we should be looking at. But that's a big task. cheers Dan Ack
  114594
May 25, 2021 15:35 zeriyoshi@gmail.com (Go Kudo)
Hi, Thanks for the response.

The RFC has been revised based on the points you pointed out.

https://wiki.php.net/rfc/rng_extension

The main changes are as follows:

- Class name has been changed from `Randomizer` to `Random` .
- Added a static method `getNonBiasedMax()` to get the safe range.
- `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()`
for avoid future reserved words.
- `getInt()` arguments no longer accept null.
- `shuffle(array|string $target): array|string` has been separated into
`arrayShuffle(array $array)` and `stringShuffle(string $string): string`
for more comfortable static-analysis.
- fix: php_random_algo struct (included uint64_t -> int64_t)

Answer a few questions:

> When $seed is null, what is used for the seed value?
Depends on the algo's implementation, but basically it is using internal `php_random_int()`. It is similar to `mt_srand()` on PHP 8.1. https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13
> Why cancelled RNG Extension?
As a result of discussions during the draft, the functions became a single class and no longer need to be separated. The functionality for random numbers is now included in ext/standard and will conform to this. (e.g. rand.c random.c)
> Deprecation
Dropped in the last RFC update. These were premature and should not have been included with the RFCs that add new features. If the direction seems generally okay, I'd like to start implementing it to show more details. Regards, Go Kudo 2021年5月23日(日) 5:56 Go Kudo <zeriyoshi@gmail..com>:
> Hi, Internals and all participated in the previous discussion. > > RFCs have been cleaned up and the proposal has been substantially changed.. > > https://wiki.php.net/rfc/rng_extension > > First of all, I apologize for not checking out the implementation of > password_hash(), which is a precedent to learn from. > > I think I've answered all the questions I've been getting on Open Issues. > If I have left anything out, please let me know. > > Regards, > Go Kudo >
  114608
May 26, 2021 10:16 guilliam.xavier@gmail.com (Guilliam Xavier)
On Tue, May 25, 2021 at 5:36 PM Go Kudo <zeriyoshi@gmail.com> wrote:

> Hi, Thanks for the response. > > The RFC has been revised based on the points you pointed out. > > https://wiki.php.net/rfc/rng_extension >
Thanks, it looks like you have addressed all previous points (for me at least). But also introduced a new one ;) with the new `static function getNonBiasedMax(string $algo): int`... (Note: I think some questions below could be answered by the list in general, not only Go Kudo.) Let's compare these two equivalent functions: function f1(int $seed): void { mt_srand($seed); $a = mt_rand(); $b = mt_rand(); var_dump($a, $b); } function f2(int $seed): void { $random = new Random(RANDOM_MT19937, $seed); $max = Random::getNonBiasedMax(RANDOM_MT19937); $a = $random->getInt(0, $max); $b = $random->getInt(0, $max); var_dump($a, $b); } In particular, note that we did *not* need to write the explicit/long version of f1: function f1_explicit(int $seed): void { mt_srand($seed); $max = mt_getrandmax(); $a = mt_rand(0, $max); $b = mt_rand(0, $max); var_dump($a, $b); } But what would happen with the implicit/short version of f2? function f2_implicit(int $seed): void { $random = new Random(RANDOM_MT19937, $seed); $a = $random->getInt(); $b = $random->getInt(); var_dump($a, $b); } Would we get "biased" results (by the way, what does that mean exactly)? like `mt_rand(PHP_INT_MIN, PHP_INT_MAX)`? Couldn't the default min/max be made "safe"? or maybe "dynamic" depending on the algo/implementation? Also, let's consider this: function g(Random $random): void { $max = /* ??? */; $a = $random->getInt(0, $max); $b = $random->getInt(0, $max); var_dump($a, $b); } Here, how to get the "non-biased max" for this Random instance (unknown algo)? Moreover, what would `Random::getNonBiasedMax(RANDOM_USER)` return? I think we would rather want/need to call e.g. `FixedNumberForTest::getNonBiasedMax()` (potentially overridden)? Maybe you could add a (non-static) `function getAlgo(): string`, so we could at least call `$random::getNonBiasedMax($random->getAlgo())`? (maybe it could also be more generally useful, possibly along with a `getSeed()`, akin to `password_get_info(string $hash)`?) or a non-static `function getNonBiasedMax(): int`, and rename the static one? (or even drop it after all? how often will we need it without having an instance? and if needed, is `new Random($algo, 0)` a costly operation?) or some other solution someone can think of? Ah that made me think: should some methods better be `final`? Finally, the current "Open Issues" section should probably renamed to "Discussion" or even "FAQ" here? Regards, -- Guilliam Xavier
  114678
May 31, 2021 14:04 zeriyoshi@gmail.com (Go Kudo)
Hi Internals.

I apologize for the discussion outside the ML. Here's a brief history.

https://github.com/php/php-src/pull/7079

- To test the implementation, I sent a Pull-Request to php-src on GitHub.
- An excellent point was made, and we ended up having a discussion about it
on GitHub.
- I was going to change the current implementation RFC to Under Discussion,
but decided against it.

Also, sorry for the delay in answering your questions. I am not very good
at English and it is taking me a long time.

The current implementation and RFC:

- https://github.com/php/php-src/pull/7079
- https://wiki.php.net/rfc/rng_extension

However, the implementation of this user-defined class is very ugly and we
intend to improve it as follows.

Before:
```php

class Random
{
    public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, ?int
$seed = null) {}
    //....
}

class UserDefinedRandom extends Random
{
    protected function next(): int
    {
        return 1;
     }
}
```

After:
```php
interface RandomNumberGenerator
{
    public function generate(): int;
}


final class Random
{
    public function __construct(string|RandomNumberGenerator $algo =
RANDOM_XORSHIFT128PLUS, ?int $seed = null) {}
}
```

2021年5月26日(水) 0:35 Go Kudo <zeriyoshi@gmail..com>:

> Hi, Thanks for the response. > > The RFC has been revised based on the points you pointed out. > > https://wiki.php.net/rfc/rng_extension > > The main changes are as follows: > > - Class name has been changed from `Randomizer` to `Random` . > - Added a static method `getNonBiasedMax()` to get the safe range. > - `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()` > for avoid future reserved words. > - `getInt()` arguments no longer accept null. > - `shuffle(array|string $target): array|string` has been separated into > `arrayShuffle(array $array)` and `stringShuffle(string $string): string` > for more comfortable static-analysis. > - fix: php_random_algo struct (included uint64_t -> int64_t) > > Answer a few questions: > > > When $seed is null, what is used for the seed value? > > Depends on the algo's implementation, but basically it is using internal > `php_random_int()`. > It is similar to `mt_srand()` on PHP 8.1. > > > https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13 > > > Why cancelled RNG Extension? > > As a result of discussions during the draft, the functions became a single > class and no longer need to be separated. The functionality for random > numbers is now included in ext/standard and will conform to this. (e.g. > rand.c random.c) > > > Deprecation > > Dropped in the last RFC update. > These were premature and should not have been included with the RFCs that > add new features. > > If the direction seems generally okay, I'd like to start implementing it > to show more details. > > Regards, > Go Kudo > > 2021年5月23日(日) 5:56 Go Kudo <zeriyoshi@gmail.com>: > >> Hi, Internals and all participated in the previous discussion. >> >> RFCs have been cleaned up and the proposal has been substantially changed. >> >> https://wiki.php.net/rfc/rng_extension >> >> First of all, I apologize for not checking out the implementation of >> password_hash(), which is a precedent to learn from. >> >> I think I've answered all the questions I've been getting on Open Issues.. >> If I have left anything out, please let me know. >> >> Regards, >> Go Kudo >> >
  114680
June 1, 2021 10:43 nikita.ppv@gmail.com (Nikita Popov)
On Mon, May 31, 2021 at 4:04 PM Go Kudo <zeriyoshi@gmail.com> wrote:

> Hi Internals. >
Hi :) Thanks for your continued work on this RFC. This is a tough one, with a lot of feedback, some of it contradictory. I'm still going to add my 2c.... I think that this proposal can basically work out in two ways: 1. Pragmatic: Have a Random class that represents both the raw randomness source and ways to use it. 2. Pure: Separate the randomness source from transformations based on it. Your current design sits in a weird middle ground between these, where you can extend the Random class to insert a user-provided randomness source. I think you should pick between one or the other option. If you pick the first option, then user-extensibility should simply not be supported at all. The class should be final, and only support a limit set of extension-provided RNGs. This is a pragmatic design that covers most use-cases. If you pick the second option, then you should consistently separate the source for both extension-provided RNGs and user-provided ones. I don't think it makes sense to have extension provided RNGs use a `new Random(RANDOM_XORSHIFT128PLUS)` interface, while user-provided ones use `new Random(new SomeRandomSource)`. Going down this route, it should be `new Random(new XorShift128Plus)`. This is a pure design, which also means that it's more complicated, and may make simple usages harder (though there is certainly room for a Random::createDefault() here.) As a minor note, I don't think we should add something like random.ignore_generated_size_exceeded. Our general policy is to avoid the introduction of ini settings that affect runtime behavior where possible. We should pick one of the behaviors here and stick with it. Regards, Nikita I apologize for the discussion outside the ML. Here's a brief history.
> > https://github.com/php/php-src/pull/7079 > > - To test the implementation, I sent a Pull-Request to php-src on GitHub. > - An excellent point was made, and we ended up having a discussion about it > on GitHub. > - I was going to change the current implementation RFC to Under Discussion, > but decided against it. > > Also, sorry for the delay in answering your questions. I am not very good > at English and it is taking me a long time. > > The current implementation and RFC: > > - https://github.com/php/php-src/pull/7079 > - https://wiki.php.net/rfc/rng_extension > > However, the implementation of this user-defined class is very ugly and we > intend to improve it as follows. > > Before: > ```php > > class Random > { > public function __construct(string $algo = RANDOM_XORSHIFT128PLUS, ?int > $seed = null) {} > //.... > } > > class UserDefinedRandom extends Random > { > protected function next(): int > { > return 1; > } > } > ``` > > After: > ```php > interface RandomNumberGenerator > { > public function generate(): int; > } > > > final class Random > { > public function __construct(string|RandomNumberGenerator $algo = > RANDOM_XORSHIFT128PLUS, ?int $seed = null) {} > } > ``` > > 2021年5月26日(水) 0:35 Go Kudo <zeriyoshi@gmail.com>: > > > Hi, Thanks for the response. > > > > The RFC has been revised based on the points you pointed out. > > > > https://wiki.php.net/rfc/rng_extension > > > > The main changes are as follows: > > > > - Class name has been changed from `Randomizer` to `Random` . > > - Added a static method `getNonBiasedMax()` to get the safe range. > > - `int()` and `bytes()` have been renamed to `getInt()` and `getBytes()` > > for avoid future reserved words. > > - `getInt()` arguments no longer accept null. > > - `shuffle(array|string $target): array|string` has been separated into > > `arrayShuffle(array $array)` and `stringShuffle(string $string): string` > > for more comfortable static-analysis. > > - fix: php_random_algo struct (included uint64_t -> int64_t) > > > > Answer a few questions: > > > > > When $seed is null, what is used for the seed value? > > > > Depends on the algo's implementation, but basically it is using internal > > `php_random_int()`. > > It is similar to `mt_srand()` on PHP 8.1. > > > > > > > https://github.com/php/php-src/commit/53ee3f7f897f7ee33a4c45210014648043386e13 > > > > > Why cancelled RNG Extension? > > > > As a result of discussions during the draft, the functions became a > single > > class and no longer need to be separated. The functionality for random > > numbers is now included in ext/standard and will conform to this. (e.g. > > rand.c random.c) > > > > > Deprecation > > > > Dropped in the last RFC update. > > These were premature and should not have been included with the RFCs that > > add new features. > > > > If the direction seems generally okay, I'd like to start implementing it > > to show more details. > > > > Regards, > > Go Kudo > > > > 2021年5月23日(日) 5:56 Go Kudo <zeriyoshi@gmail.com>: > > > >> Hi, Internals and all participated in the previous discussion. > >> > >> RFCs have been cleaned up and the proposal has been substantially > changed. > >> > >> https://wiki.php.net/rfc/rng_extension > >> > >> First of all, I apologize for not checking out the implementation of > >> password_hash(), which is a precedent to learn from. > >> > >> I think I've answered all the questions I've been getting on Open > Issues. > >> If I have left anything out, please let me know. > >> > >> Regards, > >> Go Kudo > >> > > >