This repository has been archived on 2021-05-25. You can view files and clone it, but cannot push or open issues or pull requests.
pyRCV2/docs/rng.md

14 lines
1.2 KiB
Markdown
Raw Permalink Normal View History

2020-12-24 21:09:47 +11:00
# Deterministic random number generator specification
The deterministic random number generator used in pyRCV2 is based on an algorithm by [Ronald L Rivest](https://people.csail.mit.edu/rivest/sampler.py).
The algorithm takes a *seed* value, which is an arbitrary character string. The algorithm has, in its internal state, a *counter*, whose value is initially 0.
In order to generate a value between 0 (inclusive) and *n* (exclusive), to the state is appended a comma (",") followed by the value of the counter, and a SHA-256 hash *H* is computed of the resulting string encoded using UTF-8. The hash *H* is represented as an unsigned hexadecimal integer, *k*. The counter is incremented by 1.
2021-05-23 01:31:54 +10:00
In order to avoid modulo bias, if *k* ≥ ⌊*M*/*n*⌋ × *n* (where *M* = 2^256), *k* is discarded and the algorithm is repeated.
2020-12-24 21:09:47 +11:00
Otherwise, the result is *k* modulo *n*.
2021-05-23 01:31:54 +10:00
In revision [86695aa](https://yingtongli.me/git/pyRCV2/commit/?id=86695aa67e2af936a84d78cb7a7aed9844f61ca9) and older, a slightly different formula was in use to avoid modulo bias – see the documentation of previous revisions for details. It is theoretically possible, though overwhelmingly unlikely, that this may result in different outcomes.