C++11 introduced robust random numbers library,

I will attempt to create short tutorial/HOWTO for this library.

Perhaps you want random double, not integer?

Or maybe you don't really want every number to have same probability? This is actually often needed.

For example you may want to generate random value for human height. Wikipedia claims that tallest man on earth was Robert Waldow, with height of 272cm (8 ft 11.1 in). Peter Dinklage (Tyrion Lannister) is 135 cm (4 ft 5 in) tall. So random height should be

Only now, it is equally possible to generate number in range 170-190 (normal height) as in range 250-270 (extreme height). For our random variable to match real world, it should favor values around 176 cm (average male height in U.S.).

This is what Normal Distribution is about. Statistics show, that U.S. males have height avarage 175.9 cm with standard deviation 7.4 cm (https://tall.life/height-percentile-calculator-age-country/). This means that majority (68.2%) of randomly returned values should be within range 168,5-183,3.

It is possible to achieve such distribution with

Alternatively you could seed engine with time data, either by

others are also useful, but I don't feel confident enough to give sensible examples for them. List of available distributions is here: http://en.cppreference.com/w/cpp/concept/RandomNumberDistribution
Every distribution gives either real number (

What changed in C++11 is that implementations of generators are standardized and not implementation dependent as was with

*<random>*. Its author does great introduction in the video from Cppcon:## Bad, old times

Before C++11, when you wanted to get random numbers in range {1, 2, ... 10}, you wrote something like this:#include <stdio.h> /* printf, NULL */ #include <stdlib.h> /* srand, rand */ #include <time.h> /* time */ int main () { srand (time(NULL)); int num1 = rand()%10 + 1; int num2 = rand()%10 + 1; return 0; }Several things are wrong here:

- It always gives uniform distribution (1 is as probable as 5). This is usually fine, but when not, you need a lot of additional work.
- Actually, it is not very uniform.
- If you want for some reason to reseed the generator (call
*srand(time(NULL))*again) shortly after first seeding, you may get repeated answers - You only get discreet numbers (integers)
- Random algorithm is not specified, so results are not portable/reproducible
*%10 + 1*is not very pretty- a lot of thread safety problems

### (Not) uniform distribution

Function

*rand()*gives random number in range [0; RAND_MAX], where*RAND_MAX*is at least 32767 (my machine: 2147483647 = 0x7fffffff = max signed int). So this causes two problems- If you want wide range of random numbers (more than 32767), it is not guaranteed that you will get them
- If your range is not power of 2 (10 isn't), your distribution will be skewed toward lower numbers.

*RAND_MAX*was 32700, then operation rand()%10 wold work fine. But for 32767, numbers {0, 1, ... 7} are more probable than {8, 9}.### API limitations

Generating random integer with uniform distribution is quite often needed, but sometimes it is not sufficient.Perhaps you want random double, not integer?

Or maybe you don't really want every number to have same probability? This is actually often needed.

For example you may want to generate random value for human height. Wikipedia claims that tallest man on earth was Robert Waldow, with height of 272cm (8 ft 11.1 in). Peter Dinklage (Tyrion Lannister) is 135 cm (4 ft 5 in) tall. So random height should be

*auto height = 135 + rand() % (272-135);*Only now, it is equally possible to generate number in range 170-190 (normal height) as in range 250-270 (extreme height). For our random variable to match real world, it should favor values around 176 cm (average male height in U.S.).

This is what Normal Distribution is about. Statistics show, that U.S. males have height avarage 175.9 cm with standard deviation 7.4 cm (https://tall.life/height-percentile-calculator-age-country/). This means that majority (68.2%) of randomly returned values should be within range 168,5-183,3.

It is possible to achieve such distribution with

*rand()*, but is quite complicated mathematical procedure.## Modern world of C++11

Since C++11 we are encouraged to use*<random>*library. This is how to do it in good and simple way:#include <random> // only 1 include int random10() { // good, expensive randomness for seed static std::random_device rd; // cheaper source of randomnes, still good; seeded static std::mt19937 e(rd()); // what numbers we want static std::uniform_int_distribution<int> d {1, 10}; // combine randomness and distribution return d(e); }We could replace both rd and e with single static default_random_engine, but we would gain little simplicy and lose a lot of quality.

Alternatively you could seed engine with time data, either by

*time()*or evenunsigned seed = std::chrono::system_clock::now().time_since_epoch().count();but again - this is not actually better. First, bad guys atacking your program can guess your seed and calculate all random numbers and second, if you reseed multiple times (reseed =

*srand()*), many seeds may end up with the same value if your clock has wide time unit (*time(NULL)*has precision of 1 second, so it is actually easy to trigger this with multiple threads).### Engine

First we must create source of randomness, called engine. This can be achieved either by- hardware device that provides true randomness, or by
- pseudo-random generator, that keeps in memory its state and is fully deterministic.

*std::*- truly random device (if available on computer; pseudo-random if not)**random_device**

// uses RDRND (Intel hardware RNG) or /dev/urandom std::random_device rd1a; std::random_device rd1b("/dev/urandom"); // much slower on Linux; rd2 may block // if system runs out of entropy (randomness) std::random_device rd2("/dev/random"); // how much entropy we _currently_ have. // Not always implemented cout << "entropy == " << rd2.entropy() << endl;

*std::linear_congruential_engine*(**minstd_rand**) or*std::subtract_with_carry_engine*(ranlux24_base) - light and fast pseudo random generator of reasonable quality

// good seed, lightweight generator std::minstd_rand e1 { rd2() }; // not bad, nanoseconds precis auto seed_ok = std::chrono::system_clock::now() .time_since_epoch().count(); std::minstd_rand e2 { seed_ok }; ion // not good; seconds precision auto seed_bad = time(nullptr); std::minstd_rand e3 { seed_bad }; //for deterministic simulations const auto seed_simulation = 4; std::minstd_rand e4 { seed_simulation };

*std::mersenne_twister_engine*(**mt19937**) - good quality pseudo-random generator that is not very light//mersenne_twister is heavy; //if you care enough to use it, use it properly std::mt19937 e5 { rd2() }; // also fine if you cannot risk delay // due to lack of entropy std::mt19937 e6 { rd1() };

### Distribution

After selecting engine, we need to decide what distribution fits our needs. I expect that most frequently used ones will be:

**uniform_int_distribution**- e.g.*uniform_int_distribution<int> dis(1, 10)*will generate random int of value {1, 2, ... 10}

// int by default std::uniform_int_distribution<> die {1, 6}; std::uniform_int_distribution<int> coin {0, 1}; std::uniform_int_distribution<int> studentId {MIN_ID, MAX_ID};

**uniform_real_distribution**- e.g.

*uniform_real_distribution<double> dis(0.0, 100.0)*will generate double number in range [0; 100) (0 inclusive, 100 exclusive)

std::uniform_real_distribution<float> degree {0, 360};

**normal_distribution**- for human height it could be

*normal_distribution<double> dis(175.9, 7.4)*, where arguments are of course

*mean value*and

*standard deviation*

// double by default std::normal_distribution<> d(avg, std); std::normal_distribution<float> height(175.9, 7.4); std::normal_distribution<> studentScore(50, 15);

**bernoulli_distribution**-

*bernoulli_distribution dis(0.6)*generates boolean true 60% of time, so can be used to simulate unfair coin toss

// bad coin gives bool true 90% of time std::bernoulli_distribution badCoin(0.9);

**discrete_distribution**- generates random integer in range [0; n) (n is number of possible answers) where probability of every integer is given as double. E.g. to pick random element of

*vector<Obj> vec*, where element 1 is twice as likely as any of other 3 arguments, use

*discrete_distribution dis({2, 1, 1, 1})*. Weights (probabilities of elements) don't have to sum to 100%, unless you want them to actually represent probabilities.

// int by default // random month (0-11), but probability // according to # of days std::discrete_distribution<> month {31, 28.25, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; // sum of 2 dice minus 1; // return 4 or 6 is 5 times more likely // than return 0 or 12 std::discrete_distribution<> doubleDice {1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1};

// we can use percentage values

std::discreet_distribution<> percentageDistr

{ 60.0, 30.0, 10.0};

*float/double*) or integers (

*int, long, long long, short*). Also every distribution has default value type, either

*int*or

*double*that can be optionally changed with explicit template parameter.

## Other skills

### Shuffle

One feature of C++/STL that should be mentioned is*std::random_shuffle*and

*std::shuffle*functions. They are actually located in

*<algorithm>*and not in

*<random>*. Both functions take a random access container (as iterators) and randomly rearrange values (obviously). They differ in how they handle (optional) source of randomness:

*std::random_shuffle*takes unary functionstd::random_shuffle ( foo.begin(), foo.end(), [](int i) {return rand()%i;});

*std::shuffle*takes generator from*<random>*shuffle (foo.begin(), foo.end(), std::default_random_engine(seed));

#include <algorithm> #include <iostream> #include <list> #include <numeric> #include <random> #include <vector> using namespace std; void lottery() { // generate numbers 1..56 using std::iota vector<int> numbers(56); iota(numbers.begin(), numbers.end(), 1); // randomize their order shuffle(numbers.begin(), numbers.end(), mt19937{random_device{}()}); // pick 5 of them numbers.erase(numbers.begin()+5, numbers.end()); for (auto i : numbers) cout << i << " "; cout <<endl; }

### Deterministic results

Often when doing simulations, on one hand we need randomness for simulation to be realistic, on the other we need to be able to reproduce experiment with exactly same results. Simple solution is to hard-code seed value. This approach works and any engine (except of*std::random_device*) can take integer seed in constructor.

What changed in C++11 is that implementations of generators are standardized and not implementation dependent as was with

*rand()*. It means, that our simulation is guaranteed to behave the same on any platform we deploy to. This may seem as obvious feature, but it was not simple to achieve before - 2 most popular compilers, GCC and VisualStudio had different pseudo random functions and therefore produced different simulation results.

cout << "old rand(): "; // for me prints 83 86 77 15 93, // but may vary depending on compiler srand(0); for (int i=0; i<5; ++i) cout << rand()%100 <<" ";

cout << "new std::random: "; // always prints 54 59 71 84 60 static mt19937 e {0); static uniform_int_distribution<> d {0, 99}; for (int i=0; i<5; ++i) cout << d(e) << " ";

### Thread safety

Classes in*<random>*are NOT thread safe. However, writing thread-safe code is very easy. All we need to do, is make sure that no two threads are using the same random engine at the same time. Easiest way to achieve it, is to make the engine*thread local*.#include <random> // same include /** this function may be used by different threads safely */ int random10() { // engines are expensive and not thread safe // so keep 1 engine for every thread that wants it thread_local std::mt19937 e{std::random_device{}()}; // distributions are cheap, don't need to keep them // compiler will inline it anyway std::uniform_int_distribution<int> d {1, 10}; // combine randomness and distribution return d(e); }If you don't know

*thread_local*keyword, think of it as of*static*that is created separately for every thread that uses it, and is destroyed when thread finishes. Because every thread has its own engine, they will never interfere with one another.## Conclusion

I hope I persuaded you that new*<random>*library is great, and its*srand()/rand()*forbears are better left alone :-)
## Komentarze

## Prześlij komentarz