Generating random numbers in C++


Posted by Diego Assencio on 2015.11.19 under Programming (C/C++)

Before C++11 was released, the easiest way to generate random numbers in C++ was through the std::rand() function. However, C++11 now offers easy-to-use and vastly superior mechanisms for generating random numbers and provides developers with the ability to sample from many commonly used distributions using only machinery available in the STL. In this post, we will show the new C++11 methods for generating random integers from a uniform distribution over a closed interval; sampling from other distributions will be a trivial task after one understands the concepts illustrated below.

Let's solve the following problem: sample ten integer values from the closed interval $[1,6]$ (i.e., the set $\{1,2,3,4,5,6\}$) assuming a uniform distribution. This is equivalent to simulating ten rolls of a dice with six faces. The code below shows how this can be done:

#include <random>
#include <iostream>

int main()
{
	std::random_device device;
	std::mt19937 generator(device());
	std::uniform_int_distribution<int> distribution(1,6);

	/* generate ten random numbers in [1,6] */
	for (size_t i = 0; i < 10; ++i)
	{
		std::cout << distribution(generator) << ' ';
	}
	std::cout << std::endl;

	return 0;
}

Try compiling and running the code above. One possible output is:

1 2 4 5 4 4 2 1 4 6

Let's now discuss what the code above does. The first thing we do is to create an instance of std::random_device:

	std::random_device device;

This is a special class which produces uniformly-distributed unsigned integers with 32 bits of length. It can produce random numbers either by accessing the operational system's entropy pool via system calls or by using hardware random number generators such as Intel's RdRand when available (but not all implementations of std::random_device allow this). Developers must be warned, however, that even though 32-bit unsigned integers are generated, the entropy of this random number generator may be lower than 32 bits.

Unfortunately, it is not advisable to use std::random_device repeatedly as this may quickly deplete the entropy pool and therefore reduce the level of randomness available to the other processes running in the system. Additionally, its reliance on system calls makes it a very slow random number generator. To be precise, the following code also generates random integers in the closed interval $[1,6]$ but is not a good way to do it:

#include <random>
#include <iostream>

int main()
{
	/*
	 * this is a BAD WAY to generate random numbers; never use
	 * std::random_device to repeatedly generate random numbers!
	 */
	std::random_device device;
	std::uniform_int_distribution<int> distribution(1,6);

	/* generate ten random numbers in [1,6] */
	for (size_t i = 0; i < 10; ++i)
	{
		std::cout << distribution(device) << ' ';
	}
	std::cout << std::endl;

	return 0;
}

If we cannot use std::random_device repeatedly, then what should we do to generate multiple random numbers? The answer is simple: use std::random_device to generate a single random number which is then used as a seed for a pseudorandom number generator (PRNG) and then use the PRNG itself to generate as many pseudorandom numbers as we wish. For those who do not know what a PRNG is, think of it as an algorithm that takes an initial seed (a random number) and applies it on an internal mechanism to produce numbers which while not truly random, are still "sufficiently close to random".

The procedure we just described is exactly what we did on our original piece of code: we used std::random_device to generate a seed for a very commonly used PRNG called Mersenne Twister which is implemented by the std::mt19937 class (MT19937 stands for "Mersenne Twister" based on the Mersenne prime $2^{19937}−1$). This PRNG produces sequences of 32-bit integers with a very long period of $2^{19937}−1$, i.e., the sequence will repeat itself only after $2^{19937}−1$ numbers have been generated — an imaginably large number! This line initializes the PRNG with a seed produced by device, which is an object of type std::random_device:

std::mt19937 generator(device());

Let's now restate our initial goal: we need a uniform integer distribution which produces random values only in the closed interval $[1,6]$. The class std::uniform_int_distribution was designed for exactly this type of task. We first create an instance of this class which defines the closed interval over which we wish to sample integer values:

std::uniform_int_distribution<int> distribution(1,6);

Using its overloaded operator(), an object of type std::uniform_int_distribution can take a random number generator and use it to generate a number within its defined target interval. Indeed, the following expression in our code returns a number in $[1,6]$:

distribution(generator)

As a final note, the STL also contains an implementation of the Mersenne Twister PRNG which generates 64-bit pseudorandom integers through a class called std::mt19937_64. Although not strictly necessary, the seed for this PRNG should be a 64-bit integer as well.

What about reproducibility?

If you wish to have your program always generate the same sequence of pseudorandom numbers (e.g. for reproducing scientific experiments), all you have to do is seed your PRNG with a fixed seed. The code below shows an example in which we always seed std::mt19937 with the same value to have it always generate the same sequence of pseudorandom numbers:

#include <random>
#include <iostream>

int main()
{
	/* seed the PRNG (MT19937) using a fixed value (in our case, 0) */
	std::mt19937 generator(0);
	std::uniform_int_distribution<int> distribution(1,6);

	/* generate ten numbers in [1,6] (always the same sequence!) */
	for (size_t i = 0; i < 10; ++i)
	{
		std::cout << distribution(generator) << ' ';
	}
	std::cout << std::endl;

	return 0;
}

On my laptop, this code consistently generates the following sequence:

4 4 5 6 4 6 4 6 3 4

Comments

No comments posted yet.

Leave a reply

NOTE: A name and a comment (max. 1024 characters) must be provided; all other fields are optional. Equations will be processed if surrounded with dollar signs (as in LaTeX). You can post up to 5 comments per day.