Generate Normally Distributed Random Numbers Using np.random.normal

[

How to Get Normally Distributed Random Numbers With NumPy

Probability distributions describe the likelihood of all possible outcomes of an event or experiment. The normal distribution is one of the most useful probability distributions because it models many natural phenomena very well. With NumPy, you can create random number samples from the normal distribution.

This distribution is also called the Gaussian distribution or simply the bell curve. The latter hints at the shape of the distribution when you plot it:

The normal distribution is symmetrical around its peak. Because of this symmetry, the mean of the distribution, often denoted by μ, is at that peak. The standard deviation, σ, describes how spread out the distribution is.

If some samples are normally distributed, then it’s probable that a random sample has a value close to the mean. In fact, about 68 percent of all samples are within one standard deviation of the mean.

You can interpret the area under the curve in the plot as a measure of probability. The darkly colored area, which represents all samples less than one standard deviation from the mean, is 68 percent of the full area under the curve.

In this tutorial, you’ll learn how you can use Python’s NumPy library to work with the normal distribution, and in particular how to create random numbers that are normally distributed. Along the way, you’ll get to know NumPy’s random number generator (RNG) and how to ensure that you can work with randomness in a reproducible manner.

You’ll also see how to visualize probability distributions with Matplotlib and histograms, as well as the effect of manipulating the mean and standard deviation. The central limit theorem explains the importance of the normal distribution. It describes how the average for any repeated experiment or measurement approximates the normal distribution.

As you’ll learn, you can do some powerful statistical analysis with the right Python code. To get the code from this tutorial, click the link below:

$ python -m pip install numpy matplotlib scipy

In addition to grabbing NumPy, you’ve installed Matplotlib and SciPy, so you’re ready to roll.

How to Use NumPy to Generate Normally Distributed Random Numbers

NumPy includes a full subpackage, numpy.random, dedicated to working with random numbers. For historical reasons, this package includes many functions. However, you should usually start by instantiating a default random number generator (RNG):

import numpy as np
rng = np.random.default_rng()

The RNG can generate random numbers from many different probability distributions, including the normal distribution. To generate random numbers from the normal distribution, you can use the normal method.

mean = 0
std_dev = 1
sample_size = 1000

random_values = rng.normal(mean, std_dev, sample_size)

In the example above, mean and std_dev refer to the mean and standard deviation of the normal distribution, and sample_size specifies the number of random values to generate. The normal method will generate sample_size random values that follow a normal distribution with the specified mean and standard deviation.

Plot Your Normally Distributed Numbers

After generating the random values, you can use Matplotlib to visualize the distribution by creating a histogram.

import matplotlib.pyplot as plt

plt.hist(random_values, bins=30)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Normal Distribution')
plt.show()

The above code will create a histogram with 30 bins. The number of bins controls the granularity of the histogram. You can adjust this number to suit your needs. The resulting plot will show the distribution of the generated random values.

Specify the Mean and Standard Deviation

By manipulating the mean and standard deviation, you can generate random values that follow a specific normal distribution.

mean = 5
std_dev = 2

random_values = rng.normal(mean, std_dev, sample_size)

In the above example, the generated random values will have a mean of 5 and a standard deviation of 2.

Work With Random Numbers in NumPy

NumPy provides many functions for working with random numbers. In addition to generating random values from specific probability distributions, you can also perform various statistical operations on the generated random values.

For example, you can compute the mean, standard deviation, and other statistical measures of the generated random values:

mean = np.mean(random_values)
std_dev = np.std(random_values)
max_value = np.max(random_values)
min_value = np.min(random_values)

NumPy also provides functions to calculate percentiles, moments, and various other statistical measures. These functions can be helpful when performing data analysis or conducting statistical experiments.

Iterate Toward Normality With the Central Limit Theorem

The central limit theorem is a fundamental concept in statistics that states that the average of a large number of independently and identically distributed random variables will be approximately normally distributed, regardless of the shape of the original distribution.

To illustrate this concept, you can generate random numbers from a non-normal distribution, compute the averages of those random numbers, and then visualize the distribution of the averages.

sample_size = 500

random_values = rng.uniform(0, 1, (1000, sample_size))
averages = np.mean(random_values, axis=1)

plt.hist(averages, bins=30)
plt.xlabel('Average')
plt.ylabel('Frequency')
plt.title('Distribution of Averages')
plt.show()

In the above example, random numbers are generated from a uniform distribution between 0 and 1. The shape of the uniform distribution is not normal. However, by calculating the averages of the generated random numbers, the resulting distribution of the averages approximates a normal distribution.

Conclusion

NumPy provides powerful tools for working with random numbers and probability distributions. In this tutorial, you’ve learned how to generate normally distributed random numbers using NumPy’s random number generator. You’ve also seen how to visualize the generated random values and manipulate the mean and standard deviation to generate specific distributions.

The central limit theorem showcases the importance of the normal distribution in statistics and demonstrates how averages of random variables tend to converge to a normal distribution. By understanding these concepts and utilizing NumPy’s capabilities, you can perform advanced statistical analysis and experiment with random values efficiently and effectively.