Effortlessly Generate Random Numbers from Normal Distribution with np.random.normal

[

How to Get Normally Distributed Random Numbers With NumPy

Probability distributions describe the likelihood of all possible outcomes of an event or experiment. The normal distribution is one of the most useful probability distributions because it models many natural phenomena very well. With NumPy, you can create random number samples from the normal distribution.

This distribution is also called the Gaussian distribution or simply the bell curve. The latter hints at the shape of the distribution when you plot it.

The normal distribution is symmetrical around its peak. Because of this symmetry, the mean of the distribution, often denoted by μ, is at that peak. The standard deviation, σ, describes how spread out the distribution is.

If some samples are normally distributed, then it’s probable that a random sample has a value close to the mean. In fact, about 68 percent of all samples are within one standard deviation of the mean.

You can interpret the area under the curve in the plot as a measure of probability. The darkly colored area, which represents all samples less than one standard deviation from the mean, is 68 percent of the full area under the curve.

How to Use NumPy to Generate Normally Distributed Random Numbers

NumPy includes a full subpackage, numpy.random, dedicated to working with random numbers. For historical reasons, this package includes many functions. However, you should usually start by instantiating a default random number generator (RNG):

import numpy as np
rng = np.random.default_rng()

The RNG can generate random numbers from many different distributions, including the normal distribution. To generate random numbers from the standard normal distribution, you can use the normal() function:

n = rng.normal(size=1000)

In this example, size=1000 specifies that you want to generate 1000 random numbers. By default, the normal() function generates random numbers with a mean of 0 and a standard deviation of 1. These are the parameters of the standard normal distribution.

Plot Your Normally Distributed Numbers

To visualize the distribution of the random numbers you’ve generated, you can create a histogram using the matplotlib library. Here’s an example:

import matplotlib.pyplot as plt

plt.hist(n, bins=30, density=True, alpha=0.7)
plt.show()

In this code, plt.hist() creates the histogram, bins=30 specifies the number of bins to use, density=True normalizes the histogram, and alpha=0.7 sets the transparency of the bars.

The resulting histogram will show the distribution of the random numbers. In this case, because you generated random numbers from the standard normal distribution, you should see a bell-shaped curve centered around 0.

Specify the Mean and Standard Deviation

If you want to generate random numbers from a normal distribution with a specific mean and standard deviation, you can provide additional parameters to the normal() function. For example:

n = rng.normal(loc=5, scale=2, size=1000)

In this code, loc=5 sets the mean to 5, and scale=2 sets the standard deviation to 2. The size=1000 parameter specifies that you want to generate 1000 random numbers.

Work With Random Numbers in NumPy

Once you’ve generated random numbers with NumPy, you can perform various operations on them. For example, you can calculate the mean, standard deviation, and other summary statistics:

mean = np.mean(n)
std = np.std(n)

You can also perform mathematical operations on the random numbers, just like you would with any other NumPy array. For example, you can calculate the square of each number:

squared = np.square(n)

NumPy provides a wide range of functions for working with arrays, so you can manipulate your random numbers in any way you like.

Iterate Toward Normality With the Central Limit Theorem

The central limit theorem is a fundamental concept in statistics. It states that the average of a large number of independent and identically distributed random variables tends toward a normal distribution, regardless of the shape of the original distribution.

You can demonstrate the central limit theorem by generating random numbers from different distributions and calculating their means. For example, you can generate 1000 random numbers from a uniform distribution and calculate their means:

means = []
for _ in range(1000):
    sample = rng.uniform(size=100)
    sample_mean = np.mean(sample)
    means.append(sample_mean)

plt.hist(means, bins=30, density=True, alpha=0.7)
plt.show()

In this code, you generate 1000 random samples of size 100 from a uniform distribution. You calculate the mean of each sample and store it in the means list. Finally, you plot a histogram of the means.

As you increase the sample size, the distribution of the means will approach a normal distribution. This demonstrates the central limit theorem in action.

Conclusion

In this tutorial, you learned how to use NumPy to generate random numbers from the normal distribution. You saw how to visualize the distribution of the random numbers using histograms and how to specify the mean and standard deviation of the distribution.

You also learned how to work with the random numbers in NumPy, including calculating summary statistics and performing mathematical operations. Finally, you explored the central limit theorem and how it relates to the normal distribution.

With your newfound knowledge, you can now apply the power of NumPy to a wide range of statistical and scientific applications.