Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Introduction

In the previous chapter, we explored the fundamental concepts of continuous random variables, including Probability Density Functions (PDFs), Cumulative Distribution Functions (CDFs), expected value, and variance. While those concepts apply to any continuous random variable, many real-world phenomena tend to follow specific patterns. Over time, mathematicians and statisticians have identified and characterized several “families” of continuous distributions that appear frequently in practice.

This chapter introduces some of the most common and useful continuous distributions: the Uniform, Exponential, Normal (Gaussian), Gamma, and Beta distributions. For each, we will:

  1. Define its PDF and CDF.

  2. Discuss its key parameters and their interpretation.

  3. Calculate its expected value and variance.

  4. Highlight typical applications and characteristic properties.

  5. Demonstrate how to work with it using Python, primarily the scipy.stats module.

Understanding these distributions provides a powerful toolkit for modeling continuous phenomena, from arrival times and lifetimes to measurement errors and proportions. We’ll also briefly touch upon the relationships between some of these distributions.

Let’s start by importing the necessary libraries.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy import integrate # For potential numerical integration examples
# Configure plots for better readability
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 12

1. Uniform Distribution

The Uniform distribution is perhaps the simplest continuous distribution. It describes a variable whose outcome is equally likely within a specific interval [a,b][a, b].

Definition: A random variable XX follows a Uniform distribution on the interval [a,b][a, b], denoted XU(a,b)X \sim U(a, b), if its PDF is constant over the interval and zero elsewhere.

PDF (Probability Density Function):

f(x;a,b)={1bafor axb0otherwisef(x; a, b) = \begin{cases} \frac{1}{b-a} & \text{for } a \le x \le b \\ 0 & \text{otherwise} \end{cases}

The total area under the PDF must be 1. Here, the area is simply the area of a rectangle with width (ba)(b-a) and height 1ba\frac{1}{b-a}.

CDF (Cumulative Distribution Function):

F(x;a,b)=P(Xx)={0for x<axabafor axb1for x>bF(x; a, b) = P(X \le x) = \begin{cases} 0 & \text{for } x < a \\ \frac{x-a}{b-a} & \text{for } a \le x \le b \\ 1 & \text{for } x > b \end{cases}

The CDF increases linearly from 0 to 1 over the interval [a,b][a, b].

Parameters:

Expected Value and Variance:

Applications:

Example: A train is scheduled to arrive at 3:00 PM, but it can arrive uniformly randomly anytime between 2:55 PM and 3:05 PM. Let XX be the arrival time in minutes past 2:55 PM. Then XU(0,10)X \sim U(0, 10).

Python Implementation (scipy.stats.uniform):

scipy.stats.uniform uses loc for the start point (aa) and scale for the width (bab-a).

# Define the parameters for the train example: a=0, b=10
a = 0
b = 10
loc = a
scale = b - a
# Create a uniform distribution object
uniform_dist = stats.uniform(loc=loc, scale=scale)
# Print key properties
print(f"Uniform Distribution U({a}, {b})")
print(f"Mean (Expected Value): {uniform_dist.mean():.2f} (Theoretical: {(a+b)/2:.2f})")
print(f"Variance: {uniform_dist.var():.2f} (Theoretical: {(b-a)**2/12:.2f})")
print(f"Standard Deviation: {uniform_dist.std():.2f}")
Uniform Distribution U(0, 10)
Mean (Expected Value): 5.00 (Theoretical: 5.00)
Variance: 8.33 (Theoretical: 8.33)
Standard Deviation: 2.89
# Calculate probabilities using PDF and CDF
# P(X <= 3 minutes past 2:55) = P(arrival by 2:58 PM)
prob_le_3 = uniform_dist.cdf(3)
print(f"\nP(X <= 3) = {prob_le_3:.2f}")

P(X <= 3) = 0.30
# P(X > 7 minutes past 2:55) = P(arrival after 3:02 PM)
prob_gt_7 = 1 - uniform_dist.cdf(7) # Or uniform_dist.sf(7)
print(f"P(X > 7) = {prob_gt_7:.2f}")
P(X > 7) = 0.30
# P(2 <= X <= 5) = F(5) - F(2)
prob_between_2_5 = uniform_dist.cdf(5) - uniform_dist.cdf(2)
print(f"P(2 <= X <= 5) = {prob_between_2_5:.2f}")
P(2 <= X <= 5) = 0.30
# Plotting the PDF and CDF
x = np.linspace(a - 2, b + 2, 500) # Range slightly outside [a, b]
pdf_values = uniform_dist.pdf(x)
cdf_values = uniform_dist.cdf(x)

fig, ax = plt.subplots(1, 2, figsize=(14, 5))

ax[0].plot(x, pdf_values, 'r-', lw=2, label='Uniform PDF')
ax[0].set_title(f'Uniform({a},{b}) PDF')
ax[0].set_xlabel('x (Minutes past 2:55 PM)')
ax[0].set_ylabel('Density f(x)')
ax[0].grid(True)
ax[0].legend()

ax[1].plot(x, cdf_values, 'b-', lw=2, label='Uniform CDF')
ax[1].set_title(f'Uniform({a},{b}) CDF')
ax[1].set_xlabel('x (Minutes past 2:55 PM)')
ax[1].set_ylabel('Cumulative Probability F(x)')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()
<Figure size 1400x500 with 2 Axes>
# Generate random samples
num_samples = 10000
samples = uniform_dist.rvs(size=num_samples)
# Plot histogram of samples vs theoretical PDF
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=20, density=True, alpha=0.6, color='g', label='Sample Histogram')
plt.plot(x, pdf_values, 'r-', lw=2, label='Theoretical PDF')
plt.title(f'Histogram of {num_samples} Uniform Samples vs PDF')
plt.xlabel('x')
plt.ylabel('Density')
plt.legend()
plt.show()
<Figure size 1000x600 with 1 Axes>

2. Exponential Distribution

The Exponential distribution describes the time until an event occurs in a Poisson process (a process where events occur continuously and independently at a constant average rate).

Definition: A random variable TT follows an Exponential distribution with rate parameter λ>0\lambda > 0, denoted TExp(λ)T \sim Exp(\lambda), if its PDF describes the waiting time for the first event.

PDF (Probability Density Function):

f(t;λ)={λeλtfor t00for t<0f(t; \lambda) = \begin{cases} \lambda e^{-\lambda t} & \text{for } t \ge 0 \\ 0 & \text{for } t < 0 \end{cases}

CDF (Cumulative Distribution Function):

F(t;λ)=P(Tt)={1eλtfor t00for t<0F(t; \lambda) = P(T \le t) = \begin{cases} 1 - e^{-\lambda t} & \text{for } t \ge 0 \\ 0 & \text{for } t < 0 \end{cases}

The probability that the event has not occurred by time tt is P(T>t)=1F(t)=eλtP(T > t) = 1 - F(t) = e^{-\lambda t}. This is often called the survival function.

Parameter:

Expected Value and Variance:

Memoryless Property: A key property of the Exponential distribution is that it is memoryless. This means that the probability of the event occurring in the next time interval, given that it hasn’t occurred yet, is independent of how long we’ve already waited. Formally:

P(T>s+tT>s)=P(T>t)P(T > s+t \mid T > s) = P(T > t)

for any s,t0s, t \ge 0.

Applications:

Example: The lifetime of a certain type of electronic component follows an Exponential distribution. On average, a component lasts 1000 hours. What is the probability that a component lasts more than 1200 hours?

Here, the average lifetime is E[T]=1000E[T] = 1000. Since E[T]=1/λE[T] = 1/\lambda, the rate parameter is λ=1/1000\lambda = 1/1000 failures per hour.

Python Implementation (scipy.stats.expon):

scipy.stats.expon uses scale which corresponds to 1/λ1/\lambda. It also has an optional loc parameter, which shifts the start of the distribution (default is 0).

# Define the parameters for the component lifetime example
mean_lifetime = 1000 # E[T] = 1/lambda
lambda_rate = 1 / mean_lifetime
scale_param = 1 / lambda_rate # scale = 1/lambda
# Create an exponential distribution object
exp_dist = stats.expon(scale=scale_param) # loc defaults to 0
# Print key properties
print(f"Exponential Distribution with rate lambda = {lambda_rate:.4f} (mean = {mean_lifetime})")
print(f"Mean (Expected Value): {exp_dist.mean():.2f} (Theoretical: {1/lambda_rate:.2f})")
print(f"Variance: {exp_dist.var():.2f} (Theoretical: {1/lambda_rate**2:.2f})")
print(f"Standard Deviation: {exp_dist.std():.2f}")
Exponential Distribution with rate lambda = 0.0010 (mean = 1000)
Mean (Expected Value): 1000.00 (Theoretical: 1000.00)
Variance: 1000000.00 (Theoretical: 1000000.00)
Standard Deviation: 1000.00
# Calculate probabilities
# P(T > 1200 hours)
prob_gt_1200 = exp_dist.sf(1200) # sf is the survival function (1 - CDF)
print(f"\nP(T > 1200) = {prob_gt_1200:.4f}")

P(T > 1200) = 0.3012
# P(T <= 800 hours)
prob_le_800 = exp_dist.cdf(800)
print(f"P(T <= 800) = {prob_le_800:.4f}")
P(T <= 800) = 0.5507
# P(500 <= T <= 1500) = F(1500) - F(500)
prob_between_500_1500 = exp_dist.cdf(1500) - exp_dist.cdf(500)
print(f"P(500 <= T <= 1500) = {prob_between_500_1500:.4f}")
P(500 <= T <= 1500) = 0.3834
# Demonstrate memoryless property: P(T > 1200 | T > 1000) == P(T > 200)?
# P(T > 1200 | T > 1000) = P(T > 1200 and T > 1000) / P(T > 1000)
#                        = P(T > 1200) / P(T > 1000)
prob_gt_1200_given_gt_1000 = exp_dist.sf(1200) / exp_dist.sf(1000)
prob_gt_200 = exp_dist.sf(200)
print(f"\nP(T > 1200 | T > 1000) = {prob_gt_1200_given_gt_1000:.4f}")
print(f"P(T > 200) = {prob_gt_200:.4f}")
print(f"Memoryless property holds: {np.isclose(prob_gt_1200_given_gt_1000, prob_gt_200)}")

P(T > 1200 | T > 1000) = 0.8187
P(T > 200) = 0.8187
Memoryless property holds: True
# Plotting the PDF and CDF
t = np.linspace(0, 5 * mean_lifetime, 500) # Plot up to 5x the mean lifetime
pdf_values = exp_dist.pdf(t)
cdf_values = exp_dist.cdf(t)
fig, ax = plt.subplots(1, 2, figsize=(14, 5))

ax[0].plot(t, pdf_values, 'r-', lw=2, label='Exponential PDF')
ax[0].set_title(f'Exponential PDF ($\lambda={lambda_rate:.4f}$)')
ax[0].set_xlabel('t (Hours)')
ax[0].set_ylabel('Density f(t)')
ax[0].grid(True)
ax[0].legend()

ax[1].plot(t, cdf_values, 'b-', lw=2, label='Exponential CDF')
ax[1].set_title(f'Exponential CDF ($\lambda={lambda_rate:.4f}$)')
ax[1].set_xlabel('t (Hours)')
ax[1].set_ylabel('Cumulative Probability F(t)')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()
<Figure size 1400x500 with 2 Axes>
# Generate random samples
num_samples = 10000
samples = exp_dist.rvs(size=num_samples)
# Plot histogram of samples vs theoretical PDF
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=50, density=True, alpha=0.6, color='g', label='Sample Histogram')
plt.plot(t, pdf_values, 'r-', lw=2, label='Theoretical PDF')
plt.title(f'Histogram of {num_samples} Exponential Samples vs PDF')
plt.xlabel('t (Hours)')
plt.ylabel('Density')
plt.xlim(0, 5 * mean_lifetime) # Adjust xlim for better view
plt.legend()
plt.show()
<Figure size 1000x600 with 1 Axes>

3. Normal (Gaussian) Distribution

The Normal distribution, often called the “bell curve,” is arguably the most important distribution in probability and statistics. It arises naturally in many situations due to the Central Limit Theorem (which we’ll cover later) and is used to model numerous physical, biological, and social phenomena.

Definition: A random variable XX follows a Normal distribution with mean μ\mu and variance σ2\sigma^2, denoted XN(μ,σ2)X \sim N(\mu, \sigma^2), if its PDF is given by the characteristic bell-shaped curve centered at μ\mu.

PDF (Probability Density Function):

f(x;μ,σ2)=12πσ2e(xμ)22σ2f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{ - \frac{(x-\mu)^2}{2\sigma^2} }

for <x<-\infty < x < \infty.

CDF (Cumulative Distribution Function): The CDF, F(x)=P(Xx)F(x) = P(X \le x), does not have a simple closed-form expression in terms of elementary functions. It’s typically calculated using numerical methods or tables involving the standard normal distribution. The CDF is often denoted by Φ(z)\Phi(z) for the standard normal case.

Parameters:

Expected Value and Variance:

Standard Normal Distribution: A special case is the Standard Normal Distribution, denoted ZN(0,1)Z \sim N(0, 1), where μ=0\mu = 0 and σ2=1\sigma^2 = 1. Any Normal random variable XN(μ,σ2)X \sim N(\mu, \sigma^2) can be standardized into a standard normal variable ZZ using the transformation:

Z=XμσZ = \frac{X - \mu}{\sigma}

This transformation is crucial for calculations, as probabilities for any Normal distribution can be found using the CDF of the Standard Normal distribution. P(Xx)=P(Xμσxμσ)=P(Zz)P(X \le x) = P\left(\frac{X-\mu}{\sigma} \le \frac{x-\mu}{\sigma}\right) = P(Z \le z), where z=(xμ)/σz = (x-\mu)/\sigma.

Empirical Rule (68-95-99.7 Rule): For any Normal distribution:

Applications:

Example: Suppose IQ scores in a population are Normally distributed with a mean (μ\mu) of 100 and a standard deviation (σ\sigma) of 15. XN(100,152)X \sim N(100, 15^2).

  1. What is the probability a randomly selected person has an IQ score between 85 and 115?

  2. What is the probability a randomly selected person has an IQ score above 130?

  3. What IQ score marks the 95th percentile?

Python Implementation (scipy.stats.norm):

scipy.stats.norm uses loc for the mean (μ\mu) and scale for the standard deviation (σ\sigma).

# Define the parameters for the IQ score example
mu = 100
sigma = 15
# Create a normal distribution object
norm_dist = stats.norm(loc=mu, scale=sigma)
# Print key properties
print(f"Normal Distribution N(mu={mu}, sigma^2={sigma**2})")
print(f"Mean (Expected Value): {norm_dist.mean():.2f} (Theoretical: {mu:.2f})")
print(f"Variance: {norm_dist.var():.2f} (Theoretical: {sigma**2:.2f})")
print(f"Standard Deviation: {norm_dist.std():.2f} (Theoretical: {sigma:.2f})")
Normal Distribution N(mu=100, sigma^2=225)
Mean (Expected Value): 100.00 (Theoretical: 100.00)
Variance: 225.00 (Theoretical: 225.00)
Standard Deviation: 15.00 (Theoretical: 15.00)
# Calculate probabilities
# 1. P(85 <= X <= 115) = F(115) - F(85)
# Note: 85 is mu - sigma, 115 is mu + sigma. Should be approx 68%.
prob_between_85_115 = norm_dist.cdf(115) - norm_dist.cdf(85)
print(f"\nP(85 <= X <= 115) = {prob_between_85_115:.4f} (Matches empirical rule: ~68%)")

P(85 <= X <= 115) = 0.6827 (Matches empirical rule: ~68%)
# 2. P(X > 130)
# Note: 130 is mu + 2*sigma. P(X > mu+2sigma) should be approx (1-0.95)/2 = 2.5%
prob_gt_130 = norm_dist.sf(130) # sf = 1 - cdf
print(f"P(X > 130) = {prob_gt_130:.4f} (Matches empirical rule: ~2.5%)")
P(X > 130) = 0.0228 (Matches empirical rule: ~2.5%)
# Check the 99.7% rule: P(mu - 3*sigma <= X <= mu + 3*sigma)
lower_3sd = mu - 3 * sigma
upper_3sd = mu + 3 * sigma
prob_within_3sd = norm_dist.cdf(upper_3sd) - norm_dist.cdf(lower_3sd)
print(f"P({lower_3sd:.1f} <= X <= {upper_3sd:.1f}) = {prob_within_3sd:.4f} (Matches empirical rule: ~99.7%)")
P(55.0 <= X <= 145.0) = 0.9973 (Matches empirical rule: ~99.7%)
# 3. Find the IQ score at the 95th percentile (value x such that F(x) = 0.95)
# Use the Percent Point Function (PPF), which is the inverse of the CDF
iq_95th_percentile = norm_dist.ppf(0.95)
print(f"\n95th percentile IQ score: {iq_95th_percentile:.2f}")
# Verification: Calculate CDF at this value
print(f"CDF at {iq_95th_percentile:.2f}: {norm_dist.cdf(iq_95th_percentile):.2f}")

95th percentile IQ score: 124.67
CDF at 124.67: 0.95
# Plotting the PDF and CDF
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 500) # Range covering most of the probability mass
pdf_values = norm_dist.pdf(x)
cdf_values = norm_dist.cdf(x)

fig, ax = plt.subplots(1, 2, figsize=(14, 5))

ax[0].plot(x, pdf_values, 'r-', lw=2, label='Normal PDF')
ax[0].set_title(f'Normal PDF ($\mu={mu}, \sigma={sigma}$)')
ax[0].set_xlabel('x (IQ Score)')
ax[0].set_ylabel('Density f(x)')
# Add lines for mean and std devs
ax[0].axvline(mu, color='k', linestyle='--', label=f'$\mu={mu}$')
ax[0].axvline(mu + sigma, color='gray', linestyle=':', label=f'$\mu+\sigma={mu+sigma}$')
ax[0].axvline(mu - sigma, color='gray', linestyle=':')
ax[0].axvline(mu + 2*sigma, color='lightgray', linestyle=':', label=f'$\mu+2\sigma={mu+2*sigma}$')
ax[0].axvline(mu - 2*sigma, color='lightgray', linestyle=':')
ax[0].grid(True)
ax[0].legend()

ax[1].plot(x, cdf_values, 'b-', lw=2, label='Normal CDF')
ax[1].set_title(f'Normal CDF ($\mu={mu}, \sigma={sigma}$)')
ax[1].set_xlabel('x (IQ Score)')
ax[1].set_ylabel('Cumulative Probability F(x)')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()
<Figure size 1400x500 with 2 Axes>
# Generate random samples
num_samples = 10000
samples = norm_dist.rvs(size=num_samples)
# Plot histogram of samples vs theoretical PDF
plt.figure(figsize=(10, 6))
plt.hist(samples, bins=50, density=True, alpha=0.6, color='g', label='Sample Histogram')
plt.plot(x, pdf_values, 'r-', lw=2, label='Theoretical PDF')
plt.title(f'Histogram of {num_samples} Normal Samples vs PDF')
plt.xlabel('x (IQ Score)')
plt.ylabel('Density')
plt.legend()
plt.show()
<Figure size 1000x600 with 1 Axes>

4. Gamma Distribution

The Gamma distribution is a flexible distribution defined for non-negative values. It is related to the Exponential distribution and the Chi-squared distribution. It arises naturally as the waiting time until the kk-th event occurs in a Poisson process.

Definition: A random variable XX follows a Gamma distribution with shape parameter k>0k > 0 and rate parameter λ>0\lambda > 0 (or scale parameter θ=1/λ\theta = 1/\lambda), denoted XGamma(k,λ)X \sim Gamma(k, \lambda) or XGamma(k,θ)X \sim Gamma(k, \theta), if its PDF involves the Gamma function Γ(k)\Gamma(k).

The Gamma function is defined as Γ(k)=0xk1exdx\Gamma(k) = \int_0^\infty x^{k-1} e^{-x} dx. For positive integers kk, Γ(k)=(k1)!\Gamma(k) = (k-1)!.

PDF (Probability Density Function): Using shape kk and rate λ\lambda:

f(x;k,λ)=λkxk1eλxΓ(k)for x0f(x; k, \lambda) = \frac{\lambda^k x^{k-1} e^{-\lambda x}}{\Gamma(k)} \quad \text{for } x \ge 0

Using shape kk and scale θ=1/λ\theta = 1/\lambda:

f(x;k,θ)=1Γ(k)θkxk1ex/θfor x0f(x; k, \theta) = \frac{1}{\Gamma(k)\theta^k} x^{k-1} e^{-x/\theta} \quad \text{for } x \ge 0

CDF (Cumulative Distribution Function): The CDF involves the lower incomplete gamma function and doesn’t have a simple closed form. It is usually computed numerically.

Parameters:

Expected Value and Variance:

Relationship to Exponential:

Applications:

Example: Customers arrive at a store according to a Poisson process with an average rate of λ=2\lambda = 2 customers per minute. What is the probability that we have to wait between 1 and 3 minutes for the 5th customer to arrive?

The waiting time T5T_5 for the 5th customer follows a Gamma distribution with shape k=5k=5 and rate λ=2\lambda=2. So T5Gamma(k=5,λ=2)T_5 \sim Gamma(k=5, \lambda=2).

Python Implementation (scipy.stats.gamma):

scipy.stats.gamma uses a for the shape parameter (kk) and scale for the scale parameter (θ=1/λ\theta = 1/\lambda). It also has an optional loc parameter (default 0).

# Define parameters for the customer arrival example
k_shape = 5 # Number of events (shape parameter a in scipy)
lambda_rate = 2 # Rate of arrivals per minute
theta_scale = 1 / lambda_rate # Scale parameter in scipy
# Create a Gamma distribution object
gamma_dist = stats.gamma(a=k_shape, scale=theta_scale) # loc defaults to 0
# Print key properties
print(f"Gamma Distribution Gamma(k={k_shape}, lambda={lambda_rate}) or Gamma(a={k_shape}, scale={theta_scale})")
print(f"Mean (Expected Value): {gamma_dist.mean():.2f} (Theoretical: {k_shape / lambda_rate:.2f})")
print(f"Variance: {gamma_dist.var():.2f} (Theoretical: {k_shape / lambda_rate**2:.2f})")
print(f"Standard Deviation: {gamma_dist.std():.2f}")
Gamma Distribution Gamma(k=5, lambda=2) or Gamma(a=5, scale=0.5)
Mean (Expected Value): 2.50 (Theoretical: 2.50)
Variance: 1.25 (Theoretical: 1.25)
Standard Deviation: 1.12
# Calculate probability: P(1 <= T_5 <= 3) = F(3) - F(1)
prob_between_1_3 = gamma_dist.cdf(3) - gamma_dist.cdf(1)
print(f"\nP(1 <= T_5 <= 3) = {prob_between_1_3:.4f}")

P(1 <= T_5 <= 3) = 0.6623
# Verify relationship with Exponential: Gamma(k=1, lambda=2) should match Exp(lambda=2)
gamma_k1 = stats.gamma(a=1, scale=1/lambda_rate)
exp_dist_check = stats.expon(scale=1/lambda_rate)
print(f"\nGamma(k=1) Mean: {gamma_k1.mean():.2f}, Exp Mean: {exp_dist_check.mean():.2f}")
print(f"Gamma(k=1) Var: {gamma_k1.var():.2f}, Exp Var: {exp_dist_check.var():.2f}")
test_x = 1.5
print(f"Gamma(k=1) PDF at {test_x}: {gamma_k1.pdf(test_x):.4f}, Exp PDF at {test_x}: {exp_dist_check.pdf(test_x):.4f}")

Gamma(k=1) Mean: 0.50, Exp Mean: 0.50
Gamma(k=1) Var: 0.25, Exp Var: 0.25
Gamma(k=1) PDF at 1.5: 0.0996, Exp PDF at 1.5: 0.0996
# Plotting the PDF and CDF for Gamma(5, 2)
x = np.linspace(0, gamma_dist.ppf(0.999), 500) # Plot up to 99.9th percentile
pdf_values = gamma_dist.pdf(x)
cdf_values = gamma_dist.cdf(x)

fig, ax = plt.subplots(1, 2, figsize=(14, 5))

ax[0].plot(x, pdf_values, 'r-', lw=2, label=f'Gamma PDF (k={k_shape}, $\lambda$={lambda_rate})')
ax[0].set_title('Gamma PDF')
ax[0].set_xlabel('x (Waiting Time in Minutes)')
ax[0].set_ylabel('Density f(x)')
ax[0].grid(True)
ax[0].legend()

ax[1].plot(x, cdf_values, 'b-', lw=2, label=f'Gamma CDF (k={k_shape}, $\lambda$={lambda_rate})')
ax[1].set_title('Gamma CDF')
ax[1].set_xlabel('x (Waiting Time in Minutes)')
ax[1].set_ylabel('Cumulative Probability F(x)')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()
<Figure size 1400x500 with 2 Axes>
# Plot different shapes of Gamma by varying k (keeping scale=1 for simplicity)
plt.figure(figsize=(10, 6))
shapes_to_plot = [1, 2, 5, 10]
x_gamma_shapes = np.linspace(0, 20, 500)
for k_val in shapes_to_plot:
    plt.plot(x_gamma_shapes, stats.gamma.pdf(x_gamma_shapes, a=k_val, scale=1),
             label=f'k={k_val} ($\lambda=1$)')
plt.title('Gamma PDF for different shape parameters k (scale=1)')
plt.xlabel('x')
plt.ylabel('Density f(x)')
plt.legend()
plt.ylim(bottom=0)
plt.show()
<Figure size 1000x600 with 1 Axes>

5. Beta Distribution

The Beta distribution is defined on the interval [0,1][0, 1], making it ideal for modeling probabilities, proportions, or percentages. Its shape is highly flexible, controlled by two positive shape parameters, α\alpha and β\beta.

Definition: A random variable XX follows a Beta distribution with shape parameters α>0\alpha > 0 and β>0\beta > 0, denoted XBeta(α,β)X \sim Beta(\alpha, \beta), if its PDF involves the Beta function B(α,β)B(\alpha, \beta).

The Beta function is defined as B(α,β)=01tα1(1t)β1dt=Γ(α)Γ(β)Γ(α+β)B(\alpha, \beta) = \int_0^1 t^{\alpha-1} (1-t)^{\beta-1} dt = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma(\alpha+\beta)}.

PDF (Probability Density Function):

f(x;α,β)=1B(α,β)xα1(1x)β1=Γ(α+β)Γ(α)Γ(β)xα1(1x)β1f(x; \alpha, \beta) = \frac{1}{B(\alpha, \beta)} x^{\alpha-1} (1-x)^{\beta-1} = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} x^{\alpha-1} (1-x)^{\beta-1}

for 0x10 \le x \le 1.

CDF (Cumulative Distribution Function): The CDF is the regularized incomplete beta function, Ix(α,β)I_x(\alpha, \beta). It does not have a simple closed form but is computed numerically.

Parameters:

Expected Value and Variance:

Applications:

Example: Suppose we are modeling the click-through rate (CTR) of a new website banner ad. Based on prior experience with similar ads, we model the CTR (pp) using a Beta distribution with α=2\alpha = 2 and β=8\beta = 8. This suggests a CTR that is likely low (since β>α\beta > \alpha), centered around E[p]=2/(2+8)=0.2E[p] = 2/(2+8) = 0.2. What is the probability that the CTR is between 0.1 and 0.3?

Python Implementation (scipy.stats.beta):

scipy.stats.beta uses a for the first shape parameter (α\alpha) and b for the second shape parameter (β\beta). It also has optional loc and scale parameters to transform the distribution from [0,1][0, 1] to [loc,loc+scale][loc, loc+scale], but the default is the standard [0,1][0, 1] interval.

# Define parameters for the CTR example
alpha_param = 2 # Parameter a in scipy
beta_param = 8  # Parameter b in scipy
# Create a Beta distribution object
beta_dist = stats.beta(a=alpha_param, b=beta_param) # loc=0, scale=1 are defaults
# Print key properties
print(f"Beta Distribution Beta(alpha={alpha_param}, beta={beta_param})")
print(f"Mean (Expected Value): {beta_dist.mean():.2f} (Theoretical: {alpha_param / (alpha_param + beta_param):.2f})")
variance_theoretical = (alpha_param * beta_param) / ((alpha_param + beta_param)**2 * (alpha_param + beta_param + 1))
print(f"Variance: {beta_dist.var():.4f} (Theoretical: {variance_theoretical:.4f})")
print(f"Standard Deviation: {beta_dist.std():.4f}")
Beta Distribution Beta(alpha=2, beta=8)
Mean (Expected Value): 0.20 (Theoretical: 0.20)
Variance: 0.0145 (Theoretical: 0.0145)
Standard Deviation: 0.1206
# Calculate probability: P(0.1 <= p <= 0.3) = F(0.3) - F(0.1)
prob_between_01_03 = beta_dist.cdf(0.3) - beta_dist.cdf(0.1)
print(f"\nP(0.1 <= CTR <= 0.3) = {prob_between_01_03:.4f}")

P(0.1 <= CTR <= 0.3) = 0.5788
# Find the median (50th percentile) CTR
median_ctr = beta_dist.ppf(0.5)
print(f"Median (50th percentile) CTR: {median_ctr:.4f}")
Median (50th percentile) CTR: 0.1796
# Plotting the PDF and CDF for Beta(2, 8)
x = np.linspace(0, 1, 500)
pdf_values = beta_dist.pdf(x)
cdf_values = beta_dist.cdf(x)
fig, ax = plt.subplots(1, 2, figsize=(14, 5))

ax[0].plot(x, pdf_values, 'r-', lw=2, label=f'Beta PDF (alpha={alpha_param}, beta={beta_param})') # Problem: missing backslash before alpha/beta
ax[0].set_title('Beta PDF')
ax[0].set_xlabel('x (Click-Through Rate)')
ax[0].set_ylabel('Density f(x)')
ax[0].grid(True)
ax[0].legend()

ax[1].plot(x, cdf_values, 'b-', lw=2, label=f'Beta CDF (alpha={alpha_param}, beta={beta_param})') # Problem: missing backslash before alpha/beta
ax[1].set_title('Beta CDF')
ax[1].set_xlabel('x (Click-Through Rate)')
ax[1].set_ylabel('Cumulative Probability F(x)')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()
<Figure size 1400x500 with 2 Axes>
# Plot different shapes of Beta
plt.figure(figsize=(12, 8))
params_to_plot = [(1, 1), (2, 2), (5, 5), (2, 8), (8, 2), (0.5, 0.5), (1, 5), (5, 1)]
x_beta_shapes = np.linspace(0.001, 0.999, 500) # Avoid exact 0 and 1 for shapes < 1
<Figure size 1200x800 with 0 Axes>
for a_val, b_val in params_to_plot:
    plt.plot(x_beta_shapes, stats.beta.pdf(x_beta_shapes, a=a_val, b=b_val),
             label=f'$\\alpha={a_val}, \\beta={b_val}$')

plt.title('Beta PDF for different shape parameters $\\alpha, \\beta$')
plt.xlabel('x')
plt.ylabel('Density f(x)')
plt.legend()
plt.ylim(0, 5) # Adjust ylim as needed, Beta PDF can be > 1
plt.grid(True)
plt.show()
<Figure size 1000x600 with 1 Axes>

6. Relationships Between Distributions

It’s useful to be aware of the connections between these common distributions:

  1. Exponential and Gamma: As mentioned, the Exponential distribution is a special case of the Gamma distribution where the shape parameter k=1k=1.

    • Exp(λ)Gamma(k=1,λ)Exp(\lambda) \equiv Gamma(k=1, \lambda)

    • The sum of kk independent Exp(λ)Exp(\lambda) variables follows a Gamma(k,λ)Gamma(k, \lambda) distribution.

  2. Uniform and Beta: The standard Uniform distribution U(0,1)U(0, 1) is a special case of the Beta distribution where α=1\alpha = 1 and β=1\beta = 1.

    • U(0,1)Beta(α=1,β=1)U(0, 1) \equiv Beta(\alpha=1, \beta=1)

  3. Normal Approximation (via Central Limit Theorem): While not a direct equivalence, the Normal distribution can often approximate other distributions under certain conditions, most notably:

    • Binomial Approximation: A Binomial(n,pn, p) distribution can be approximated by N(np,np(1p))N(np, np(1-p)) when nn is large and pp is not too close to 0 or 1 (rules of thumb like np>5np > 5 and n(1p)>5n(1-p) > 5 are common).

    • Poisson Approximation: A Poisson(λ\lambda) distribution can be approximated by N(λ,λ)N(\lambda, \lambda) when λ\lambda is large (e.g., λ>10\lambda > 10 or 20).

    • Gamma Approximation: A Gamma(k,λk, \lambda) distribution becomes more symmetric and bell-shaped (approaching Normal) as the shape parameter kk increases. (We will explore the Central Limit Theorem in detail in Chapter 15).

  4. Chi-Squared Distribution: The Chi-Squared distribution with kk degrees of freedom, χ2(k)\chi^2(k), is a special case of the Gamma distribution: χ2(k)Gamma(k/2,rate=1/2)\chi^2(k) \equiv Gamma(k/2, \text{rate}=1/2) or Gamma(shape=k/2,scale=2)Gamma(\text{shape}=k/2, \text{scale}=2). It arises frequently in statistical testing.

Understanding these relationships helps build intuition and allows leveraging properties from one distribution to understand another.

7. Working with scipy.stats Continuous Distributions

The scipy.stats module provides a consistent interface for working with these (and many other) distributions. For a continuous distribution object dist created from stats.<distribution_name>():

Key Parameters:

Always check the documentation (help(stats.<distribution_name>)) for the specific parameter names and definitions used by SciPy, as they might differ slightly from textbook notations (especially rate vs. scale).

# %%
# Example: Using PPF for Normal distribution
# Find the interval [x1, x2] centered around the mean that contains 95% of the probability
mu = 100
sigma = 15
norm_dist = stats.norm(loc=mu, scale=sigma)
# The central 95% leaves 2.5% in each tail
lower_bound = norm_dist.ppf(0.025)
upper_bound = norm_dist.ppf(0.975) # Or norm_dist.isf(0.025)
print(f"The central 95% interval for N({mu}, {sigma**2}) is [{lower_bound:.2f}, {upper_bound:.2f}]")
# Check: These values should be approximately mu +/- 1.96*sigma
print(f"Theoretical interval [mu - 1.96*sigma, mu + 1.96*sigma]: [{mu - 1.96*sigma:.2f}, {mu + 1.96*sigma:.2f}]")
The central 95% interval for N(100, 225) is [70.60, 129.40]
Theoretical interval [mu - 1.96*sigma, mu + 1.96*sigma]: [70.60, 129.40]

Chapter Summary

This chapter introduced five fundamental continuous probability distributions:

We explored their PDFs, CDFs, parameters, moments (mean, variance), and typical applications. Crucially, we learned how to use the scipy.stats module to calculate probabilities, find percentiles, generate random samples, and visualize these distributions in Python. Understanding these common distributions provides a robust foundation for modeling a wide variety of continuous phenomena encountered in data analysis and probabilistic modeling.

Exercises

  1. Uniform Wait Time: A bus is scheduled to arrive every 20 minutes starting at the hour (e.g., 6:00, 6:20, 6:40...). You arrive at a random time. Assume your arrival time relative to the scheduled bus departure is uniformly distributed over the 20-minute interval. a. Define the random variable and the parameters of its distribution. b. What is the probability you wait less than 5 minutes for the next bus? c. What is the probability you wait between 10 and 15 minutes? d. What is the expected waiting time? Use scipy.stats.uniform.

  2. Exponential Service Time: The time it takes a bank teller to serve a customer follows an Exponential distribution with a mean service time of 3 minutes. a. What is the rate parameter λ\lambda? b. What is the probability a customer’s service takes more than 5 minutes? c. What is the probability a customer’s service takes less than 2 minutes? d. Given that a customer has already been served for 4 minutes, what is the probability their service will take at least 2 more minutes? (Hint: Memoryless property). Use scipy.stats.expon.

  3. Normal Heights: Assume adult male heights in a certain country are Normally distributed with a mean of 175 cm and a standard deviation of 7 cm. XN(175,72)X \sim N(175, 7^2). a. What percentage of men are taller than 185 cm? b. What percentage of men have heights between 170 cm and 180 cm? c. What height corresponds to the 90th percentile? d. Generate 5000 random samples from this distribution and plot a histogram. Overlay the theoretical PDF. Use scipy.stats.norm.

  4. Gamma Repair Time: The time (in hours) to repair a machine follows a Gamma distribution with shape parameter k=3k=3 and scale parameter θ=2\theta=2 (rate λ=0.5\lambda=0.5). a. What is the expected repair time and the standard deviation? b. What is the probability that the repair takes less than 5 hours? c. What is the probability that the repair takes more than 10 hours? Use scipy.stats.gamma.

  5. Beta Proportion: A chemical process yields a product whose proportion of impurity PP is modeled by a Beta(2,5)Beta(2, 5) distribution. a. What is the expected proportion of impurity? b. What is the probability that the impurity proportion is less than 0.1? c. What is the probability that the impurity proportion is greater than 0.5? d. Plot the PDF of this distribution. Use scipy.stats.beta.

Example Solution Structure for Exercise 1 (Students would fill this in)

# Exercise 1: Uniform Wait Time
wait_interval_max = 20
uniform_wait = stats.uniform(loc=0, scale=wait_interval_max)
# a. RV X = Waiting time (minutes), X ~ U(0, 20)
print("Ex 1: Uniform Wait Time")
print(f"a. X ~ U(0, {wait_interval_max})")
Ex 1: Uniform Wait Time
a. X ~ U(0, 20)
# b. P(X < 5)
prob_lt_5 = uniform_wait.cdf(5)
print(f"b. P(X < 5) = {prob_lt_5:.3f}")
b. P(X < 5) = 0.250
# c. P(10 <= X <= 15)
prob_10_15 = uniform_wait.cdf(15) - uniform_wait.cdf(10)
print(f"c. P(10 <= X <= 15) = {prob_10_15:.3f}")
c. P(10 <= X <= 15) = 0.250
# d. E[X]
expected_wait = uniform_wait.mean()
print(f"d. E[X] = {expected_wait:.2f} minutes")
d. E[X] = 10.00 minutes

(Students would add code for Exercises 2-5 similarly)