Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Welcome to Chapter 20! Throughout this book, we’ve used NumPy and SciPy for numerical computation – calculating probabilities as decimal approximations, simulating random processes, and plotting distributions. This numerical approach is powerful and practical for most real-world applications.

However, there’s another way to work with probability: symbolic computation. Instead of getting 0.16666... when calculating the probability of rolling a 1 on a fair die, we can get the exact answer: 1/6. Instead of numerically evaluating integrals or derivatives, we can manipulate formulas algebraically.

This chapter introduces SymPy (Symbolic Python), a Python library for symbolic mathematics. SymPy allows us to:

Learning Objectives

Why Symbolic Computation?

Numerical vs Symbolic Approaches

Throughout this book, we’ve primarily used the numerical approach:

import numpy as np
from scipy import stats

# Numerical: Probability of getting exactly 3 heads in 5 coin flips
prob_numerical = stats.binom.pmf(k=3, n=5, p=0.5)
print(f"Numerical result: {prob_numerical}")
print(f"As decimal: {prob_numerical:.10f}")
Numerical result: 0.31249999999999983
As decimal: 0.3125000000

The numerical approach gives us a floating-point approximation. For most practical purposes, this is perfect! But what if we want the exact answer?

import sympy as sp
from sympy.stats import Binomial, P, E, variance

# Symbolic: Same problem with exact arithmetic
n, k = 5, 3
p_sym = sp.Rational(1, 2)  # Exact 1/2, not 0.5

# Calculate using combinatorics
prob_symbolic = sp.binomial(n, k) * p_sym**k * (1-p_sym)**(n-k)
print(f"Symbolic result: {prob_symbolic}")
print(f"Simplified: {sp.simplify(prob_symbolic)}")
print(f"As decimal: {float(prob_symbolic)}")
Symbolic result: 5/16
Simplified: 5/16
As decimal: 0.3125

When to Use Each Approach

Use Numerical (NumPy/SciPy) when:

Use Symbolic (SymPy) when:

Best practice: Start with symbolic to understand the mathematics, then use numerical for computation.

SymPy Basics for Probability

Exact Numbers: Rational vs Float

import sympy as sp

# The difference between symbolic and numerical
print("Floating point (numerical):")
print(f"  1/3 = {1/3}")
print(f"  1/3 + 1/3 + 1/3 = {1/3 + 1/3 + 1/3}")

print("\nRational (symbolic):")
one_third = sp.Rational(1, 3)
print(f"  1/3 = {one_third}")
print(f"  1/3 + 1/3 + 1/3 = {one_third + one_third + one_third}")
Floating point (numerical):
  1/3 = 0.3333333333333333
  1/3 + 1/3 + 1/3 = 1.0

Rational (symbolic):
  1/3 = 1/3
  1/3 + 1/3 + 1/3 = 1

This exact arithmetic is crucial for probability calculations!

Symbolic Combinatorics

SymPy provides symbolic versions of the combinatorial functions we’ve been using:

# Comparison with math and scipy.special
import math
from scipy.special import comb, perm

n, k = 10, 3

print("=== Factorials ===")
print(f"math.factorial(10) = {math.factorial(n)}")
print(f"sp.factorial(10) = {sp.factorial(n)}")

# Symbolic with variables
n_sym, k_sym = sp.symbols('n k', integer=True, positive=True)
print(f"\nSymbolic: n! = {sp.factorial(n_sym)}")

print("\n=== Permutations ===")
print(f"scipy perm(10, 3) = {perm(n, k, exact=True)}")
print(f"SymPy P(10, 3) = {sp.factorial(n) / sp.factorial(n-k)}")

print("\n=== Combinations ===")
print(f"scipy comb(10, 3) = {comb(n, k, exact=True)}")
print(f"SymPy C(10, 3) = {sp.binomial(n, k)}")

# With symbolic parameters
print(f"\nSymbolic: C(n, k) = {sp.binomial(n_sym, k_sym)}")
=== Factorials ===
math.factorial(10) = 3628800
sp.factorial(10) = 3628800

Symbolic: n! = factorial(n)

=== Permutations ===
scipy perm(10, 3) = 720
SymPy P(10, 3) = 720

=== Combinations ===
scipy comb(10, 3) = 120
SymPy C(10, 3) = 120

Symbolic: C(n, k) = binomial(n, k)

Simplifying Expressions

One of SymPy’s superpowers is simplifying complex expressions:

n, k = sp.symbols('n k', integer=True, positive=True)

# Probability of exactly k successes in n Bernoulli trials
p = sp.Symbol('p', real=True, positive=True)
q = 1 - p

# Binomial probability formula
binomial_pmf = sp.binomial(n, k) * p**k * q**(n-k)
print("Binomial PMF:")
print(f"  Original: {binomial_pmf}")
print(f"  Expanded: {sp.expand(binomial_pmf)}")

# Verify that probabilities sum to 1 (for small n)
n_val = 3
total_prob = sum(binomial_pmf.subs(n, n_val).subs(k, i) for i in range(n_val + 1))
print(f"\nSum of probabilities (n=3): {sp.simplify(total_prob)}")
Binomial PMF:
  Original: p**k*(1 - p)**(-k + n)*binomial(n, k)
  Expanded: p**k*(1 - p)**(-k + n)*binomial(n, k)

Sum of probabilities (n=3): 1

Symbolic Random Variables with sympy.stats

The sympy.stats module lets us work with random variables symbolically, similar to how we used scipy.stats numerically.

Discrete Distributions

from sympy.stats import Die, Coin, Binomial, Poisson, Geometric
from sympy.stats import P, E, variance, density, sample

# Fair six-sided die
X = Die('X', 6)

print("=== Die Probabilities ===")
print(f"P(X = 3) = {P(X == 3)}")
print(f"P(X > 4) = {P(X > 4)}")
print(f"P(X is even) = {P(X % 2 == 0)}")  # Note: May not simplify automatically

print(f"\nE(X) = {E(X)}")
print(f"Var(X) = {variance(X)}")
=== Die Probabilities ===
P(X = 3) = 0
P(X > 4) = 1/3
P(X is even) = 0

E(X) = 7/2
Var(X) = 35/12
# Binomial distribution
n_trials = sp.Symbol('n', integer=True, positive=True)
p_success = sp.Symbol('p', real=True, positive=True)

# Create symbolic binomial random variable
X_binom = Binomial('X', 10, sp.Rational(1, 2))

print("=== Binomial(10, 1/2) ===")
print(f"P(X = 5) = {P(X_binom == 5)}")
print(f"P(X <= 3) = {P(X_binom <= 3)}")
print(f"E(X) = {E(X_binom)}")
print(f"Var(X) = {variance(X_binom)}")

# With symbolic parameters - expectations and variances
X_sym = Binomial('X', n_trials, p_success)
print(f"\n=== Binomial(n, p) - Symbolic ===")
print(f"E(X) = {E(X_sym)}")
print(f"Var(X) = {variance(X_sym)}")
=== Binomial(10, 1/2) ===
P(X = 5) = 0
P(X <= 3) = 11/64
E(X) = 5
Var(X) = 5/2

=== Binomial(n, p) - Symbolic ===
E(X) = Sum(Piecewise((_k*p**_k*(1 - p)**(-_k + n)*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n))
Var(X) = Sum(Piecewise((p**_k*(1 - p)**(-_k + n)*(_k - Sum(Piecewise((_k*p**_k*(1 - p)**(-_k + n)*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n)))**2*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n))
# Geometric distribution
p_geo = sp.Rational(1, 6)  # Probability of rolling a 6
X_geo = Geometric('X', p_geo)

print("=== Geometric(1/6) - First success ===")
print(f"P(X = 1) = {P(X_geo == 1)}")  # Success on first trial
print(f"P(X = 6) = {P(X_geo == 6)}")  # Success on sixth trial
print(f"E(X) = {E(X_geo)}")
print(f"Var(X) = {variance(X_geo)}")
=== Geometric(1/6) - First success ===
P(X = 1) = 0
P(X = 6) = 0
E(X) = 6
Var(X) = 30

Continuous Distributions

from sympy.stats import Normal, Exponential, Uniform, ContinuousRV
from sympy import exp, pi, sqrt, oo

# Normal distribution with symbolic parameters
mu, sigma = sp.symbols('mu sigma', real=True)
sigma_pos = sp.Symbol('sigma', positive=True, real=True)

X_norm = Normal('X', mu, sigma_pos)

print("=== Normal(μ, σ) - Symbolic ===")
print(f"E(X) = {E(X_norm)}")
print(f"Var(X) = {variance(X_norm)}")

# With specific values
X_standard = Normal('Z', 0, 1)
print(f"\n=== Standard Normal(0, 1) ===")
print(f"E(Z) = {E(X_standard)}")
print(f"Var(Z) = {variance(X_standard)}")

# Probability calculations (may be slow for symbolic results)
X_concrete = Normal('X', 100, 15)
print(f"\n=== Normal(100, 15) ===")
print(f"P(X > 115) = {P(X_concrete > 115)}")  # Returns in terms of erf
=== Normal(μ, σ) - Symbolic ===
E(X) = mu
Var(X) = sigma**2

=== Standard Normal(0, 1) ===
E(Z) = 0
Var(Z) = 1

=== Normal(100, 15) ===
P(X > 115) = sqrt(2)*(-sqrt(2)*sqrt(pi)*erf(sqrt(2)/2) + sqrt(2)*sqrt(pi))/(4*sqrt(pi))
# Exponential distribution
lambda_sym = sp.Symbol('lambda', positive=True)
rate = sp.Rational(1, 2)

X_exp = Exponential('X', rate)
print("=== Exponential(λ=1/2) ===")
print(f"E(X) = {E(X_exp)}")
print(f"Var(X) = {variance(X_exp)}")

# Symbolic rate parameter
X_exp_sym = Exponential('X', lambda_sym)
print(f"\n=== Exponential(λ) - Symbolic ===")
print(f"E(X) = {E(X_exp_sym)}")
print(f"Var(X) = {variance(X_exp_sym)}")
=== Exponential(λ=1/2) ===
E(X) = 2
Var(X) = 4

=== Exponential(λ) - Symbolic ===
E(X) = 1/lambda
Var(X) = lambda**(-2)

Deriving Probability Formulas

One of the most powerful uses of SymPy is deriving and verifying probability formulas.

Moment Generating Functions

# Binomial MGF
n, p, t = sp.symbols('n p t', real=True)
n = sp.Symbol('n', integer=True, positive=True)
p = sp.Symbol('p', real=True, positive=True)

# MGF of Binomial: M(t) = (pe^t + q)^n where q = 1-p
q = 1 - p
mgf_binomial = (p * sp.exp(t) + q)**n

print("=== Binomial MGF ===")
print(f"M(t) = {mgf_binomial}")

# Expected value is the first derivative at t=0
mgf_derivative = sp.diff(mgf_binomial, t)
expected_value = mgf_derivative.subs(t, 0)
print(f"\nM'(t) = {mgf_derivative}")
print(f"E(X) = M'(0) = {sp.simplify(expected_value)}")

# Variance from second moment
mgf_second_derivative = sp.diff(mgf_binomial, t, 2)
second_moment = mgf_second_derivative.subs(t, 0)
variance_binomial = sp.simplify(second_moment - expected_value**2)
print(f"\nM''(0) = {sp.simplify(second_moment)}")
print(f"Var(X) = E(X²) - E(X)² = {variance_binomial}")
=== Binomial MGF ===
M(t) = (p*exp(t) - p + 1)**n

M'(t) = n*p*(p*exp(t) - p + 1)**n*exp(t)/(p*exp(t) - p + 1)
E(X) = M'(0) = n*p

M''(0) = n*p*(n*p - p + 1)
Var(X) = E(X²) - E(X)² = n*p*(1 - p)

Bayes’ Theorem Symbolically

# Symbolic Bayes' Theorem
# P(A|B) = P(B|A) * P(A) / P(B)

P_A, P_B, P_B_given_A = sp.symbols('P(A) P(B) P(B|A)', real=True, positive=True)

# Prior, likelihood, and evidence
prior = P_A
likelihood = P_B_given_A
evidence = P_B

# Posterior
posterior = (likelihood * prior) / evidence

print("=== Bayes' Theorem ===")
print(f"P(A|B) = {posterior}")

# Concrete example: Medical test
# Disease prevalence = 1%
# Test sensitivity (true positive rate) = 95%
# Test specificity (true negative rate) = 90%
# What's P(Disease|Positive) ?

P_disease = sp.Rational(1, 100)
P_no_disease = 1 - P_disease
P_pos_given_disease = sp.Rational(95, 100)
P_pos_given_no_disease = sp.Rational(10, 100)  # False positive = 1 - specificity

# Total probability of testing positive
P_positive = P_pos_given_disease * P_disease + P_pos_given_no_disease * P_no_disease

# Posterior: P(Disease|Positive)
P_disease_given_pos = (P_pos_given_disease * P_disease) / P_positive

print(f"\n=== Medical Test Example ===")
print(f"P(Disease) = {P_disease}")
print(f"P(Positive|Disease) = {P_pos_given_disease}")
print(f"P(Positive|No Disease) = {P_pos_given_no_disease}")
print(f"P(Positive) = {P_positive} = {float(P_positive):.4f}")
print(f"P(Disease|Positive) = {P_disease_given_pos} = {float(P_disease_given_pos):.4f}")
=== Bayes' Theorem ===
P(A|B) = P(A)*P(B|A)/P(B)

=== Medical Test Example ===
P(Disease) = 1/100
P(Positive|Disease) = 19/20
P(Positive|No Disease) = 1/10
P(Positive) = 217/2000 = 0.1085
P(Disease|Positive) = 19/217 = 0.0876

Converting Between Symbolic and Numerical

# Symbolic computation
from sympy.stats import Binomial, P, E
import numpy as np
from scipy import stats

n, p_val = 20, sp.Rational(3, 10)
X_sym = Binomial('X', n, p_val)

# Symbolic probability
prob_sym = P(X_sym == 5)
print(f"Symbolic: P(X=5) = {prob_sym}")
print(f"Exact fraction: {prob_sym}")

# Convert to float
prob_float = float(prob_sym)
print(f"As float: {prob_float}")

# Compare with scipy
prob_scipy = stats.binom.pmf(5, n, float(p_val))
print(f"SciPy numerical: {prob_scipy}")
print(f"Difference: {abs(prob_float - prob_scipy)}")

# Expected value comparison
exp_sym = E(X_sym)
exp_scipy = stats.binom.mean(n, float(p_val))
print(f"\nSymbolic E(X) = {exp_sym} = {float(exp_sym)}")
print(f"SciPy E(X) = {exp_scipy}")
Symbolic: P(X=5) = 0
Exact fraction: 0
As float: 0.0
SciPy numerical: 0.17886305056987964
Difference: 0.17886305056987964

Symbolic E(X) = 6 = 6.0
SciPy E(X) = 6.0

Practical Examples

Example 1: Exact Poker Probabilities

# Calculate exact probability of a full house in 5-card poker

# Total 5-card hands
total_hands = sp.binomial(52, 5)

# Full house: 3 of one rank, 2 of another
# Choose rank for triple: C(13, 1)
# Choose 3 suits from 4: C(4, 3)
# Choose rank for pair: C(12, 1)  # Can't be same as triple
# Choose 2 suits from 4: C(4, 2)

ways_full_house = sp.binomial(13, 1) * sp.binomial(4, 3) * sp.binomial(12, 1) * sp.binomial(4, 2)
prob_full_house = sp.Rational(ways_full_house, total_hands)

print(f"Total 5-card hands: {total_hands}")
print(f"Ways to get full house: {ways_full_house}")
print(f"Probability (exact): {prob_full_house}")
print(f"Probability (decimal): {float(prob_full_house):.6f}")
print(f"Odds: 1 in {float(1/prob_full_house):.2f}")
Total 5-card hands: 2598960
Ways to get full house: 3744
Probability (exact): 6/4165
Probability (decimal): 0.001441
Odds: 1 in 694.17

Example 2: Birthday Problem - Exact Solution

# What's the probability that in a group of n people, at least 2 share a birthday?

def birthday_probability_exact(n):
    """Calculate exact probability using SymPy"""
    # P(at least 2 share) = 1 - P(all different)
    # P(all different) = 365/365 * 364/365 * 363/365 * ... * (365-n+1)/365

    prob_all_different = sp.Rational(1, 1)
    for i in range(n):
        prob_all_different *= sp.Rational(365 - i, 365)

    prob_at_least_two_share = 1 - prob_all_different
    return prob_at_least_two_share

# Test for different group sizes
for n in [10, 20, 23, 30, 50]:
    prob = birthday_probability_exact(n)
    print(f"n={n:2d}: P(at least 2 share) = {float(prob):.6f}")

# The famous n=23 case
prob_23 = birthday_probability_exact(23)
print(f"\nExact probability for n=23: {float(prob_23):.10f}")
print(f"Greater than 50%? {float(prob_23) > 0.5}")
n=10: P(at least 2 share) = 0.116948

n=20: P(at least 2 share) = 0.411438
n=23: P(at least 2 share) = 0.507297
n=30: P(at least 2 share) = 0.706316
n=50: P(at least 2 share) = 0.970374

Exact probability for n=23: 0.5072972343
Greater than 50%? True

Example 3: Deriving Variance Formula

# Prove that Var(X) = E(X²) - E(X)² symbolically

from sympy.stats import DiscreteRV, E, variance

# Create a simple discrete RV
x = sp.Symbol('x')
X = DiscreteRV(x, {1: sp.Rational(1,6),
                    2: sp.Rational(1,6),
                    3: sp.Rational(1,6),
                    4: sp.Rational(1,6),
                    5: sp.Rational(1,6),
                    6: sp.Rational(1,6)})

# Method 1: Using variance function
var_direct = variance(X)

# Method 2: E(X²) - E(X)²
E_X = E(X)
E_X_squared = E(X**2)
var_formula = E_X_squared - E_X**2

print(f"E(X) = {E_X}")
print(f"E(X²) = {E_X_squared}")
print(f"\nVar(X) using variance(): {var_direct}")
print(f"Var(X) using E(X²) - E(X)²: {sp.simplify(var_formula)}")
print(f"Are they equal? {sp.simplify(var_direct - var_formula) == 0}")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[16], line 15
      7 X = DiscreteRV(x, {1: sp.Rational(1,6),
      8                     2: sp.Rational(1,6),
      9                     3: sp.Rational(1,6),
     10                     4: sp.Rational(1,6),
     11                     5: sp.Rational(1,6),
     12                     6: sp.Rational(1,6)})
     14 # Method 1: Using variance function
---> 15 var_direct = variance(X)
     17 # Method 2: E(X²) - E(X)²
     18 E_X = E(X)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/rv_interface.py:80, in variance(X, condition, **kwargs)
     77     from sympy.stats.symbolic_probability import Variance
     78     return Variance(X, condition)
---> 80 return cmoment(X, 2, condition, **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/rv_interface.py:242, in cmoment(X, n, condition, evaluate, **kwargs)
    240 from sympy.stats.symbolic_probability import CentralMoment
    241 if evaluate:
--> 242     return CentralMoment(X, n, condition).doit()
    243 return CentralMoment(X, n, condition).rewrite(Integral)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:688, in CentralMoment.doit(self, **hints)
    687 def doit(self, **hints):
--> 688     return self.rewrite(Expectation).doit(**hints)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:260, in Expectation.doit(self, **hints)
    257 evaluate = hints.get('evaluate', True)
    259 if deep:
--> 260     expr = expr.doit(**hints)
    262 if not is_random(expr) or isinstance(expr, Expectation):  # expr isn't random?
    263     return expr

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/basic.py:1947, in Basic.doit(self, **hints)
   1928 """Evaluate objects that are not evaluated by default like limits,
   1929 integrals, sums and products. All objects of this kind will be
   1930 evaluated recursively, unless some species were excluded via 'hints'
   (...)   1944 
   1945 """
   1946 if hints.get('deep', True):
-> 1947     terms = [term.doit(**hints) if isinstance(term, Basic) else term
   1948                                  for term in self.args]
   1949     return self.func(*terms)
   1950 else:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/basic.py:1947, in <listcomp>(.0)
   1928 """Evaluate objects that are not evaluated by default like limits,
   1929 integrals, sums and products. All objects of this kind will be
   1930 evaluated recursively, unless some species were excluded via 'hints'
   (...)   1944 
   1945 """
   1946 if hints.get('deep', True):
-> 1947     terms = [term.doit(**hints) if isinstance(term, Basic) else term
   1948                                  for term in self.args]
   1949     return self.func(*terms)
   1950 else:

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in AssocOp.doit(self, **hints)
    476 def doit(self, **hints):
    477     if hints.get('deep', True):
--> 478         terms = [term.doit(**hints) for term in self.args]
    479     else:
    480         terms = self.args

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in <listcomp>(.0)
    476 def doit(self, **hints):
    477     if hints.get('deep', True):
--> 478         terms = [term.doit(**hints) for term in self.args]
    479     else:
    480         terms = self.args

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in AssocOp.doit(self, **hints)
    476 def doit(self, **hints):
    477     if hints.get('deep', True):
--> 478         terms = [term.doit(**hints) for term in self.args]
    479     else:
    480         terms = self.args

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in <listcomp>(.0)
    476 def doit(self, **hints):
    477     if hints.get('deep', True):
--> 478         terms = [term.doit(**hints) for term in self.args]
    479     else:
    480         terms = self.args

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:288, in Expectation.doit(self, **hints)
    286     return self.func(expr)
    287 # Otherwise case is simple, pass work off to the ProbabilitySpace
--> 288 result = pspace(expr).compute_expectation(expr, evaluate=evaluate)
    289 if hasattr(result, 'doit') and evaluate:
    290     return result.doit(**hints)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:313, in SingleDiscretePSpace.compute_expectation(self, expr, rvs, evaluate, **kwargs)
    311 x = self.value.symbol
    312 try:
--> 313     return self.distribution.expectation(expr, x, evaluate=evaluate,
    314             **kwargs)
    315 except NotImplementedError:
    316     return Sum(expr * self.pdf, (x, self.set.inf, self.set.sup),
    317             **kwargs)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:160, in SingleDiscreteDistribution.expectation(self, expr, var, evaluate, **kwargs)
    156 p = poly(expr, var)
    158 t = Dummy('t', real=True)
--> 160 mgf = self.moment_generating_function(t)
    161 deg = p.degree()
    162 taylor = poly(series(mgf, t, 0, deg + 1).removeO(), t)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:123, in SingleDiscreteDistribution.moment_generating_function(self, t, **kwargs)
    121     if mgf is not None:
    122         return mgf
--> 123 return self.compute_moment_generating_function(**kwargs)(t)

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/cache.py:72, in __cacheit.<locals>.func_wrapper.<locals>.wrapper(*args, **kwargs)
     69 @wraps(func)
     70 def wrapper(*args, **kwargs):
     71     try:
---> 72         retval = cfunc(*args, **kwargs)
     73     except TypeError as e:
     74         if not e.args or not e.args[0].startswith('unhashable type:'):

File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:112, in SingleDiscreteDistribution.compute_moment_generating_function(self, **kwargs)
    110 x = Dummy('x', integer=True)
    111 pdf = self.pdf(x)
--> 112 mgf = summation(exp(t*x)*pdf, (x, self.set.inf, self.set.sup))
    113 return Lambda(t, mgf)

TypeError: unsupported operand type(s) for *: 'exp' and 'Dict'

Working with Joint Distributions

from sympy.stats import Die, P, E

# Two independent dice
X = Die('X', 6)
Y = Die('Y', 6)

# Sum of two dice
S = X + Y

print("=== Sum of Two Dice ===")
print(f"P(S = 7) = {P(S == 7)}")
print(f"P(S = 12) = {P(S == 12)}")
print(f"P(S > 10) = {P(S > 10)}")

print(f"\nE(X) = {E(X)}")
print(f"E(Y) = {E(Y)}")
print(f"E(X + Y) = {E(S)}")
print(f"Verify: E(X) + E(Y) = {E(X) + E(Y)}")

# Conditional probability
print(f"\n=== Conditional Probabilities ===")
# P(X=6 | X+Y=10)
# Using sympy.stats conditional might be complex, so we'll use the definition
# P(X=6, X+Y=10) / P(X+Y=10)

Limitations and Performance Considerations

While SymPy is powerful, it has limitations:

import time

# SymPy can be slow for complex symbolic computations
print("=== Performance Comparison ===")

# Numerical computation (fast)
start = time.time()
result_numerical = stats.binom.pmf(50, 100, 0.5)
time_numerical = time.time() - start

# Symbolic computation (slower)
start = time.time()
X_sym = Binomial('X', 100, sp.Rational(1, 2))
result_symbolic = float(P(X_sym == 50))
time_symbolic = time.time() - start

print(f"Numerical (SciPy): {result_numerical:.10f} in {time_numerical:.6f}s")
print(f"Symbolic (SymPy):  {result_symbolic:.10f} in {time_symbolic:.6f}s")
print(f"Speedup: {time_symbolic/time_numerical:.1f}x slower")

Key takeaways:

Summary

In this chapter, we explored symbolic probability computation with SymPy:

Key Concepts:

When to Use SymPy:

When to Use NumPy/SciPy:

Best Practice: Use SymPy to understand the mathematics, then use NumPy/SciPy for practical computation.

In the next chapter, we’ll explore SageMath, a comprehensive mathematical software system that combines the power of SymPy, NumPy, and many other tools into a unified environment.

Exercises

  1. Exact Coin Flips: Calculate the exact probability of getting exactly 7 heads in 10 fair coin flips using SymPy. Compare with SciPy.

  2. Symbolic Variance: Create a symbolic random variable following a Poisson distribution with parameter λ. Derive E(X) and Var(X) symbolically.

  3. Poker Hands: Calculate the exact probability of getting a “straight flush” (5 consecutive cards of the same suit) in 5-card poker.

  4. Conditional Probability: Using two symbolic dice X and Y, calculate P(X = 4 | X + Y = 9) exactly.

  5. MGF Derivation: Derive the moment generating function for a Geometric distribution with parameter p, and use it to find E(X) and Var(X).

  6. Birthday Problem Extension: Modify the birthday problem to find the smallest group size where P(at least 2 share) > 0.9.

  7. Bayes Update: A rare disease affects 0.1% of the population. A test is 99% accurate (both sensitivity and specificity). If you test positive, what’s the exact probability you have the disease?

  8. Sum of Uniforms: Create two independent discrete uniform random variables on {1,2,3,4,5,6}. Find the exact PMF of their sum.

Further Reading