Welcome to Chapter 20! Throughout this book, we’ve used NumPy and SciPy for numerical computation – calculating probabilities as decimal approximations, simulating random processes, and plotting distributions. This numerical approach is powerful and practical for most real-world applications.
However, there’s another way to work with probability: symbolic computation. Instead of getting 0.16666... when calculating the probability of rolling a 1 on a fair die, we can get the exact answer: 1/6. Instead of numerically evaluating integrals or derivatives, we can manipulate formulas algebraically.
This chapter introduces SymPy (Symbolic Python), a Python library for symbolic mathematics. SymPy allows us to:
Work with exact fractions instead of decimal approximations
Derive probability formulas with symbolic parameters
Manipulate probability distributions algebraically
Verify textbook formulas and proofs
Bridge numerical computation with pure mathematics
Learning Objectives¶
Understand when to use symbolic vs numerical computation
Work with exact probabilities using SymPy fractions
Use SymPy for symbolic combinatorics (factorials, binomials)
Create and manipulate symbolic random variables with
sympy.statsCompute probabilities, expectations, and variances symbolically
Derive probability formulas with symbolic parameters
Convert between symbolic and numerical representations
Why Symbolic Computation?¶
Numerical vs Symbolic Approaches¶
Throughout this book, we’ve primarily used the numerical approach:
import numpy as np
from scipy import stats
# Numerical: Probability of getting exactly 3 heads in 5 coin flips
prob_numerical = stats.binom.pmf(k=3, n=5, p=0.5)
print(f"Numerical result: {prob_numerical}")
print(f"As decimal: {prob_numerical:.10f}")Numerical result: 0.31249999999999983
As decimal: 0.3125000000
The numerical approach gives us a floating-point approximation. For most practical purposes, this is perfect! But what if we want the exact answer?
import sympy as sp
from sympy.stats import Binomial, P, E, variance
# Symbolic: Same problem with exact arithmetic
n, k = 5, 3
p_sym = sp.Rational(1, 2) # Exact 1/2, not 0.5
# Calculate using combinatorics
prob_symbolic = sp.binomial(n, k) * p_sym**k * (1-p_sym)**(n-k)
print(f"Symbolic result: {prob_symbolic}")
print(f"Simplified: {sp.simplify(prob_symbolic)}")
print(f"As decimal: {float(prob_symbolic)}")Symbolic result: 5/16
Simplified: 5/16
As decimal: 0.3125
When to Use Each Approach¶
Use Numerical (NumPy/SciPy) when:
Working with real-world data or measurements
Running simulations with large datasets
Speed is critical
Decimal approximations are sufficient
Working with continuous distributions that don’t have closed forms
Use Symbolic (SymPy) when:
You need exact answers (fractions, expressions with π, e, √2)
Deriving formulas with unknown parameters
Teaching or learning (exact answers aid understanding)
Verifying mathematical proofs
Working with small discrete problems
You want to manipulate algebraic expressions
Best practice: Start with symbolic to understand the mathematics, then use numerical for computation.
SymPy Basics for Probability¶
Exact Numbers: Rational vs Float¶
import sympy as sp
# The difference between symbolic and numerical
print("Floating point (numerical):")
print(f" 1/3 = {1/3}")
print(f" 1/3 + 1/3 + 1/3 = {1/3 + 1/3 + 1/3}")
print("\nRational (symbolic):")
one_third = sp.Rational(1, 3)
print(f" 1/3 = {one_third}")
print(f" 1/3 + 1/3 + 1/3 = {one_third + one_third + one_third}")Floating point (numerical):
1/3 = 0.3333333333333333
1/3 + 1/3 + 1/3 = 1.0
Rational (symbolic):
1/3 = 1/3
1/3 + 1/3 + 1/3 = 1
This exact arithmetic is crucial for probability calculations!
Symbolic Combinatorics¶
SymPy provides symbolic versions of the combinatorial functions we’ve been using:
# Comparison with math and scipy.special
import math
from scipy.special import comb, perm
n, k = 10, 3
print("=== Factorials ===")
print(f"math.factorial(10) = {math.factorial(n)}")
print(f"sp.factorial(10) = {sp.factorial(n)}")
# Symbolic with variables
n_sym, k_sym = sp.symbols('n k', integer=True, positive=True)
print(f"\nSymbolic: n! = {sp.factorial(n_sym)}")
print("\n=== Permutations ===")
print(f"scipy perm(10, 3) = {perm(n, k, exact=True)}")
print(f"SymPy P(10, 3) = {sp.factorial(n) / sp.factorial(n-k)}")
print("\n=== Combinations ===")
print(f"scipy comb(10, 3) = {comb(n, k, exact=True)}")
print(f"SymPy C(10, 3) = {sp.binomial(n, k)}")
# With symbolic parameters
print(f"\nSymbolic: C(n, k) = {sp.binomial(n_sym, k_sym)}")=== Factorials ===
math.factorial(10) = 3628800
sp.factorial(10) = 3628800
Symbolic: n! = factorial(n)
=== Permutations ===
scipy perm(10, 3) = 720
SymPy P(10, 3) = 720
=== Combinations ===
scipy comb(10, 3) = 120
SymPy C(10, 3) = 120
Symbolic: C(n, k) = binomial(n, k)
Simplifying Expressions¶
One of SymPy’s superpowers is simplifying complex expressions:
n, k = sp.symbols('n k', integer=True, positive=True)
# Probability of exactly k successes in n Bernoulli trials
p = sp.Symbol('p', real=True, positive=True)
q = 1 - p
# Binomial probability formula
binomial_pmf = sp.binomial(n, k) * p**k * q**(n-k)
print("Binomial PMF:")
print(f" Original: {binomial_pmf}")
print(f" Expanded: {sp.expand(binomial_pmf)}")
# Verify that probabilities sum to 1 (for small n)
n_val = 3
total_prob = sum(binomial_pmf.subs(n, n_val).subs(k, i) for i in range(n_val + 1))
print(f"\nSum of probabilities (n=3): {sp.simplify(total_prob)}")Binomial PMF:
Original: p**k*(1 - p)**(-k + n)*binomial(n, k)
Expanded: p**k*(1 - p)**(-k + n)*binomial(n, k)
Sum of probabilities (n=3): 1
Symbolic Random Variables with sympy.stats¶
The sympy.stats module lets us work with random variables symbolically, similar to how we used scipy.stats numerically.
Discrete Distributions¶
from sympy.stats import Die, Coin, Binomial, Poisson, Geometric
from sympy.stats import P, E, variance, density, sample
# Fair six-sided die
X = Die('X', 6)
print("=== Die Probabilities ===")
print(f"P(X = 3) = {P(X == 3)}")
print(f"P(X > 4) = {P(X > 4)}")
print(f"P(X is even) = {P(X % 2 == 0)}") # Note: May not simplify automatically
print(f"\nE(X) = {E(X)}")
print(f"Var(X) = {variance(X)}")=== Die Probabilities ===
P(X = 3) = 0
P(X > 4) = 1/3
P(X is even) = 0
E(X) = 7/2
Var(X) = 35/12
# Binomial distribution
n_trials = sp.Symbol('n', integer=True, positive=True)
p_success = sp.Symbol('p', real=True, positive=True)
# Create symbolic binomial random variable
X_binom = Binomial('X', 10, sp.Rational(1, 2))
print("=== Binomial(10, 1/2) ===")
print(f"P(X = 5) = {P(X_binom == 5)}")
print(f"P(X <= 3) = {P(X_binom <= 3)}")
print(f"E(X) = {E(X_binom)}")
print(f"Var(X) = {variance(X_binom)}")
# With symbolic parameters - expectations and variances
X_sym = Binomial('X', n_trials, p_success)
print(f"\n=== Binomial(n, p) - Symbolic ===")
print(f"E(X) = {E(X_sym)}")
print(f"Var(X) = {variance(X_sym)}")=== Binomial(10, 1/2) ===
P(X = 5) = 0
P(X <= 3) = 11/64
E(X) = 5
Var(X) = 5/2
=== Binomial(n, p) - Symbolic ===
E(X) = Sum(Piecewise((_k*p**_k*(1 - p)**(-_k + n)*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n))
Var(X) = Sum(Piecewise((p**_k*(1 - p)**(-_k + n)*(_k - Sum(Piecewise((_k*p**_k*(1 - p)**(-_k + n)*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n)))**2*binomial(n, _k), _k <= n), (0, True)), (_k, 0, n))
# Geometric distribution
p_geo = sp.Rational(1, 6) # Probability of rolling a 6
X_geo = Geometric('X', p_geo)
print("=== Geometric(1/6) - First success ===")
print(f"P(X = 1) = {P(X_geo == 1)}") # Success on first trial
print(f"P(X = 6) = {P(X_geo == 6)}") # Success on sixth trial
print(f"E(X) = {E(X_geo)}")
print(f"Var(X) = {variance(X_geo)}")=== Geometric(1/6) - First success ===
P(X = 1) = 0
P(X = 6) = 0
E(X) = 6
Var(X) = 30
Continuous Distributions¶
from sympy.stats import Normal, Exponential, Uniform, ContinuousRV
from sympy import exp, pi, sqrt, oo
# Normal distribution with symbolic parameters
mu, sigma = sp.symbols('mu sigma', real=True)
sigma_pos = sp.Symbol('sigma', positive=True, real=True)
X_norm = Normal('X', mu, sigma_pos)
print("=== Normal(μ, σ) - Symbolic ===")
print(f"E(X) = {E(X_norm)}")
print(f"Var(X) = {variance(X_norm)}")
# With specific values
X_standard = Normal('Z', 0, 1)
print(f"\n=== Standard Normal(0, 1) ===")
print(f"E(Z) = {E(X_standard)}")
print(f"Var(Z) = {variance(X_standard)}")
# Probability calculations (may be slow for symbolic results)
X_concrete = Normal('X', 100, 15)
print(f"\n=== Normal(100, 15) ===")
print(f"P(X > 115) = {P(X_concrete > 115)}") # Returns in terms of erf=== Normal(μ, σ) - Symbolic ===
E(X) = mu
Var(X) = sigma**2
=== Standard Normal(0, 1) ===
E(Z) = 0
Var(Z) = 1
=== Normal(100, 15) ===
P(X > 115) = sqrt(2)*(-sqrt(2)*sqrt(pi)*erf(sqrt(2)/2) + sqrt(2)*sqrt(pi))/(4*sqrt(pi))
# Exponential distribution
lambda_sym = sp.Symbol('lambda', positive=True)
rate = sp.Rational(1, 2)
X_exp = Exponential('X', rate)
print("=== Exponential(λ=1/2) ===")
print(f"E(X) = {E(X_exp)}")
print(f"Var(X) = {variance(X_exp)}")
# Symbolic rate parameter
X_exp_sym = Exponential('X', lambda_sym)
print(f"\n=== Exponential(λ) - Symbolic ===")
print(f"E(X) = {E(X_exp_sym)}")
print(f"Var(X) = {variance(X_exp_sym)}")=== Exponential(λ=1/2) ===
E(X) = 2
Var(X) = 4
=== Exponential(λ) - Symbolic ===
E(X) = 1/lambda
Var(X) = lambda**(-2)
Deriving Probability Formulas¶
One of the most powerful uses of SymPy is deriving and verifying probability formulas.
Moment Generating Functions¶
# Binomial MGF
n, p, t = sp.symbols('n p t', real=True)
n = sp.Symbol('n', integer=True, positive=True)
p = sp.Symbol('p', real=True, positive=True)
# MGF of Binomial: M(t) = (pe^t + q)^n where q = 1-p
q = 1 - p
mgf_binomial = (p * sp.exp(t) + q)**n
print("=== Binomial MGF ===")
print(f"M(t) = {mgf_binomial}")
# Expected value is the first derivative at t=0
mgf_derivative = sp.diff(mgf_binomial, t)
expected_value = mgf_derivative.subs(t, 0)
print(f"\nM'(t) = {mgf_derivative}")
print(f"E(X) = M'(0) = {sp.simplify(expected_value)}")
# Variance from second moment
mgf_second_derivative = sp.diff(mgf_binomial, t, 2)
second_moment = mgf_second_derivative.subs(t, 0)
variance_binomial = sp.simplify(second_moment - expected_value**2)
print(f"\nM''(0) = {sp.simplify(second_moment)}")
print(f"Var(X) = E(X²) - E(X)² = {variance_binomial}")=== Binomial MGF ===
M(t) = (p*exp(t) - p + 1)**n
M'(t) = n*p*(p*exp(t) - p + 1)**n*exp(t)/(p*exp(t) - p + 1)
E(X) = M'(0) = n*p
M''(0) = n*p*(n*p - p + 1)
Var(X) = E(X²) - E(X)² = n*p*(1 - p)
Bayes’ Theorem Symbolically¶
# Symbolic Bayes' Theorem
# P(A|B) = P(B|A) * P(A) / P(B)
P_A, P_B, P_B_given_A = sp.symbols('P(A) P(B) P(B|A)', real=True, positive=True)
# Prior, likelihood, and evidence
prior = P_A
likelihood = P_B_given_A
evidence = P_B
# Posterior
posterior = (likelihood * prior) / evidence
print("=== Bayes' Theorem ===")
print(f"P(A|B) = {posterior}")
# Concrete example: Medical test
# Disease prevalence = 1%
# Test sensitivity (true positive rate) = 95%
# Test specificity (true negative rate) = 90%
# What's P(Disease|Positive) ?
P_disease = sp.Rational(1, 100)
P_no_disease = 1 - P_disease
P_pos_given_disease = sp.Rational(95, 100)
P_pos_given_no_disease = sp.Rational(10, 100) # False positive = 1 - specificity
# Total probability of testing positive
P_positive = P_pos_given_disease * P_disease + P_pos_given_no_disease * P_no_disease
# Posterior: P(Disease|Positive)
P_disease_given_pos = (P_pos_given_disease * P_disease) / P_positive
print(f"\n=== Medical Test Example ===")
print(f"P(Disease) = {P_disease}")
print(f"P(Positive|Disease) = {P_pos_given_disease}")
print(f"P(Positive|No Disease) = {P_pos_given_no_disease}")
print(f"P(Positive) = {P_positive} = {float(P_positive):.4f}")
print(f"P(Disease|Positive) = {P_disease_given_pos} = {float(P_disease_given_pos):.4f}")=== Bayes' Theorem ===
P(A|B) = P(A)*P(B|A)/P(B)
=== Medical Test Example ===
P(Disease) = 1/100
P(Positive|Disease) = 19/20
P(Positive|No Disease) = 1/10
P(Positive) = 217/2000 = 0.1085
P(Disease|Positive) = 19/217 = 0.0876
Converting Between Symbolic and Numerical¶
# Symbolic computation
from sympy.stats import Binomial, P, E
import numpy as np
from scipy import stats
n, p_val = 20, sp.Rational(3, 10)
X_sym = Binomial('X', n, p_val)
# Symbolic probability
prob_sym = P(X_sym == 5)
print(f"Symbolic: P(X=5) = {prob_sym}")
print(f"Exact fraction: {prob_sym}")
# Convert to float
prob_float = float(prob_sym)
print(f"As float: {prob_float}")
# Compare with scipy
prob_scipy = stats.binom.pmf(5, n, float(p_val))
print(f"SciPy numerical: {prob_scipy}")
print(f"Difference: {abs(prob_float - prob_scipy)}")
# Expected value comparison
exp_sym = E(X_sym)
exp_scipy = stats.binom.mean(n, float(p_val))
print(f"\nSymbolic E(X) = {exp_sym} = {float(exp_sym)}")
print(f"SciPy E(X) = {exp_scipy}")Symbolic: P(X=5) = 0
Exact fraction: 0
As float: 0.0
SciPy numerical: 0.17886305056987964
Difference: 0.17886305056987964
Symbolic E(X) = 6 = 6.0
SciPy E(X) = 6.0
Practical Examples¶
Example 1: Exact Poker Probabilities¶
# Calculate exact probability of a full house in 5-card poker
# Total 5-card hands
total_hands = sp.binomial(52, 5)
# Full house: 3 of one rank, 2 of another
# Choose rank for triple: C(13, 1)
# Choose 3 suits from 4: C(4, 3)
# Choose rank for pair: C(12, 1) # Can't be same as triple
# Choose 2 suits from 4: C(4, 2)
ways_full_house = sp.binomial(13, 1) * sp.binomial(4, 3) * sp.binomial(12, 1) * sp.binomial(4, 2)
prob_full_house = sp.Rational(ways_full_house, total_hands)
print(f"Total 5-card hands: {total_hands}")
print(f"Ways to get full house: {ways_full_house}")
print(f"Probability (exact): {prob_full_house}")
print(f"Probability (decimal): {float(prob_full_house):.6f}")
print(f"Odds: 1 in {float(1/prob_full_house):.2f}")Total 5-card hands: 2598960
Ways to get full house: 3744
Probability (exact): 6/4165
Probability (decimal): 0.001441
Odds: 1 in 694.17
Example 2: Birthday Problem - Exact Solution¶
# What's the probability that in a group of n people, at least 2 share a birthday?
def birthday_probability_exact(n):
"""Calculate exact probability using SymPy"""
# P(at least 2 share) = 1 - P(all different)
# P(all different) = 365/365 * 364/365 * 363/365 * ... * (365-n+1)/365
prob_all_different = sp.Rational(1, 1)
for i in range(n):
prob_all_different *= sp.Rational(365 - i, 365)
prob_at_least_two_share = 1 - prob_all_different
return prob_at_least_two_share
# Test for different group sizes
for n in [10, 20, 23, 30, 50]:
prob = birthday_probability_exact(n)
print(f"n={n:2d}: P(at least 2 share) = {float(prob):.6f}")
# The famous n=23 case
prob_23 = birthday_probability_exact(23)
print(f"\nExact probability for n=23: {float(prob_23):.10f}")
print(f"Greater than 50%? {float(prob_23) > 0.5}")n=10: P(at least 2 share) = 0.116948
n=20: P(at least 2 share) = 0.411438
n=23: P(at least 2 share) = 0.507297
n=30: P(at least 2 share) = 0.706316
n=50: P(at least 2 share) = 0.970374
Exact probability for n=23: 0.5072972343
Greater than 50%? True
Example 3: Deriving Variance Formula¶
# Prove that Var(X) = E(X²) - E(X)² symbolically
from sympy.stats import DiscreteRV, E, variance
# Create a simple discrete RV
x = sp.Symbol('x')
X = DiscreteRV(x, {1: sp.Rational(1,6),
2: sp.Rational(1,6),
3: sp.Rational(1,6),
4: sp.Rational(1,6),
5: sp.Rational(1,6),
6: sp.Rational(1,6)})
# Method 1: Using variance function
var_direct = variance(X)
# Method 2: E(X²) - E(X)²
E_X = E(X)
E_X_squared = E(X**2)
var_formula = E_X_squared - E_X**2
print(f"E(X) = {E_X}")
print(f"E(X²) = {E_X_squared}")
print(f"\nVar(X) using variance(): {var_direct}")
print(f"Var(X) using E(X²) - E(X)²: {sp.simplify(var_formula)}")
print(f"Are they equal? {sp.simplify(var_direct - var_formula) == 0}")---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[16], line 15
7 X = DiscreteRV(x, {1: sp.Rational(1,6),
8 2: sp.Rational(1,6),
9 3: sp.Rational(1,6),
10 4: sp.Rational(1,6),
11 5: sp.Rational(1,6),
12 6: sp.Rational(1,6)})
14 # Method 1: Using variance function
---> 15 var_direct = variance(X)
17 # Method 2: E(X²) - E(X)²
18 E_X = E(X)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/rv_interface.py:80, in variance(X, condition, **kwargs)
77 from sympy.stats.symbolic_probability import Variance
78 return Variance(X, condition)
---> 80 return cmoment(X, 2, condition, **kwargs)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/rv_interface.py:242, in cmoment(X, n, condition, evaluate, **kwargs)
240 from sympy.stats.symbolic_probability import CentralMoment
241 if evaluate:
--> 242 return CentralMoment(X, n, condition).doit()
243 return CentralMoment(X, n, condition).rewrite(Integral)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:688, in CentralMoment.doit(self, **hints)
687 def doit(self, **hints):
--> 688 return self.rewrite(Expectation).doit(**hints)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:260, in Expectation.doit(self, **hints)
257 evaluate = hints.get('evaluate', True)
259 if deep:
--> 260 expr = expr.doit(**hints)
262 if not is_random(expr) or isinstance(expr, Expectation): # expr isn't random?
263 return expr
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/basic.py:1947, in Basic.doit(self, **hints)
1928 """Evaluate objects that are not evaluated by default like limits,
1929 integrals, sums and products. All objects of this kind will be
1930 evaluated recursively, unless some species were excluded via 'hints'
(...) 1944
1945 """
1946 if hints.get('deep', True):
-> 1947 terms = [term.doit(**hints) if isinstance(term, Basic) else term
1948 for term in self.args]
1949 return self.func(*terms)
1950 else:
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/basic.py:1947, in <listcomp>(.0)
1928 """Evaluate objects that are not evaluated by default like limits,
1929 integrals, sums and products. All objects of this kind will be
1930 evaluated recursively, unless some species were excluded via 'hints'
(...) 1944
1945 """
1946 if hints.get('deep', True):
-> 1947 terms = [term.doit(**hints) if isinstance(term, Basic) else term
1948 for term in self.args]
1949 return self.func(*terms)
1950 else:
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in AssocOp.doit(self, **hints)
476 def doit(self, **hints):
477 if hints.get('deep', True):
--> 478 terms = [term.doit(**hints) for term in self.args]
479 else:
480 terms = self.args
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in <listcomp>(.0)
476 def doit(self, **hints):
477 if hints.get('deep', True):
--> 478 terms = [term.doit(**hints) for term in self.args]
479 else:
480 terms = self.args
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in AssocOp.doit(self, **hints)
476 def doit(self, **hints):
477 if hints.get('deep', True):
--> 478 terms = [term.doit(**hints) for term in self.args]
479 else:
480 terms = self.args
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/operations.py:478, in <listcomp>(.0)
476 def doit(self, **hints):
477 if hints.get('deep', True):
--> 478 terms = [term.doit(**hints) for term in self.args]
479 else:
480 terms = self.args
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/symbolic_probability.py:288, in Expectation.doit(self, **hints)
286 return self.func(expr)
287 # Otherwise case is simple, pass work off to the ProbabilitySpace
--> 288 result = pspace(expr).compute_expectation(expr, evaluate=evaluate)
289 if hasattr(result, 'doit') and evaluate:
290 return result.doit(**hints)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:313, in SingleDiscretePSpace.compute_expectation(self, expr, rvs, evaluate, **kwargs)
311 x = self.value.symbol
312 try:
--> 313 return self.distribution.expectation(expr, x, evaluate=evaluate,
314 **kwargs)
315 except NotImplementedError:
316 return Sum(expr * self.pdf, (x, self.set.inf, self.set.sup),
317 **kwargs)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:160, in SingleDiscreteDistribution.expectation(self, expr, var, evaluate, **kwargs)
156 p = poly(expr, var)
158 t = Dummy('t', real=True)
--> 160 mgf = self.moment_generating_function(t)
161 deg = p.degree()
162 taylor = poly(series(mgf, t, 0, deg + 1).removeO(), t)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:123, in SingleDiscreteDistribution.moment_generating_function(self, t, **kwargs)
121 if mgf is not None:
122 return mgf
--> 123 return self.compute_moment_generating_function(**kwargs)(t)
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/core/cache.py:72, in __cacheit.<locals>.func_wrapper.<locals>.wrapper(*args, **kwargs)
69 @wraps(func)
70 def wrapper(*args, **kwargs):
71 try:
---> 72 retval = cfunc(*args, **kwargs)
73 except TypeError as e:
74 if not e.args or not e.args[0].startswith('unhashable type:'):
File /opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/sympy/stats/drv.py:112, in SingleDiscreteDistribution.compute_moment_generating_function(self, **kwargs)
110 x = Dummy('x', integer=True)
111 pdf = self.pdf(x)
--> 112 mgf = summation(exp(t*x)*pdf, (x, self.set.inf, self.set.sup))
113 return Lambda(t, mgf)
TypeError: unsupported operand type(s) for *: 'exp' and 'Dict'Working with Joint Distributions¶
from sympy.stats import Die, P, E
# Two independent dice
X = Die('X', 6)
Y = Die('Y', 6)
# Sum of two dice
S = X + Y
print("=== Sum of Two Dice ===")
print(f"P(S = 7) = {P(S == 7)}")
print(f"P(S = 12) = {P(S == 12)}")
print(f"P(S > 10) = {P(S > 10)}")
print(f"\nE(X) = {E(X)}")
print(f"E(Y) = {E(Y)}")
print(f"E(X + Y) = {E(S)}")
print(f"Verify: E(X) + E(Y) = {E(X) + E(Y)}")
# Conditional probability
print(f"\n=== Conditional Probabilities ===")
# P(X=6 | X+Y=10)
# Using sympy.stats conditional might be complex, so we'll use the definition
# P(X=6, X+Y=10) / P(X+Y=10)Limitations and Performance Considerations¶
While SymPy is powerful, it has limitations:
import time
# SymPy can be slow for complex symbolic computations
print("=== Performance Comparison ===")
# Numerical computation (fast)
start = time.time()
result_numerical = stats.binom.pmf(50, 100, 0.5)
time_numerical = time.time() - start
# Symbolic computation (slower)
start = time.time()
X_sym = Binomial('X', 100, sp.Rational(1, 2))
result_symbolic = float(P(X_sym == 50))
time_symbolic = time.time() - start
print(f"Numerical (SciPy): {result_numerical:.10f} in {time_numerical:.6f}s")
print(f"Symbolic (SymPy): {result_symbolic:.10f} in {time_symbolic:.6f}s")
print(f"Speedup: {time_symbolic/time_numerical:.1f}x slower")Key takeaways:
SymPy is slower than NumPy/SciPy
Use SymPy for exact answers and derivations, not large-scale computation
Convert to numerical when you need speed
Some symbolic operations may not simplify automatically
Summary¶
In this chapter, we explored symbolic probability computation with SymPy:
Key Concepts:
Exact arithmetic with
sp.Rationalvs floating-point approximationSymbolic combinatorics:
sp.factorial,sp.binomialSymbolic random variables with
sympy.statsDeriving formulas with symbolic parameters
Converting between symbolic and numerical representations
When to Use SymPy:
Need exact answers (fractions, expressions)
Deriving formulas with unknown parameters
Verifying mathematical proofs
Teaching and learning probability theory
Small discrete probability problems
When to Use NumPy/SciPy:
Real-world data and measurements
Large-scale simulations
Speed is critical
Continuous distributions without closed forms
Best Practice: Use SymPy to understand the mathematics, then use NumPy/SciPy for practical computation.
In the next chapter, we’ll explore SageMath, a comprehensive mathematical software system that combines the power of SymPy, NumPy, and many other tools into a unified environment.
Exercises¶
Exact Coin Flips: Calculate the exact probability of getting exactly 7 heads in 10 fair coin flips using SymPy. Compare with SciPy.
Symbolic Variance: Create a symbolic random variable following a Poisson distribution with parameter λ. Derive E(X) and Var(X) symbolically.
Poker Hands: Calculate the exact probability of getting a “straight flush” (5 consecutive cards of the same suit) in 5-card poker.
Conditional Probability: Using two symbolic dice X and Y, calculate P(X = 4 | X + Y = 9) exactly.
MGF Derivation: Derive the moment generating function for a Geometric distribution with parameter p, and use it to find E(X) and Var(X).
Birthday Problem Extension: Modify the birthday problem to find the smallest group size where P(at least 2 share) > 0.9.
Bayes Update: A rare disease affects 0.1% of the population. A test is 99% accurate (both sensitivity and specificity). If you test positive, what’s the exact probability you have the disease?
Sum of Uniforms: Create two independent discrete uniform random variables on {1,2,3,4,5,6}. Find the exact PMF of their sum.
Further Reading¶
SymPy Documentation: https://
docs .sympy .org/ SymPy Stats Module: https://
docs .sympy .org /latest /modules /stats .html SymPy Tutorial: https://
docs .sympy .org /latest /tutorial /index .html Computer Algebra Systems in Education: Various papers on using CAS for teaching probability and statistics