Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Welcome to Chapter 21! In the previous chapter, we explored SymPy for symbolic probability computation. Now we’ll discover SageMath (often called just “Sage”), a comprehensive free open-source mathematics software system that combines the power of many specialized libraries including NumPy, SciPy, SymPy, matplotlib, and dozens more.

SageMath is built on Python but extends it with a powerful mathematical syntax and extensive pre-built functionality for:

While this entire book could have been written using SageMath, we’ve focused on standard Python libraries (NumPy, SciPy, SymPy) because they’re more commonly used in data science and machine learning workflows. However, SageMath offers unique advantages for mathematical work and is worth knowing about.

Learning Objectives

What is SageMath?

SageMath vs Python + Libraries

Standard Python Approach (what we’ve used in this book):

import numpy as np
import scipy.stats as stats
import sympy as sp
import matplotlib.pyplot as plt

SageMath Approach:

Key Philosophy: SageMath aims to be a free open-source alternative to Mathematica, Maple, and MATLAB.

Installation and Access

CoCalc (https://cocalc.com) is a free online platform that provides SageMath in your browser:

Option 2: Local Installation

Install SageMath locally:

On Ubuntu/Debian:

sudo apt-get install sagemath

On macOS (using Homebrew):

brew install --cask sagemath

On Windows:

Via Conda (unofficial):

conda install -c conda-forge sage

Option 3: Docker

docker pull sagemath/sagemath
docker run -p 8888:8888 sagemath/sagemath:latest sage-jupyter

Option 4: SageMathCell

For quick one-off calculations: https://sagecell.sagemath.org/

SageMath Basics for Probability

Exact Arithmetic by Default

Unlike Python where 1/3 gives 0.333..., SageMath provides exact arithmetic by default:

# In SageMath (this would give exact result)
sage: 1/3
1/3

sage: 1/3 + 1/3 + 1/3
1

sage: sqrt(2)
sqrt(2)

sage: sqrt(2).n()  # .n() for numerical approximation
1.41421356237310

sage: pi
pi

sage: pi.n(digits=50)
3.1415926535897932384626433832795028841971693993751

This exact arithmetic is similar to SymPy but built into the core language.

Combinatorics in SageMath

SageMath has extensive built-in combinatorics support:

# Factorials
sage: factorial(10)
3628800

sage: factorial(100)  # Handles large numbers easily
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000

# Binomial coefficients
sage: binomial(10, 3)
120

sage: binomial(52, 5)  # Poker hands
2598960

# Permutations
sage: factorial(8) / factorial(8-3)  # P(8,3)
336

# SageMath also has Permutations class for working with permutation objects
sage: Permutations(3).list()
[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

sage: Permutations(3).cardinality()
6

# Combinations as mathematical objects
sage: Combinations(5, 2).list()
[[1, 2], [1, 3], [1, 4], [1, 5], [2, 3], [2, 4], [2, 5], [3, 4], [3, 5], [4, 5]]

sage: Combinations(5, 2).cardinality()
10

Symbolic Variables and Expressions

# Declare symbolic variables
sage: var('n k p')
(n, k, p)

# Binomial probability formula
sage: prob = binomial(n, k) * p^k * (1-p)^(n-k)
sage: prob
(p - 1)^(-k + n)*p^k*binomial(n, k)

# Substitute values
sage: prob.subs(n=10, k=3, p=1/2)
15/128

# Simplify
sage: expr = (n * (n-1)) / n
sage: expr.simplify()
n - 1

# Expand
sage: expand((p + (1-p))^5)
1

Probability Distributions in SageMath

SageMath provides several ways to work with probability distributions.

Using SciPy Through SageMath

SageMath includes SciPy, so you can use it exactly as we have throughout this book:

sage: import scipy.stats as stats
sage: stats.binom.pmf(3, 10, 0.5)
0.1171875

Symbolic Probability with SageMath

# Exact probability calculations
sage: var('n k p')
sage: binomial_prob = binomial(n,k) * p^k * (1-p)^(n-k)

# Calculate for specific values
sage: binomial_prob(n=5, k=2, p=1/2)
5/16

# Verify sum to 1
sage: n_val = 5
sage: sum(binomial_prob(n=n_val, k=k_val, p=1/2) for k_val in range(n_val+1))
1

Discrete Distributions

# Creating a discrete probability distribution
sage: def coin_flip():
....:     return 'H' if random() < 0.5 else 'T'

sage: # Simulate flips
sage: [coin_flip() for _ in range(10)]
['T', 'H', 'H', 'T', 'H', 'T', 'T', 'H', 'H', 'T']

# Using GeneralDiscreteDistribution
sage: outcomes = [1, 2, 3, 4, 5, 6]
sage: probabilities = [1/6] * 6
sage: die_dist = GeneralDiscreteDistribution(probabilities)

sage: # Sample from distribution
sage: [outcomes[die_dist.get_random_element()] for _ in range(10)]
[4, 1, 6, 2, 3, 5, 1, 6, 4, 2]

Advanced Combinatorics Examples

Example 1: Poker Hand Probabilities

sage: # Total 5-card poker hands
sage: total_hands = binomial(52, 5)
sage: total_hands
2598960

sage: # Royal flush: A,K,Q,J,10 of same suit
sage: royal_flush = 4  # One per suit
sage: prob_royal_flush = royal_flush / total_hands
sage: prob_royal_flush
1/649740

sage: # Full house: 3 of one rank, 2 of another
sage: full_house = binomial(13,1) * binomial(4,3) * binomial(12,1) * binomial(4,2)
sage: full_house
3744
sage: prob_full_house = full_house / total_hands
sage: prob_full_house
6/4165

sage: # Convert to decimal
sage: prob_full_house.n()
0.00144057623049925

Example 2: Birthday Problem

sage: def birthday_probability(n):
....:     """Exact probability that at least 2 people share a birthday"""
....:     prob_all_different = 1
....:     for i in range(n):
....:         prob_all_different *= (365 - i) / 365
....:     return 1 - prob_all_different

sage: # Test various group sizes
sage: for n in [10, 20, 23, 30, 50]:
....:     print(f"n={n:2d}: {birthday_probability(n).n():.6f}")
n=10: 0.116948
n=20: 0.411438
n=23: 0.507297
n=30: 0.706316
n=50: 0.970374

sage: # Find minimum n where probability > 0.5
sage: n = 1
sage: while birthday_probability(n) < 1/2:
....:     n += 1
sage: print(f"Minimum group size for >50%: {n}")
Minimum group size for >50%: 23

Example 3: Multinomial Coefficients

sage: # Multinomial coefficient: n! / (k1! * k2! * ... * km!)
sage: # Example: Arrangements of MISSISSIPPI (11 letters)
sage: # M:1, I:4, S:4, P:2

sage: n = 11
sage: multinomial([1, 4, 4, 2])
34650

sage: # Verify with factorial formula
sage: factorial(11) / (factorial(1) * factorial(4) * factorial(4) * factorial(2))
34650

sage: # Probability that a random arrangement spells MISSISSIPPI
sage: 1 / multinomial([1, 4, 4, 2])
1/34650

Symbolic Probability Theory

Deriving Expected Value Formulas

sage: # Expected value of binomial distribution
sage: var('n p k')
sage: # E(X) = sum(k * P(X=k)) for k=0 to n

sage: # Using symbolic sum (this may be slow for symbolic n)
sage: # We can verify for small concrete values
sage: n_val = 5
sage: p_val = var('p')
sage: expected = sum(k * binomial(n_val, k) * p_val^k * (1-p_val)^(n_val-k)
....:                for k in range(n_val + 1))
sage: expected.simplify()
5*p

sage: # This confirms E(X) = np for Binomial(n,p)

Moment Generating Functions

sage: # MGF of Binomial distribution: M(t) = (pe^t + (1-p))^n
sage: var('t n p')
sage: mgf = (p * exp(t) + (1-p))^n

sage: # First derivative at t=0 gives E(X)
sage: mgf_prime = diff(mgf, t)
sage: expected_value = mgf_prime.subs(t=0).simplify()
sage: expected_value
n*p

sage: # Second derivative for variance
sage: mgf_double_prime = diff(mgf, t, 2)
sage: second_moment = mgf_double_prime.subs(t=0).simplify()
sage: variance = (second_moment - expected_value^2).simplify()
sage: variance
n*p*(p - 1)
sage: # Which simplifies to n*p*(1-p)

Numerical Integration and Probability

SageMath can handle both symbolic and numerical integration:

sage: # Numerical integration for continuous distributions
sage: var('x')

sage: # Standard normal PDF
sage: normal_pdf = 1/sqrt(2*pi) * exp(-x^2/2)

sage: # P(Z < 1) for standard normal
sage: prob = integral(normal_pdf, x, -oo, 1)
sage: prob.n()
0.841344746068543

sage: # P(0 < Z < 1)
sage: prob_range = integral(normal_pdf, x, 0, 1)
sage: prob_range.n()
0.341344746068543

sage: # Verify using scipy for comparison
sage: import scipy.stats as stats
sage: stats.norm.cdf(1)  # Should match first calculation
0.8413447460685429

Plotting Distributions

SageMath has built-in plotting that’s similar to matplotlib but with some enhancements:

sage: # Plot binomial PMF
sage: n, p = 10, 0.5
sage: points = [(k, binomial(n,k) * p^k * (1-p)^(n-k)) for k in range(n+1)]
sage: bar_chart(points, color='blue', width=0.5)

sage: # Plot normal PDF
sage: var('x')
sage: mu, sigma = 0, 1
sage: pdf = 1/(sigma*sqrt(2*pi)) * exp(-(x-mu)^2/(2*sigma^2))
sage: plot(pdf, (x, -4, 4), color='red', thickness=2,
....:      axes_labels=['x', 'PDF'], title='Standard Normal Distribution')

sage: # Multiple distributions on same plot
sage: p1 = plot(pdf, (x, -4, 4), color='blue', legend_label='N(0,1)')
sage: pdf2 = 1/(2*sqrt(2*pi)) * exp(-(x-1)^2/(2*4))  # N(1,2)
sage: p2 = plot(pdf2, (x, -4, 6), color='red', legend_label='N(1,2)')
sage: (p1 + p2).show()

Simulation and Monte Carlo

sage: # Monte Carlo estimation of π
sage: def estimate_pi(n):
....:     inside = 0
....:     for _ in range(n):
....:         x, y = random(), random()
....:         if x^2 + y^2 <= 1:
....:             inside += 1
....:     return 4 * inside / n

sage: estimate_pi(10000)
3.1416  # Will vary

sage: # Simulate dice rolls
sage: rolls = [randint(1,6) for _ in range(1000)]
sage: # Count frequencies
sage: {i: rolls.count(i)/1000 for i in range(1,7)}
{1: 0.163, 2: 0.171, 3: 0.166, 4: 0.168, 5: 0.162, 6: 0.170}  # Approximately uniform

Comparison: SageMath vs NumPy/SciPy/SymPy

When to Use SageMath

✅ Use SageMath when:

❌ Don’t use SageMath when:

Syntax Comparison

# Exact probability: P(X=3) for Binomial(10, 0.5)

# NumPy/SciPy (numerical)
from scipy.stats import binom
prob = binom.pmf(3, 10, 0.5)  # 0.1171875

# SymPy (symbolic)
from sympy.stats import Binomial, P
from sympy import Rational
X = Binomial('X', 10, Rational(1,2))
prob = P(X == 3)  # 15/128

# SageMath (natural mathematical syntax)
binomial(10, 3) * (1/2)^10  # 15/128

Practical Example: Complete Analysis

Let’s solve a problem using SageMath’s full capabilities:

Problem: A fair coin is flipped 100 times. What’s the probability of getting between 45 and 55 heads (inclusive)?

sage: # Method 1: Exact calculation
sage: n = 100
sage: p = 1/2
sage: prob_exact = sum(binomial(n,k) * p^n for k in range(45, 56))
sage: prob_exact.n()
0.728747317564360

sage: # Method 2: Using scipy for comparison
sage: import scipy.stats as stats
sage: prob_scipy = stats.binom.cdf(55, n, 0.5) - stats.binom.cdf(44, n, 0.5)
sage: prob_scipy
0.7287473175643597

sage: # Method 3: Normal approximation
sage: # Binomial(100, 0.5) ≈ Normal(50, 25)
sage: mu = n * p
sage: sigma = sqrt(n * p * (1-p))
sage: sigma.n()
5.00000000000000

sage: # Using continuity correction: P(44.5 < X < 55.5)
sage: from scipy.stats import norm
sage: prob_normal = norm.cdf(55.5, mu, sigma) - norm.cdf(44.5, mu, sigma)
sage: prob_normal
0.7287181077536644

sage: print(f"Exact: {prob_exact.n():.10f}")
sage: print(f"SciPy: {prob_scipy:.10f}")
sage: print(f"Normal approx: {prob_normal:.10f}")

Unique SageMath Features for Probability

1. Built-in Graph Theory (for Markov Chains)

sage: # Create a transition matrix for a Markov chain
sage: P = matrix(QQ, [[1/2, 1/2, 0],
....:                  [1/4, 1/2, 1/4],
....:                  [0, 1/2, 1/2]])

sage: # Visualize as directed graph
sage: G = DiGraph(P, format='weighted_adjacency_matrix')
sage: G.plot(edge_labels=True)

sage: # Find stationary distribution
sage: # Solve π P = π
sage: eigenspaces = P.transpose().eigenspaces_right()
sage: # Extract eigenvector for eigenvalue 1

2. Automatic Simplification

sage: # SageMath often simplifies automatically
sage: var('n')
sage: expr = binomial(n, 0) + binomial(n, n)
sage: expr
2  # Automatically simplified!

sage: expr2 = factorial(n) / (factorial(n-1) * n)
sage: expr2.simplify()
1

3. LaTeX Output

sage: var('n k p')
sage: formula = binomial(n,k) * p^k * (1-p)^(n-k)
sage: latex(formula)
\left(p - 1\right)^{-k + n} p^{k} \binom{n}{k}

# This is great for generating formulas for papers and presentations!

Integration with Jupyter

SageMath works excellently with Jupyter notebooks:

  1. CoCalc: Built-in Jupyter with SageMath kernel

  2. Local Jupyter: After installing SageMath, use sage -n jupyter

  3. SageMath kernel: Select “SageMath” kernel in Jupyter

Advantages:

Limitations and Considerations

Installation Complexity

Ecosystem Compatibility

Performance

Community Size

Summary

SageMath is a powerful, comprehensive mathematical software system built on Python that offers:

Strengths:

Trade-offs:

Recommendation:

Best of both worlds: Learn the concepts with SageMath, implement production code with NumPy/SciPy!

Exercises

  1. Installation: Set up SageMath using your preferred method (CoCalc, local, or Docker) and verify it works.

  2. Exact Probabilities: Calculate the exact probability of getting exactly 5 heads in 12 coin flips. Express as both a fraction and decimal.

  3. Poker Probability: Calculate the exact probability of getting “four of a kind” in 5-card poker using SageMath combinatorics.

  4. Symbolic Variance: Derive the variance formula for a geometric distribution using symbolic computation in SageMath.

  5. Birthday Problem Extension: Find the minimum number of people needed for a >90% chance of a shared birthday.

  6. Markov Chain: Create a simple 3-state Markov chain transition matrix and find its stationary distribution using SageMath’s linear algebra capabilities.

  7. Comparison: Solve the same binomial probability problem using SciPy (numerical), SymPy (symbolic), and SageMath. Compare the results and execution times.

  8. Monte Carlo: Implement a Monte Carlo simulation in SageMath to estimate the probability that the sum of two dice is greater than 8.

Further Reading

Conclusion

You’ve now completed the journey through probability in practice with Python! From basic probability concepts to Monte Carlo methods, Markov chains, and now symbolic computation with SymPy and SageMath, you have a comprehensive toolkit for tackling probability problems.

Key Takeaways from This Book:

Next Steps:

Thank you for joining this hands-on journey through probability with Python. May your p-values be significant and your confidence intervals narrow! 🎲📊🐍