Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Chapter 2: The Language of Probability: Sets, Sample Spaces, and Events

Welcome to Chapter 2! In the previous chapter, we introduced the ‘why’ of probability and set up our Python environment. Now, we dive into the fundamental vocabulary and concepts that form the bedrock of probability theory. Understanding these core ideas – sample spaces, events, and the rules governing them – is crucial before we can tackle more complex problems and distributions. We’ll use set theory as our language and Python to make these abstract concepts tangible.

Learning Objectives

Experiments, Outcomes, Sample Spaces

In probability, an experiment is any procedure or process that can be repeated infinitely (in theory) and has a well-defined set of possible results. Think of flipping a coin, rolling a die, measuring a patient’s temperature, or recording the time it takes for a website to load.

Each possible result of an experiment is called an outcome.

The sample space, often denoted by SS or Ω\Omega (Omega), is the set of all possible outcomes of an experiment.

Discrete vs. Continuous Sample Spaces

Sample spaces can be categorized based on the nature of their outcomes:

  1. Discrete Sample Space: Contains a finite or countably infinite number of outcomes. The outcomes can be listed.

    • Example (Finite): Rolling a standard six-sided die. The sample space is S={1,2,3,4,5,6}S = \{1, 2, 3, 4, 5, 6\}.

    • Example (Countably Infinite): Flipping a coin until the first Head appears. The sample space is S={H,TH,TTH,TTTH,...}S = \{H, TH, TTH, TTTH, ...\}. Although infinite, we can map each outcome to a positive integer (1st flip, 2nd flip, 3rd flip, etc.).

  1. Continuous Sample Space: Contains an infinite number of outcomes that form a continuum. The outcomes cannot be listed individually because there are infinitely many possibilities between any two given outcomes.

    • Example: Measuring the exact height of a randomly selected adult. The sample space might be all real numbers between 0.5 meters and 3.0 meters, S={hR0.5h3.0}S = \{h \in \mathbb{R} \mid 0.5 \le h \le 3.0\}.

    • Example: Recording the time until a component fails. S={tRt0}S = \{t \in \mathbb{R} \mid t \ge 0\}.

Python Representation (Discrete): We can easily represent finite discrete sample spaces using Python sets or lists. Sets are often conceptually closer as they inherently handle uniqueness and order doesn’t matter.

Python Implementation
# Sample space for rolling a single six-sided die
sample_space_die = {1, 2, 3, 4, 5, 6}

# Sample space for flipping a coin
sample_space_coin = {'Heads', 'Tails'}

print(f"Sample space (Die): {sample_space_die}")
print(f"Sample space (Coin): {sample_space_coin}")
Sample space (Die): {1, 2, 3, 4, 5, 6}
Sample space (Coin): {'Heads', 'Tails'}

Events as Subsets

An event is any subset of the sample space. It represents a specific outcome or a collection of outcomes of interest. Events are usually denoted by capital letters (A, B, E, etc.).

Python Representation: Events, being subsets, can also be represented using Python sets.

Python Implementation
# Continuing the die roll example
S = {1, 2, 3, 4, 5, 6}

# Event A: Rolling an even number
A = {2, 4, 6}

# Event B: Rolling a number greater than 4
B = {5, 6}

# Check if they are subsets of S
print(f"Is A a subset of S? {A.issubset(S)}")
print(f"Is B a subset of S? {B.issubset(S)}")
print(f"Event A: {A}")
print(f"Event B: {B}")
Is A a subset of S? True
Is B a subset of S? True
Event A: {2, 4, 6}
Event B: {5, 6}

Set Theory Refresher

Since events are sets, the language and operations of set theory are fundamental to probability. Let A and B be two events in a sample space S.

  1. Union (ABA \cup B): The set of outcomes that are in A, or in B, or in both. Corresponds to the logical ‘OR’.

    • Example: For the die roll, ABA \cup B = “Rolling an even number OR a number greater than 4” = {2,4,5,6}\{2, 4, 5, 6\}.

  1. Intersection (ABA \cap B): The set of outcomes that are in both A and B. Corresponds to the logical ‘AND’.

    • Example: For the die roll, ABA \cap B = “Rolling an even number AND a number greater than 4” = {6}\{6\}.

  1. Complement (AA' or AcA^c): The set of outcomes in the sample space S that are not in A. Corresponds to the logical ‘NOT’.

    • Example: For the die roll, AA' = “NOT rolling an even number” = “Rolling an odd number” = {1,3,5}\{1, 3, 5\}.

Disjoint Events: Two events A and B are disjoint or mutually exclusive if they have no outcomes in common, i.e., their intersection is the empty set (AB=A \cap B = \emptyset).

Venn Diagrams: These are useful visual aids. The sample space S is represented by a rectangle, and events are represented by circles or shapes within it. Overlapping areas show intersections, and the area outside a circle represents its complement.

(We won’t draw Venn diagrams directly in code here, but libraries like matplotlib_venn can be used for this. Conceptually, imagine S as a box containing numbers 1-6. Circle A encloses 2, 4, 6. Circle B encloses 5, 6. The overlap contains only 6. The area outside A contains 1, 3, 5.)

Python Set Operations:

Python Implementation
S = {1, 2, 3, 4, 5, 6}
A = {2, 4, 6}  # Even numbers
B = {5, 6}     # Numbers > 4
C = {1, 3, 5}  # Odd numbers (A's complement)

# Union (A or B)
union_AB = A.union(B)
print(f"A union B: {union_AB}") # Corresponds to {2, 4, 5, 6}

# Intersection (A and B)
intersection_AB = A.intersection(B)
print(f"A intersection B: {intersection_AB}") # Corresponds to {6}

# Complement of A (relative to S)
complement_A = S.difference(A)
print(f"Complement of A: {complement_A}") # Corresponds to {1, 3, 5}
print(f"Is complement of A equal to C? {complement_A == C}")

# Check for disjoint events (A and C)
intersection_AC = A.intersection(C)
print(f"A intersection C: {intersection_AC}") # Corresponds to {}
print(f"Are A and C disjoint? {intersection_AC == set()}") # Empty set means disjoint
A union B: {2, 4, 5, 6}

A intersection B: {6}
Complement of A: {1, 3, 5}
Is complement of A equal to C? True
A intersection C: set()
Are A and C disjoint? True

Axioms of Probability

The entire structure of probability theory is built upon three fundamental axioms, proposed by Andrey Kolmogorov. Let S be a sample space, and P(A) denote the probability of an event A.

  1. Non-negativity: For any event A, the probability of A is greater than or equal to zero.

    P(A)0P(A) \ge 0

    Probabilities cannot be negative.

  1. Normalization: The probability of the entire sample space S is equal to 1.

    P(S)=1P(S) = 1

    This means that some outcome within the realm of possibility must occur. The maximum possible probability is 1.

  1. Additivity for Disjoint Events: If A1,A2,A3,...A_1, A_2, A_3, ... is a sequence of mutually exclusive (disjoint) events (i.e., AiAj=A_i \cap A_j = \emptyset for all iji \ne j), then the probability of their union is the sum of their individual probabilities:

    P(A1A2A3...)=P(A1)+P(A2)+P(A3)+...P(A_1 \cup A_2 \cup A_3 \cup ...) = P(A_1) + P(A_2) + P(A_3) + ...

    For a finite number of disjoint events, say A and B, this simplifies to: If AB=A \cap B = \emptyset, then P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B).

Examples based on the axioms:

Consider rolling a fair six-sided die with S={1,2,3,4,5,6}S = \{1, 2, 3, 4, 5, 6\}. Assuming fairness, each outcome has probability 1/6.

Example 1 - Non-negativity (Axiom 1):

Each individual outcome has non-negative probability:

P({1})=1/60,P({2})=1/60,etc.P(\{1\}) = 1/6 \ge 0, \quad P(\{2\}) = 1/6 \ge 0, \quad \text{etc.}

Example 2 - Normalization (Axiom 2):

The event “Roll < 7” equals the entire sample space: D={1,2,3,4,5,6}=SD = \{1, 2, 3, 4, 5, 6\} = S.

Since the individual outcomes are disjoint, we can apply Axiom 3:

P(D)=P(S)=P({1}{2}...{6})P(D) = P(S) = P(\{1\} \cup \{2\} \cup ... \cup \{6\})
=P({1})+P({2})+...+P({6})= P(\{1\}) + P(\{2\}) + ... + P(\{6\})
=1/6+1/6+1/6+1/6+1/6+1/6=6/6=1= 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 6/6 = 1

Example 3 - Impossible events:

The event “Roll > 6” is impossible: E=E = \emptyset.

To find P()P(\emptyset), note that S=SS \cup \emptyset = S and these sets are disjoint. By Axiom 3:

P(S)=P(S)+P()P(S \cup \emptyset) = P(S) + P(\emptyset)

Since P(S)=1P(S) = 1 (Axiom 2):

1=1+P()1 = 1 + P(\emptyset)
P()=0\therefore P(\emptyset) = 0

This shows that the probability of any impossible event is 0.

Basic Probability Rules

Several useful rules can be derived directly from the axioms:

  1. Probability Range: For any event A:

    0P(A)10 \le P(A) \le 1

    (Follows from Axioms 1 & 2 and ASA \subseteq S).

  1. Complement Rule: The probability that event A does not occur is 1 minus the probability that it does occur.

    P(A)=1P(A)P(A') = 1 - P(A)
    • Derivation: A and A’ are disjoint (AA=A \cap A' = \emptyset) and their union is the entire sample space (AA=SA \cup A' = S). By Axiom 3, P(AA)=P(A)+P(A)P(A \cup A') = P(A) + P(A'). By Axiom 2, P(S)=1P(S) = 1. Therefore, P(A)+P(A)=1P(A) + P(A') = 1, which rearranges to the rule.

    • Example: What is the probability of not rolling a 6? Let A={6}A = \{6\}, so P(A)=1/6P(A) = 1/6. AA' = “not rolling a 6” = {1,2,3,4,5}\{1, 2, 3, 4, 5\}. P(A)=1P(A)=11/6=5/6P(A') = 1 - P(A) = 1 - 1/6 = 5/6.

  1. Addition Rule (General): For any two events A and B (not necessarily disjoint), the probability that A or B (or both) occurs is:

    P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)
    • Intuition: If we simply add P(A) and P(B), we have double-counted the probability of the outcomes that are in both A and B (the intersection). So, we subtract P(AB)P(A \cap B) to correct for this. If A and B are disjoint, AB=A \cap B = \emptyset and P(AB)=0P(A \cap B) = 0, which reduces this rule to Axiom 3 for two events.

    • Example: What is the probability of rolling an even number (A={2,4,6}) or a number greater than 4 (B={5,6})? P(A)=3/6=1/2P(A) = 3/6 = 1/2 P(B)=2/6=1/3P(B) = 2/6 = 1/3 The intersection is AB={6}A \cap B = \{6\}, so P(AB)=1/6P(A \cap B) = 1/6. Using the Addition Rule: P(AB)=P(A)+P(B)P(AB)=3/6+2/61/6=4/6=2/3P(A \cup B) = P(A) + P(B) - P(A \cap B) = 3/6 + 2/6 - 1/6 = 4/6 = 2/3. Let’s check the outcomes in AB={2,4,5,6}A \cup B = \{2, 4, 5, 6\}. There are 4 outcomes, each with probability 1/6. So the total probability is indeed 4×(1/6)=4/6=2/34 \times (1/6) = 4/6 = 2/3. It works!

Hands-on Python Practice

Let’s use Python to solidify these concepts through simulation. We often don’t know the theoretical probabilities beforehand, or the situation is too complex to calculate. Simulation allows us to estimate probabilities by running the experiment many times and observing the outcomes. This estimated probability is called the empirical probability.

Empirical Probability:

Pempirical(A)=Number of times event A occurredTotal number of trialsP_{empirical}(A) = \frac{\text{Number of times event A occurred}}{\text{Total number of trials}}

The Law of Large Numbers (which we’ll study later) tells us that as the number of trials increases, the empirical probability converges to the true theoretical probability.

Setup: We’ll need NumPy for efficient random number generation.

Python Implementation
import numpy as np
import matplotlib.pyplot as plt

# Configure plots for better readability
plt.style.use('seaborn-v0_8-whitegrid')

With our libraries imported, we can now work through several examples demonstrating how Python helps us understand probability concepts.

1. Representing Sample Spaces and Events

We’ve already seen how to use Python sets. Let’s reiterate for a coin flip.

Python Implementation
# Sample Space
S_coin = {'H', 'T'} # Using H for Heads, T for Tails

# Events
E_heads = {'H'}
E_tails = {'T'}

print(f"Sample Space (Coin): {S_coin}")
print(f"Event (Heads): {E_heads}")
print(f"Is Heads an event in S_coin? {E_heads.issubset(S_coin)}")
Sample Space (Coin): {'T', 'H'}
Event (Heads): {'H'}
Is Heads an event in S_coin? True

2. Simulating Simple Experiments

Let’s simulate rolling a fair six-sided die many times.

Python Implementation
import numpy as np

# Simulate 1000 dice rolls
num_rolls = 1000
rolls = np.random.randint(1, 7, size=num_rolls) # Generate random integers between 1 (inclusive) and 7 (exclusive)

# Display the first 20 rolls
print(f"First 20 rolls: {rolls[:20]}")
print(f"Total rolls simulated: {len(rolls)}")
First 20 rolls: [2 2 2 6 4 4 4 6 5 5 1 6 3 3 1 3 2 6 4 6]
Total rolls simulated: 1000

3. Calculating Empirical Probabilities

Example: What is the empirical probability of rolling a number greater than 4?

Theoretical answer: Event B={5,6}B = \{5, 6\}. P(B)=2/6=1/30.333P(B) = 2/6 = 1/3 \approx 0.333.

Python Implementation
# Define the event B: rolling > 4
# We can count how many rolls satisfy this condition
outcomes_greater_than_4 = rolls[rolls > 4]
num_success = len(outcomes_greater_than_4)

# Calculate empirical probability
empirical_prob_B = num_success / num_rolls

print(f"Number of rolls > 4: {num_success}")
print(f"Total rolls: {num_rolls}")
print(f"Empirical P(Roll > 4): {empirical_prob_B:.4f}")
print(f"Theoretical P(Roll > 4): {1/3:.4f}")
Number of rolls > 4: 324
Total rolls: 1000
Empirical P(Roll > 4): 0.3240
Theoretical P(Roll > 4): 0.3333

Note: The rolls variable used here comes from the dice roll simulation above. Try running the simulation and this calculation multiple times—you’ll notice the empirical probability fluctuates slightly but remains close to the theoretical value of 1/3, especially with a large num_rolls.

Example: What is the empirical probability of rolling an even number?

Theoretical answer: Event A={2,4,6}A = \{2, 4, 6\}. P(A)=3/6=0.5P(A) = 3/6 = 0.5.

Python Implementation
# Event A: rolling an even number
# An outcome is even if outcome % 2 == 0
outcomes_even = rolls[rolls % 2 == 0]
num_even = len(outcomes_even)

# Calculate empirical probability
empirical_prob_A = num_even / num_rolls

print(f"Number of even rolls: {num_even}")
print(f"Total rolls: {num_rolls}")
print(f"Empirical P(Roll is Even): {empirical_prob_A:.4f}")
print(f"Theoretical P(Roll is Even): {0.5:.4f}")
Number of even rolls: 473
Total rolls: 1000
Empirical P(Roll is Even): 0.4730
Theoretical P(Roll is Even): 0.5000

4. Visualizing Events and Outcomes

We can use histograms to visualize the distribution of outcomes from our simulation.

Python Implementation
import matplotlib.pyplot as plt

# Plotting the frequency of each outcome
plt.figure(figsize=(8, 5))

# Count occurrences of each die face
faces = [1, 2, 3, 4, 5, 6]
counts = [np.sum(rolls == face) for face in faces]

# Create bar plot
plt.bar(faces, counts, color='skyblue', edgecolor='black', alpha=0.7)
plt.title(f'Frequency of Outcomes for {num_rolls} Die Rolls')
plt.xlabel('Die Face')
plt.ylabel('Frequency Count')
plt.xticks(faces)

# Add a line for the expected frequency for a fair die
expected_frequency = num_rolls / 6
plt.axhline(expected_frequency, color='red', linestyle='--', label=f'Expected Frequency ({expected_frequency:.1f})')
plt.legend()

plt.show()
<Figure size 800x500 with 1 Axes>

The histogram shows the counts for each outcome (1 through 6). For a fair die and a large number of rolls, we expect the bars to be roughly the same height, close to the theoretical expected frequency (total rolls / 6).

We can also visualize the outcomes that constitute a specific event. For instance, let’s highlight the rolls that were greater than 4.

Python Implementation
# Create a boolean mask for the event
event_mask_B = rolls > 4 # True if roll > 4, False otherwise

# Simple textual visualization: show rolls and whether they met the condition
print("Roll | > 4?")
print("----|----")
for i in range(15): # Show first 15
    print(f"{rolls[i]:<4}| {event_mask_B[i]}")

# Highlight on a plot (can be more complex, here just re-emphasize the count)
print(f"\nFrom the simulation:")
print(f"- The event 'Roll > 4' (outcomes {5, 6}) occurred {num_success} times.")
print(f"- The empirical probability is {empirical_prob_B:.4f}")
Roll | > 4?
----|----
2   | False
2   | False
2   | False
6   | True
4   | False
4   | False
4   | False
6   | True
5   | True
5   | True
1   | False
6   | True
3   | False
3   | False
1   | False

From the simulation:
- The event 'Roll > 4' (outcomes (5, 6)) occurred 324 times.
- The empirical probability is 0.3240

This visualization shows which specific outcomes from our simulation satisfy the event condition (rolling > 4). It demonstrates how we can programmatically filter and analyze events from our simulated data.

Chapter Summary

This chapter introduced the basic language of probability theory, grounded in set theory.

Understanding this vocabulary and these basic rules is essential. In the next chapter, we will build upon this foundation by learning systematic ways to count the number of outcomes in sample spaces and events, which is often necessary for calculating theoretical probabilities, especially when outcomes are equally likely.

Exercises

  1. Set Operations and Events: In a survey of 100 students:

    • 60 students study Mathematics (M)

    • 50 students study Physics (P)

    • 30 students study both Mathematics and Physics

    a) How many students study Mathematics or Physics (or both)? b) How many students study neither Mathematics nor Physics? c) How many students study exactly one of the two subjects?

  2. Probability Axioms: A box contains 5 red balls, 3 blue balls, and 2 green balls. You randomly select one ball.

    a) What is the probability of selecting a red ball? b) What is the probability of not selecting a red ball? c) What is the probability of selecting a red or blue ball? d) Verify that the probabilities of all possible outcomes sum to 1.

  3. Addition Rule: A fair six-sided die is rolled. Let AA be the event “roll an even number” and BB be the event “roll a number greater than 3”.

    a) List the outcomes in events AA, BB, and ABA \cap B b) Calculate P(A)P(A), P(B)P(B), and P(AB)P(A \cap B) c) Use the Addition Rule to find P(AB)P(A \cup B) d) Verify your answer by directly counting favorable outcomes

  4. De Morgan’s Laws: In a deck of 52 playing cards, let:

    • HH = event that the card is a Heart

    • FF = event that the card is a face card (Jack, Queen, King)

    Find the probability that a randomly drawn card is: a) Neither a Heart nor a face card (use De Morgan’s Law: (HF)c=HcFc(H \cup F)^c = H^c \cap F^c) b) Verify your answer by calculating directly

  5. Mutually Exclusive vs. Independent: Consider rolling a fair six-sided die.

    • Let AA = “roll a 2”

    • Let BB = “roll an odd number”

    a) Are events AA and BB mutually exclusive? Explain. b) Calculate P(A)P(A), P(B)P(B), and P(AB)P(A \cap B) c) If events were independent, what would P(AB)P(A \cap B) equal? Compare with your answer in (b).