A graphical model that represents the probabilistic relationships among a set of random variables. Uses a directed acyclic graph.
Provides a compact and intuitive representation of joint probability distributions using conditional independence and causal structure.
Node
Represents a random variable.
Each node has a local CPD.
The overall joint probability of all variables is:
Edge
Directed links. A link from node to node means that has a direct influence on .
Local Conditional Probability Distribution
Aka. CPD. A function defining the probability of a variable given its parents.
For a node with parents , the CPD is:
Specifies the relationship between a node and its parent nodes. Can be continuous or discrete. For discrete variables: represented using conditional probability table (CPT).
A CPT for a boolean variable with boolean parents has entries.
Joint Probability Distribution
Aka. JPD. The full joint probability for all variables in the network is the product of local conditional probabilities:
Example:
Here:
- is used because has no parents.
- because depends on both and .
- because depend only on .
This decomposition avoids the need to represent all combinations explicitly.
Inference
The goal is computing probabilities of interest given evidence.
By Enumeration
where is a normalization constant ensuring probabilities sum to 1.
This method sums over hidden (unobserved) variables to find the posterior probability.
Conditional Independence
Patterns
| Relationship Type | Structure | Independence Property |
|---|---|---|
| Common cause | and independent given | |
| Common effect | and dependent given | |
| Causal chain | and independent given |
Structure
- Children are conditionally independent of ancestors given parents.
- Siblings are conditionally independent given their common parent.
- Parents are generally not conditionally independent given a child.
D-Separation
Used to test independences in a Bayesian network.
Steps:
- List all paths between the 2 variables (ignore arrow direction).
- For each path, identify the type of connection at each intermediate node
- Chain: or
- Fork:
- Collider:
- Apply blocking rules
- For chain or fork: path is blocked if middle node is conditioned
- For collider: path is blocked if both collider and its descendents are not conditioned
- They are independent iff all paths between the 2 nodes are blocked.
Compactness and Efficiency
If every node has at most parents, total storage = . This is linear in n, compared to for the full joint distribution.
More compact than full joint tables.
Constructing a Bayesian Network
- Choose variables relevant to the domain.
- Decide an ordering of variables (causes before effects preferred).
- For each variable :
- Add a node for .
- Select minimal parents ensuring conditional independence.
- Define the CPT for .