Create Bayesian Network and learn parameters with Python3.x

Bayesian Network

Python3

Parameter Learning

Machine Learning

Probabilistic Graphical Models

Create Bayesian Network and learn parameters with Python3.x

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Bayesian networks model uncertain relationships with a directed acyclic graph and conditional probability tables. In Python, a common workflow is to define the graph structure first and then learn the probability tables from a pandas data set.

That means there are really two separate tasks: structure definition and parameter learning. Many examples blur them together, but keeping them separate makes the process much easier to understand.

What a Bayesian Network Contains

A Bayesian network has:

nodes for random variables
directed edges for conditional dependencies
conditional probability tables for each node

For example, a small weather-style model might say:

'Cloudy influences Rain'
'Rain influences WetGrass'

That graph structure says which variables depend on which others. The learned parameters then tell you the actual probabilities.

Defining a Network in Python

Using pgmpy, you can define the structure explicitly:

python

1import pandas as pd
2from pgmpy.models import DiscreteBayesianNetwork
3
4model = DiscreteBayesianNetwork([
5    ("Cloudy", "Rain"),
6    ("Rain", "WetGrass"),
7])

This creates the graph only. At this point, the model knows the dependency directions but does not yet know the probability values.

Learning Parameters from Data

Suppose you have discrete training data:

python

1import pandas as pd
2from pgmpy.models import DiscreteBayesianNetwork
3from pgmpy.estimators import MaximumLikelihoodEstimator
4
5data = pd.DataFrame({
6    "Cloudy": [0, 0, 1, 1, 1, 0, 1, 0],
7    "Rain": [0, 0, 1, 1, 0, 0, 1, 0],
8    "WetGrass": [0, 0, 1, 1, 1, 0, 1, 0],
9})
10
11model = DiscreteBayesianNetwork([
12    ("Cloudy", "Rain"),
13    ("Rain", "WetGrass"),
14])
15
16model.fit(data, estimator=MaximumLikelihoodEstimator)
17
18for cpd in model.get_cpds():
19    print(cpd)

fit(...) estimates the conditional probability tables from the observed frequencies in data.

Using Bayesian Estimation Instead of Pure Counts

Maximum likelihood is simple, but it can produce zero probabilities when some combinations are rare or absent. A Bayesian estimator smooths those tables with priors.

python

1from pgmpy.estimators import BayesianEstimator
2
3model.fit(
4    data,
5    estimator=BayesianEstimator,
6    prior_type="BDeu",
7    equivalent_sample_size=5,
8)

This is often safer for small or sparse data sets because it avoids overconfident zero estimates.

Running Inference After Learning

Once parameters are learned, you can query the network:

python

1from pgmpy.inference import VariableElimination
2
3inference = VariableElimination(model)
4
5result = inference.query(
6    variables=["WetGrass"],
7    evidence={"Cloudy": 1},
8)
9
10print(result)

Now the model can answer probabilistic questions using both the graph structure and the learned conditional tables.

Data Requirements Matter

For basic discrete Bayesian networks in libraries like pgmpy, the data should usually be categorical or encoded as discrete values. Continuous variables need a different modeling strategy, discretization, or a model family designed for continuous distributions.

If your data frame contains strings instead of integers, that is often fine as long as the states are categorical and consistent:

python

1data = pd.DataFrame({
2    "Cloudy": ["yes", "no", "yes", "yes"],
3    "Rain": ["yes", "no", "yes", "no"],
4    "WetGrass": ["yes", "no", "yes", "yes"],
5})

What matters is that the states are discrete and meaningful.

Structure Learning Is a Different Problem

The example above assumes you already know the graph edges. That is parameter learning on a fixed structure.

If you do not know the structure, you can attempt structure learning separately with score-based or constraint-based methods. But that is a harder problem and should not be confused with simply fitting CPDs on a known DAG.

In many real applications, the best workflow is:

define or choose the graph structure
verify it is acyclic
fit parameters from data
run inference and validate results

Common Pitfalls

Discrete Bayesian-network code usually expects discrete states, not raw continuous variables.
'fit(...) learns parameters for a chosen graph; it does not automatically discover the structure.'
Small data sets can produce brittle probability tables unless you add smoothing.
Directed cycles are invalid in Bayesian networks and must be removed before fitting.

Summary

Creating a Bayesian network involves defining a DAG of dependencies.
Parameter learning fills in the conditional probability tables from data.
In Python, pgmpy provides a practical workflow for both fitting and inference.
Use Bayesian smoothing when data is sparse to avoid brittle zero probabilities.
Keep structure learning separate from parameter learning so the workflow stays clear.