0% found this document useful (0 votes)
14 views

ML Lab Program - VTU

Demonstrate and Analyse the results sets obtained from the Bayesian belief network principle.

Uploaded by

Deepak D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

ML Lab Program - VTU

Demonstrate and Analyse the results sets obtained from the Bayesian belief network principle.

Uploaded by

Deepak D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

MACHINE LEARNING LABORATORY

6 Demonstrate and Analyse the results sets obtained from Bayesian belief
Aim network Principle.
Write a program to construct a Bayesian network considering medical data.
Program Use this model to demonstrate the diagnosis of heart patients using standard
Heart Disease Data Set. You can use Python ML library classes/API.

CONCEPT –

A Bayesian network is a directed acyclic graph in which each edge corresponds to a conditional
dependency, and each node corresponds to a unique random variable.

Bayesian network consists of two major parts: a directed acyclic graph and a set of conditional
probability distributions
• The directed acyclic graph is a set of random variables represented by nodes.
• The conditional probability distribution of a node (random variable) is defined for every
possible outcome of the preceding causal node(s).

For illustration, consider the following example. Suppose we attempt to turn on our computer,
but the computer does not start (observation/evidence). We would like to know which of the
possible causes of computer failure is more likely. In this simplified illustration, we assume
only two possible causes of this misfortune: electricity failure and computer malfunction.
The corresponding directed acyclic graph is depicted in below figure.

Fig: Directed acyclic graph representing two independent possible causes of a computer failure.

The goal is to calculate the posterior conditional probability distribution of each of the possible
unobserved causes given the observed evidence, i.e. P [Cause | Evidence].

Deepak D, Assistant Professor, Dept. of AI & ML, Canara Engineering College, Mangaluru 1
MACHINE LEARNING LABORATORY

Training Instances: (The below data is saved as heart.csv file)

Heart Disease Databases


The Cleveland database contains 76 attributes, but all published experiments refer to using a
subset of 14 of them. In particular, the Cleveland database is the only one that has been used
by ML researchers to this date. The "Heartdisease" field refers to the presence of heart disease
in the patient. It is integer valued from 0 (no presence) to 4.

Database 0 1 2 3 4 Total
Cleveland 165 55 36 35 13 303

Some instance from the dataset:


age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal Heartdisease
63 1 1 145 233 1 2 150 0 2.3 3 0 6 0
67 1 4 160 286 0 2 108 1 1.5 2 3 3 2
67 1 4 120 229 0 2 129 1 2.6 2 2 7 1
41 0 2 130 204 0 2 172 0 1.4 1 0 3 0
62 0 4 140 268 0 2 160 0 3.6 3 2 3 3
60 1 4 130 206 0 2 132 1 2.4 2 2 7 4

Attribute Information
1. age: age in years
2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
• Value 1: typical angina
• Value 2: atypical angina
• Value 3: non-anginal pain
• Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
• Value 0: normal
• Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation
or depression of > 0.05 mV)
• Value 2: showing probable or definite left ventricular hypertrophy by Estes'
criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10. oldpeak = ST depression induced by exercise relative to rest

Deepak D, Assistant Professor, Dept. of AI & ML, Canara Engineering College, Mangaluru 2
MACHINE LEARNING LABORATORY

11. slope: the slope of the peak exercise ST segment


• Value 1: upsloping
• Value 2: flat
• Value 3: downsloping
12. ca: number of major vessels (0-3) colored by fluoroscopy
• 0: No major vessels visible
• 1: One major vessel visible
• 2: Two major vessels visible
• 3: Three major vessels visible
13. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
14. Heartdisease: It is integer valued from 0 (no presence) to 4.
• 0: No heart disease
• 1: Mild heart disease
• 2: Moderate heart disease
• 3: Severe heart disease
• 4: Very severe heart disease

Deepak D, Assistant Professor, Dept. of AI & ML, Canara Engineering College, Mangaluru 3
MACHINE LEARNING LABORATORY

Program:

import numpy as np
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator # Probabilistic Graphical Models
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination

heartDisease = pd.read_csv("heart.csv")
heartDisease = heartDisease.replace('?',np.nan)

print('Sample instances from the dataset are given below')


print(heartDisease.head())

model= BayesianNetwork([('age','heartdisease'), ('sex','heartdisease'),


('exang','heartdisease'), ('cp','heartdisease'), ('heartdisease','restecg'),
('heartdisease','chol')])

print('\n Learning CPD using Maximum likelihood estimators')


model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)

print('\n Inferencing with Bayesian Network:')


HeartDiseasetest_infer = VariableElimination(model)

print('\n 1. Probability of HeartDisease given evidence= restecg')


q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'restecg':1})
print(q1)

print('\n 2. Probability of HeartDisease given evidence= cp ')


q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp':2})
print(q2)

Deepak D, Assistant Professor, Dept. of AI & ML, Canara Engineering College, Mangaluru 4
MACHINE LEARNING LABORATORY

Output:

Sample instances from the dataset are given below

age sex cp trestbps chol ... oldpeak slope ca thal heartdisease


0 63 1 1 145 233 ... 2.3 3 0 6 0
1 67 1 4 160 286 ... 1.5 2 3 3 2
2 67 1 4 120 229 ... 2.6 2 2 7 1
3 37 1 3 130 250 ... 3.5 3 0 3 0
4 41 0 2 130 204 ... 1.4 1 0 3 0

[5 rows x 14 columns]

Learning CPD using Maximum likelihood estimators

Inferencing with Bayesian Network:

1. Probability of HeartDisease given evidence= restecg


+-----------------+---------------------+
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.1016 |
+-----------------+---------------------+
| heartdisease(1) | 0.0000 |
+-----------------+---------------------+
| heartdisease(2) | 0.2361 |
+-----------------+---------------------+
| heartdisease(3) | 0.2017 |
+-----------------+---------------------+
| heartdisease(4) | 0.4605 |
+-----------------+---------------------+

2. Probability of HeartDisease given evidence= cp


+-----------------+---------------------+
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.3742 |
+-----------------+---------------------+
| heartdisease(1) | 0.2018 |
+-----------------+---------------------+
| heartdisease(2) | 0.1375 |
+-----------------+---------------------+
| heartdisease(3) | 0.1541 |
+-----------------+---------------------+
| heartdisease(4) | 0.1323 |
+-----------------+---------------------+

Deepak D, Assistant Professor, Dept. of AI & ML, Canara Engineering College, Mangaluru 5

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy