Probabilistic Reasoning (Unit-3)

Unit-3 (AI)
Probabilistic reasoning
Uncertainty:
A agent working in real environment almost never has access to whole truth about its
environment. Therefore agent needs to work under uncertainty.
With this knowledge representation, we might write A→B, which means if A is true
then B is true, but consider a situation where we are not sure about whether A is true or
not then we cannot express this statement, this situation is called uncertainty.
Causes of Uncertainty: Uncertainty in AI can arise from various sources, including:
Information Occurred from Unreliable Sources:
AI systems rely on data to make decisions and predictions. However, data obtained from
various sources may not always be reliable. Data can be incomplete, inconsistent, or
biased, leading to uncertainty in the outcomes generated by AI systems.
 Experimental Errors: In scientific research and experimentation, errors can occur at
various stages, such as data collection, measurement, and analysis. These errors can
introduce uncertainty in the results and conclusions drawn from the experiments.
 Equipment Fault: In many AI systems, machines and sensors are used to collect data
and make decisions. However, these machines can be subject to faults, malfunctions,
or inaccuracies, leading to uncertainty in the outcomes generated by AI systems.
 Temperature Variation: Many real-world applications of AI, such as weather
prediction, environmental monitoring, and energy management, are sensitive to
temperature variations. However, temperature measurements can be subject to
uncertainty due to factors such as sensor accuracy, calibration errors, and
environmental fluctuations.
 Climate Change: Climate change is a global phenomenon that introduces uncertainty
in various aspects of our lives. For example, predicting the impacts of climate change
on agriculture, water resources, and infrastructure requires dealing with uncertain data
and models.
 Missing data
 Noisy data
 Incomplete knowledge
 Theoretical ignorance
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the
concept of probability to indicate the uncertainty in knowledge. In probabilistic
reasoning, we combine probability theory with logic to handle the uncertainty. We use
probability in probabilistic reasoning because it provides a way to handle the uncertainty.
In the real world, there are lots of scenarios, where the certainty of something is not
confirmed, such as "It will rain today," "behavior of someone for some situations," "A
match between two teams or two players." These are probable sentences for which we
can assume that it will happen but not sure about it, so here we use probabilistic
reasoning.
Need of probabilistic reasoning in AI:
 When there are unpredictable outcomes.
 When specifications or possibilities of predicates becomes too large to handle.
 When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain
knowledge:
1. Bayes' rule
2. Bayesian Statistics
As probabilistic reasoning uses probability and related terms, so before understanding
probabilistic reasoning, let's understand some common terms:
Probability: Probability can be defined as a chance that an uncertain event will occur. It
is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.
0 ≤ P(A) ≤ 1,
where P(A) is the probability of an event A.
P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.
We can find the probability of an uncertain event by using the below formula.
 P(¬A) = probability of a not happening event.

 P(¬A) + P(A) = 1.
 Event: Each possible outcome of a variable is called an event.
 Sample space: The collection of all possible events is called sample space.
 Random variables: Random variables are used to represent the events and objects in
the real world.
 Prior probability: The prior probability of an event is probability computed before
observing new information.
 Posterior Probability: The probability that is calculated after all evidence or
information has taken into account. It is a combination of prior probability and new
information.
Conditional probability:
Conditional probability is a probability of occurring an event when another event has
already happened. For Example, let’s consider the case of rolling two dice, sample
space of this event is as follows:
Now, consider an event A = getting 3 on the first die and B = getting a sum of 9.
Then the probability of getting 9 when on the first die it’s already 3 is P(B | A),
which can be calculated as follows:
All the cases for the first die as 3 are (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6).
In all of these cases, only one case has a sum of 9.
Thus, P(B | A) = 1/36.
In case, we have to find P(A | B),
All cases where the sum is 9 are (3, 6), (4, 5), (5, 4), and (6, 3).
In all of these cases, only one case has 3 on the first die i.e., (3, 6)
Thus, P(A | B) = 1/36.
Conditional Probability Formula

Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
 P(A ∩ B) represents the probability of both events A and B occurring simultaneously

 P(B) represents the probability of event B occurring
If the probability of A is given and we need to find the probability of B, then it will be
given as:
How to Calculate Conditional Probability?

To calculate the conditional probability, we can use the following step-by-step method:
Step 1: Identify the Events. Let’s call them Event A and Event B.
Step 2: Determine the Probability of Event A i.e., P(A)
Step 3: Determine the Probability of Event B i.e., P(B)
Step 4: Determine the Probability of Event A and B i.e., P(A∩B).
Step 5: Apply the Conditional Probability Formula and calculate the required probability.
Example: In a class, there are 70% of the students who like English and 40% of the
students who likes English and mathematics, and then what is the percent of students
those who like English also like mathematics?
Solution: Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics.
Bayes' theorem
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which
determines the probability of an event with uncertain knowledge. Bayes' theorem was
named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics.
Bayes’ Theorem is used to determine the conditional probability of an event.
Bayes Theorem Derivation using conditional probability:

From the definition of conditional probability, Bayes theorem can be derived for events
as given below:
P(A|B) = P(A ⋂ B)/ P(B) where P(B) ≠ 0
P(B|A) = P(B ⋂ A)/ P(A) where P(A) ≠ 0
Here, the joint probability P(A ⋂ B) of both events A and B being true such that,
P(B ⋂ A) = P(A ⋂ B)
P(A ⋂ B) = P(A | B) P(B) = P(B | A) P(A)
P(A|B) = [P(B|A) P(A)]/ P(B)

Question: From a standard deck of playing cards, a single card is drawn. The
probability that the card is king is 4/52, then calculate posterior probability
P(King|Face), which means the drawn face card is a king card.
Solution:
P(king): probability that the card is King= 4/52= 1/13
P(face): probability that a card is a face card= 3/13
P(Face|King): probability of face card when we assume it is a king = 1
Putting all values in equation (i) we will get:
Difference Between Conditional Probability and Bayes Theorem:

The difference between Conditional Probability and Bayes Theorem can be understood
with the help of the table given below,
Bayes’ Theorem Conditional Probability
Bayes’ Theorem is derived using the definition of Conditional Probability is the probability of event A
conditional probability. It is used to find the reverse when event B has already occurred.
probability.
Formula: P(A|B) = [P(B|A)P(A)] / P(B) Formula: P(A|B) = P(A∩B) / P(B)
Application of Bayes' theorem in Artificial intelligence:

Following are some applications of Bayes' theorem:
 It is used to calculate the next step of the robot when the already executed step is
given.
 Bayes' theorem is helpful in weather forecasting.
 It can solve the Monty Hall problem.
Bayesian Belief Network

Bayesian belief networks (BBNs) are probabilistic graphical models that are used to
represent uncertain knowledge and make decisions based on that knowledge.
We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of
variables and their conditional dependencies using a directed acyclic graph." It is also
called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian Network Consists Of Two Parts:

 Directed Acyclic Graph
 Table of conditional probabilities.
Directed Acyclic Graph: This is a graphical representation of the variables in the
network and the causal relationships between them. The nodes in the DAG represent
variables, and the edges represent the dependencies between the variables. The arrows in
the graph indicate the direction of causality.
Table of Conditional Probabilities: For each node in the DAG, there is a corresponding
table of conditional probabilities that specifies the probability of each possible value of
the node given the values of its parents in the DAG. These tables encode the probabilistic
relationships between the variables in the network.
A Bayesian Belief Network Graph :- A Bayesian network graph is made up of
nodes and Arcs (directed links), where:
 Each node corresponds to the random variables, and a variable can be continuous or
discrete.
 Arc or directed arrows represent the causal relationship or conditional probabilities
between random variables. These directed links or arrows connect the pair of nodes in
the graph.
These links represent that one node directly influence the other node, and if there is
no directed link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented
by the nodes of the network graph.
o If we are considering node B, which is connected with node A by a
directed arrow, then node A is called the parent of Node B.
o Node C is independent of node A.
The Bayesian Network has Mainly Two Components

The Bayesian network has two main components:
1. Causal component
2. Actual or Numerical component.
The causal component represents the causal relationships between the variables in the
system, while the numerical component provides the actual probabilities that are used to
make predictions and to calculate probabilities.
Joint probability distribution:
If we have variables x1, x2, x3,....., xn, then the probabilities of a different
combination of x1, x2, x3.. xn, are known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint
probability distribution.
= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
In general for each variable Xi, we can write the equation as:
Explanation of Bayesian network

Let's understand the Bayesian network through an example by creating a directed acyclic
graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm
reliably responds at detecting a burglary but also responds for minor earthquakes. Harry
has two neighbors David and Sophia, who have taken a responsibility to inform Harry at
work when they hear the alarm. David always calls Harry when he hears the alarm, but
sometimes he got confused with the phone ringing and calls at that time too. On the other
hand, Sophia likes to listen to high music, so sometimes she misses to hear the alarm.
Here we would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary,
nor an earthquake occurred, and David and Sophia both called the Harry.
Solution:
 The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly
affecting the probability of alarm's going off, but David and Sophia's calls depend on
alarm probability.
 The network is representing that our assumptions do not directly perceive the burglary
and also do not notice the minor earthquake, and they also not confer before calling.
 The conditional distributions for each node are given as conditional probabilities table
or CPT.
 Each row in the CPT must be sum to 1 because all the entries in the table represent an
exhaustive set of cases for the variable.
 In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if
there are two parents, then CPT will contain 4 probability values
List of all events occurring in this network:
 Burglary (B)  David Calls(D)
 Earthquake(E)  Sophia calls(S)
 Alarm(A)
We can write the events of problem statement in the form of probability: P[D, S, A, B,
E], can rewrite the above probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:
Conditional probability table for David Calls:

The Conditional probability of David that he will call depends on the probability of
Alarm.
Conditional probability table for Sophia Calls:

The Conditional probability of Sophia that she calls is depending on its Parent Node
"Alarm."
From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:
P(S, D, A, ¬B, ¬E)
= P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint
distribution.
Applications of Bayesian Networks in AI
Some of the most common applications of Bayesian networks in AI include:
1. Prediction and classification: Bayesian belief networks can be used to predict the
probability of an event or classify data into different categories based on a set of
inputs. This is useful in areas such as fraud detection, medical diagnosis, and image
recognition.
2. Decision making: Bayesian networks can be used to make decisions based on
uncertain or incomplete information. For example, they can be used to determine the
optimal route for a delivery truck based on traffic conditions and delivery schedules.
3. Risk analysis: Bayesian belief networks can be used to analyze the risks associated
with different actions or events. This is useful in areas such as financial planning,
insurance, and safety analysis.
4. Anomaly detection: Bayesian networks can be used to detect anomalies in data, such
as outliers or unusual patterns. This is useful in areas such as cybersecurity, where
unusual network traffic may indicate a security breach.
5. Natural language processing: Bayesian belief networks can be used to model the
probabilistic relationships between words and phrases in natural language, which is
useful in applications such as language translation and sentiment analysis.
6. Spam Filtering: One of the significant applications of Bayesian networks is in spam
filtering algorithms. By analyzing the content, context, and other relevant features of
emails, Bayesian networks help detect and sort unwanted or malicious emails,
protecting users from potential threats.
7. Image Processing: Bayesian networks also play a crucial role in image processing
tasks. By leveraging their probabilistic modeling capabilities, these networks assist in
converting images into a digital format, allowing for enhanced image analysis, object
recognition, and other image-related tasks.
What Is The Markov Model?
Markov Model in machine learning is a model that states that future events are only
influenced or affected by current events and not by previous ones. The model’s primary
purpose is to determine the probability of upcoming events with the help of present
events.
Two commonly applied types of Markov model are used when the system being
represented is autonomous -- that is, when the system isn't influenced by an external
agent. These are as follows:
1. Markov chains. These are the simplest type of Markov model and are used to
represent systems where all states are observable. Markov chains show all
possible states, and between states, they show the transition rate, which is the
probability of moving from one state to another per unit of time. Applications of
this type of model include prediction of market crashes, speech recognition and
search engine algorithms.
2. Hidden Markov models. These are used to represent systems with some
unobservable states. In addition to showing states and transition rates, hidden
Markov models also represent observations and observation likelihoods for each
state. Hidden Markov models are used for a range of applications, including
thermodynamics, finance and pattern recognition.
Hidden Markov Model

A statistical model called a Hidden Markov Model (HMM) is used to describe systems
with changing unobservable states over time. Hidden Markov Models (HMMs) are a type
of probabilistic model that are commonly used in machine learning for tasks such as
speech recognition, natural language processing, and bioinformatics.
A Hidden Markov Model (HMM) is a probabilistic model that consists of a sequence of
hidden states, each of which generates an observation. The hidden states are usually not
directly observable, and the goal of HMM is to estimate the sequence of hidden states
based on a sequence of observations.
An HMM consists of two types of variables: hidden states and observations.
1. The hidden states are the underlying variables that generate the observed data,
but they are not directly observable.
2. The observations are the variables that are measured and observed.
An HMM is defined by the following components:

 A set of N hidden states, S = {s1, s2, ..., sN}.
 A set of M observations, O = {o1, o2, ..., oM}.
 An initial state probability distribution, ? = {?1, ?2, ..., ?N}, which specifies the
probability of starting in each hidden state.
 A transition probability matrix, A = [aij], defines the probability of moving from
one hidden state to another.
 An emission probability matrix, B = [bjk], defines the probability of emitting an
observation from a given hidden state.
The relationship between the hidden states and the observations is modeled using a
probability distribution. The Hidden Markov Model (HMM) is the relationship between
the hidden states and the observations using two sets of probabilities: the transition
probabilities and the emission probabilities.
 The transition probabilities describe the probability of transitioning from one
hidden state to another.
 The emission probabilities describe the probability of observing an output given
a hidden state.
Hidden Markov Model Algorithm

The Hidden Markov Model (HMM) algorithm can be implemented using the following
steps:
Step 1: Define the state space and observation space
The state space is the set of all possible hidden states, and the observation space is the set
of all possible observations.
Step 2: Define the initial state distribution
This is the probability distribution over the initial state.
Step 3: Define the state transition probabilities
These are the probabilities of transitioning from one state to another. This forms the
transition matrix, which describes the probability of moving from one state to another.
Step 4: Define the observation likelihoods:
These are the probabilities of generating each observation from each state. This forms the
emission matrix, which describes the probability of generating each observation from
each state.
Step 5: Train the model
The parameters of the state transition probabilities and the observation likelihoods are
estimated using the Baum-Welch algorithm, or the forward-backward algorithm. This is
done by iteratively updating the parameters until convergence.
Step 6: Decode the most likely sequence of hidden states
Given the observed data, the Viterbi algorithm is used to compute the most likely
sequence of hidden states. This can be used to predict future observations, classify
sequences, or detect patterns in sequential data.
Step 7: Evaluate the model
The performance of the HMM can be evaluated using various metrics, such as accuracy,
precision, recall, or F1 score.
Hidden Markov Model With an Example:

To explain it more we can take the example of two friends, Rahul and Ashok. Now Rahul
completes his daily life works according to the weather conditions. Major three activities
completed by Rahul are- go jogging, go to the office, and cleaning his residence. What
Rahul is doing today depends on whether and whatever Rahul does he tells Ashok and
Ashok has no proper information about the weather But Ashok can assume the weather
condition according to Rahul work.
Ashok believes that the weather operates as a discrete Markov chain, wherein the chain
there are only two states whether the weather is Rainy or it is sunny. The condition of the
weather cannot be observed by Ashok, here the conditions of the weather are hidden from
Ashok. On each day, there is a certain chance that Bob will perform one activity from the
set of the following activities {“jog”, “work”,” clean”}, which are depending on the
weather. Since Rahul tells Ashok that what he has done, those are the observations. The
entire system is that of a hidden Markov model (HMM).
Here we can say that the parameter of HMM is known to Ashok because he has general
information about the weather and he also knows what Rahul likes to do on average.
So let’s consider a day where Rahul called Ashok and told him that he has cleaned his
residence. In that scenario, Ashok will have a belief that there are more chances of a rainy
day and we can say that belief Ashok has is the start probability of HMM let’s say which
is like the following.
The states and observation are:
states = ('Rainy', 'Sunny')
observations = ('walk', 'shop', 'clean')
And the start probability is:
start_probability = {'Rainy': 0.6, 'Sunny': 0.4}
Now the distribution of the probability has the weight age more on the rainy day stateside
so we can say there will be more chances for a day to being rainy again and the
probabilities for next day weather states are as following
transition_probability = {
'Rainy' : {'Rainy': 0.7, 'Sunny': 0.3},
'Sunny' : {'Rainy': 0.4, 'Sunny': 0.6}, }
From the above we can say the changes in the probability for a day is transition
probabilities and according to the transition probability the emitted results for the
probability of work that Rahul will perform is
emission_probability = {
'Rainy' : {'jog': 0.1, 'work': 0.4, 'clean': 0.5},
'Sunny' : {'jog': 0.6, 'work: 0.3, 'clean': 0.1},
}
This probability can be considered as the emission probability. Using the emission
probability Ashok can predict the states of the weather or using the transition
probabilities Ashok can predict the work which Rahul is going to perform the next day.
Below image shown the HMM process for making probabilities
So here from the above intuition and the example we can understand how we can use this
probabilistic model to make a prediction
Applications of Hidden Markov Models

 Speech Recognition: One of the most well-known applications of HMMs is
speech recognition. In this field, HMMs are used to model the different sounds
and phones that makeup speech. The hidden states, in this case, correspond to the
different sounds or phones, and the observations are the acoustic signals that are
generated by the speech.
 Natural Language Processing: Another important application of HMMs is
natural language processing. In this field, HMMs are used for tasks such as part-
of-speech tagging, named entity recognition, and text classification. In these
applications, the hidden states are typically associated with the underlying
grammar or structure of the text, while the observations are the words in the text.
In natural language processing systems, the HMMs are usually trained on large
datasets of text, and the estimated parameters of the HMMs are used to perform
various NLP tasks, such as text classification, part-of-speech tagging, and named
entity recognition.
 Bioinformatics: HMMs are also widely used in bioinformatics, where they are
used to model sequences of DNA, RNA, and proteins. HMMs are useful in
bioinformatics because they can effectively capture the underlying structure of the
molecule, even when the data is noisy or incomplete. In bioinformatics systems,
the HMMs are usually trained on large datasets of molecular sequences, and the
estimated parameters of the HMMs are used to predict the structure or function of
new molecular sequences.
 Finance
Finally, HMMs have also been used in finance, where they are used to model
stock prices, interest rates, and currency exchange rates. Applications of the
Hidden Markov Model
 Machine Translation: Given a text in one language, a Hidden Markov Model
can be used to translate it into another language.
 Handwriting Recognition: Given an image of a handwritten text, a machine can
use a Hidden Markov Model to figure out the most probable sequence of
characters written in the text.
 Activity Recognition: A Hidden Markov Model can be used to identify the most
probable activity from a time-series data stream.
Limitations of Hidden Markov Models

Now, we will explore some of the key limitations of HMMs and discuss how they can
impact the accuracy and performance of HMM-based systems.
 Limited Modeling Capabilities: One of the key limitations of HMMs is that they
are relatively limited in their modelling capabilities. HMMs are designed to model
sequences of data, where the underlying structure of the data is represented by a
set of hidden states. However, the structure of the data can be quite complex, and
the simple structure of HMMs may not be enough to accurately capture all the
details. For example, in speech recognition, the complex relationship between the
speech sounds and the corresponding acoustic signals may not be fully captured
by the simple structure of an HMM.
 Overfitting: Another limitation of HMMs is that they can be prone to overfitting,
especially when the number of hidden states is large or the amount of training
data is limited. Overfitting occurs when the model fits the training data too well
and is unable to generalize to new data. This can lead to poor performance when
the model is applied to real-world data and can result in high error rates. To avoid
overfitting, it is important to carefully choose the number of hidden states and to
use appropriate regularization techniques.
 Lack of Robustness: HMMs are also limited in their robustness to noise and
variability in the data. For example, in speech recognition, the acoustic signals
generated by speech can be subjected to a variety of distortions and noise, which
can make it difficult for the HMM to accurately estimate the underlying structure
of the data. In some cases, these distortions and noise can cause the HMM to
make incorrect decisions, which can result in poor performance. To address these
limitations, it is often necessary to use additional processing and filtering
techniques, such as noise reduction and normalization, to pre-process the data
before it is fed into the HMM.
 Computational Complexity: Finally, HMMs can also be limited by their
computational complexity, especially when dealing with large amounts of data or
when using complex models. The computational complexity of HMMs is due to
the need to estimate the parameters of the model and to compute the likelihood of
the data given in the model. This can be time-consuming and computationally
expensive, especially for large models or for data that is sampled at a high
frequency. To address this limitation, it is often necessary to use parallel
computing techniques or to use approximations that reduce the computational
complexity of the model.
What is Utility Theory?

Utility theory in artificial intelligence is a mathematical framework used to model
decision-making under uncertainty. It allows one to assign subjective values or
preferences to different outcomes and helps make optimal choices based on these values.
Utility theory is widely used in various AI applications such as game theory, economics,
robotics, and recommendation systems, among others.
At its core, utility theory helps AI systems make decisions that maximize a specific goal,
referred to as utility. The concept of utility is subjective and varies from person to
person or from system to system. It represents the degree of satisfaction associated with
different outcomes. For example, in a recommendation system, the utility could
describe the level of user satisfaction with a particular recommendation. In a robotics
application, a utility could represent the cost or risk of different actions.
Utility theory also provides a way to model decisions in uncertain or probabilistic
environments, where the outcomes are associated with different probabilities. For
example, in a game of poker, the utility of a particular action may depend on the
probabilities of different cards being dealt to the player. We can use the utility function to
calculate the expected utility of each action, which is the average utility weighted by the
corresponding probabilities. The AI system can then choose the action with the highest
expected utility.
Example
Lottery: To understand the concept of utility theory in artificial intelligence, let's
consider a simple example of a lottery. Suppose you are given the option to play a lottery
with two choices:
1. A guaranteed prize of $$100$
2. A 50% chance of winning $$200$ and a 50% chance of winning nothing
Which option would you choose? Your decision depends on your risk tolerance,
financial situation, and personal preferences. Utility theory provides a way to model
and quantify these preferences mathematically using a utility function.
What is Utility Function?

A utility function is a mathematical function used in Artificial Intelligence (AI) to
represent a system's preferences or objectives. It assigns a numerical value, referred to as
utility, to different outcomes based on their satisfaction level. The utility function is a
quantitative measure of the system's subjective preferences. It is used to guide decision-
making in AI systems. An agent or system typically defines the utility function based on
its goals, objectives, and preferences. It maps different outcomes to their corresponding
utility values, where higher utility values represent more desirable outcomes. The
utility function is subjective and can vary from one agent or system to another, depending
on the specific context or domain of the AI application.
Utility Function Representation (denoted by U)
The utility function is typically denoted as UU. It is a mathematical function that takes as
input the different features of an outcome and maps them to a real-valued utility value.
We can represent the utility function mathematically as U(x)U(x), where xx represents
the attributes or features of an outcome. How we define the utility function can vary
depending on the application and the type of decision problem we are trying to solve.
Examples:
Self-Driving Cars: In the self-driving cars application, the utility function may
consider factors such as time taken, fuel consumption, safety, and comfort, and
assign utility values to different routes based on these factors. The self-driving car can
then use the utility values to calculate the expected utility of each route, taking into
account the probabilities of different traffic conditions or road obstacles, and choose
the route with the highest expected utility to reach the destination.

Probabilistic Reasoning (Unit-3)

Uploaded by

Copyright:

Available Formats

Probabilistic Reasoning (Unit-3)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probabilistic Reasoning (Unit-3)

Uploaded by

Copyright:

Available Formats

Unit-3 (AI)

 P(¬A) = probability of a not happening event.

Conditional Probability Formula

 P(A ∩ B) represents the probability of both events A and B occurring simultaneously

How to Calculate Conditional Probability?

Bayes Theorem Derivation using conditional probability:

P(A|B) = P(A ⋂ B)/ P(B) where P(B) ≠ 0

P(B|A) = P(B ⋂ A)/ P(A) where P(A) ≠ 0

P(A ⋂ B) = P(A | B) P(B) = P(B | A) P(A)

P(A|B) = [P(B|A) P(A)]/ P(B)

Difference Between Conditional Probability and Bayes Theorem:

Application of Bayes' theorem in Artificial intelligence:

Bayesian Belief Network

Bayesian Network Consists Of Two Parts:

The Bayesian Network has Mainly Two Components

Explanation of Bayesian network

Conditional probability table for David Calls:

Conditional probability table for Sophia Calls:

Hidden Markov Model

An HMM is defined by the following components:

Hidden Markov Model Algorithm

Hidden Markov Model With an Example:

Applications of Hidden Markov Models

Limitations of Hidden Markov Models

What is Utility Theory?

What is Utility Function?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.