Causes of Uncertainty
Causes of Uncertainty
Causes of Uncertainty
Till now, we have learned knowledge representation using first-order logic and
propositional logic with certainty, which means we were sure about the predicates.
With this knowledge representation, we might write A→B, which means if A is true then
B is true, but consider a situation where we are not sure about whether A is true or not
then we cannot express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we
need uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the
concept of probability to indicate the uncertainty in knowledge. In probabilistic
reasoning, we combine probability theory with logic to handle the uncertainty.
In the real world, there are lots of scenarios, where the certainty of something is not
confirmed, such as "It will rain today," "behavior of someone for some situations," "A
match between two teams or two players." These are probable sentences for which we
can assume that it will happen but not sure about it, so here we use probabilistic
In probabilistic reasoning, there are two ways to solve problems with uncertain
o Bayes' rule
o Bayesian Statistics
Probability: Probability can be defined as a chance that an uncertain event will occur.
It is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and objects in
the real world.
Let's suppose, we want to calculate the event A when event B has already occurred,
"the probability of A under the conditions of B", it can be written as:
If the probability of A is given and we need to find the probability of B, then it will be
given as:
It can be explained by using the below Venn diagram, where B is occurred event, so
sample space will be reduced to set B, and now we can only calculate event A when
event B is already occurred by dividing the probability of P(A⋀B) by P( B ).
In a class, there are 70% of the students who like English and 40% of the students who
likes English and mathematics, and then what is the percent of students those who like
English also like mathematics?
Hence, 57% are the students who like English also like Mathematics.
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning,
which determines the probability of an event with uncertain knowledge.
Bayes' theorem was named after the British mathematician Thomas Bayes.
The Bayesian inference is an application of Bayes' theorem, which is fundamental to
Bayesian statistics.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can
determine the probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event
A with known event B:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is
basic of most modern AI systems for probabilistic inference.
It shows the simple relationship between joint and conditional probabilities. Here,
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we
calculate the probability of evidence.
P(A) is called the prior probability, probability of hypothesis before considering the
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule
can be written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Question: what is the probability that a patient has diseases meningitis with a
stiff neck?
Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it
occurs 80% of the time. He is also aware of some more facts, which are given as follows:
Let a be the proposition that patient has stiff neck and b be the proposition that patient
has meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a
stiff neck.
Question: From a standard deck of playing cards, a single card is drawn. The
probability that the card is king is 4/52, then calculate posterior probability
P(King|Face), which means the drawn face card is a king card.
o It is used to calculate the next step of the robot when the already executed step
is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
Frame Representation
A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world. Frames are the AI data structure which divides
knowledge into substructures by representing stereotypes situations. It consists of a
collection of slots and slot values. These slots may be of any type and sizes. Slots have
names and values which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames
which enable us to put constraints on the frames. Example: IF-NEEDED facts are called
when data of any particular slot is needed. A frame may consist of any number of slots,
and a slot may include any number of facets and facets may have any number of values.
A frame is also known as slot-filter knowledge representation in artificial
Frames are derived from semantic networks and later evolved into our modern-day
classes and objects. A single frame is not much useful. Frames system consist of a
collection of frames which are connected. In the frame, knowledge about an object or
event can be stored together in the knowledge base. The frame is a type
Components of a script
• Speech
• Written Text
Components of NLP
There are two components of NLP as given −
Natural Language Understanding (NLU)
Understanding involves the following tasks −
Reinforcement Learning
Imagine a mouse in a maze trying to find hidden pieces of cheese. At first, the Mouse
may move randomly, but after a while, the Mouse's feel helps sense which actions
bring it closer to the cheese. The more times we expose the Mouse to the maze, the
better at finding the cheese.
Process for Mouse refers to what we do with Reinforcement Learning (RL) to train a
system or game. Generally speaking, RL is a method of machine learning that helps an
agent to learn from experience.
You can use RL when you have little or no historical data about a problem, as it does
not require prior information (unlike traditional machine learning methods). In the RL
framework, you learn from the data as you go. Not surprisingly, RL is particularly
successful with games, especially games of "correct information" such as chess and
Go. With games, feedback from the agent and the environment comes quickly,
allowing the model to learn faster. The downside of RL is that it can take a very long
time to train if the problem is complex.
As IBM's Deep Blue beat the best human chess player in 1997, the RL-based algorithm
AlphaGo beat the best Go player in 2016. The current forerunners of RL are the teams
of DeepMind in the UK.
In April 2019, the OpenAI Five team was the first AI to defeat the world champion team
of e-sport Dota 2, a very complex video game that the OpenAI Five team chose
because there were no RL algorithms capable of winning it. You can tell that
reinforcement learning is a particularly powerful form of AI, and we certainly want to
see more progress from these teams. Still, it's also worth remembering the limitations
of the method.
Supervised learning
Supervised machine learning creates a model that makes predictions based on
evidence in the presence of uncertainty. A supervised learning algorithm takes a known
set of input data and known responses to the data (output) and trains a model to
generate reasonable predictions for the response to the new data. Use supervised
learning if you have known data for the output you are trying to estimate.
Classification models classify the input data. Classification techniques predict discrete
responses. For example, the email is genuine, or spam, or the tumor is cancerous or
benign. Typical applications include medical imaging, speech recognition, and credit
Use taxonomy if your data can be tagged, classified, or divided into specific groups or
classes. For example, applications for handwriting recognition use classification to
recognize letters and numbers. In image processing and computer vision,
unsupervised pattern recognition techniques are used for object detection and image
If you are working with a data range or if the nature of your response is a real number,
such as temperature or the time until a piece of equipment fails, use regression
Common regression algorithms include linear, nonlinear models, regularization,
stepwise regression, boosted and bagged decision trees, neural networks, and
adaptive neuro-fuzzy learning.
Physicians want to predict whether someone will have a heart attack within a year.
They have data on previous patients, including age, weight, height, and blood
pressure. They know if previous patients had had a heart attack within a year. So the
problem is to combine existing data into a model that can predict whether a new
person will have a heart attack within a year.
Unsupervised Learning
Detects hidden patterns or internal structures in unsupervised learning data. It is used
to eliminate datasets containing input data without labeled responses.
For example, if a cell phone company wants to optimize the locations where they
build towers, they can use machine learning to predict how many people their towers
are based on.
A phone can only talk to 1 tower at a time, so the team uses clustering algorithms to
design the good placement of cell towers to optimize signal reception for their groups
or groups of customers.
The expert system is a part of AI, and the first ES was developed in the year 1970, which
was the first successful approach of artificial intelligence. It solves the most complex
issue as an expert by extracting the knowledge stored in its knowledge base. The
system helps in decision making for compsex problems using both facts and
heuristics like a human expert. It is called so because it contains the expert
knowledge of a specific domain and can solve any complex problem of that particular
domain. These systems are designed for a specific domain, such as medicine,
science, etc.
The performance of an expert system is based on the expert's knowledge stored in its
knowledge base. The more knowledge stored in the KB, the more that system improves
its performance. One of the common examples of an ES is a suggestion of spelling
errors while typing in the Google search box.
Below is the block diagram that represents the working of an expert system:
Note: It is important to remember that an expert system is not used to replace the human
experts; instead, it is used to assist the human in making a complex decision. These
systems do not have human capabilities of thinking and work on the basis of the
knowledge base of the particular domain.
o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis
expert system. It was used in organic chemistry to detect unknown organic molecules
with the help of their mass spectra and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was designed
to find the bacteria causing infections like bacteraemia and meningitis. It was also used
for the recommendation of antibiotics and the diagnosis of blood clotting diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung
cancer. To determine the disease, it takes a picture from the upper body, which looks
like the shadow. This shadow identifies the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer
at early stages.
• Track: Variations on the script. Different tracks may share components
of the same scripts.
• Roles: These are the actions that the individual participants perform.