HCS 404 Notes
HCS 404 Notes
COURSE OUTLINE:
1. Introduction to AI
Principles & Developments in AI
Brief History of AI
2. Intelligent Agents (Types of)
3. Problem Solving: Search Algorithms & heuristics
Uninformed Search
Informed Search
4. Knowledge and Reasoning
Agents that reason logically – prepositional logic
a) Using first-order logic
b) Inference in first-order logic
5. Knowledge Representation
Rules
Frames
Cases
Semantic Nets
6. Rule Based Systems
Forward Chaining
Backward Chaining
7. Reasoning under Uncertainty
Uncertainty
Errors (types of)
Uncertainty in Inference Chains
What is AI
Defn 1.
The science of making machines do things which would require intelligence if they were
done by a human. (Marvin Minsky)
Defn 2.
(a) A set of goals meant to address a (b) class of problems, using (c) a set of methods
by a (d) set of people
2
(d) A Set of People
X, Y, Z you & me
-An aim to build systems that think and act like humans, and think and act rationally.
Human Rational
Think * *
Act * *
3
with the world and generate better strategies over time.
Intelligent Agents
• An agent is an entity that perceives and acts. It:
– Exists in an environment
– Has sensors that detect percepts (agent’s perceptual input at any given time) in the
environment
– Has effectors that can carry out actions which may change the environment
– Has goals to achieve
4
2. With internal state (Model-Based)
– Have memory of past percepts & actions. Model – knowledge about how the
world works. E g. finite state machine, vending machine
3. Goal Based
– Explicit representation of desired state(s) of environment, more flexible &
knowledge can be modified.
• Utility-based
5
– How much is each goal desired. A utility function maps a state onto a real
number, which describes the associated degree of happiness. A complete
specification of the utility function allows rational decisions in two kinds of cases,
where goals are inadequate. First, when they are conflicting goals, the function
specifies the appropriate tradeoff. Second, when there are several goals that the
agent can aim for, none of which can be achieved with certainty, utility provides a
way in which the likelihood of success can be weighed up against the importance
of the goals.
Environments
• accessible vs. inaccessible (complete state of environment accessible to the agent at
each point)
• deterministic vs. non-deterministic (stochastic), next state based on current state.
• episodic vs. non-episodic (each episode consists of the agent perceiving & then
performing a single independent action)
• static vs. dynamic (environment changes while agent performs action)
• discrete vs. continuous (the way time is handled, & to the percepts & actions of the
agent.
6
Dartmouth conference by John McCarthy
Presentation of the Logic Theorist (PRG for basic logic theorem proving) by
Newell & Simon
C. AI Early Work (1952 – 1969)
Newell & Simon’s General Problem Solver
Gelerter(1959) Geometry Theorem Prover
Definition of LISP by McCarthy
Development of microworlds programs by Minsky’s students
D. Discovery of Reality(1966 – 1973)
Identification of perceptrons limitations in terms of knowledge representation.
E. Knowledge-Based Systems(1969-1979)
Introduction to the use of domain specific knowledge to solve complex prob.
Design of DENDRAL used to solve the problem of inferring molecular structure
from info. Provided by a mass spectrometer
Design of MYCIN used to diagnose blood infections.
Birth of PROLOG
F. AI Industry Boom (1980 - 1990)
Commercial expert system (R1) used to configure orders for new computer systems at
DEC.
US sales of AI-related hardware and software sowed to $425 million in 1986.
G. The Return of neural networks(1986 – 1990)
Hopfield neural networks developed
H. AI becomes a science(1987 – Now)
Experiments on Speech recognition & neural networks carried out
Introduction of data mining technology
Bayesian Networks used to reason with uncertain knowledge
Rationally acting expert systems designed
I. Beginning of Intelligent Agents (1995 – Now)
Application Areas of Artificial Intelligence
7
A computer answering the phone, and replying to a question
understanding the text on a Web page to decide who it might be of interest to
translating a daily newspaper from Japanese to English
understanding text in journals / books and building an expert system based on that
understanding
5. Speech Recognition
Lecture 3
Problem Solving by Search
Problem Solving Agents are Goal Based agents, which decide what to do by finding
sequences of actions that lead to desirable sates. A problem in the state-space search is
defined as follows (S O G) where
The state space is identified by a directed graph in which each node is a state and each
arc represents the application of an operator transforming a state onto a successor state. A
solution is a path from a start state to a goal state. A goal state may be defined either
explicitly or as the set of states satisfying a given predicate
Goal Formulation
A goal is a set of world states in which a goal is satisfied. Goal formulation is the first
step in problem solving. The agent’s task is to find out which sequence of actions will get
it to a goal state.
8
Problem formulation is the process of deciding what actions and states to consider,
given a goal. Search is the process of looking for possible sequences of actions that lead
to states of known value, and choosing the best sequence when they are several options.
A search algorithm takes a problem as input and returns a solution in the form of an
action sequence.
Here the environment is assumed to be static and can be viewed as discrete, and the
initial state is known. The environment is also assumed to be deterministic. This is an
oversimplified case.
9
• Problem formulation as search:
– states: one of the 8 shown above.
– operators: move left, move right, or suck.
– goal test: no dirt left in any square.
– path cost: each action costs 1; path cost=path length.
2. The 8-puzzle
10
Search Method Evaluation
Criteria • Correctness: Are all ``solutions'' found by the algorithm correct?
• Completeness: Does the algorithm find a solution, whenever one exists?
• Termination: Is the algorithm guaranteed to terminate on all problems?
• Solution Optimality: Is the algorithm guaranteed to find optimal solutions?
• Complexity: How much time and space does the algorithm use?
• Algorithm Optimality: Does the algorithm use as little time or space as
possible?
Lecture 4
Informed Search Strategies
-The search space of a problem is generally described using the number of possible states
in each search.
-Some problems are too large to search efficiently using uniformed methods.
-At times we have additional knowledge about the problem that we can use to inform the
agent doing the search.
-Heuristics (informed guesses) are employed to direct the search.
-Heuristics can estimate the goodness of a particular node (or state) n. i.e.
-how close is n to a goal node.
-What might be the minimal cost path from n to a goal node?
-Formally, h(n) >= 0 for all nodes
-h(n) = 0 means n is a goal node.
-h(n) = implies n is a dead end, does not reach a goal state.
-e.g. heuristic hSLD (n) = straight line distance from n to goal.
A/ Best-first search
–General approach of informed search:
-Node is selected for expansion based on an evaluation function f(n)
-Evaluation function measures distance to the goal.
–Choose node, which appears best
-It can be implemented as:
11
Has variations, that use some estimated measure of the cost of the solution and try to
minimize it, e.g.cost of the path so far from the initial state to the present state. In order to
focus the search, the measure must incorporate some estimate of the cost of the path from
a particular state to the goal.
-Suppose the function f(n) = hSLD(n) is employed using the Greedy Best search algorithm
where SLD is straight line distance from node n to Bucharest as given in the map The
figure below shows how to get to Bucharest using Greedy Best Search Algorithm with
the prescribed function.
12
The algorithm finds a path, though not an optimal path: the path it found via Sibiu and
Fagaras to Bucharest is 32 miles longer than the path through Pimnicu Vilcea and Pitesti.
Hence, the algorithm always chooses what looks locally best, rather than worrying about
whether or not it will be best in the long run.
Greedy search is susceptible to false starts. Consider the problem of getting from Iasi to
Fagaras. h suggests that Neamt be expanded first, but it is a deadend. The solution is to
go first to Vaslui and then continue to Urziceni, Bucharest and Fagaras.
NB, if we are not careful to detect repeated states, the solution will never be found - the
search will oscillate between Neamt and Iasi.
Greedy search resembles depth first search (dfs) in the way that it prefers to follow a
single path to the goal and backup only when a dead-end is encountered. It suffers from
the same defects as dfs - it is not optimal and it is incomplete because it can start down an
infinite path and never try other possibilities.
The worst-case complexity for greedy search is O(bm), where m is the maximum depth of
the search. Its space complexity is the same as its time complexity, but the worst case can
be substantially reduced with a good heuristic function.
ii) A* search
-Combines the use of g(n), the cost of the path so far, and h(n), the cost of moving from
the node n to the goal, simply by summing them: f(n)=g(n)+h(n)
-f(n) = the estimated cost of the cheapest solution through node n .
-Minimizes the total estimated cost, and is optimal if h(n) is an admissible heuristic(i.e.
no overestimation of reaching goal).
-Avoid expanding paths that are already expensive (consider the lowest g(n)).
The following is a search from Arad to Bucharest using A* Search
13
Notice that A* prefers to expand from Rimnicu Vilcea rather than Fagaras. Even though
Fagaras is closer to Bucharest, the path taken to get to Fagaras is not as efficient in getting
close to Bucharest as the path taken to get to Rimnicu.
For the A* search above: along any path from the root, the f-cost never decreases. This
fact holds true for almost all admissible heuristics. A heuristic with this property is said to
exhibit monotonicity.
If f never decreases along any path out of the root, we can conceptually draw contours in
the state space.
Because A* expands the leaf node of lowest f, an A* search fans out from the start node,
adding nodes in concentric bands of increasing f-cost.
14
In A*, the first solution found must be the optimal one because nodes in all subsequent
contours will have higher f-cost and hence higher g-cost.
Complexity of A*
The catch with A* is that even though its complete, optimal and optimally efficient, it still
can't always be used, because for most problems, the number of nodes within the goal
contour search space is still exponential in the length of the solution.
Similarly to breadth-first search, however, the major difficulty with A* is the amount of
space that it uses.
15
(1) Has a f-value worse than the f-limit. -> Backtrack
-Backtrack over Rumnicu Vilcea and store f-value for best child (Pitesti w/ value 417)
Best is now Fagaras. Best child value is 450
But 450 is greater than 417 -> backtrack!
Backtrack over Fagaras and store f-value for best child (bucharest=450)
Best is now Rimnicu Viclea (again 447).
Subtree is again expanded
Best alternative subtree is now through Timisoara (447).
Solution is found since because 447(dist. at Timisoara) > 417(dist. at Pitetsi).
Characteristics
-Optimal if the heuristic function is admissible (optimistic)
-May explore same states several times, time consuming
-Saves space
16
Local Search Algorithms (Iterative improvement search)
-Only interested in solution, not path.
-Employed in optimization problems, e.g. Find the min(max) objective function.
Local search begins from an arbitrary state in the search space and looks for an
improvement in the neighborhood of that state, until no improvement can be found.
(Iterative improvement)
Local search algorithms: keep a single "current" state, try to improve it according to an
objective function.
Local Search Algorithms employ special heuristics.
A heuristic is a simplification or educated guess that limits the search for
solutions in domains that are difficult or poorly understood.
It cannot be computed from problem definition itself
A heuristic h(n) is admissible if for every node n,
h(n) ≤ h*(n), where h*(n) is the true cost to reach the goal state from n.
An admissible heuristic never overestimates the cost to reach the goal, i.e., it is
optimistic
Example: hSLD(n) (never overestimates the actual road distance)
Hill climbing search works well (and is fast, and takes little memory) if an accurate
heuristic measure is available in the domain, and if there are no local maxima.
17
In general the idea is to:
o Start from initial state, loop over operators, generate a state for the current
operator. If the newly generated state is better than the current state, go to that
state, repeat until a goal is found.
o Dis-Advantages : susceptible to local maxima, plateau's and ridges.
o Remember, from any given node, we go to the next node that is better than the
current node. If there is no node better than the present node, then hill climbing
halts, i.e. has found a solution.
Hill-climbing is a greedy strategy. The upside is it may make rapid progress towards the
best state. The downside is that it can halt at a local maximum. A local maximum is a
state that is better than all its successors but worse than the global maximum.
One workaround is to allow sideway moves, i.e. to allow the algorithm to choose a
successor that has the same value as the current state. This can help a search get off a
shoulder (why?), but the search becomes infinite on a plateau (why?). So a limit must be
placed on the number of consecutive sideways moves.
The most sophisticated versions of this idea go under the name tabu search. Tabu search
is hill-climbing search, often with complete-state formulations, in which sideways moves
are allowed. However, a fixed size memory is maintained of the most recently visited
states. This is called the tabu list. The current state may not be replaced by any state that
is currently on the tabu list.
Allowing sideway moves still doesn’t solve the problem of other kinds of local maxima.
18
One possible solution is random re-start hill-climbing. In random re-start hill-climbing,
a series of hill-climbing searches are executed from randomly-chosen initial states. In the
limit (if enough random re-starts are done), this should find an optimal solution.
Another possible solution is to even allow downwards moves, i.e. if none of the
successors is better than the current state, but the time limit is not yet reached, allow the
algorithm to choose one of these successors that worsens the value. If you do this, you
should store the current state before making the downwards move. Next time you reach a
maximum, you can see whether the new maximum is better or worse than the one you
stored earlier, and work with the better of the two. (This is not the same as having an
agenda. Here we are remembering just one state: the best (local) maximum seen so far.)
Lecture 7
Knowledge & Reasoning
Perspectives of Knowledge Representation
a) KR as applied epistemology (study of the meaning of knowledge)
All intelligent activity presupposes knowledge. Knowledge is represented in a knowledge
base, which consists of knowledge structures (typically symbolic) and programs
(1) how to represent the knowledge one has about a problem domain, and
(2) how to reason using that knowledge in order to answer questions or make
decisions
Knowledge representation deals with the problem of how to model the world
sufficiently for intelligent action
19
Logic is one of the oldest representation languages studied for AI, and is the
foundation for many existing systems that use logic as either inspiration or the basis
for the tools in that system
c) Frames: Much like a semantic network except each node represents prototypical
concepts and/or situations. Each node has several property slots whose values may be
specified or inherited by default.
person(Socrates).
person(Hillary).
forall X [person(X) ---> mortal(X)]
g) Rules: The use of Production Systems to encode condition-action rules (as in expert
systems).
20
j) Hybrid Schemes: Any representation formalism employing a combination of KR
schemes.
Representation Languages
A logic includes:
o Syntax: Specifies the symbols in the language and how they can be
combined to form sentences. Hence facts about the world are represented
as sentences in logic
o Semantics: Specifies what facts in the world a sentence refers to. Hence,
also specifies how you assign a truth value to a sentence based on its
meaning in the world. A fact is a claim about the world, and may be true
or false.
o Inference Procedure: Mechanical method for computing (deriving) new
(true) sentences from existing sentences
Properties of Logic
Has very limited expressive power (unlike natural language) e.g., cannot say \pits cause
breezes in adjacent squares" except by writing one sentence for each square.
Facts in logic are claims about the world that are True or False, whereas a
representation is an expression (sentence) in some language that can be encoded
in a computer program and stands for the objects and relations in the world
We need to ensure that the representation is consistent with reality, so that the
following figure holds:
21
entails
Representation: Sentences --------------> Sentences
| Semantics | Semantics
| refer to refer to
| |
follows
World: Facts ------------------> Facts
Truth: A sentence is True if the state of affairs it describes is actually the case in
the world. So, truth can only be assessed with respect to the semantics. Yet the
computer does not know the semantics of the knowledge representation language,
so we need some way of performing inferences to derive valid conclusions even
when the computer does not know what the semantics (the interpretation) is
In (PL)
A user defines a set of propositional symbols, like P and Q. User defines the
semantics of each of these symbols. For example,
o P means "It is hot"
o Q means "It is humid"
o R means "It is raining"
A sentence (also called a formula or well-formed formula or wff) is defined as:
1. A symbol
2. If S is a sentence, then ~S is a sentence.
3. If S and T are sentences, then (S T), (S T), (S => T), and (S <=> T) are also
sentences,
4. A finite number of applications of (1)-(3)
Examples of PL sentences:
o (P Q) => R (here meaning "If it is hot and humid, then it is raining")
o Q => P (here meaning "If it is humid, then it is hot")
o Q (here meaning "It is humid.")
22
Given the truth values, of all of the constituent symbols in a sentence, that
sentence can be "evaluated" to determine its truth value (True or False). This is
called an interpretation of the sentence.
A model is an interpretation (i.e., an assignment of truth values to symbols) of a
set of sentences such that each sentence is True. A model is just a formal
mathematical structure that "stands in" for the world.
A valid sentence (also called a tautology) is a sentence that is True under all
interpretations. E.g. It's raining or it's not raining."
An inconsistent sentence (also called un-satisfiable or a contradiction) is a
sentence that is False under all interpretations. Hence the world is never like what
it describes. For example, "It's raining and it's not raining."
Sentence P entails sentence Q, written P |= Q, means that whenever P is True, so
is Q. In other words, all models of P are also models of Q
Since the computer doesn't know the interpretation of these sentences in the world, we
don't know whether the constituent symbols represent facts in the world that are True or
False. So, instead, consider all possible combinations of truth values for all the symbols,
hence enumerating all logically distinct cases:
F F ... F
F F ... T
T T .. T
23
second-from-last column also has a T. But this is logically equivalent to saying
that the sentence (KB => G) is valid (by definition of the "implies" connective).
In other words, if the last column of the table above contains only True,
then KB entails G; or conclusion G logically follows from the premises in KB,
no matter what the interpretations (i.e., semantics) associated with all of the
sentences!
The truth table method of inference is complete for Propositional Logic because
we can always enumerate all 2n rows for the n propositional symbols that occur.
But this is exponential in n. In general, it has been shown that the problem of
checking if a set of sentences in PL is satisfiable is NP-complete. (The truth table
method of inference is not complete for FOL (First-Order Logic).)
Recap
Example
Using the "weather" sentences from above, let KB = (((P Q) => R) (Q => P) Q)
corresponding to the three facts we know about the weather:
(1)"If it is hot and humid, then it is raining,”
(2)"If it is humid, then it is hot," and
(3) "It is humid."
Now let's ask the query "Is it raining?" That is, is the query sentence R entailed by KB?
Using the truth-table approach to answering this query we have:
24
FFF T T F F F T
Hence, in this problem there is only one model of KB, when P, Q, and R are all True.
And in this case R is also True, so R is entailed by KB. Also, you can see that the last
column is all True values, so the sentence KB => R is valid.
Instead of creating a truth table, a proof procedure or inference procedure that uses
sound rules of inference to deduce (i.e., derive) new sentences that are true in all cases
where the premises are true, can be used For example, consider the following:
Since whenever P and P => Q are both true (last row only), Q is true too, Q is said to be
derived from these two premise sentences. We write this as KB |- Q. This local pattern
referencing only two of the M sentences in KB is called the Modus Ponens inference
rule. The truth table shows that this inference rule is sound. It specifies how to make one
kind of step in deriving a conclusion sentence from a KB.
Therefore, given the sentences in KB, construct a proof that a given conclusion sentence
can be derived from KB by applying a sequence of sound inferences using either
sentences in KB or sentences derived earlier in the proof, until the conclusion sentence is
derived.
25
Resolution A B, ~B C AC
1. Q Premise
2. Q => P Premise
3. P Modus Ponens (1,2)
4. (P Q) => R Premise
5. PQ And Introduction (1,3)
6. R Modus Ponens (4,5)
Soundness: If KB |- Q then KB |= Q
That is, if Q is derived from a set of sentences KB using a given set of rules of
inference, then Q is entailed by KB. Hence, inference produces only real
entailments, or any sentence that follows deductively from the premises is valid.
Completeness: If KB |= Q then KB |- Q
That is, if Q is entailed by a set of sentences KB, then Q can be derived from KB
using the rules of inference. Hence, inference produces all entailments, or all
valid sentences can be proved from the premises.
26
Mortal-Confucius
That is, we have used four symbols to represent the three given sentences. But, given this
representation, the third sentence is not entailed by the first two.
A different representation would be to use three symbols to represent the three sentences
as:
Person => Mortal
Confucius => Person
Confucius => Mortal
In this case the third sentence is entailed by the first two, but we needed an explicit
symbol, Confucius, to represent an individual who is a member of the classes "person"
and "mortal." So, to represent other individuals we must introduce separate symbols for
each one, with means for representing the fact that all individuals who are "people" are
also "mortal." First-Order Logic (abbreviated FOL or FOPC) is expressive enough to
concisely represent this kind of situation. Read about the Wampus World
Lecture 8
First-Order Logic (FOL) Syntax
FOL primitives:
Variable symbols. e.g., x, y
Connectives. Same as in PL: not (~), and (^), or (v), implies (=>), if and only if (<=>)
Quantifiers: Universal () and Existential ()
Existential quantifier normally used with "and" to specify a list of properties or facts
about an individual.
e.g., (x) MIM703student(x) ^ smart(x) means
27
" There is anMIM703 student who is smart.".
Switching the order of universal quantifiers does not change the meaning: (x)( y)P(x,y)
is logically equivalent to (y)( x)P(x,y). Similarly, you can switch the order of existential
quantifiers.
28
Universal Elimination
If (x)P(x) is true, then P(c) is true, where c is a constant in the domain of x.
e.g. from (x)eats(Ziggy, x) we can infer eats(Ziggy, IceCream).
The variable symbol can be replaced by any ground term, i.e., any constant symbol or
function symbol applied to ground terms only.
Existential Introduction
If P(c) is true, then (x)P(x) is inferred.
e.g., from eats(Ziggy, IceCream) we can infer (x)eats(Ziggy, x). All instances of the
given constant symbol are replaced by the new variable symbol. Note that the variable
symbol cannot already exist anywhere in the expression.
Existential Elimination
From (x)P(x) infer P(c). For example, from (x)eats(Ziggy, x) infer eats(Ziggy, cheese).
Note that the variable is replaced by a brand new constant that does not occur in this or
any other sentence in the Knowledge Base.
In general, given atomic sentences P1, P2, ..., PN, and implication sentence
(Q1 ^ Q2 ^ ... ^ QN) => R, where Q1, ..., QN and R are atomic sentences, and
subst(Theta, Pi) = subst(Theta, Qi) for i=1,...,N, derive new sentence: subst(Theta, R)
subst(Theta, alpha) denotes the result of applying a set of substitutions defined by Theta
to the sentence alpha
A substitution list Theta = {v1/t1, v2/t2, ..., vn/tn} means to replace all occurrences of
variable symbol vi by term ti. Substitutions are made in left-to-right order in the list.
Generalized Modus Ponens is not complete for FOL but is complete for KBs containing
only Horn clauses
Proofs start with the given axioms/premises in KB, deriving new sentences using GMP
until the goal/query sentence is derived.
This defines a forward chaining inference procedure because it moves "forward" from
the KB to the goal.
e.g.: KB = 1. All cats like fish,2. cats eat everything they like, and 3. Ziggy is a cat.
29
In FOL, KB = (x) cat(x) => likes(x, Fish)
(x)( y) (cat(x) ^ likes(x,y)) => eats(x,y) cat(Ziggy)
Proof:
Use GMP with (1) and (3) to derive: 4. likes(Ziggy, Fish)
Use GMP with (3), (4) and (2) to derive eats(Ziggy, Fish)
So, Yes, Ziggy eats fish.
Backward-chaining deduction using GMP is complete for KBs containing only Horn
clauses. Proofs start with the goal query, find implications that would allow you to prove
it, and then prove each of the antecedents in the implication, continuing to work
"backwards" until we get to the axioms, which we know are true.
Proof:
Goal matches RHS of Horn clause (2), so try and prove new sub-goals cat(Ziggy) and
likes(Ziggy, Fish) that correspond to the LHS of (2) cat(Ziggy) matches axiom (3), so
we've "solved" that sub-goal likes(Ziggy, Fish) matches the RHS of (1), so try and prove
cat(Ziggy) cat(Ziggy) matches (as it did earlier) axiom (3), so we've solved this sub-goal
There are no unsolved sub-goals, so we're done. Yes, Ziggy eats fish.
30
substitution list Theta, then derive the resolvent sentence:
subst(Theta, P1 v ... v Pj-1 v Pj+1 v ... v Pn v Q1 v ... Qk-1 v Qk+1 v ... v Qm)
Example
From clause P(x, f(a)) v P(x, f(y)) v Q(y) and clause ~P(z, f(a)) v ~Q(z), derive resolvent
clause P(z, f(y)) v Q(y) v ~Q(z) using Theta = {x/z}
Unification
Unification is a "pattern matching" procedure that takes two atomic sentences, called
literals, as input, and returns "failure" if they do not match and a substitution list, Theta,
if they do match. That is, unify(p,q) = Theta means subst(Theta, p) = subst(Theta, q) for
two atomic sentences p and q.
Examples
Literal 1 Literal 2 Result of Unify
Parents(x, father(x), parents(Bill, father(Bill),
{x/Bill, y/mother(Bill)}
mother(Bill)) y)
Parents(x, father(x), {x/Bill, y/Bill,
parents(Bill, father(y), z)
mother(Bill)) z/mother(Bill)}
Parents(x, father(x), parents(Bill, father(y),
Failure
mother(Jane)) mother(y))
31
3. Semantic Networks
Semantic net is a labelled graph.
Models the associations between ideas that people maintain.
nodes in the graph represent objects, concepts, or situations;
arcs in graph represent relationships between objects.
x is a member of y
x is a y (is-a relationship) e.g. opus is-a penguin
x is R-related to y
e.g. bill friend opus (friend is the relationship)
Inheritance
Inheritance is one of the main kinds of reasoning done in semantic nets
The subset (ako) relation is often used to link a class and its super-class.
Some links (e.g. legs) are inherited along subset paths
The semantics of a semantic net can be relatively informal or very formal
Often defined at the implementation level
A node can have any number of super-classes that contain it, enabling a node to inherit
properties from multiple parent nodes and their ancestors in the network. This can cause
conflicting inheritance.
32
Related knowledge is easily clustered.
Efficient in space requirements
Objects represented only once
Frames
Frames ako semantic net with properties and methods
Devised by Marvin Minsky, 1974.
Incorporates certain valuable human thinking characteristics:
Expectations, assumptions, stereotypes. Exceptions. Fuzzy boundaries between classes.
The essence of this form of knowledge representation is typicality, with exceptions, rather
than definition.
Hierarchical structure
How Frames are Organised I
A frame can represent a specific entry, or a general concept
Each frame has:
A name
Slots (attributes) which have values
a specific value
a default value
an inherited value
a pointer to another frame
a procedure that gives the value
33
Reasoning with Frames
Easy to answer questions such as is x a y?
Simply follow the instance and/or is-a links.
e.g. Is Opus a bird?
Also useful for default reasoning.
Simply inherit all default values that are not explicitly provided.
e.g How many legs does Opus have?
Frame Organization
In the higher levels of the frame hierarchy, typical knowledge about the class is stored.
In the lower levels, the value in a slot may be a specific value, to overwrite the value
which would otherwise be inherited from a higher frame.
An instance of an object is joined to its class by an instance relationship.
A class is joined to its super class by an is-a relationship.
Frame Advantages
Fairly intuitive for many applications
Similar to human knowledge organization
Suitable for causal knowledge
Easy to include default information and detect missing values
34
Easier to understand than logic or rules
Very flexible
Frame Disadvantages
No standards (slot-filler values)
More of a general methodology than a specific representation:
Frame for a class-room will be different for a professor and for a maintenance
worker
No associated reasoning/inference mechanisms
A number of approaches can be employed to deal with errors in the system, examples
are:-
1. Certainty Factors
Certainty factors are values used to approximate the degree to which we think a rule in a
rule-based system is correct. These values range from –1 and +1. The negative value
indicates predominance of opposing evidence, while the positive value indicates a
predominance of confirming evidence for the rule being correct.
35
Min(0.6, 0.8) * 0.9 = 0.54.
If there are two rules talking about the same point then the certainty factor of the point is
considered as the maximum of the CFs of the two rules.
Example 2
36
certain facts in the light of new evidence. The fundamental notion is that of conditional
probability:
P(H | E)
This is read as the probability that a hypothesis H is true given that we have observed
evidence E for the hypothesis. So, for example, we might be interested in the probability
that a patient has measles given the knowledge that they have spots:
P(patient-has-measles | patient-has-spots)
Sometimes we will know how likely some ``evidence'' is, if some hypothesis is true, but
not the other way around. For example, we may know that 50% of people with measles
have spots. We may also know that:
The only diseases that cause spots are measles, chickenpox and lassa fever.
60%of people with chickenpox have spots.
80%of people with lassa fever have spots.
There is a 1%chance of someone in a given population having measles (given no
evidence for or against).
There is a 1%chance of them having chickenpox.
There is a 0.05%chance of them having lassa fever.
This can be represented more formally as:
From this we can calculate the probability that they have measles if they have spots (if we
have no other evidence). Abbreviating things somewhat, for the above example we have:
Bayes’ theorem is only valid if we know all the conditional probabilities relating to the
evidence in question. In fact, as we consider more and more evidence it quickly becomes
computationally intractable to use Bayes’ theorem, quite apart from the problem of
obtaining and representing all the conditional probabilities. Because of this, Bayes’
theorem is rarely. However, it is important as it is a well-known, sound way of dealing
with the probabilities of hypotheses given evidence, and as such provides a kind of
standard for assessing other approaches.
37