First-Order Logic: Epresentation Evisited
First-Order Logic: Epresentation Evisited
First-Order Logic: Epresentation Evisited
8
In which we notice that the world is blessed with many objects, some of which are
related to other objects, and in which we endeavor to reason about them.
In Chapter 7, we showed how a knowledge-based agent could represent the world in which it
operates and deduce what actions to take. We used propositional logic as our representation
language because it sufficed to illustrate the basic concepts of logic and knowledge-based
agents. Unfortunately, propositional logic is too puny a language to represent knowledge
FIRST-ORDER LOGIC of complex environments in a concise way. In this chapter, we examine first-order logic,1
which is sufficiently expressive to represent a good deal of our commonsense knowledge.
It also either subsumes or forms the foundation of many other representation languages and
has been studied intensively for many decades. We begin in Section 8.1 with a discussion of
representation languages in general; Section 8.2 covers the syntax and semantics of first-order
logic; Sections 8.3 and 8.4 illustrate the use of first-order logic for simple representations.
In this section, we discuss the nature of representation languages. Our discussion motivates
the development of first-order logic, a much more expressive language than the propositional
logic introduced in Chapter 7. We look at propositional logic and at other kinds of languages
to understand what works and what fails. Our discussion will be cursory, compressing cen-
turies of thought, trial, and error into a few paragraphs.
Programming languages (such as C++ or Java or Lisp) are by far the largest class of
formal languages in common use. Programs themselves represent, in a direct sense, only
computational processes. Data structures within programs can represent facts; for example,
a program could use a 4 × 4 array to represent the contents of the wumpus world. Thus, the
programming language statement World [2,2] ← Pit is a fairly natural way to assert that there
is a pit in square [2,2]. (Such representations might be considered ad hoc; database systems
were developed precisely to provide a more general, domain-independent way to store and
1 Also called first-order predicate calculus, sometimes abbreviated as FOL or FOPC.
285
286 Chapter 8. First-Order Logic
retrieve facts.) What programming languages lack is any general mechanism for deriving
facts from other facts; each update to a data structure is done by a domain-specific procedure
whose details are derived by the programmer from his or her ow n knowledge of the domain.
This procedural approach can be contrasted with the declarative nature of propositional logic,
in which knowledge and inference are separate, and inference is entirely domain independent.
A second drawback of data structures in programs (and of databases, for that matter)
is the lack of any easy way to say, for example, “There is a pit in [2,2] or [3,1]” or “If the
wumpus is in [1,1] then he is not in [2,2].” Programs can store a single value for each variable,
and some systems allow the value to be “unknown,” but they lack the expressiveness required
to handle partial information.
Propositional logic is a declarative language because its semantics is based on a truth
relation between sentences and possible worlds. It also has sufficient expressive power to
deal with partial information, using disjunction and negation. Propositional logic has a third
COMPOSITIONALITY property that is desirable in representation languages, namely, compositionality. In a com-
positional language, the meaning of a sentence is a function of the meaning of its parts. For
example, the meaning of “S1,4 ∧ S1,2 ” is related to the meanings of “S1,4 ” and “S1,2 .” It
would be very strange if “S1,4 ” meant that there is a stench in square [1,4] and “S1,2 ” meant
that there is a stench in square [1,2], but “S1,4 ∧ S1,2 ” meant that France and Poland drew 1–1
in last week’s ice hockey qualifying match. Clearly, noncompositionality makes life much
more difficult for the reasoning system.
As we saw in Chapter 7, however, propositional logic lacks the expressive power to
concisely describe an environment with many objects. For example, we were forced to write
a separate rule about breezes and pits for each square, such as
B1,1 ⇔ (P1,2 ∨ P2,1 ) .
In English, on the other hand, it seems easy enough to say, once and for all, “Squares adjacent
to pits are breezy.” The syntax and semantics of English somehow make it possible to describe
the environment concisely.
recover its meaning without also storing a representation of the context—which raises the
question of how the context itself can be represented. Natural languages also suffer from
AMBIGUITY ambiguity, a problem for a representation language. As Pinker (1995) puts it: “When people
think about spring, surely they are not confused as to whether they are thinking about a season
or something that goes boing—and if one word can correspond to two thoughts, thoughts
can’t be words.”
The famous Sapir–Whorf hypothesis claims that our understanding of the world is
strongly influenced by the language we speak. Whorf (1956) wrote “We cut nature up, orga-
nize it into concepts, and ascribe significances as we do, largely because we are parties to an
agreement to organize it this way—an agreement that holds throughout our speech commu-
nity and is codified in the patterns of our language.” It is certainly true that different speech
communities divide up the world differently. The French have two words “chaise” and “fau-
teuil,” for a concept that English speakers cover with one: “chair.” But English speakers
can easily recognize the category fauteuil and give it a name—roughly “open-arm chair”—so
does language really make a difference? Whorf relied mainly on intuition and speculation,
but in the intervening years we actually have real data from anthropological, psychological
and neurological studies.
For example, can you remember which of the following two phrases formed the opening
of Section 8.1?
“In this section, we discuss the nature of representation languages . . .”
“This section covers the topic of knowledge representation languages . . .”
Wanner (1974) did a similar experiment and found that subjects made the right choice at
chance level—about 50% of the time—but remembered the content of what they read with
better than 90% accuracy. This suggests that people process the words to form some kind of
nonverbal representation.
More interesting is the case in which a concept is completely absent in a language.
Speakers of the Australian aboriginal language Guugu Yimithirr have no words for relative
directions, such as front, back, right, or left. Instead they use absolute directions, saying,
for example, the equivalent of “I have a pain in my north arm.” This difference in language
makes a difference in behavior: Guugu Yimithirr speakers are better at navigating in open
terrain, while English speakers are better at placing the fork to the right of the plate.
Language also seems to influence thought through seemingly arbitrary grammatical
features such as the gender of nouns. For example, “bridge” is masculine in Spanish and
feminine in German. Boroditsky (2003) asked subjects to choose English adjectives to de-
scribe a photograph of a particular bridge. Spanish speakers chose big, dangerous, strong,
and towering, whereas German speakers chose beautiful, elegant, fragile, and slender. Words
can serve as anchor points that affect how we perceive the world. Loftus and Palmer (1974)
showed experimental subjects a movie of an auto accident. Subjects who were asked “How
fast were the cars going when they contacted each other?” reported an average of 32 mph,
while subjects who were asked the question with the word “smashed” instead of “contacted”
reported 41mph for the same cars in the same movie.
288 Chapter 8. First-Order Logic
In a first-order logic reasoning system that uses CNF, we can see that the linguistic form
“¬(A ∨ B)” and “¬A ∧ ¬B” are the same because we can look inside the system and see
that the two sentences are stored as the same canonical CNF form. Can we do that with the
human brain? Until recently the answer was “no,” but now it is “maybe.” Mitchell et al.
(2008) put subjects in an fMRI (functional magnetic resonance imaging) machine, showed
them words such as “celery,” and imaged their brains. The researchers were then able to train
a computer program to predict, from a brain image, what word the subject had been presented
with. Given two choices (e.g., “celery” or “airplane”), the system predicts correctly 77% of
the time. The system can even predict at above-chance levels for words it has never seen
an fMRI image of before (by considering the images of related words) and for people it has
never seen before (proving that fMRI reveals some level of common representation across
people). This type of work is still in its infancy, but fMRI (and other imaging technology
such as intracranial electrophysiology (Sahin et al., 2009)) promises to give us much more
concrete ideas of what human knowledge representations are like.
From the viewpoint of formal logic, representing the same knowledge in two different
ways makes absolutely no difference; the same facts will be derivable from either represen-
tation. In practice, however, one representation might require fewer steps to derive a conclu-
sion, meaning that a reasoner with limited resources could get to the conclusion using one
representation but not the other. For nondeductive tasks such as learning from experience,
outcomes are necessarily dependent on the form of the representations used. We show in
Chapter 18 that when a learning program considers two possible theories of the world, both
of which are consistent with all the data, the most common way of breaking the tie is to choose
the most succinct theory—and that depends on the language used to represent theories. Thus,
the influence of language on thought is unavoidable for any agent that does learning.
The language of first-order logic, whose syntax and semantics we define in the next section,
is built around objects and relations. It has been so important to mathematics, philosophy, and
artificial intelligence precisely because those fields—and indeed, much of everyday human
existence—can be usefully thought of as dealing with objects and the relations among them.
First-order logic can also express facts about some or all of the objects in the universe. This
enables one to represent general laws or rules, such as the statement “Squares neighboring
the wumpus are smelly.”
The primary difference between propositional and first-order logic lies in the ontologi-
ONTOLOGICAL
COMMITMENT cal commitment made by each language—that is, what it assumes about the nature of reality.
Mathematically, this commitment is expressed through the nature of the formal models with
respect to which the truth of sentences is defined. For example, propositional logic assumes
that there are facts that either hold or do not hold in the world. Each fact can be in one
of two states: true or false, and each model assigns true or false to each proposition sym-
bol (see Section 7.4.2).2 First-order logic assumes more; namely, that the world consists of
objects with certain relations among them that do or do not hold. The formal models are
correspondingly more complicated than those for propositional logic. Special-purpose logics
TEMPORAL LOGIC make still further ontological commitments; for example, temporal logic assumes that facts
hold at particular times and that those times (which may be points or intervals) are ordered.
Thus, special-purpose logics give certain kinds of objects (and the axioms about them) “first
class” status within the logic, rather than simply defining them within the knowledge base.
HIGHER-ORDER
LOGIC Higher-order logic views the relations and functions referred to by first-order logic as ob-
jects in themselves. This allows one to make assertions about all relations—for example, one
could wish to define what it means for a relation to be transitive. Unlike most special-purpose
logics, higher-order logic is strictly more expressive than first-order logic, in the sense that
some sentences of higher-order logic cannot be expressed by any finite number of first-order
logic sentences.
EPISTEMOLOGICAL
COMMITMENT A logic can also be characterized by its epistemological commitments—the possible
states of knowledge that it allows with respect to each fact. In both propositional and first-
order logic, a sentence represents a fact and the agent either believes the sentence to be true,
believes it to be false, or has no opinion. These logics therefore have three possible states
of knowledge regarding any sentence. Systems using probability theory, on the other hand,
2 In contrast, facts in fuzzy logic have a degree of truth between 0 and 1. For example, the sentence “Vienna is
a large city” might be true in our world only to degree 0.6 in fuzzy logic.
290 Chapter 8. First-Order Logic
can have any degree of belief, ranging from 0 (total disbelief) to 1 (total belief). 3 For ex-
ample, a probabilistic wumpus-world agent might believe that the wumpus is in [1,3] with
probability 0.75. The ontological and epistemological commitments of five different logics
are summarized in Figure 8.1.
Figure 8.1 Formal languages and their ontological and epistemological commitments.
In the next section, we will launch into the details of first-order logic. Just as a student of
physics requires some familiarity with mathematics, a student of AI must develop a talent for
working with logical notation. On the other hand, it is also important not to get too concerned
with the specifics of logical notation—after all, there are dozens of different versions. The
main things to keep hold of are how the language facilitates concise representations and how
its semantics leads to sound reasoning procedures.
We begin this section by specifying more precisely the way in which the possible worlds
of first-order logic reflect the ontological commitment to objects and relations. Then we
introduce the various elements of the language, explaining their semantics as we go along.
objects: Richard the Lionheart, King of England from 1189 to 1199; his younger brother, the
evil King John, who ruled from 1199 to 1215; the left legs of Richard and John; and a crown.
The objects in the model may be related in various ways. In the figure, Richard and
TUPLE John are brothers. Formally speaking, a relation is just the set of tuples of objects that are
related. (A tuple is a collection of objects arranged in a fixed order and is written with angle
brackets surrounding the objects.) Thus, the brotherhood relation in this model is the set
{ Richard the Lionheart, King John , King John, Richard the Lionheart } . (8.1)
(Here we have named the objects in English, but you may, if you wish, mentally substitute the
pictures for the names.) The crown is on King John’s head, so the “on head” relation contains
just one tuple, the crown, King John . The “brother” and “on head” relations are binary
relations—that is, they relate pairs of objects. The model also contains unary relations, or
properties: the “person” property is true of both Richard and John; the “king” property is true
only of John (presumably because Richard is dead at this point); and the “crown” property is
true only of the crown.
Certain kinds of relationships are best considered as functions, in that a given object
must be related to exactly one object in this way. For example, each person has one left leg,
so the model has a unary “left leg” function that includes the following mappings:
Richard the Lionheart → Richard’s left leg
(8.2)
King John → John’s left leg .
TOTAL FUNCTIONS Strictly speaking, models in first-order logic require total functions, that is, there must be a
value for every input tuple. Thus, the crown must have a left leg and so must each of the left
legs. There is a technical solution to this awkward problem involving an additional “invisible”
crown
on head
brother
person person
king
brother
R $
J
left leg left leg
Figure 8.2 A model containing five objects, two binary relations, three unary relations
(indicated by labels on the objects), and one unary function, left-leg.
292 Chapter 8. First-Order Logic
object that is the left leg of everything that has no left leg, including itself. Fortunately, as
long as one makes no assertions about the left legs of things that have no left legs, these
technicalities are of no import.
So far, we have described the elements that populate models for first-order logic. The
other essential part of a model is the link between those elements and the vocabulary of the
logical sentences, which we explain next.
Term → Function(Term, . . .)
| Constant
| Variable
Quantifier → ∀ | ∃
Constant → A | X1 | John | · · ·
Variable → a | x | s | · · ·
Predicate → True | False | After | Loves | Raining | · · ·
Function → Mother | LeftLeg | · · ·
O PERATOR P RECEDENCE : ¬, =, ∧, ∨, ⇒, ⇔
Figure 8.3 The syntax of first-order logic with equality, specified in Backus–Naur form
(see page 1060 if you are not familiar with this notation). Operator precedences are specified,
from highest to lowest. The precedence of quantifiers is such that a quantifier holds over
everything to the right of it.
R J R J R J R J R J R J
Figure 8.4 Some members of the set of all models for a language with two constant sym-
bols, R and J, and one binary relation symbol. The interpretation of each constant symbol is
shown by a gray arrow. Within each model, the related objects are connected by arrows.
294 Chapter 8. First-Order Logic
8.2.3 Terms
TERM A term is a logical expression that refers to an object. Constant symbols are therefore terms,
but it is not always convenient to have a distinct symbol to name every object. For example,
in English we might use the expression “King John’s left leg” rather than giving a name
to his leg. This is what function symbols are for: instead of using a constant symbol, we
use LeftLeg (John ). In the general case, a complex term is formed by a function symbol
followed by a parenthesized list of terms as arguments to the function symbol. It is important
to remember that a complex term is just a complicated kind of name. It is not a “subroutine
call” that “returns a value.” There is no LeftLeg subroutine that takes a person as input and
returns a leg. We can reason about left legs (e.g., stating the general rule that everyone has one
and then deducing that John must have one) without ever providing a definition of LeftLeg .
This is something that cannot be done with subroutines in programming languages.5
The formal semantics of terms is straightforward. Consider a term f (t1 , . . . , tn ). The
function symbol f refers to some function in the model (call it F ); the argument terms refer
to objects in the domain (call them d1 , . . . , dn ); and the term as a whole refers to the object
that is the value of the function F applied to d1 , . . . , dn . For example, suppose the LeftLeg
function symbol refers to the function shown in Equation (8.2) and John refers to King John,
then LeftLeg (John ) refers to King John’s left leg. In this way, the interpretation fixes the
referent of every term.
ATOMIC SENTENCE sentence (or atom for short) is formed from a predicate symbol optionally followed by a
ATOM parenthesized list of terms, such as
Brother (Richard , John ).
This states, under the intended interpretation given earlier, that Richard the Lionheart is the
brother of King John.6 Atomic sentences can have complex terms as arguments. Thus,
Married (Father (Richard ), Mother (John ))
states that Richard the Lionheart’s father is married to King John’s mother (again, under a
suitable interpretation).
An atomic sentence is true in a given model if the relation referred to by the predicate
symbol holds among the objects referred to by the arguments.
8.2.6 Quantifiers
Once we have a logic that allows objects, it is only natural to want to express properties of
QUANTIFIER entire collections of objects, instead of enumerating the objects by name. Quantifiers let us
do this. First-order logic contains two standard quantifiers, called universal and existential.
Nested quantifiers
We will often want to express more complex sentences using multiple quantifiers. The sim-
plest case is where the quantifiers are of the same type. For example, “Brothers are siblings”
can be written as
∀ x ∀ y Brother (x, y) ⇒ Sibling (x, y) .
7 There is a variant of the existential quantifier, usually written ∃1 or ∃!, that means “There exists exactly one.”
The same meaning can be expressed using equality statements.
298 Chapter 8. First-Order Logic
Consecutive quantifiers of the same type can be written as one quantifier with several vari-
ables. For example, to say that siblinghood is a symmetric relationship, we can write
∀ x, y Sibling(x, y) ⇔ Sibling(y, x) .
In other cases we will have mixtures. “Everybody loves somebody” means that for every
person, there is someone that person loves:
∀ x ∃ y Loves(x, y) .
On the other hand, to say “There is someone who is loved by everyone,” we write
∃ y ∀ x Loves(x, y) .
The order of quantification is therefore very important. It becomes clearer if we insert paren-
theses. ∀ x (∃ y Loves(x, y)) says that everyone has a particular property, namely, the prop-
erty that they love someone. On the other hand, ∃ y (∀ x Loves(x, y)) says that someone in
the world has a particular property, namely the property of being loved by everybody.
Some confusion can arise when two quantifiers are used with the same variable name.
Consider the sentence
∀ x (Crown (x) ∨ (∃ x Brother (Richard , x))) .
Here the x in Brother (Richard , x) is existentially quantified. The rule is that the variable
belongs to the innermost quantifier that mentions it; then it will not be subject to any other
quantification. Another way to think of it is this: ∃ x Brother (Richard , x) is a sentence
about Richard (that he has a brother), not about x; so putting a ∀ x outside it has no effect. It
could equally well have been written ∃ z Brother (Richard , z). Because this can be a source
of confusion, we will always use different variable names with nested quantifiers.
8.2.7 Equality
First-order logic includes one more way to make atomic sentences, other than using a predi-
EQUALITY SYMBOL cate and terms as described earlier. We can use the equality symbol to signify that two terms
refer to the same object. For example,
Father (John ) = Henry
says that the object referred to by Father (John ) and the object referred to by Henry are the
same. Because an interpretation fixes the referent of any term, determining the truth of an
equality sentence is simply a matter of seeing that the referents of the two terms are the same
object.
The equality symbol can be used to state facts about a given function, as we just did for
the Father symbol. It can also be used with negation to insist that two terms are not the same
object. To say that Richard has at least two brothers, we would write
∃ x, y Brother (x, Richard ) ∧ Brother (y, Richard ) ∧ ¬(x = y) .
The sentence
∃ x, y Brother (x, Richard ) ∧ Brother (y, Richard )
does not have the intended meaning. In particular, it is true in the model of Figure 8.2, where
Richard has only one brother. To see this, consider the extended interpretation in which both
x and y are assigned to King John. The addition of ¬(x = y) rules out such models. The
notation x = y is sometimes used as an abbreviation for ¬(x = y).
R J R J R J R J R J
R R R R R
J J J J
... J
Figure 8.5 Some members of the set of all models for a language with two constant sym-
bols, R and J, and one binary relation symbol, under database semantics. The interpretation
of the constant symbols is fixed, and there is a distinct object for each constant symbol.
contains no more domain elements than those named by the constant symbols. Under the
DATABASE
SEMANTICS resulting semantics, which we call database semantics to distinguish it from the standard
semantics of first-order logic, the sentence Equation (8.3) does indeed state that Richard’s
two brothers are John and Geoffrey. Database semantics is also used in logic programming
systems, as explained in Section 9.4.5.
It is instructive to consider the set of all possible models under database semantics for
the same case as shown in Figure 8.4. Figure 8.5 shows some of the models, ranging from
the model with no tuples satisfying the relation to the model with all tuples satisfying the
relation. With two objects, there are four possible two-element tuples, so there are 24 = 16
different subsets of tuples that can satisfy the relation. Thus, there are 16 possible models in
all—a lot fewer than the infinitely many models for the standard first-order semantics. On the
other hand, the database semantics requires definite knowledge of what the world contains.
This example brings up an important point: there is no one “correct” semantics for
logic. The usefulness of any proposed semantics depends on how concise and intuitive it
makes the expression of the kinds of knowledge we want to write down, and on how easy
and natural it is to develop the corresponding rules of inference. Database semantics is most
useful when we are certain about the identity of all the objects described in the knowledge
base and when we have all the facts at hand; in other cases, it is quite awkward. For the rest
of this chapter, we assume the standard semantics while noting instances in which this choice
leads to cumbersome expressions.
Now that we have defined an expressive logical language, it is time to learn how to use it. The
best way to do this is through examples. We have seen some simple sentences illustrating the
various aspects of logical syntax; in this section, we provide more systematic representations
DOMAIN of some simple domains. In knowledge representation, a domain is just some part of the
world about which we wish to express some knowledge.
We begin with a brief description of the T ELL /A SK interface for first-order knowledge
bases. Then we look at the domains of family relationships, numbers, sets, and lists, and at
Section 8.3. Using First-Order Logic 301
the wumpus world. The next section contains a more substantial example (electronic circuits)
and Chapter 12 covers everything in the universe.
We can go through each function and predicate, writing down what we know in terms
of the other symbols. For example, one’s mother is one’s female parent:
∀ m, c Mother (c) = m ⇔ Female (m) ∧ Parent (m, c) .
One’s husband is one’s male spouse:
∀ w, h Husband (h, w) ⇔ Male (h) ∧ Spouse (h, w) .
Male and female are disjoint categories:
∀ x Male(x) ⇔ ¬Female (x) .
Parent and child are inverse relations:
∀ p, c Parent (p, c) ⇔ Child (c, p) .
A grandparent is a parent of one’s parent:
∀ g, c Grandparent (g, c) ⇔ ∃ p Parent (g, p) ∧ Parent (p, c) .
A sibling is another child of one’s parents:
∀ x, y Sibling(x, y) ⇔ x = y ∧ ∃ p Parent (p, x) ∧ Parent (p, y) .
We could go on for several more pages like this, and Exercise 8.14 asks you to do just that.
Each of these sentences can be viewed as an axiom of the kinship domain, as explained
in Section 7.1. Axioms are commonly associated with purely mathematical domains—we
will see some axioms for numbers shortly—but they are needed in all domains. They provide
the basic factual information from which useful conclusions can be derived. Our kinship
DEFINITION axioms are also definitions; they have the form ∀ x, y P (x, y) ⇔ . . .. The axioms define
the Mother function and the Husband , Male , Parent , Grandparent , and Sibling predicates
in terms of other predicates. Our definitions “bottom out” at a basic set of predicates (Child ,
Spouse , and Female ) in terms of which the others are ultimately defined. This is a natural
way in which to build up the representation of a domain, and it is analogous to the way in
which software packages are built up by successive definitions of subroutines from primitive
library functions. Notice that there is not necessarily a unique set of primitive predicates;
we could equally well have used Parent , Spouse , and Male. In some domains, as we show,
there is no clearly identifiable basic set.
THEOREM Not all logical sentences about a domain are axioms. Some are theorems—that is, they
are entailed by the axioms. For example, consider the assertion that siblinghood is symmetric:
∀ x, y Sibling(x, y) ⇔ Sibling(y, x) .
Is this an axiom or a theorem? In fact, it is a theorem that follows logically from the axiom
that defines siblinghood. If we A SK the knowledge base this sentence, it should return true.
From a purely logical point of view, a knowledge base need contain only axioms and
no theorems, because the theorems do not increase the set of conclusions that follow from
the knowledge base. From a practical point of view, theorems are essential to reduce the
computational cost of deriving new sentences. Without them, a reasoning system has to start
from first principles every time, rather like a physicist having to rederive the rules of calculus
for every new problem.
Section 8.3. Using First-Order Logic 303
Not all axioms are definitions. Some provide more general information about certain
predicates without constituting a definition. Indeed, some predicates have no complete defi-
nition because we do not know enough to characterize them fully. For example, there is no
obvious definitive way to complete the sentence
∀ x Person(x) ⇔ . . .
Fortunately, first-order logic allows us to make use of the Person predicate without com-
pletely defining it. Instead, we can write partial specifications of properties that every person
has and properties that make something a person:
∀ x Person(x) ⇒ . . .
∀ x . . . ⇒ Person(x) .
Axioms can also be “just plain facts,” such as Male (Jim) and Spouse (Jim, Laura ).
Such facts form the descriptions of specific problem instances, enabling specific questions
to be answered. The answers to these questions will then be theorems that follow from
the axioms. Often, one finds that the expected answers are not forthcoming—for example,
from Spouse (Jim, Laura ) one expects (under the laws of many countries) to be able to infer
¬Spouse (George, Laura ); but this does not follow from the axioms given earlier—even after
we add Jim = George as suggested in Section 8.2.8. This is a sign that an axiom is missing.
Exercise 8.8 asks the reader to supply it.
PREFIX logic is called prefix.) To make our sentences about numbers easier to read, we allow the use
of infix notation. We can also write S(n) as n + 1, so the second axiom becomes
∀ m, n NatNum(m) ∧ NatNum(n) ⇒ (m + 1) + n = (m + n) + 1 .
This axiom reduces addition to repeated application of the successor function.
SYNTACTIC SUGAR The use of infix notation is an example of syntactic sugar, that is, an extension to or
abbreviation of the standard syntax that does not change the semantics. Any sentence that
uses sugar can be “desugared” to produce an equivalent sentence in ordinary first-order logic.
Once we have addition, it is straightforward to define multiplication as repeated addi-
tion, exponentiation as repeated multiplication, integer division and remainders, prime num-
bers, and so on. Thus, the whole of number theory (including cryptography) can be built up
from one constant, one function, one predicate and four axioms.
SET The domain of sets is also fundamental to mathematics as well as to commonsense
reasoning. (In fact, it is possible to define number theory in terms of set theory.) We want to
be able to represent individual sets, including the empty set. We need a way to build up sets
by adding an element to a set or taking the union or intersection of two sets. We will want
to know whether an element is a member of a set and we will want to distinguish sets from
objects that are not sets.
We will use the normal vocabulary of set theory as syntactic sugar. The empty set is a
constant written as { }. There is one unary predicate, Set , which is true of sets. The binary
predicates are x ∈ s (x is a member of set s) and s1 ⊆ s2 (set s1 is a subset, not necessarily
proper, of set s2 ). The binary functions are s1 ∩ s2 (the intersection of two sets), s1 ∪ s2
(the union of two sets), and {x|s} (the set resulting from adjoining element x to set s). One
possible set of axioms is as follows:
1. The only sets are the empty set and those made by adjoining something to a set:
∀ s Set (s) ⇔ (s = { }) ∨ (∃ x, s2 Set (s2 ) ∧ s = {x|s2 }) .
2. The empty set has no elements adjoined into it. In other words, there is no way to
decompose { } into a smaller set and an element:
¬∃ x, s {x|s} = { } .
3. Adjoining an element already in the set has no effect:
∀ x, s x ∈ s ⇔ s = {x|s} .
4. The only members of a set are the elements that were adjoined into it. We express
this recursively, saying that x is a member of s if and only if s is equal to some set s2
adjoined with some element y, where either y is the same as x or x is a member of s2 :
∀ x, s x ∈ s ⇔ ∃ y, s 2 (s = {y|s2 } ∧ (x = y ∨ x ∈ s2 )) .
5. A set is a subset of another set if and only if all of the first set’s members are members
of the second set:
∀ s1 , s2 s1 ⊆ s2 ⇔ (∀ x x ∈ s1 ⇒ x ∈ s2 ) .
6. Two sets are equal if and only if each is a subset of the other:
∀ s1 , s2 (s1 = s2 ) ⇔ (s1 ⊆ s2 ∧ s2 ⊆ s1 ) .
Section 8.3. Using First-Order Logic 305
7. An object is in the intersection of two sets if and only if it is a member of both sets:
∀ x, s1 , s2 x ∈ (s1 ∩ s2 ) ⇔ (x ∈ s1 ∧ x ∈ s2 ) .
8. An object is in the union of two sets if and only if it is a member of either set:
∀ x, s1 , s2 x ∈ (s1 ∪ s2 ) ⇔ (x ∈ s1 ∨ x ∈ s2 ) .
LIST Lists are similar to sets. The differences are that lists are ordered and the same element can
appear more than once in a list. We can use the vocabulary of Lisp for lists: Nil is the constant
list with no elements; Cons, Append , First, and Rest are functions; and Find is the pred-
icate that does for lists what Member does for sets. List ? is a predicate that is true only of
lists. As with sets, it is common to use syntactic sugar in logical sentences involving lists. The
empty list is [ ]. The term Cons(x, y), where y is a nonempty list, is written [x|y]. The term
Cons(x, Nil ) (i.e., the list containing the element x) is written as [x]. A list of several ele-
ments, such as [A, B, C ], corresponds to the nested term Cons(A, Cons(B, Cons (C, Nil ))).
Exercise 8.16 asks you to write out the axioms for lists.
Given the percept and rules from the preceding paragraphs, this would yield the desired con-
clusion BestAction (Grab, 5)—that is, Grab is the right thing to do.
We have represented the agent’s inputs and outputs; now it is time to represent the
environment itself. Let us begin with objects. Obvious candidates are squares, pits, and the
wumpus. We could name each square—Square 1,2 and so on—but then the fact that Square 1,2
and Square 1,3 are adjacent would have to be an “extra” fact, and we would need one such
fact for each pair of squares. It is better to use a complex term in which the row and column
appear as integers; for example, we can simply use the list term [1, 2]. Adjacency of any two
squares can be defined as
∀ x, y, a, b Adjacent ([x, y], [a, b]) ⇔
(x = a ∧ (y = b − 1 ∨ y = b + 1)) ∨ (y = b ∧ (x = a − 1 ∨ x = a + 1)) .
We could name each pit, but this would be inappropriate for a different reason: there is no
reason to distinguish among pits. 10 It is simpler to use a unary predicate Pit that is true of
squares containing pits. Finally, since there is exactly one wumpus, a constant Wumpus is
just as good as a unary predicate (and perhaps more dignified from the wumpus’s viewpoint).
The agent’s location changes over time, so we write At(Agent , s, t) to mean that the
agent is at square s at time t. We can fix the wumpus’s location with ∀t At(Wumpus, [2, 2], t).
We can then say that objects can only be at one location at a time:
∀ x, s1 , s2 , t At(x, s1 , t) ∧ At(x, s2 , t) ⇒ s1 = s2 .
Given its current location, the agent can infer properties of the square from properties of its
current percept. For example, if the agent is at a square and perceives a breeze, then that
square is breezy:
∀ s, t At(Agent , s, t) ∧ Breeze(t) ⇒ Breezy (s) .
It is useful to know that a square is breezy because we know that the pits cannot move about.
Notice that Breezy has no time argument.
Having discovered which places are breezy (or smelly) and, very important, not breezy
(or not smelly), the agent can deduce where the pits are (and where the wumpus is). Whereas
propositional logic necessitates a separate axiom for each square (see R2 and R3 on page 247)
and would need a different set of axioms for each geographical layout of the world, first-order
logic just needs one axiom:
∀ s Breezy (s) ⇔ ∃ r Adjacent (r, s) ∧ Pit (r) . (8.4)
Similarly, in first-order logic we can quantify over time, so we need just one successor-state
axiom for each predicate, rather than a different copy for each time step. For example, the
axiom for the arrow (Equation (7.2) on page 267) becomes
∀ t HaveArrow (t + 1) ⇔ (HaveArrow (t) ∧ ¬Action (Shoot , t)) .
From these two example sentences, we can see that the first-order logic formulation is no
less concise than the original English-language description given in Chapter 7. The reader
10 Similarly, most of us do not name each bird that flies overhead as it migrates to warmer regions in winter. An
ornithologist wishing to study migration patterns, survival rates, and so on does name each bird, by means of a
ring on its leg, because individual birds must be tracked.
Section 8.4. Knowledge Engineering in First-Order Logic 307
is invited to construct analogous axioms for the agent’s location and orientation; in these
cases, the axioms quantify over both space and time. As in the case of propositional state
estimation, an agent can use logical inference with axioms of this kind to keep track of aspects
of the world that are not directly observed. Chapter 10 goes into more depth on the subject of
first-order successor-state axioms and their uses for constructing plans.
The preceding section illustrated the use of first-order logic to represent knowledge in three
simple domains. This section describes the general process of knowledge-base construction—
KNOWLEDGE
ENGINEERING a process called knowledge engineering. A knowledge engineer is someone who investigates
a particular domain, learns what concepts are important in that domain, and creates a formal
representation of the objects and relations in the domain. We illustrate the knowledge engi-
neering process in an electronic circuit domain that should already be fairly familiar, so that
we can concentrate on the representational issues involved. The approach we take is suitable
for developing special-purpose knowledge bases whose domain is carefully circumscribed
and whose range of queries is known in advance. General-purpose knowledge bases, which
cover a broad range of human knowledge and are intended to support tasks such as natural
language understanding, are discussed in Chapter 12.
3. Decide on a vocabulary of predicates, functions, and constants. That is, translate the
important domain-level concepts into logic-level names. This involves many questions
of knowledge-engineering style. Like programming style, this can have a significant
impact on the eventual success of the project. For example, should pits be represented
by objects or by a unary predicate on squares? Should the agent’s orientation be a
function or a predicate? Should the wumpus’s location depend on time? Once the
ONTOLOGY choices have been made, the result is a vocabulary that is known as the ontology of
the domain. The word ontology means a particular theory of the nature of being or
existence. The ontology determines what kinds of things exist, but does not determine
their specific properties and interrelationships.
4. Encode general knowledge about the domain. The knowledge engineer writes down
the axioms for all the vocabulary terms. This pins down (to the extent possible) the
meaning of the terms, enabling the expert to check the content. Often, this step reveals
misconceptions or gaps in the vocabulary that must be fixed by returning to step 3 and
iterating through the process.
5. Encode a description of the specific problem instance. If the ontology is well thought
out, this step will be easy. It will involve writing simple atomic sentences about in-
stances of concepts that are already part of the ontology. For a logical agent, problem
instances are supplied by the sensors, whereas a “disembodied” knowledge base is sup-
plied with additional sentences in the same way that traditional programs are supplied
with input data.
6. Pose queries to the inference procedure and get answers. This is where the reward is:
we can let the inference procedure operate on the axioms and problem-specific facts to
derive the facts we are interested in knowing. Thus, we avoid the need for writing an
application-specific solution algorithm.
7. Debug the knowledge base. Alas, the answers to queries will seldom be correct on
the first try. More precisely, the answers will be correct for the knowledge base as
written, assuming that the inference procedure is sound, but they will not be the ones
that the user is expecting. For example, if an axiom is missing, some queries will not be
answerable from the knowledge base. A considerable debugging process could ensue.
Missing axioms or axioms that are too weak can be easily identified by noticing places
where the chain of reasoning stops unexpectedly. For example, if the knowledge base
includes a diagnostic rule (see Exercise 8.13) for finding the wumpus,
∀ s Smelly(s) ⇒ Adjacent (Home(Wumpus ), s) ,
instead of the biconditional, then the agent will never be able to prove the absence of
wumpuses. Incorrect axioms can be identified because they are false statements about
the world. For example, the sentence
∀ x NumOfLegs(x, 4) ⇒ Mammal (x)
is false for reptiles, amphibians, and, more importantly, tables. The falsehood of this
sentence can be determined independently of the rest of the knowledge base. In contrast,
Section 8.4. Knowledge Engineering in First-Order Logic 309
C1
1 X1
2 X2 1
3 A2
A1 O1 2
Figure 8.6 A digital circuit C1, purporting to be a one-bit full adder. The first two inputs
are the two bits to be added, and the third input is a carry bit. The first output is the sum, and
the second output is a carry bit for the next adder. The circuit contains two XOR gates, two
AND gates, and one OR gate.
310 Chapter 8. First-Order Logic
signal on the output terminal that flows along another wire. To determine what these signals
will be, we need to know how the gates transform their input signals. There are four types
of gates: AND, OR, and XOR gates have two input terminals, and NOT gates have one. All
gates have one output terminal. Circuits, like gates, have input and output terminals.
To reason about functionality and connectivity, we do not need to talk about the wires
themselves, the paths they take, or the junctions where they come together. All that matters
is the connections between terminals—we can say that one output terminal is connected to
another input terminal without having to say what actually connects them. Other factors such
as the size, shape, color, or cost of the various components are irrelevant to our analysis.
If our purpose were something other than verifying designs at the gate level, the ontol-
ogy would be different. For example, if we were interested in debugging faulty circuits, then
it would probably be a good idea to include the wires in the ontology, because a faulty wire
can corrupt the signal flowing along it. For resolving timing faults, we would need to include
gate delays. If we were interested in designing a product that would be profitable, then the
cost of the circuit and its speed relative to other products on the market would be important.
Decide on a vocabulary
We now know that we want to talk about circuits, terminals, signals, and gates. The next step
is to choose functions, predicates, and constants to represent them. First, we need to be able
to distinguish gates from each other and from other objects. Each gate is represented as an
object named by a constant, about which we assert that it is a gate with, say, Gate (X1 ). The
behavior of each gate is determined by its type: one of the constants AN D, OR, XOR, or
NOT . Because a gate has exactly one type, a function is appropriate: Type (X1 ) = XOR.
Circuits, like gates, are identified by a predicate: Circuit(C1 ).
Next we consider terminals, which are identified by the predicate Terminal (x). A gate
or circuit can have one or more input terminals and one or more output terminals. We use the
function In (1, X1 ) to denote the first input terminal for gate X1 . A similar function Out is
used for output terminals. The function Arity(c, i, j) says that circuit c has i input and j out-
put terminals. The connectivity between gates can be represented by a predicate, Connected ,
which takes two terminals as arguments, as in Connected (Out(1, X1 ), In (1, X2 )).
Finally, we need to know whether a signal is on or off. One possibility is to use a unary
predicate, On (t), which is true when the signal at a terminal is on. This makes it a little
difficult, however, to pose questions such as “What are all the possible values of the signals
at the output terminals of circuit C1 ?” We therefore introduce as objects two signal values, 1
and 0, and a function Signal (t) that denotes the signal value for the terminal t.
8.5 S UMMARY
This chapter has introduced first-order logic, a representation language that is far more pow-
erful than propositional logic. The important points are as follows:
• Knowledge representation languages should be declarative, compositional, expressive,
context independent, and unambiguous.
• Logics differ in their ontological commitments and epistemological commitments.
While propositional logic commits only to the existence of facts, first-order logic com-
mits to the existence of objects and relations and thereby gains expressive power.
• The syntax of first-order logic builds on that of propositional logic. It adds terms to
represent objects, and has universal and existential quantifiers to construct assertions
about all or some of the possible values of the quantified variables.
• A possible world, or model, for first-order logic includes a set of objects and an inter-
pretation that maps constant symbols to objects, predicate symbols to relations among
objects, and function symbols to functions on objects.
• An atomic sentence is true just when the relation named by the predicate holds between
the objects named by the terms. Extended interpretations, which map quantifier vari-
ables to objects in the model, define the truth of quantified sentences.
• Developing a knowledge base in first-order logic requires a careful process of analyzing
the domain, choosing a vocabulary, and encoding the axioms required to support the
desired inferences.
Although Aristotle’s logic deals with generalizations over objects, it fell far short of the ex-
pressive power of first-order logic. A major barrier to its further development was its concen-
tration on one-place predicates to the exclusion of many-place relational predicates. The first
systematic treatment of relations was given by Augustus De Morgan (1864), who cited the
following example to show the sorts of inferences that Aristotle’s logic could not handle: “All
horses are animals; therefore, the head of a horse is the head of an animal.” This inference
is inaccessible to Aristotle because any valid rule that can support this inference must first
analyze the sentence using the two-place predicate “x is the head of y.” The logic of relations
was studied in depth by Charles Sanders Peirce (1870, 2004).
True first-order logic dates from the introduction of quantifiers in Gottlob Frege’s (1879)
Begriffschrift (“Concept Writing” or “Conceptual Notation”). Peirce (1883) also developed
first-order logic independently of Frege, although slightly later. Frege’s ability to nest quan-
tifiers was a big step forward, but he used an awkward notation. The present notation for
first-order logic is due substantially to Giuseppe Peano (1889), but the semantics is virtually
identical to Frege’s. Oddly enough, Peano’s axioms were due in large measure to Grassmann
(1861) and Dedekind (1888).
314 Chapter 8. First-Order Logic
Leopold Löwenheim (1915) gave a systematic treatment of model theory for first-order
logic, including the first proper treatment of the equality symbol. Löwenheim’s results were
further extended by Thoralf Skolem (1920). Alfred Tarski (1935, 1956) gave an explicit
definition of truth and model-theoretic satisfaction in first-order logic, using set theory.
McCarthy (1958) was primarily responsible for the introduction of first-order logic as a
tool for building AI systems. The prospects for logic-based AI were advanced significantly by
Robinson’s (1965) development of resolution, a complete procedure for first-order inference
described in Chapter 9. The logicist approach took root at Stanford University. Cordell Green
(1969a, 1969b) developed a first-order reasoning system, QA3, leading to the first attempts to
build a logical robot at SRI (Fikes and Nilsson, 1971). First-order logic was applied by Zohar
Manna and Richard Waldinger (1971) for reasoning about programs and later by Michael
Genesereth (1984) for reasoning about circuits. In Europe, logic programming (a restricted
form of first-order reasoning) was developed for linguistic analysis (Colmerauer et al., 1973)
and for general declarative systems (Kowalski, 1974). Computational logic was also well
entrenched at Edinburgh through the LCF (Logic for Computable Functions) project (Gordon
et al., 1979). These developments are chronicled further in Chapters 9 and 12.
Practical applications built with first-order logic include a system for evaluating the
manufacturing requirements for electronic products (Mannion, 2002), a system for reasoning
about policies for file access and digital rights management (Halpern and Weissman, 2008),
and a system for the automated composition of Web services (McIlraith and Zeng, 2001).
Reactions to the Whorf hypothesis (Whorf, 1956) and the problem of language and
thought in general, appear in several recent books (Gumperz and Levinson, 1996; Bowerman
and Levinson, 2001; Pinker, 2003; Gentner and Goldin-Meadow, 2003). The “theory” theory
(Gopnik and Glymour, 2002; Tenenbaum et al., 2007) views children’s learning about the
world as analogous to the construction of scientific theories. Just as the predictions of a
machine learning algorithm depend strongly on the vocabulary supplied to it, so will the
child’s formulation of theories depend on the linguistic environment in which learning occurs.
There are a number of good introductory texts on first-order logic, including some by
leading figures in the history of logic: Alfred Tarski (1941), Alonzo Church (1956), and
W.V. Quine (1982) (which is one of the most readable). Enderton (1972) gives a more math-
ematically oriented perspective. A highly formal treatment of first-order logic, along with
many more advanced topics in logic, is provided by Bell and Machover (1977). Manna and
Waldinger (1985) give a readable introduction to logic from a computer science perspec-
tive, as do Huth and Ryan (2004), who concentrate on program verification. Barwise and
Etchemendy (2002) take an approach similar to the one used here. Smullyan (1995) presents
results concisely, using the tableau format. Gallier (1986) provides an extremely rigorous
mathematical exposition of first-order logic, along with a great deal of material on its use in
automated reasoning. Logical Foundations of Artificial Intelligence (Genesereth and Nilsson,
1987) is both a solid introduction to logic and the first systematic treatment of logical agents
with percepts and actions, and there are two good handbooks: van Bentham and ter Meulen
(1997) and Robinson and Voronkov (2001). The journal of record for the field of pure math-
ematical logic is the Journal of Symbolic Logic, whereas the Journal of Applied Logic deals
with concerns closer to those of artificial intelligence.
Exercises 315
E XERCISES
8.1 A logical knowledge base represents the world using a set of sentences with no explicit
structure. An analogical representation, on the other hand, has physical structure that corre-
sponds directly to the structure of the thing represented. Consider a road map of your country
as an analogical representation of facts about the country—it represents facts with a map lan-
guage. The two-dimensional structure of the map corresponds to the two-dimensional surface
of the area.
a. Give five examples of symbols in the map language.
b. An explicit sentence is a sentence that the creator of the representation actually writes
down. An implicit sentence is a sentence that results from explicit sentences because
of properties of the analogical representation. Give three examples each of implicit and
explicit sentences in the map language.
c. Give three examples of facts about the physical structure of your country that cannot be
represented in the map language.
d. Give two examples of facts that are much easier to express in the map language than in
first-order logic.
e. Give two other examples of useful analogical representations. What are the advantages
and disadvantages of each of these languages?
8.2 Consider a knowledge base containing just two sentences: P (a) and P (b). Does this
knowledge base entail ∀ x P (x)? Explain your answer in terms of models.
8.3 Is the sentence ∃ x, y x = y valid? Explain.
8.4 Write down a logical sentence such that every world in which it is true contains exactly
one object.
8.5 Consider a symbol vocabulary that contains c constant symbols, pk predicate symbols
of each arity k, and fk function symbols of each arity k, where 1 ≤ k ≤ A. Let the domain
size be fixed at D. For any given model, each predicate or function symbol is mapped onto a
relation or function, respectively, of the same arity. You may assume that the functions in the
model allow some input tuples to have no value for the function (i.e., the value is the invisible
object). Derive a formula for the number of possible models for a domain with D elements.
Don’t worry about eliminating redundant combinations.
8.6 Which of the following are valid (necessarily true) sentences?
a. (∃x x = x) ⇒ (∀ y ∃z y = z).
b. ∀ x P (x) ∨ ¬P (x).
c. ∀ x Smart (x) ∨ (x = x).
8.7 Consider a version of the semantics for first-order logic in which models with empty
domains are allowed. Give at least two examples of sentences that are valid according to the
316 Chapter 8. First-Order Logic
standard semantics but not according to the new semantics. Discuss which outcome makes
more intuitive sense for your examples.
8.8 Does the fact ¬Spouse (George, Laura ) follow from the facts Jim = George and
Spouse (Jim, Laura)? If so, give a proof; if not, supply additional axioms as needed. What
happens if we use Spouse as a unary function symbol instead of a binary predicate?
8.9 This exercise uses the function MapColor and predicates In (x, y), Borders (x, y), and
Country (x), whose arguments are geographical regions, along with constant symbols for
various regions. In each of the following we give an English sentence and a number of can-
didate logical expressions. For each of the logical expressions, state whether it (1) correctly
expresses the English sentence; (2) is syntactically invalid and therefore meaningless; or (3)
is syntactically valid but does not express the meaning of the English sentence.
a. Paris and Marseilles are both in France.
(i) In (Paris ∧ Marseilles, France ).
(ii) In (Paris , France ) ∧ In (Marseilles, France ).
(iii) In (Paris , France ) ∨ In (Marseilles, France ).
b. There is a country that borders both Iraq and Pakistan.
(i) ∃c Country (c) ∧ Border (c, Iraq ) ∧ Border (c, Pakistan ).
(ii) ∃c Country (c) ⇒ [Border (c, Iraq ) ∧ Border (c, Pakistan )].
(iii) [∃ c Country (c)] ⇒ [Border (c, Iraq ) ∧ Border (c, Pakistan )].
(iv) ∃c Border (Country (c), Iraq ∧ Pakistan ).
c. All countries that border Ecuador are in South America.
(i) ∀c Country(c) ∧ Border (c, Ecuador ) ⇒ In (c, SouthAmerica ).
(ii) ∀c Country (c) ⇒ [Border (c, Ecuador ) ⇒ In (c, SouthAmerica )].
(iii) ∀c [Country (c) ⇒ Border (c, Ecuador )] ⇒ In (c, SouthAmerica ).
(iv) ∀c Country(c) ∧ Border (c, Ecuador ) ∧ In (c, SouthAmerica ).
d. No region in South America borders any region in Europe.
(i) ¬[∃ c, d In(c, SouthAmerica ) ∧ In (d, Europe ) ∧ Borders (c, d)].
(ii) ∀ c, d [In (c, SouthAmerica ) ∧ In(d, Europe )] ⇒ ¬Borders (c, d)].
(iii) ¬∀ c In (c, SouthAmerica ) ⇒ ∃ d In (d, Europe ) ∧ ¬Borders (c, d).
(iv) ∀ c In (c, SouthAmerica ) ⇒ ∀ d In (d, Europe ) ⇒ ¬Borders (c, d).
e. No two adjacent countries have the same map color.
(i) ∀ x, y ¬Country (x) ∨ ¬Country (y) ∨ ¬Borders (x, y) ∨
¬(MapColor (x) = MapColor (y)).
(ii) ∀ x, y (Country (x) ∧ Country (y) ∧ Borders (x, y) ∧ ¬(x = y)) ⇒
¬(MapColor (x) = MapColor (y)).
(iii) ∀ x, y Country (x) ∧ Country (y) ∧ Borders (x, y) ∧
¬(MapColor (x) = MapColor (y)).
(iv) ∀ x, y (Country (x) ∧ Country (y) ∧ Borders (x, y)) ⇒ MapColor (x = y).
Exercises 317
8.12 Rewrite the first two Peano axioms in Section 8.3.3 as a single axiom that defines
NatNum(x) so as to exclude the possibility of natural numbers except for those generated by
the successor function.
8.13 Equation (8.4) on page 306 defines the conditions under which a square is breezy. Here
we consider two other ways to describe this aspect of the wumpus world.
DIAGNOSTIC RULE a. We can write diagnostic rules leading from observed effects to hidden causes. For find-
ing pits, the obvious diagnostic rules say that if a square is breezy, some adjacent square
must contain a pit; and if a square is not breezy, then no adjacent square contains a pit.
Write these two rules in first-order logic and show that their conjunction is logically
equivalent to Equation (8.4).
CAUSAL RULE b. We can write causal rules leading from cause to effect. One obvious causal rule is that
a pit causes all adjacent squares to be breezy. Write this rule in first-order logic, explain
why it is incomplete compared to Equation (8.4), and supply the missing axiom.
318 Chapter 8. First-Order Logic
George Mum
Figure 8.7 A typical family tree. The symbol “ ” connects spouses and arrows point to
children.
8.16 Using the set axioms as examples, write axioms for the list domain, including all the
constants, functions, and predicates mentioned in the chapter.
8.17 Explain what is wrong with the following proposed definition of adjacent squares in
the wumpus world:
∀ x, y Adjacent ([x, y], [x + 1, y]) ∧ Adjacent ([x, y], [x, y + 1]) .
8.18 Write out the axioms required for reasoning about the wumpus’s location, using a
constant symbol Wumpus and a binary predicate At(Wumpus, Location ). Remember that
there is only one wumpus.
8.19 Assuming predicates Parent (p, q) and Female (p) and constants Joan and Kevin,
with the obvious meanings, express each of the following sentences in first-order logic. (You
may use the abbreviation ∃1 to mean “there exists exactly one.”)
a. Joan has a daughter (possibly more than one, and possibly sons as well).
b. Joan has exactly one daughter (but may have sons as well).
c. Joan has exactly one child, a daughter.
d. Joan and Kevin have exactly one child together.
e. Joan has at least one child with Kevin, and no children with anyone else.
Exercises 319
8.20 Arithmetic assertions can be written in first-order logic with the predicate symbol <,
the function symbols + and ×, and the constant symbols 0 and 1. Additional predicates can
also be defined with biconditionals.
a. Represent the property “x is an even number.”
b. Represent the property “x is prime.”
c. Goldbach’s conjecture is the conjecture (unproven as yet) that every even number is
equal to the sum of two primes. Represent this conjecture as a logical sentence.
8.21 In Chapter 6, we used equality to indicate the relation between a variable and its value.
For instance, we wrote WA = red to mean that Western Australia is colored red. Repre-
senting this in first-order logic, we must write more verbosely ColorOf (WA) = red . What
incorrect inference could be drawn if we wrote sentences such as WA = red directly as logical
assertions?
8.22 Write in first-order logic the assertion that every key and at least one of every pair of
socks will eventually be lost forever, using only the following vocabulary: Key(x), x is a key;
Sock (x), x is a sock; Pair (x, y), x and y are a pair; Now , the current time; Before(t1 , t2 ),
time t1 comes before time t2 ; Lost (x, t), object x is lost at time t.
8.23 For each of the following sentences in English, decide if the accompanying first-order
logic sentence is a good translation. If not, explain why not and correct it. (Some sentences
may have more than one error!)
a. No two people have the same social security number.
¬∃ x, y, n Person(x) ∧ Person(y) ⇒ [HasSS #(x, n) ∧ HasSS #(y, n)].
b. John’s social security number is the same as Mary’s.
∃ n HasSS #(John , n) ∧ HasSS #(Mary , n).
c. Everyone’s social security number has nine digits.
∀ x, n Person(x) ⇒ [HasSS #(x, n) ∧ Digits(n, 9)].
d. Rewrite each of the above (uncorrected) sentences using a function symbol SS # instead
of the predicate HasSS #.
8.24 Represent the following sentences in first-order logic, using a consistent vocabulary
(which you must define):
a. Some students took French in spring 2001.
b. Every student who takes French passes it.
c. Only one student took Greek in spring 2001.
d. The best score in Greek is always higher than the best score in French.
e. Every person who buys a policy is smart.
f. No person buys an expensive policy.
g. There is an agent who sells policies only to people who are not insured.
320 Chapter 8. First-Order Logic
X0 Z0
Y0 Ad0
X1 Z1 X3 X2 X1 X0
Y1 Ad1
+ Y3 Y2 Y1 Y0
X2 Z2
Y2 Ad2 Z4 Z3 Z2 Z1 Z0
X3 Z3
Y3 Ad3 Z4
Figure 8.8 A four-bit adder. Each Ad i is a one-bit adder, as in Figure 8.6 on page 309.
h. There is a barber who shaves all men in town who do not shave themselves.
i. A person born in the UK, each of whose parents is a UK citizen or a UK resident, is a
UK citizen by birth.
j. A person born outside the UK, one of whose parents is a UK citizen by birth, is a UK
citizen by descent.
k. Politicians can fool some of the people all of the time, and they can fool all of the people
some of the time, but they can’t fool all of the people all of the time.
l. All Greeks speak the same language. (Use Speaks (x, l) to mean that person x speaks
language l.)
8.25 Write a general set of facts and axioms to represent the assertion “Wellington heard
about Napoleon’s death” and to correctly answer the question “Did Napoleon hear about
Wellington’s death?”
8.26 Extend the vocabulary from Section 8.4 to define addition for n-bit binary numbers.
Then encode the description of the four-bit adder in Figure 8.8, and pose the queries needed
to verify that it is in fact correct.
8.27 Obtain a passport application for your country, identify the rules determining eligi-
bility for a passport, and translate them into first-order logic, following the steps outlined in
Section 8.4.
8.28 Consider a first-order logical knowledge base that describes worlds containing people,
songs, albums (e.g., “Meet the Beatles”) and disks (i.e., particular physical instances of CDs).
The vocabulary contains the following symbols:
CopyOf (d, a): Predicate. Disk d is a copy of album a.
Owns(p, d): Predicate. Person p owns disk d.
Sings(p, s, a): Album a includes a recording of song s sung by person p.
Wrote(p, s): Person p wrote song s.
McCartney , Gershwin , BHoliday, Joe, EleanorRigby , TheManILove, Revolver :
Constants with the obvious meanings.
Exercises 321