0% found this document useful (0 votes)
3 views392 pages

UNALGALG

The document discusses universal algebra and its applications in theoretical computer science, focusing on the structure and properties of algebras and their operations. It presents two main approaches: the construction of new algebras and the classification of algebras based on identities. The book also explores Galois-connections, clones, and advanced topics, providing a comprehensive foundation suitable for graduate students and researchers.

Uploaded by

piyawat.su45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views392 pages

UNALGALG

The document discusses universal algebra and its applications in theoretical computer science, focusing on the structure and properties of algebras and their operations. It presents two main approaches: the construction of new algebras and the classification of algebras based on identities. The book also explores Galois-connections, clones, and advanced topics, providing a comprehensive foundation suitable for graduate students and researchers.

Uploaded by

piyawat.su45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 392

Universal Algebra and Applications in

Theoretical Computer Science

S. L. Wismath, K. Denecke

April 25, 2025


Introduction

An algebra is a structure consisting of one or more sets of objects and one or


more operations on the objects. Such structures occur in all areas of Math-
ematics and Science: sets of numbers with operations such as addition and
multiplication, sets of relations or functions with the operation of compo-
sition, sets of matrices with operations of addition and multiplication, sets
of propositions with operations of conjunction, disjunction, and negation,
finite automata with a set of inputs, a set of outputs and a set of states,
with state transition and output functions, and so on. In each case we are
interested in the properties of the operations involved, what their arities are
and what laws or axioms they satisfy. In particular, any algebra has a type
associated with it, which specifies the number of operations involved and
the arity of each one. Most students of abstract algebra begin their study
with groups, which are sets with one binary operation which satisfies four
specific axioms or properties. The example of groups shows that not only
single algebras but classes of similar algebras having certain properties in
common are interesting and important.

General or universal algebra is the study of algebras and classes of algebras of


arbitrary types. There are two main approaches to the study of such abstract
algebras. In the first approach, we look for constructions on algebras which
produce new algebras of the same type. For instance, we can look for subsets
of the original set which inherit the same operations and properties, and for
mappings between sets which preserve the operational structure. There are
three main constructions available for producing new algebras from given
ones: the construction of subalgebras, homomorphic images and product al-
gebras. These concepts occur in specific algebraic theories such as group
theory, ring or field theory, but they can be formulated in a more general
way as well. This allows us to prove theorems such as the usual Homomor-
phism or Isomorphism Theorems for groups in a more abstract setting which

v
vi

covers all types of algebras at once.

The formation of subalgebras, homomorphic images and products gives us


three operators on classes of algebras, and we can look for classes of algebras
which are closed under these operators. Such classes of algebras are called
varieties.

The second approach to the study of abstract algebras involves the study of
terms and identities. Here we classify algebras according to the identities or
axioms they satisfy. Given any set of identities, of a particular type, we can
form the class of all algebras of that type which satisfy the given identities.
Such a class is called an equational class. This point of view is quite different
from the classical structure theoretical approach, and combines logic, model
theory and abstract algebra.

In the first six chapters of this book we develop these two approaches to
general algebra. We begin in Chapter 1 with a detailed list of examples of
various algebras, arising from a number of areas of Mathematics and Com-
puter Science. The subalgebra, homomorphic image, product and quotient
constructions are described in Chapters 1, 3 and 4. In Chapter 5 we intro-
duce terms and identities, to be used in the second approach. This treatment
culminates in Chapter 6 with a complete proof of Birkhoff’s Theorem, which
relates our two approaches: this theorem says that any variety is an equa-
tional class, and vice versa, so that the two approaches lead us to exactly
the same classes of algebras.

These first six chapters provide a solid foundation in the core material of
universal algebra. With a leisurely pace, careful exposition, and lots of ex-
amples and exercises, this section is designed as an introduction suitable for
beginning graduate students or researchers from other areas.

A unique feature of this book is the use of Galois-connections as a main


theme. The concept of a Galois-connection, with the related topics of clo-
sure operators, closure systems and lattices of closed sets, is introduced early
on, in Chapter 2, and used throughout the book. The classical Galois theory,
the relation between groups of automorphisms and fixed points, is touched
on in Chapter 3. The main purpose of abstract Galois theory is to develop
the interplay between two different sets or classes of objects, on the basis
of a binary relation between the sets or classes. This allows us to tackle a
vii

problem involving one kind of object by using the theory of the second kind
of objects.

We make use of our early introduction of Galois-connections and lattices of


closed sets in two main examples. The first of these is the connection Id-Mod,
between sets of identities and classes of algebras, which in Chapter 6 gives us
the complete lattice of all varieties of a given type. The other main example
is the connection Pol-Inv, between sets of relations and sets of operations on
a fixed base set. From this connection we obtain the complete lattice of all
clones of operations on a base set. Both of these examples are followed up
in Chapter 14.

The study of clones is another important feature of this work. Clones (closed
sets of functions) occur in universal algebra as clones of term operations or
of polynomial operations of an algebra, in automata theory as combinations
of elementary switching circuits, and in logic as sets of truth-value functions
generated by elementary truth-value functions.

Chapters 7 and 8 deal with applications of general algebra to theoretical


Computer Science. The terms and free algebras from Chapters 5 and 6 are
used in Chapter 7 to study term rewriting systems, including the important
properties of confluence and termination. Having shown in Chapter 1 that
finite automata may be regarded as heterogeneous or multi-based algebras,
in Chapter 8 we develop the theory of automata and Turing machines as
algebraic machines. Finite automata recognize languages made up of words
which are in fact terms of the free monoid; so they are a special case of more
general machines which recognize sets of terms from the free algebra of any
fixed type. Such general machines are called tree-recognizers, since terms are
often referred to as trees. In Chapter 8 we also use Turing machines to look
at decidability and algorithmic problems in general algebra. Thus these two
chapters give some concrete application of the more theoretical material of
the first six chapters.

Chapters 9 to 12 cover more advanced topics of universal algebra, including


Mal’cev conditions, tame congruence theory and commutators. An impor-
tant feature is the study of clones of operations, primality and functional
completeness, which are also important in Computer Science. We study fi-
nite algebras and the varieties they generate, and give an algebraic proof of
Post’s characterization of the lattice of all clones on a two-element set.
viii

The last three chapters, Chapters 13, 14 and 15, tie together the main themes
of Galois-connections, clones and varieties, and algebraic machines. In Chap-
ter 13 we return to the study of complete lattices of closed sets, obtained
from Galois-connections. We describe several methods for obtaining com-
plete sublattices of such complete lattices, and in Chapter 14 illustrate these
methods on our two main examples, the lattice of all varieties of a given
type and the lattice of all clones of operations on a fixed set. This leads to
complete sublattices of M-solid varieties (using M-hyperidentities to obtain
new closure operators) and G-clones and H-clones. We then return to appli-
cations to theoretical Computer Science, by looking at hypersubstitutions as
tree-recognizers. Hypersubstitutions involve replacing not only the leaves of
a tree by elements but also the nodes by term operations. The main result
here is the proof of the equivalence of this “parallel” replacement with the
linear approach. Hypersubstitutions can also be applied to the tree trans-
formations and tree transducers of Chapter 8, and to the syntactical and
semantical hyperunification problems.

An even more general approach to algebraic structures and to structural


thinking is the category-theoretical one. In this approach, clones and equa-
tional theories are combined into the concept of an algebraic theory, and
some parts of the theory become clearer. We believe however that for a be-
ginning student, the universal algebraic approach is a necessary first step in
reaching such higher levels of abstraction.

The material of this book is based on lectures given by the authors at the
University of Potsdam (Germany), Chiangmai University and KhonKaen
University (Thailand) and the University of Balgoevgrad (Bulgaria), and in
research seminars with our students. The authors are grateful for the critical
input of a number of students.

The work of Shelly Wismath on this book was supported by a research leave
from the University of Lethbridge (January to July 2001) and by funding
from the Natural Sciences and Engineering Research Council of Canada. The
hospitality of the Institute of Mathematics of the University of Potsdam, and
particularly the General Algebra Research Group, during the year July 2000
to June 2001 is also gratefully acknowledged. Special thanks to Stephen and
Alice for accompanying me to Germany for a year.
Contents

Introduction v

1 Basic Concepts 1
1.1 Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Subalgebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Congruence Relations and Quotients . . . . . . . . . . . . . . 21
1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 Galois Connections and Closures 31


2.1 Closure Operators . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Galois Connections . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3 Concept Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3 Homomorphisms and Isomorphisms 47


3.1 The Homomorphism Theorem . . . . . . . . . . . . . . . . . . 49
3.2 The Isomorphism Theorems . . . . . . . . . . . . . . . . . . . 58
3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4 Direct and Subdirect Products 63


4.1 Direct Products . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Subdirect Products . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5 Terms, Trees, and Polynomials 75


5.1 Terms and Trees . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Term Operations . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3 Polynomials and Polynomial Operations . . . . . . . . . . . . 85

ix
x

5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6 Identities and Varieties 91


6.1 The Galois Connection (Id, Mod) . . . . . . . . . . . . . . . . 91
6.2 Fully Invariant Congruence Relations . . . . . . . . . . . . . . 95
6.3 The Algebraic Consequence Relation . . . . . . . . . . . . . . 97
6.4 Relatively Free Algebras . . . . . . . . . . . . . . . . . . . . . 98
6.5 Varieties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.6 The Lattice of All Varieties . . . . . . . . . . . . . . . . . . . 109
6.7 Finite Axiomatizability . . . . . . . . . . . . . . . . . . . . . 110
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

7 Term Rewriting Systems 115


7.1 Confluence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.2 Reduction Systems . . . . . . . . . . . . . . . . . . . . . . . . 123
7.3 Term Rewriting . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.4 Termination of Term Rewriting Systems . . . . . . . . . . . . 141
7.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

8 Algebraic Machines 147


8.1 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . 148
8.2 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.3 Algebraic Operations on Finite Automata . . . . . . . . . . . 159
8.4 Tree Recognizers . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.5 Regular Tree Grammars . . . . . . . . . . . . . . . . . . . . . 169
8.6 Operations on Tree Languages . . . . . . . . . . . . . . . . . 174
8.7 Minimal Tree Recognizers . . . . . . . . . . . . . . . . . . . . 176
8.8 Tree Transducers . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.9 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.10 Undecidable Problems . . . . . . . . . . . . . . . . . . . . . . 187
8.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

9 Mal’cev-Type Conditions 193


9.1 Congruence Permutability . . . . . . . . . . . . . . . . . . . . 193
9.2 Congruence Distributivity . . . . . . . . . . . . . . . . . . . . 195
9.3 Arithmetical Varieties . . . . . . . . . . . . . . . . . . . . . . 201
9.4 n-Modularity and n-Permutability . . . . . . . . . . . . . . . 203
9.5 Congruence Regular Varieties . . . . . . . . . . . . . . . . . . 205
9.6 Two-Element Algebras . . . . . . . . . . . . . . . . . . . . . . 206
xi

9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

10 Clones and Completeness 215


10.1 Clones as Algebraic Structures . . . . . . . . . . . . . . . . . 215
10.2 Operations and Relations . . . . . . . . . . . . . . . . . . . . 217
10.3 The Lattice of All Boolean Clones . . . . . . . . . . . . . . . 219
10.4 The Functional Completeness Problem . . . . . . . . . . . . . 228
10.5 Primal Algebras . . . . . . . . . . . . . . . . . . . . . . . . . 231
10.6 Different Generalizations of Primality . . . . . . . . . . . . . 240
10.7 Preprimal Algebras . . . . . . . . . . . . . . . . . . . . . . . . 245
10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

11 Tame Congruence Theory 251


11.1 Minimal Algebras . . . . . . . . . . . . . . . . . . . . . . . . . 251
11.2 Tame Congruence Relations . . . . . . . . . . . . . . . . . . . 262
11.3 Permutation Algebras . . . . . . . . . . . . . . . . . . . . . . 269
11.4 The Types of Minimal Algebras . . . . . . . . . . . . . . . . . 276
11.5 Mal’cev Conditions and Omitting Types . . . . . . . . . . . . 281
11.6 Residually Small Varieties . . . . . . . . . . . . . . . . . . . . 286
11.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

12 Term Condition and Commutator 289


12.1 The Term Condition . . . . . . . . . . . . . . . . . . . . . . . 289
12.2 The Commutator . . . . . . . . . . . . . . . . . . . . . . . . . 293
12.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

13 Complete Sublattices 301


13.1 Conjugate Pairs of Closure Operators . . . . . . . . . . . . . 301
13.2 Galois Closed Subrelations . . . . . . . . . . . . . . . . . . . . 308
13.3 Closure Operators on Complete Lattices . . . . . . . . . . . . 316
13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

14 G-Clones and M -Solid Varieties 325


14.1 G-Clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
14.2 H-clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
14.3 M -Solid Varieties . . . . . . . . . . . . . . . . . . . . . . . . . 334
14.4 Intervals in the Lattice L(τ ) . . . . . . . . . . . . . . . . . . . 342
14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
xii

15 Hypersubstitutions and Machines 347


15.1 The Hyperunification Problem . . . . . . . . . . . . . . . . . 347
15.2 Hyper Tree Recognizers . . . . . . . . . . . . . . . . . . . . . 349
15.3 Tree Transformations . . . . . . . . . . . . . . . . . . . . . . . 357
15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Bibliography 363

Index 373
Chapter 1

Basic Concepts

In this first chapter we shall introduce the two basic concepts of Universal
Algebra, operations and algebras, and provide a number of examples of al-
gebras. One of these examples, lattices, is useful in a theoretical way as well:
any algebra of any type is accompanied by some lattices, making the theory
of lattices important in the study of all algebras. In Section 3 we examine
the concept of a subalgebra, and the generation of subalgebras. In Section 4
we look at congruence relations and quotient algebras. Congruence relations
on algebras generalize the well-known notion of the congruence modulo n
defined on the ring of all integers, and quotient algebras generalize the con-
struction of quotient or residue rings modulo n.

1.1 Algebras
As a branch of Mathematics, Algebra is the study of algebraic structures, or
sets of objects with operations defined on them. These sets and operations
often arise in other fields of Mathematics and in applications. Let us consider
the following example:

Example 1.1.1 Let D(Φ) be the set of all mappings of symmetry of a


rectangle Φ. As usual, a mapping of symmetry of a rectangle in a plane is
understood to be a motion of the plane under which the rectangle is invari-
ant. We will denote by g1 and g2 the two axes of symmetry of the rectangle.
The set D(Φ) of all mappings of symmetry of the rectangle Φ consists ex-
actly of the four mappings e, s1 , s2 and z0 , where

1
2 CHAPTER 1. BASIC CONCEPTS

4 · 3

·
· · · · · · ·
g1
O
·

·
1 2
g2

e is the identity mapping,


s1 is the reflection of Φ through the line g1 ,
s2 is the reflection of Φ through the line g2 , and
z0 is the rotation around the center point O by π radians.

If we label the corners of the rectangle by 1, 2, 3 and 4, then the four map-
pings of symmetry can be described by the following permutations, in the
usual cyclic notation:

e by (1), s1 by (14)(23),
s2 by (12)(34), z0 by (13)(24).

Then the set of all mappings of symmetry of the rectangle Φ is given by

D(Φ) = {(1), (14)(23), (12)(34), (13)(24)}.

The composition of any two mappings of symmetry of the rectangle Φ is


again such a mapping, so we have a binary operation

◦ : D(Φ) × D(Φ) → D(Φ)

defined on our set D(Φ). Thus we consider the pair (D(Φ); ◦), consisting of
the set of mappings with this binary operation. This is what is called an
algebra: a base set of objects, together with a set of one or more operations
which are defined on this base set. In this example the set of operations
contains only one element, the binary operation of composition of mappings.
1.1. ALGEBRAS 3

Notice that we may use the same example to produce a different algebra, as
follows. We can use as our base set the set {1, 2, 3, 4} of the four corners of
the rectangle, and as our set of operations the set

{(1), (14)(23), (12)(34), (13)(24)}

of four unary operations. This gives us the algebra

({1, 2, 3, 4}; {(1), (14)(23), (12)(34), (13)(24)}),

which is completely different from the first one but which can also be used
to describe the geometric situation.

These examples show that in defining an algebra, we must specify in addition


to the base set of objects the set of operations: both how many operations
there are, and what their arities are. We begin by defining formally the
concept of an operation on a set.

Definition 1.1.2 Let A be a set, and let n ≥ 1 be a natural number. A


function f : An → A is called an n-ary operation defined on A, and is said
to have arity n. We let On (A) be the set of all n-ary operations defined on

On (A) be the set of all finitary operations defined on
S
A and let O(A) :=
n=1
A.

Remark 1.1.3 1. Any n-ary operation f on A can be regarded as an (n+1)-


ary relation defined on A, called the graph of f . This relation is defined by
{(a1 , . . . , an+1 ) ∈ An+1 : f (a1 , . . . , an ) = an+1 }.

2. Definition 1.1.2 can be extended in the following way to the special case
that n = 0, for a nullary operation. We define A0 := {∅}. A nullary operation
is defined as a function f : {∅} → A. This means that a nullary operation
on A is uniquely determined by the element f (∅) ∈ A. For every element
a ∈ A there is exactly one mapping fa : {∅} → A with fa (∅) = a. Therefore
a nullary operation may be thought of as selecting an element from the set
A. If A = ∅ then there are no nullary operations on A.

3. If A is the two element set {0, 1}, operations on A are called Boolean
operations. If we associate the truth-values false with 0 and true with 1, then
the Boolean operations are just the truth-value functions of the Classical
4 CHAPTER 1. BASIC CONCEPTS

Propositional Logic. For instance the functions of negation, conjunction,


disjunction, implication, and equivalence are given by the truth-value tables

¬ ∧ 0 1 ∨ 0 1 ⇒ 0 1 ⇔ 0 1
0 1 0 0 0 0 0 1 0 1 1 0 1 0 ,
1 0 1 0 1 1 1 1 1 0 1 1 0 1

respectively.
Using the concept of an operation we can now define an algebra. As we have
seen, an algebra should be a pair consisting of a base set of objects and a
set of operations defined on the base set. The set of operations is usually
indexed by some index set I, so we will define an indexed algebra here;
the more general setting of non-indexed algebras will be considered later, in
Chapter 10.

Definition 1.1.4 Let A be a non-empty set. Let I be some non-empty index


set, and let (fiA )i∈I be a function which assigns to every element of I an ni -
ary operation fiA defined on A. Then the pair A = (A; (fiA )i∈I ) is called
an (indexed) algebra (indexed by the set I). The set A is called the base or
carrier set or universe of A, and (fiA )i∈I is called the sequence of fundamental
operations of A. For each i ∈ I the natural number ni is called the arity of
fiA . The sequence τ := (ni )i∈I of all the arities is called the type of the
algebra A. We use the name Alg(τ ) for the class of all algebras of a given
type τ .
Notice that in our definition we do not allow the base set A of an algebra to
be the empty set. It is possible to define an empty algebra with the empty
set as a base, but many theorems about algebras then have to include a
separate case to discuss what happens when the algebra is empty. Thus we
have chosen to exclude this case from our discussion.

1.2 Examples
In this section we illustrate our basic definition of an algebra, by presenting
a number of examples, many of which we will also use later. We note that
an operation f A on a set A will often be denoted simply by f , omitting the
superscript. Also, operations of arities zero, one and two are often said to be
nullary, unary and binary respectively.
1.2. EXAMPLES 5

Example 1.2.1 As we saw in the previous section, we may describe the set
of symmetry mappings of a rectangle by two different algebras:

A1 = ({1, 2, 3, 4}; (1), (14)(23), (12)(34), (13)(24)),

of type τ1 = (1, 1, 1, 1), or

A2 = ({(1), (14)(23), (12)(34), (13)(24)}; ◦),


of type τ2 = (2). Note that these are both examples of finite algebras, where
the base set is a finite set.

Example 1.2.2 A unar is an algebra U = (U ; g U ) of type τ = (1), with


one unary operation.

Example 1.2.3 An algebra (G; ·) of type τ = (2), with one binary oper-
ation, is called a groupoid. Here the single binary operation f is denoted by
· or even just by juxtaposition, so we write x · y or just xy instead of f (x, y).

A groupoid (G; ·) is called abelian or commutative if it also satisfies

(G0) ∀x, y ∈ G (x · y = y · x) (commutative law).

Example 1.2.4 A groupoid (G; ·) is called a semigroup if the binary oper-


ation · is associative; that is, if G satisfies

(G1) ∀x, y, z ∈ G (x · (y · z) = (x · y) · z) (associative law).

Example 1.2.5 An algebra M = (M ; ·, e) of type (2, 0) is called a monoid,


if the associative law (G1) and

(G20 ) ∀x ∈ G (x · e = e · x = x) (identity law)


are satisfied.

Example 1.2.6 A group is an algebra G = (G; ·) of type (2), which satisfies


the axioms (defining identities) (G1) and

(G2) ∀a, b ∈ G ∃x, y ∈ G (a · x = b and y · a = b).


(invertibility)
6 CHAPTER 1. BASIC CONCEPTS

A group can also be regarded as an algebra G = (G; ·, −1 , e) of type (2, 1, 0),


where the laws (G1), (G20 ) and

(G200 ) ∀x ∈ G (x · x−1 = x−1 · x = e) (inverse law)


are satisfied.

Example 1.2.7 An algebra Q = (Q; ·) of type (2) is called a quasigroup,


if · is a uniquely invertible, but not necessarily associative binary operation
on the set Q.

A quasigroup can also be characterized by the property that for all a ∈ Q


the following mappings of Q onto Q are bijections:

x 7→ a · x, left multiplication with a,

x 7→ x · a, right multiplication with a.


Quasigroups can also be defined as algebras of type (2, 2, 2) with the oper-
ations ·, \ and /, where the following axioms are satisfied:

(Q1) ∀x, y ∈Q (x \ (x · y) = y),


(Q2) ∀x, y ∈Q ((x · y)/y = x),
(Q3) ∀x, y ∈Q (x · (x \ y) = y), and
(Q4) ∀x, y ∈Q ((x/y) · y = x).

Example 1.2.8 An algebra R = (R; +, ·, −, 0) of type (2, 2, 1, 0)


is called a ring if (R; +, −, 0) is an abelian (commutative) group (with
addition as the binary operation and 0 as an identity element) and (R; ·) is
a semigroup, and also the two distributive laws

(D1) ∀x, y, z ∈ R (x · (y + z) = x · y + x · z) and


(D2) ∀x, y, z ∈ R ((x + y) · z = x · z + y · z)
are satisfied.

A ring (R; +, ·, −, 0 , e) with an identity element e for multiplication is


called a skew field if (R \ {0}; ·) is a group. When in addition the operation
· is commutative, the ring R is called a field.

Example 1.2.9 An algebra V = (V ; ∧, ∨) of type (2, 2) is called a lattice,


if the following equations are satisfied by its two binary operations, which
1.2. EXAMPLES 7

are usually called meet and join:

(V1) ∀x, y ∈ V (x∨y = y∨x),


(V10 ) ∀x, y ∈ V (x∧y = y∧x),
(V2) ∀x, y, z ∈ V (x∨(y∨z) = (x∨y)∨z),
(V20 ) ∀x, y, z ∈ V (x∧(y∧z) = (x∧y)∧z),
(V3) ∀x ∈ V (x∨x = x),
(V30 ) ∀x ∈ V (x∧x = x) (idempotency),
(V4) ∀x, y ∈ V (x∨(x∧y) = x),
(V40 ) ∀x, y ∈ V (x ∧(x∨y) = x) (absorption laws).

If in addition the lattice satisfies the following distributive laws,

(V5) ∀x, y, z ∈ V (x∧(y∨z) = (x∧y)∨(x∧z)),


(V50 ) ∀x, y, z ∈ V (x∨(y∧z) = (x∨y)∧(x∨z)),

then the lattice is said to be distributive. We remark that (V3) and (V30 )
follow from (V4) and (V40 ) respectively.

A lattice is called modular if it satisfies the modular law

(V6) ∀x , y , z ∈ V (z ≤ x ⇒ x ∧ (y ∨ z) = (x ∧ y) ∨ z).
Lattices are important both as examples of a kind of algebra, and also in the
study of all other kinds of algebras too, since any algebra turns out to have
some lattices associated with it. We shall study such associated lattices in
the coming sections. For now, we show how lattices are connected to partially
ordered sets. A binary relation ≤ on a set A is called a partial order on A
if it is reflexive, anti-symmetric and transitive. There is a close connection
between lattices and partially ordered sets, in the sense that each determines
the other, as the following theorem shows.

Theorem 1.2.10 Let (L; ≤) be a partially ordered set in which for all x, y ∈
L both the infimum {x, y} and the supremum {x, y} exist. Then the binary
V W

infimum and supremum operations make (L; ∧, ∨) a lattice. Conversely,


every lattice defines a partially ordered set in which for all x, y the infimum
{x, y} and the supremum {x, y} exist.
V W

Proof: Let (L; ≤) be a partially ordered set, in which for any two elements
x, y ∈ L the infimum {x, y} and the supremum {x, y} exist. If we put
V W
8 CHAPTER 1. BASIC CONCEPTS

x∧y = {x, y} and x ∨ y = {x, y}, then the required identities (V1) -
V W

(V4) and (V10 ) - (V40 ) are easy to verify.

If conversely (L; ∧, ∨) is a lattice, then we define

x ≤ y : ⇔ x ∧ y = x,

and show that this gives a partial order relation on L. Reflexivity follows
from (V30 ), x ∧ x = x ; for antisymmetry we see that

(x ≤ y) ∧ (y ≤ x) ⇔ (x ∧ y = x) ∧ (y ∧ x = y),

and so from commutativity we get

x = x ∧ y = y ∧ x = y;

and transitivity follows from (x ∧ y = x) ∧ (y ∧ z = y) ⇒


x = x ∧ y = x ∧ (y ∧ z) = (x ∧ y) ∧ z = x ∧ z ⇔ x ≤ z. It is
easy to check that {x, y} = x ∧ y and {x, y} = x ∨ y are satisfied.
V W

Example 1.2.11 A lattice (or partially ordered set) L in which for all sets
B ⊆ L the infimum B and the supremum B exist is called a complete
V W

lattice. Obviously, any finite lattice is complete.


A bounded lattice (V ; ∧, ∨, 0, 1) is an algebra of type (2, 2, 0, 0) which is a
lattice, with two additional nullary operations 0 and 1, which satisfy

(V6) ∀x ∈ V (x∧0 = 0) and


(V7) ∀x ∈ V (x∨1 = 1).

These axioms tell us that in the partial order determined by the lattice, 0
acts as the least element and 1 as the greatest element.

Example 1.2.12 An algebra S = (S; ·) of type (2) is called a semilattice,


if the operation · is an associative, commutative, and idempotent (satisfies
x · x = x) binary operation on S. This means that a semilattice is a
particular kind of semigroup.

Example 1.2.13 An algebra B = (B; ∧, ∨, ¬, 0, 1) of type (2, 2, 1, 0, 0)


is called a Boolean algebra, if (B; ∧, ∨, 0, 1) is a bounded distributive lat-
tice with an additional unary operation called complementation, which also
1.2. EXAMPLES 9

satisfies the two identities

(B1) ∀x ∈ B (x ∧ ¬x = 0) and
(B2) ∀x ∈ B (x ∨ ¬x = 1) (complement laws).

As an example of a Boolean algebra, we recall the set of all Boolean functions


on the two element set {0, 1} from Remark 1.1.3, 3. This gives us the algebra
({0, 1}; ∧, ∨, ¬, 0, 1), usually denoted by the name 2B . Another example
of a Boolean algebra is the power set of a set A, with the operations of
intersection, union, complementation, empty set and A.

Example 1.2.14 For applications in Logic and the theory of switching cir-
cuits, algebras whose carrier sets are sets of operations on a base set play an
important role. We recall from Definition 1.1.2 that for a non-empty base set
A, we denote by O(A) the set of all finitary operations on A (with nullary
operations regarded as constant unary operations).

We can make the set O(A) into an algebra in several ways, by defining vari-
ous operations on O(A). We first define the following composition operation
on O(A). If f A ∈ On (A) and g1A , . . ., gnA ∈ Om (A), we obtain a new opera-
tion f A (g1A , . . . , gnA ) in Om (A), by setting f A (g1A , . . . , gnA )(a1 , . . . , am ) to be
f A (g1A (a1 , . . . , am ), . . . , gnA (a1 , . . . , am )), for all a1 , . . . , am ∈ A. This gives an
operation
On (A) × (Om (A))n → Om (A),
which is called composition or superposition. Clearly, the whole set O(A) is
closed under arbitrary compositions.

The set O(A) also contains certain special elements called the projections.
For each n ≥ 1 and each 1 ≤ j ≤ n, the n-ary function enj defined on A
by enj (a1 , . . . , an ) = aj is called the j-th projection mapping of arity n. Any
subset of O(A) which is closed under composition and contains all these
projection mappings is called a clone on A. Of course O(A) itself has these
properties, and it is called the full clone on A.

There is another way to define an algebraic structure on the set O(A), using
the following operations ∗, ξ, τ and ∆ on O(A):

∗ : On (A) × Om (A) → Om+n−1 (A) defined by (f, g) 7→ f ∗ g


10 CHAPTER 1. BASIC CONCEPTS

with
(f ∗ g)(x1 , . . . , xm , xm+1 , . . . , xm+n−1 ) : =
f (g(x1 , . . . , xm ), xm+1 , . . . , xm+n−1 ),
for all x1 , . . . , xm+n−1 ∈ A;

ξ : On (A) → On (A) defined by f 7→ ξ(f )


with
ξ(f )(x1 , . . . , xn ) := f (x2 , . . . , xn , x1 );

τ : On (A) → On (A) defined by f 7→ τ (f )

with
τ (f )(x1 , . . . , xn ) := f (x2 , x1 , x3 , . . . , xn );

∆: On (A) → On−1 (A) defined by f 7→ ∆(f )

with
∆(f )(x1 , . . . , xn−1 ) := f (x1 , x1 , x2 , . . . , xn−1 )
for every n > 1 and for all x1 , . . . , xn ∈ A; and

ξ(f ) = τ (f ) = ∆(f ) = f, if n = 1.
We also use the nullary operation e21 which picks out the binary projection
on the first coordinate; as defined above, e21 (a1 , a2 ) = a1 , for all a1 , a2 ∈ A.
With these five operations, we obtain an algebra (O(A); ∗, ξ, τ, ∆, e21 ) of
type (2, 1, 1, 1, 0). This algebra is called the full iterative algebra on A, or
sometimes also a clone on A. We shall study clones in more detail in Chapter
10.

Example 1.2.15 Let R = (R; +, ·, −, 0) be a ring such that R is infinite.


In this example we consider an infinitary type of algebra, one with an infinite
number of operation symbols, one for each element of the ring R. An algebra
(M ; +, −, 0, R) of type (2, 1, 0, (1)r∈R ) is called an R-module (or a module
over R) if (M ; +, −, 0) is an abelian group and the following identities are
satisfied, for all r and s in R and x and y in M :

(M1) r(x + y) = r(x) + r(y),


(M2) (r + s)(x) = r(x) + s(x),
1.2. EXAMPLES 11

(M3) (r · s)(x) = r(s(x)).

If R also has an identity element 1, then the following additional identity


must also hold in an R-module:

(M4) 1(x) = x.

In the special case that R is a field, an R-module is usually called an R-


vector space. The elements of the module are then called vectors, while the
field elements are called scalars.

Thus modules and vector spaces are algebras with infinitely many opera-
tions. There is another way to describe modules and vector spaces alge-
braically, which avoids this infinitary type, and which is more often used.
But it involves a modification of our definition of an algebra. We have so far
considered only what are called homogeneous or one-based algebras, where
we have only one base set of objects and all operations are defined on this
set. One can also define what are called heterogeneous or multi-sorted or
multi-based algebras, where there are two or more base sets of objects and
operations are allowed on more than one kind of object. We can use this
approach here, by allowing the set V of all vectors and the universe of the
field R as two different base sets in such a heterogeneous algebra. Then we
have two operations, + : V × V → V for the usual vector addition and
· : R × V → V for the scalar multiplication of a scalar from R times a
vector from V . Then (V, R; , +, ·) is an example of a heterogeneous algebra.
Although we will focus on homogeneous or one-based algebras in this book,
we will consider the multi-based algebras described in Example 1.2.16 below
in more detail in Chapter 8, and we remark that the algebraic theory for
multi-based algebras can be developed in a completely analogous way.

Example 1.2.16 This example shows how multi-based algebras may be


used in theoretical Computer Science. An automaton without output, or an
acceptor or a recognizer, is a multi-based algebra H = (Z, X; δ), where Z
and X are non-empty sets called the sets of states and inputs, respectively,
and where δ : Z × X → Z is called the state transition function. An au-
tomaton (with output) is a quintuple A = (Z, X, B; δ, λ), where (Z, X; δ) is
an acceptor, the non-empty set B is called the set of outputs, and where
λ : Z × X → B is called the output function. For any state z ∈ Z and any
12 CHAPTER 1. BASIC CONCEPTS

input x ∈ X, the value δ(z, x) is the state which results when the input x
is read in when the machine is in the state z. The element λ(z, x) is the
output element which is produced by input x if the automaton is in state z.
If all the sets Z, X and B are finite, then the automaton is said to be finite;
otherwise it is infinite. When δ and λ are functions, and so have exactly one
image for each state-input pair (z, x), the automaton is called deterministic;
in the non-deterministic case we allow δ and λ to take on more than one
value, or to be undefined, for a given input pair. Sometimes a certain state
z0 ∈ Z is selected as an initial state, and in this case we write (Z, X; δ, z0 ) or
(Z, X, B; δ, λ, z0 ) for the automaton. In the finite case, an automaton may
be fully described by means of tables for δ and λ, as shown below.

δ x1 ... xn λ x1 ... xn
z1 δ(z1 , x1 ) . . . δ(z1 , xn ) z1 λ(z1 , x1 ) . . . λ(z1 , xn )
· · · · · ·
· · · · · ·
· · · · · ·
zk δ(zk , x1 ) . . . δ(zk , xn ) zk λ(zk , x1 ) . . . λ(zk , xn )

We shall study finite automata in detail in Chapter 8. We conclude here


with the remark that a finite automaton can also be described by a directed
graph. The vertices of the graph correspond to states, and there is an edge
labelled by x going from vertex z to vertex y when δ(z, x) = y. We can
also label the vertex by both x and λ(z, x) when the output is λ(z, x). For
example, the graph below

z1 x2 ; b1 z2
s  x2 ; b1 s
x1 ; b2@ @
I 
x1 ; b2
@
@ @
@ x3 ; b2 x2 ; b1
x3 ; b2@@ x3 ; b2
@
@ @
@ @
R s
z3
@
x1 ; b1

corresponds to the automaton with Z = {z1 , z2 , z3 }, X = {x1 , x2 , x3 }, and


B = {b1 , b2 }, with the tables shown below for δ and λ.
1.3. SUBALGEBRAS 13

δ x1 x2 x3 λ x1 x2 x3
z1 z1 z1 z3 z1 b2 b1 b2
z2 z2 z1 z3 z2 b2 b1 b2
z3 z3 z2 z1 z3 b1 b1 b2

1.3 Subalgebras
The set of all mappings of symmetry of an equilateral triangle Φ,


·
g3 · O· · g2
··
· · ·
· ·
1 · · · 2
g1

described by permutations of the set {1, 2, 3}, is the set

D(Φ) = {(1), (123), (132), (12), (13), (23)}.

Here (12), (13) and (23) correspond to reflections at the lines g1 , g3 , and
g2 , respectively; and (1), (123) and (132) correspond to rotations around
the point O by 0, 120 and 240 degrees, respectively. We notice that D(Φ)
forms a group with respect to the composition of these mappings; that is,
with respect to the multiplication of the corresponding permutations.

Since the composition of two rotations around the point O is again a rotation
around O, our operation preserves the subset of D(Φ) consisting of all rota-
tions around O, and we say that this subset is closed under our operation.
This closure can also be seen if we consider the multiplication table of the
corresponding permutations:

◦ (1) (123) (132)


(1) (1) (123) (132)
(123) (123) (132) (1)
(132) (132) (1) (123)
14 CHAPTER 1. BASIC CONCEPTS

However, we can see that the composition of reflections through different


lines gives a permutation which is not a reflection; so the set of reflections
does not have this closure property. To describe this property algebraically,
we use the concept of a subalgebra.

Definition 1.3.1 Let B = (B; (fiB )i∈I ) be an algebra of type τ . Then


an algebra A is called a subalgebra of B, written as A ⊆ B, if the following
conditions are satisfied:

(i) A = (A; (fiA )i∈I ) is an algebra of type τ ;

(ii) A ⊆ B;

(iii) ∀i ∈ I, the graph of fiA is a subset of the graph of fiB .

Remark 1.3.2 1. Condition (iii) of the Definition refers to the graph of


an operation, as defined in Remark 1.1.3. This condition means that the
graph of fiA is the restriction of the graph of fiB to Ani ⊆ B ni . We write
(fiA = fiB | Ani ), for all i ∈ I, using fiB | Ani , or just fiB | A, to denote the
restriction of fiB to Ani .

2. If fiA is a nullary operation, then


fiA = (A0 × B) ∩ fiB = ({∅} × B) ∩ fiB
= {(∅, fiB (∅))} = fiB .
That is, a nullary operation fi must designate the same element in each
subalgebra A of B.

Whether one algebra is a subalgebra of another algebra can be checked by


the following criterion:

Lemma 1.3.3 (Subalgebra Criterion) Let B = (B; (fiB )i∈I ) be an algebra


of type τ and let A ⊆ B be a subset of B for which fiA = fiB | A for all
i ∈ I. Then A = (A, (fiA )i∈I ) is a subalgebra of B = (B, (fiB )i∈I ) iff A is
closed with respect to all the operations fiB for i ∈ I; that is, if fiB (Ani ) ⊆ A
for all i ∈ I.

Proof: Assume that A ⊆ B. Then fiA = fiB | A is an operation in A for


every i ∈ I, and so for all (a1 , . . . , ani ) ∈ Ani we have

fiB (a1 , . . . , ani ) = fiA (a1 , . . . , ani ) ∈ A.


1.3. SUBALGEBRAS 15

Thus the application of any operation fiB of B to elements of A always gives


elements of A.

If conversely A is closed with respect to fiB for all i ∈ I, then we have


fiB | Ani ⊆ Ani × A. This makes each fiB | Ani an operation on Ani in A,
and all the conditions of Definition 1.3.1 are satisfied.

Corollary 1.3.4 Let A, B and C be algebras of type τ. Then:

(i) (A ⊆ B) ∧ (B ⊆ C) ⇒ A ⊆ C;

(ii) (A ⊆ B ⊆ C) ∧ (A ⊆ C) ∧ (B ⊆ C) ⇒ A ⊆ B.

Proof: (i) Clearly, for all i ∈ I we have fiA ⊆ fiB ⊆ fiC for the graphs, and
thus fiA ⊆ fiC .

(ii) Applying Criterion 1.3.3 to an arbitrary ni − tuple (a1 , . . . , ani ) ∈ Ani ,


we get fiB (a1 , . . . , ani ) = fiC (a1 , . . . , ani ) = fiA (a1 , . . . , ani ) ∈ A. This shows
that A is closed with respect to all operations of B.

Now suppose that B = (B; (fiB )i∈I ) is an algebra of type τ and that
{Aj | j ∈ J} is a family of subalgebras of B. We define a set A : =
T
Aj
j∈J
and operations fiA : = fiB | A for all i ∈ I. We will verify that when
the intersection set A is non-empty, the algebra A = (A; (fiA )i∈I ) is indeed
an algebra of type τ , to be called the intersection of the algebras Aj , and
denoted by A : = Aj .
T
J

Corollary 1.3.5 The non-empty intersection A of a non-empty family


{Aj | j ∈ J} of subalgebras of an algebra B of type τ is a subalgebra of
B.

Proof: We will show that A is closed under all fiB , for all i ∈ I. Consider an
element (a1 , . . . , ani ) ∈ Ani , so that (a1 , . . . , ani ) ∈ Anj i for all j ∈ J. There-
fore we have fiB (a1 , . . . , ani ) ∈ Aj for all j ∈ J and i ∈ I, since Aj ⊆ B for all
j ∈ J. Hence fiB (a1 , . . . , ani ) also belongs to the intersection A =
T
Aj ,
j∈J
and A ⊆ B.
16 CHAPTER 1. BASIC CONCEPTS

Now we consider an algebra B of type τ and a non-empty subset X of


its carrier set B. There is at least one subalgebra of B whose carrier set
contains X, since B itself is one such algebra. This means we can look for
the intersection of all such subalgebras of B. We define

hXiB : = ∩ {A | A ⊆ B and X ⊆ A, }

and call this subalgebra hXiB of B the subalgebra of B generated by X, and


X a generating system of this algebra.

Notice that hXiB is the least (with respect to subset inclusion) subalgebra
of B to contain the set X. In particular, X might generate the whole algebra
B, if hXiB = B.

Example 1.3.6 As an example we will consider the algebra

Z6 = ({[0]6 , [1]6 , [2]6 , [3]6 , [4]6 , [5]6 , }; +, −, [0]6 ),

where the elements of the base set are the equivalence classes of the integers
modulo 6 and +, − denote addition and subtraction of equivalence classes.
This algebra is a group, with [0]6 as an identity element. For each one-element
subset X of Z6 we calculate the subalgebra generated by X:

h{[0]6 }iZ6 = ({[0]6 }; +, −, [0]6 ), h{[1]6 }iZ6 = h{[5]6 }iZ6 = Z6 ,

h{[3]6 }iZ6 = ({[0]6 , [3]6 }; +, −, [0]6 ), h{[2]6 }iZ6 = h{[4]6 }iZ6 =


({[0]6 , [2]6 , [4]6 }; +, −, [0]6 ), h{[2]6 , [3]6 }iZ6 = h{[4]6 , [3]6 }iZ6 = Z6 .
Our next theorem shows that this process of subalgebra generation satisfies
three very important properties, called the closure properties. This makes
the subalgebra generation process an example of a closure operator, which
we shall consider in more detail in Chapter 2.

Theorem 1.3.7 Let B be an algebra. For all subsets X and Y of B, the


following closure properties hold:

(i) X ⊆ hXiB , (extensivity);


(ii) X ⊆ Y ⇒ hXiB ⊆ hY iB , (monotonicity);
(iii) hXiB = hhXiB iB , (idempotency).
1.3. SUBALGEBRAS 17

Proof: These properties follow immediately from the properties of the op-
erator h i; we leave the details to the reader.

An important problem is the following: given an algebra B of type τ , and a


subset X of B, determine all elements of the carrier set of the subalgebra of
B which is generated by X. To do this, we set

E(X) : = X ∪ {fiB (a1 , . . . , ani ) | i ∈ I, a1 , . . . , ani ∈ X}.

Then we inductively define E 0 (X) := X, and E k+1 (X) := E(E k (X)), for all
k ∈ N. With this notation we can describe our subalgebra.

Theorem 1.3.8 For any algebra B of type τ and for any non-empty subset

X ⊆ B, we have hXiB = E k (X).
S
k=0

Proof: We first give a proof by induction on k that hXiB ⊇ E k (X).
S
k=0
For k = 0 it is clear that E 0 (X) = X ⊆ hXiB . For the inductive step,
assume that the proposition hXiB ⊇ E k (X) is true. Let a ∈ E k+1 (X); we
may assume that a 6∈ E k (X). Then there exists an element i ∈ I and ele-
ments a1 , . . . , ani ∈ E k (X) with a = fiB (a1 , . . . , ani ). Since E k (X) ⊆ hXiB
and hXiB is the carrier set of a subalgebra of B, we have a ∈ hXiB too.
Thus E k+1 (X) ⊆ hXiB , and therefore by induction on k we have hXiB ⊇

E k (X).
S
k=0
∞ ∞
Now we show that hXiB ⊆ E k (X), by showing that E k (X) is the
S S
k=0 k=0

carrier set of a subalgebra of B. Let i ∈ I and a1 , . . . , ani ∈ E k (X). Then
S
k=0
for every l ∈ {1, . . . , ni } there is a minimal k(l) ∈ N with al ∈ E k(l) (X).
Take m to be the maximum of these k(l), for 1 ≤ l ≤ ni . Then al ∈ E m (X)
for all l = 1, . . . , ni and thus

[
fiB (a1 , . . . , ani ) ∈ E m+1 (X) ⊆ E k (X).
k=0

E k (X) is closed under fiB for i ∈ I. By Lemma 1.3.3
S
This shows that
k=0

E k (X) is the carrier set of a subalgebra of B which contains X. Since by
S
k=0
18 CHAPTER 1. BASIC CONCEPTS

definition hXiB is the least (with respect to subset inclusion) subalgebra of


B which contains X, we get the desired inclusion. Finally from both inclu-

sions we have the equality hXiB = E k (X).
S
k=0

The proof of Theorem 1.3.8 gives an internal description of the generation


process for algebras. In contrast to such a“bottom-up” construction, we have
a “Principal of Structural Induction,” which can be regarded as a “top-down”
description.

Theorem 1.3.9 (Principle of Structural Induction) Let A = (A; (fi )i∈I ) be


an algebra of type τ , generated by a subset X of A. To prove that a property
P holds for all elements of A, it suffices to show the validity of the following
two conditions:

(1) Base of the induction: P holds for all elements of X.


(2) Induction step: If P holds for any a1 , . . ., ani in A (the induction hy-
pothesis) then P holds for fiA (a1 , . . . , ani ), for all i ∈ I.
Proof: Let us denote by AP the set of all elements in A for which P holds.
By (1) we know that X is contained in AP , and hence by monotonicity we
have hXiA contained in hAP iA . Since X generates the algebra A, this means
that the algebra generated by AP contains A, and so must equal A. But by
(2), AP is closed under all fundamental operations of A, and hence it is a
subalgebra of A. This tells us that A = AP ; so property P holds in all of
A.

Now we can consider the structural properties of the set of all subalgebras
of an algebra B of type τ . We denote this set by Sub(B). We know by 1.3.5
that the intersection of two subalgebras of B, if it is non-empty, is again a
subalgebra of B, and we would like to use this to define a binary operation
on Sub(B). That is, we want to define a binary operation ∧ on Sub(B) by
∧ : (A1 , A2 ) 7→ A1 ∧ A2 : = A1 ∩ A2 .
However, this operation is not defined on all of Sub(B), because it does not
deal with the case where A1 ∩ A2 is the empty set. But we can fix this, either
by allowing the empty set to be considered as an algebra (see the remark at
the end of Section 1.1), or by adjoining some new element to the set Sub(B)
and defining A1 ∧ A2 to be this new element, whenever A1 ∩ A2 = φ.
1.3. SUBALGEBRAS 19

A second binary operation ∨ can be obtained if we map (A1 , A2 ) to the


subalgebra of B which is generated by the union A1 ∪ A2 :

∨ : (A1 , A2 ) 7→ A1 ∨ A2 : = hA1 ∪ A2 iB .

Our goal in defining these two operations was of course to make Sub(B) into
a lattice, and this turns out to be the case.

Theorem 1.3.10 For every algebra B, the algebra (Sub(B); ∧, ∨) is a lat-


tice, called the subalgebra lattice of B.

Proof: By 1.3.5 and by definition of the subalgebra generated by a set, the


symbols ∧ and ∨ define binary operations on Sub(B). It is easy to check that
all of the axioms required for a lattice are satisfied by these operations.

The subalgebra lattice corresponding to an algebra B can be a useful tool in


studying B itself. We will also see, in the next chapter, how we can generalize
the process we used here to construct a lattice, based on the fact that the
intersection of any subalgebras gives us a subalgebra.

We have been considering the problem of finding, for a given algebra B and a
given subset X of B, the subalgebra of B generated by X. We can also look at
the subalgebra generation question from a different point of view, as follows.
Given an algebra B, can we find a (proper) subset X of B which generates
B? In particular, we are often interested in whether we can find a finite such
generating set X. If we can, then B is said to be finitely generated, with X
as a finite generating system; otherwise B is said to be infinitely generated.
Of course any finite algebra has a finite generating system (the carrier
set itself), so this question is only interesting for infinite algebras. Before
addressing the question of when a finite generating system is possible, we
consider two examples.

Example 1.3.11 Consider the algebra (N; ·) of type (2), where · is the usual
multiplication on the set N of all natural numbers. Is this finitely generated?
Since an arbitrary generating system would have to contain the infinite set
of all prime numbers, the answer in this case is no. But the infinite set of
all prime numbers is a generating system of this algebra, since every natural
number has a representation as a product of powers of prime numbers.
20 CHAPTER 1. BASIC CONCEPTS

Example 1.3.12 Recall from Example 1.2.14 in Section 1.2 that for any
base set A, we can define an algebra of type (2, 1, 1, 1, 0) on the set O(A) of
all finitary operations on A. This is the algebra (O(A); ∗, ξ, τ, ∆, e21 ). It is well
known in Logic that for the special case when A is the two-element set {0, 1},
this algebra is finitely generated. The two-element set X = {∧, ¬} (conjunc-
tion and negation) acts as a generating set for the algebra, since all Boolean
functions can be generated from these two using our five operations. In the
general case, it is an important problem to decide whether a given set X is
a generating system of a subalgebra C of the algebra (O(A); ∗, ξ, τ, ∆, e21 ).

To answer the question of whether a given set X ⊆ A is a generating system


of an algebra A of type τ , we define the concept of a maximal subalgebra of
an algebra.

Definition 1.3.13 An algebra A of type τ is called a maximal subalgebra


of an algebra B of type τ if there is no subalgebra C with A ⊂ C ⊂ B.

Corollary 1.3.14 A is a maximal subalgebra of B iff for all q ∈ B \ A, we


have hA ∪ {q}iB = B.

Proof: “⇒”: Let A be a maximal subalgebra of B. Since q 6∈ A we have


A ⊂ hA ∪ {q}iB ⊆ B. From the maximality of A we obtain our proposition.

“⇐”: Assume A is not maximal in B. Then there is an algebra C with


A ⊂ C ⊂ B. But then for an arbitrary element q ∈ C \ A we have
hA ∪ {q}iB ⊆ C ⊂ B, and the set A ∪ {q} does not generate the alge-
bra B.

As an important example of a maximal subalgebra, we will return to our


example of the algebra of truth-value functions in classical two-valued logic.

Example 1.3.15 Let A be the two-element set {0, 1}, and let

C2 : = {f ∈ On (A) | f (0, . . . , 0) = 0, n ∈ N}.

We will use the criterion from Corollary 1.3.14 to verify that


C2 = (C2 ; ∗, ξ, τ, ∆, e21 ) is a maximal subalgebra of the algebra
(O({0, 1}); ∗, ξ, τ, ∆, e21 ) from Example 1.2.14. For any f ∈ O({0, 1}) \ C2 ,
we must have f (0, . . . , 0) = 1, and applying the operations ∗, ξ, τ, ∆, e21 (by
1.4. CONGRUENCE RELATIONS AND QUOTIENTS 21

superposition) we can produce a unary operation f with f (0) = 1. For this


f there are only two possibilities: f must be either the negation operation
¬ or the constant function with value 1. Since the conjunction ∧ obviously
belongs to C2 , in the first case we have

hC2 ∪ {f }iO(A) ⊇ h{∧, ¬}i = O(A),

and thus hC2 ∪ {f }iO(A) = O(A).


For the second case, we use the fact that since 0 + 0 = 0, the addition
modulo 2 (denoted by +) also belongs to C2 . From the constant 1 and the
addition modulo 2 we get ¬x = x + 1 for all x ∈ {0, 1}. Now we have both
¬ and ∧ in C2 , and we can conclude as we did in the first case.

Every subalgebra A of a finitely generated algebra B = (B; (fiA )i∈I ) can


be extended to a subalgebra C which is maximal in B:

A ⊆ ··· ⊆ C ⊆ B

(see for instance Gluschkow, Zeitlin and Justschenko, [49]). Using this fact
we have the following general completeness criterion for finitely generated
algebras.

Theorem 1.3.16 (General Completeness Criterion for finitely generated


algebras) Let B be a finitely generated algebra. A set X ⊆ B is a gener-
ating system of B iff there is no maximal subalgebra of B whose universe
contains the set X.
Proof: “⇒”: Assume that hXiB = B. If X ⊆ M for some maximal subal-
gebra M of B, then by definition of hXiB we have the inclusion hXiB ⊆ M,
which contradicts hXiB = B.

“⇐”: If X is not a generating system of the algebra B, then hXiB can be


extended to a maximal subalgebra M of B. But then X ⊆ M for this maxi-
mal subalgebra.

1.4 Congruence Relations and Quotients


Every function ϕ : A → B from a set A into a set B defines a partition of
the set A into classes of elements having the same image. The equivalence
22 CHAPTER 1. BASIC CONCEPTS

relation kerϕ corresponding to this partition of A is called the kernel of the


function ϕ. That is, for elements a, b ∈ A we define
(a, b) ∈ kerϕ : ⇔ ϕ(a) = ϕ(b).
This definition of the kernel makes it clear that the kernel is an equivalence
relation on A. It is also easy to show that kerϕ is equal to the composition
ϕ−1 ◦ ϕ.

For instance, let Z be the set of all integers, let m be a natural number, and
let Z/(m) be the set of all equivalence classes modulo m. Then the kernel
of the function ϕ : Z → Z/(m) which is defined by a 7→ [a]m , that is,
which maps every integer to its class modulo m, is obviously the well-known
equivalence or congruence modulo m :
a ≡ b (m) : ⇔ ∃g ∈ Z (a − b = gm).
(This is also often written as a ≡ b mod m). With respect to addition
and multiplication in the ring (Z; +, ·, −, 0) of all integers, this congruence
modulo m has the following additional property:
a1 ≡ b1 (m) ∧ a2 ≡ b2 (m) ⇒ a1 + a2 ≡ b1 + b2 (m) ∧ a1 · a2 ≡ b1 · b2 (m).
This tells us that in this example the kernel of the function ϕ is compatible
with the operations of the ring. This observation motivates the following
more general definition.

Definition 1.4.1 Let A be a set, let θ ⊆ A × A be an equivalence relation


on A, and let f be an n-ary operation from On (A). Then f is said to be
compatible with θ, or to preserve θ, if for all a1 , . . . , an , b1 , . . . , bn ∈ A,

(a1 , b1 ) ∈ θ, . . . , (an , bn ) ∈ θ
implies (f (a1 , . . . , an ), f (b1 , . . . , bn )) ∈ θ.

Remark 1.4.2 This definition can also be applied to arbitrary relations,


not just to binary equivalence relations, and we can speak of a function
preserving, or being compatible with, any relation. For instance, the mono-
tone increasing real functions f : R → R are exactly the unary operations
on R which are compatible with the relation ≤ on R, since x1 ≤ x2 ⇒
f (x1 ) ≤ f (x2 ) for all x1 , x2 for which f is defined. The preservation of rela-
tions by functions will be studied in more detail in Chapter 2.
1.4. CONGRUENCE RELATIONS AND QUOTIENTS 23

Definition 1.4.3 Let A = (A; (fiA )i∈I ) be an algebra of type τ . An equiva-


lence relation θ on A is called a congruence relation on A if all the fundamen-
tal operations fiA are compatible with θ. We denote by ConA the set of all
congruence relations of the algebra A. For every algebra A = (A; (fiA )i∈I )
the trivial equivalence relations

∆A : = {(a, a) | a ∈ A} and ∇A = A × A

are congruence relations. An algebra which has no congruence relations


except ∆A and ∇A is called simple.

Example 1.4.4 1. Let (Z; +, ·, −, 0) be the ring of all integers and let
m ∈ Z be an integer with m ≥ 0. Then the congruence modulo m defined
on Z is a congruence relation on (Z; +, ·, −, 0). Note that for m = 0 or m
= 1 we get the trivial relations on Z.

2. On every set A, the constant operations fc : An → A, defined by


fc (x1 , . . . , xn ) = c for all x1 , . . . , xn ∈ A, and the identical operation
idA : A → A are compatible with all equivalence relations defined on A.
The projections eni : An → A, defined by eni (a1 , . . . , an ) = ai , for all a1 ,. . .,
an ∈ A and 1 ≤ i ≤ n, also have the same property.

3. Let A = ({a, b, c, d}; f A ) be an algebra of type τ = (1). Let the unary


operation f A be given by the table

a b c d
f (x) b a d c .

To determine all congruence relations on A, we consider all possible par-


titions on A (and their corresponding equivalence relations), and look for
those partitions having the property that the image of every class is also
a class. This is easily seen to be the case for the two trivial partitions
{a} ∪ {b} ∪ {c} ∪ {d} and {a, b, c, d}, and for the five non-trivial ones

{a} ∪ {b} ∪ {c, d}, {a, b} ∪ {c} ∪ {d}, {a, b} ∪ {c, d},

{a, c} ∪ {b, d}, and {a, d} ∪ {b, c}.


Congruence relations have been defined as equivalence relations which are
compatible with all the fundamental operations of an algebra. They may also
be characterized in other ways, as we shall see at the end of Section 5.3. Here
24 CHAPTER 1. BASIC CONCEPTS

we show that congruences may also be characterized as equivalence relations


which are compatible with certain unary functions on an algebra. For A =
(A; (fiA )i∈I ), a translation of A is a mapping of the form

x → fiA (a1 , . . . , ai−1 , x, ai+1 , . . . , ani ),

where fiA is ni -ary and a1 , . . ., ai−1 , ai+1 , . . ., ani are fixed elements of A.

Theorem 1.4.5 An equivalence relation θ on an algebra A is a congruence


relation on A iff θ is compatible with all translations of A.
Proof: It is easy to check that any congruence is compatible with all transla-
tions of A. Now suppose that θ is an equivalence relation defined on A which
is compatible with all translations. Let fiA be ni -ary, and let (aj , bj ) be in θ
for j = 1, . . . , ni . Then repeated use of the compatibility with translations
gives
(fiA (a1 , a2 , a3 , . . . , ani ), fiA (b1 , a2 , . . . , ani )) ∈ θ,
(fiA (b1 , a2 , a3 , . . . , ani ), fiA (b1 , b2 , a3 , . . . , ani )) ∈ θ,
...,
(fiA (b1 , b2 , . . . , bni−1 , ani ), fiA (b1 , b2 , . . . , bni )) ∈ θ.
It follows from this that θ is a congruence on A.

We want now to define lattice operations on the set ConA of all congruence
relations of an algebra A, much as we did on the set Sub(A) for subalgebras.
As was the case there, our basic tool is the fact that the intersection of
congruences is again a congruence.

Theorem 1.4.6 The intersection θ1 ∩ θ2 of two congruence relations on an


algebra A = (A; (fiA )i∈I ) is again a congruence relation on A.
Proof: Obviously, the intersection of two equivalence relations on A is again
an equivalence relation defined on A. To see that the intersection is again a
congruence, let (a1 , b1 ), . . . , (ani , bni ) be pairs belonging to the intersection
θ1 ∩ θ2 and let fiA , for i ∈ I, be an arbitrary fundamental operation of A.
Then
(fiA (a1 , . . . , ani ), fiA (b1 , . . . , bni )) ∈ θl , for l = 1, 2
and thus
(fiA (a1 , . . . , ani ), fiA (b1 , . . . , bni )) ∈ θ1 ∩ θ2 .
1.4. CONGRUENCE RELATIONS AND QUOTIENTS 25

Therefore θ1 ∩ θ2 is a congruence relation on A.

Remark 1.4.7 Theorem 1.4.6 is also satisfied for arbitrary families of con-
gruence relations on A. But in general, the union of two congruence relations
of an algebra A is not a congruence relation, since this does not hold even
for equivalence relations, as the following example shows. Take

A = {1, 2, 3}, θ1 = {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1)},

θ2 = {(1, 1), (2, 2), (3, 3), (2, 3), (3, 2)}.
The relations θ1 and θ2 are equivalence relations, but

θ1 ∪ θ2 = {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1), (2, 3), (3, 2)}

is not an equivalence relation in A, since it is not transitive:

(1, 2) ∈ θ1 ∪ θ2 , (2, 3) ∈ θ1 ∪ θ2 , but (1, 3) 6∈ θ1 ∪ θ2 .

But although the union of two congruence relations θ1 and θ2 need not be
a congruence relation, as in the subalgebra case we can use intersections of
congruences to define a smallest congruence generated by the union. This
motivates the following definition.

Definition 1.4.8 Let A be an algebra, and let θ be a binary relation on A.


We define the congruence relation hθiConA on A generated by θ to be the
intersection of all congruence relations θ0 on A which contain θ:

hθiConA : = ∩ {θ0 | θ0 ∈ ConA and θ ⊆ θ0 }.

It is easy to see that hθiConA has the three important properties of a closure
operator:

θ ⊆ hθiConA (extensivity),
θ1 ⊆ θ2 ⇒ hθ1 iConA ⊆ hθ2 iConA (monotonicity),
hhθiConA iConA = hθiConA (idempotency).

Remark 1.4.9 As we did in Section 1.3 for subalgebras, we can ask how
to derive, from a binary relation θ defined on A, the congruence relation of
the algebra A generated by θ. First we have to enlarge the relation θ to its
26 CHAPTER 1. BASIC CONCEPTS

reflexive and symmetric closure ∆A ∪ θ ∪ θ−1 . Then we form the compatible


closure h∆A ∪ θ ∪ θ−1 i. Finally the set produced in this way has to be com-
pleted by all pairs needed for transitivity. This is done by using the transitive
hull operator, which maps any relation α to

αT : = ∩ {β ⊆ A2 | β ⊇ α and β is transitive}.

Altogether, then, we need to use

hθiConA = hh∆A ∪ θ0 ∪ (θ0 )−1 iiT .

This construction allows us to produce, for two congruences on A, the least


congruence on A which contains them both. This gives us the two binary
operations on Con(A) we wanted,

∧: ConA × ConA → ConA defined by (θ1 , θ2 ) 7→ θ1 ∩ θ2 ,


∨: ConA × ConA → ConA defined by (θ1 , θ2 ) 7→ hθ1 ∪ θ2 iConA .
Now we must verify that these operations do make our set into a lattice.

Theorem 1.4.10 For every algebra A the structure (ConA; ∧, ∨) is a


lattice, called the congruence lattice Con(A) of A.
Proof: We leave it as an exercise for the reader to verify that the binary
operations ∧ and ∨ on ConA satisfy all the axioms of a lattice.

Since congruence relations are equivalence relations, each congruence on an


algebra A induces a partition on the set A. What makes congruence relations
significant is that the partition set induced by a congruence can also be made
into the universe of an algebra. For each i ∈ I, we define an ni -ary operation
A/θ
fi on the quotient set A/θ, by
A/θ
fi : (A/θ)ni → A/θ,

defined by
A/θ
([a1 ]θ , . . . , [ani ]θ ) 7→ fi ([a1 ]θ , . . . , [ani ]θ ) := [fiA (a1 , . . . , ani )]θ .

As always when we define operations on equivalence classes, we must verify


that our operations are well-defined, that is, that they are independent of
the representatives chosen. But as the following proof shows, this is exactly
what the compatibility property of a congruence relation means.
1.5. EXERCISES 27

Theorem 1.4.11 For every algebra A of type τ and every congruence rela-
tion θ ∈ ConA, the previous definition produces an algebra A/θ of type τ ,
which is called the quotient algebra or factor algebra of A by θ.
Proof: Let i ∈ I. We have to show that if [aj ]θ = [bj ]θ for all j = 1, . . . , ni ,
then [fiA (a1 , . . . , ani )]θ = [fiA (b1 , . . . , bni )]θ . But this is exactly what our
compatibility property of a congruence guarantees. If [aj ]θ = [bj ]θ , then
(aj , bj ) ∈ θ for all j = 1, . . . , ni , and since θ is a congruence on the algebra
A, it follows that

(fiA (a1 , . . . , ani ), fiA (b1 , . . . , bni )) ∈ θ

and thus

[fiA (a1 , . . . , ani )]θ = [fiA (b1 , . . . , bni )]θ .

Example 1.4.12 Let m ∈ N, and let Z/(m) be the set of all congruence
classes modulo m. We define two binary operations on these classes, by

[a]m + [b]m = [a + b]m and [a]m · [b]m = [a · b]m .

This makes (Z/(m); +, ·) an algebra of type (2, 2). In fact it is a ring, called
the ring of all congruence classes modulo m. If m is a prime number, this
ring is also a field.

1.5 Exercises
1.5.1. Determine all subalgebras of the algebra

A = ({0, 1, 2, 3}; max(x, y), min(x, y), x + y(4), 3 − x(4)),

where the four binary operations are defined as follows: max(x, y) and
min(x, y) are the maximum and the minimum, respectively, with respect
to the order relation defined by 0 ≤ 1 ≤ 2 ≤ 3; x + y(4) is addition modulo
4; and 3 − x(4) denotes subtraction modulo 4.

1.5.2. Prove the properties of Theorem 1.3.7.

1.5.3. Is the algebra (N; +) of type τ = (2) finitely generated? If so, give a
finite generating system.
28 CHAPTER 1. BASIC CONCEPTS

1.5.4. A generating system X ⊆ A of an algebra A is called a basis of A,


if for all a ∈ X the inequality hX \ {a}iA 6= A is satisfied; that is, if no
proper subset of X generates A. Prove that every finitely generated algebra
has a finite basis.

1.5.5. Let (N; ) be an algebra of type (1), with a unary operation defined
by
0 : = 0, n = n − 1 for n > 0.
Prove that this algebra has no basis.

1.5.6. Complete the proofs of Theorems 1.3.10 and 1.4.10.

1.5.7. Let A = ({a, b, c, d}; f ) be an algebra of type τ = (1), with

x a b c d
f (x) c b a d.

Find all congruence relations θ on A, and for each one determine the quo-
tient algebra A/θ.

1.5.8. Let (A; t) be an algebra of type τ = (3), where t is defined by


(
z, if x = y
t(x, y, z) =
x, if x 6= y.

Prove that (A; t) is simple. (Such a function t is called a ternary discrimina-


tor function.)

1.5.9. Determine all subalgebras and all congruence relations of the algebra
A = ({(0), (x), (2)}; ×) of type τ = (2), with × defined by

× (0) (x) (2)


(0) (0) (0) (2)
(x) (0) (x) (x)
(2) (2) (x) (2).

1.5.10. Draw the corresponding directed graph for the finite deterministic
automaton ({0, 1}, {0, 1}, {0, 1}; δ, λ), given by
1.5. EXERCISES 29

δ 0 1 λ 0 1
0 0 1 and 0 0 0
1 1 0 1 1 1

1.5.11. Let G = (V, E) be a graph, with set V of vertices and set E ⊆ V × V


/ V , and let V 0 = V ∪ {0}. We can define a type (2)
of edges. Fix a symbol 0 ∈
algebra called a graph-algebra on the set V 0 , with a binary multiplication
given by xy = x if (x, y) ∈ E and xy = 0 otherwise. Construct the graph
algebra for the graph G = ({1, 2}, {(1, 2), (2, 1), (2, 2)}).

1.5.12. a) Prove that any distributive lattice is modular.


b) Prove that the lattice L1 shown below is not modular.
c) Prove that the lattice L2 shown below is modular but not distributive.

• •

• •
• • •

• •

L1 L2

1.5.13. Congruences on an algebra are connected to the concept of a normal


subgroup of a group. Let G be a group, considered as an algebra of type
(2, 1, 0), with the binary multiplication written as juxtaposition. For any
normal subgroup N of G, we define a relation θN on G by (a, b) ∈ θN iff
ab−1 ∈ N iff ∃n ∈ N (a = nb).

a) Prove that the relation θN is a congruence relation on G, for which the


set N is exactly the equivalence class of the identity element of the group.
b) Prove conversely that if θ is a congruence on a group G, the equivalence
class under θ of the identity element of the group forms a normal subgroup
of the group.
30
Chapter 2

Galois Connections and


Closures

A Galois-connection is a connection, with certain properties, between two


sets of objects, usually of different kinds. Such a connection can provide a
useful tool for studying properties of one kind of object, based on the prop-
erties of the other (usually more well-known) kind of objects. In the classical
Galois theory, for instance, properties of permutation groups are used to
study field extensions. Such connections provide a useful tool for studying
algebraic structures, and will be a main focus of study in this book. In partic-
ular, Galois-connections will be used in later chapters to produce two main
examples of complete lattices of closed sets, the lattice of all varieties of a
given type in Chapter 6 and the lattice of all clones on a fixed base set in
Chapter 10.

Galois-connections are also closely related to closure operators and closure


systems. We have already seen two examples of operators with closure prop-
erties: the formation of the subalgebra generated by a set, and the forma-
tion of the congruence generated by a binary relation. We begin this chapter
with a study of closure operators in general. In the second section we define
Galois-connections, and relate them to closure operators and systems. The
last section in this chapter describes an application of Galois-connections to
concept analysis.

31
32 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

2.1 Closure Operators


In the previous chapter we have seen two examples of operators with closure
properties. First, when we generate subalgebras of a given algebra A from
subsets X of the carrier set A of A, we have a mapping which takes any
X ⊆ A to a unique subset hXiA of A. This gives a unary operation, or an
operator, h iA : P(A) → P(A) on the power set of A, which we showed
has the three closure properties of Theorem 1.3.7. Later we saw that the
operator which maps any binary relation θ on A to the congruence hθiConA
generated by θ also has these same properties. In fact operators with these
properties occur in many areas of Mathematics, and are frequently used as
a tool to study other structures.

Definition 2.1.1 Let A be a set. A mapping C : P(A) → P(A) is called


a closure operator on A, if for all subsets X, Y ⊆ A the following properties
are satisfied:

(i) X ⊆ C(X) (extensivity),

(ii) X ⊆ Y ⇒ C(X) ⊆ C(Y ) (monotonicity),

(iii) C(X) = C(C(X)) (idempotency).

Subsets of A of the form C(X) are called closed (with respect to the operator
C) and C(X) is said to be the closed set generated by X.

Example 2.1.2 1. For every algebra A, we have an operator h iA which


maps every subset X ⊆ A of the carrier set of A to the carrier set hXiA of
the least subalgebra hXiA of A which contains X. As we remarked above,
Theorem 1.3.7 shows that this is a closure operator.

2. Let F be a field, and let (V ; +, ◦) be an F -vector space. Here we regard


◦ : F × V → V as the multiplication of the elements or vectors of V on the
left by elements, or scalars, from F . The linear hull of a subset X ⊆ V is
the F -subspace hXiV of the F -vector space V generated by X. Again this
is an example of a closure operator, with the closure of a set X equal to the
set of all linear combinations which can be produced from the vectors of X.
2.1. CLOSURE OPERATORS 33

3. Let A be a set. To every binary relation θ ⊆ A2 , we assign the relation


hθiT generated by its transitive hull, as defined in Remark 1.4.9:
hθiT : = ∩ {θ0 ⊆ A2 | θ0 ⊇ θ and θ0 is transitive}.
This gives an operator h iT : P(A2 ) → P(A2 ), which can be shown to be a
closure operator on A2 .
To characterize sets which are closed with respect to a closure operator, we
will use the concept of a closure system.

Definition 2.1.3 Let A be a set. A subset H of P(A) is called a closure


system if it satisfies the following two conditions:

(i) A ∈ H, and
(ii) ∩ B ∈ H for every non-empty subset B ⊆ H.
The elements of a closure system H are called closures.

Example 2.1.4 1. By Corollary 1.3.5, for every algebra A the set Sub(A)
of all subalgebras of A is a closure system on A.

2. By Theorem 1.4.6 and Remark 1.4.7, for every algebra A the set ConA
of all congruences on A is a closure system on A2 .

3. For every set A, the set EqA of all equivalence relations defined on A is
a closure system on A2 .

4. For every set A, the set P(A) is a closure system on A.


Our goal now is to show that the family of sets which are closed with respect
to a given closure operator satisfies the conditions of Definition 2.1.3, and
conversely that any closure system induces a closure operator. We start with
the following definition.

Definition 2.1.5 Given a closure system H on a set A, we define an oper-


ator
CH : P(A) → P(A)
on A, by
X 7→ CH (X) : = ∩ {H ∈ H | H ⊇ X}, for all X ⊆ A.
34 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

Conversely, for any closure operator C on A, we set

HC : = {X ⊆ A | C(X) = X}.

Theorem 2.1.6 Let H be a closure system and C be a closure operator on


the set A. Then CH and HC , as defined above, satisfy the following proper-
ties:

(i) CH is a closure operator on A and HC is a closure system on A.

(ii) The closed sets with respect to CH are exactly the closures of H.

(iii) The closures of HC are exactly the closed sets of C.

(iv) HCH = H and CHC = C.

Proof: (i) We show first that CH is a closure operator. Since X is contained


in every “component” of the intersection CH (X), we have X ⊆ CH (X), and
CH is extensive. CH is also monotone, because for X ⊆ Y , we have

CH (Y ) = ∩ {H ∈ H | H ⊇ Y } ⊇ CH (X) = ∩ {H ∈ H | H ⊇ X}.

This follows because all sets H which contain Y also contain X, and therefore
the second intersection contains more components than the first one, making
it “smaller.” For the idempotency of CH , we observe that CH (CH (X)) is the
intersection of all closures from H which contain CH (X). But CH (X) itself
is such a closure: it is the intersection of all elements of H which contain X,
and since H is a closure system this intersection is also in H. This means
that CH (CH (X)) = CH (X). Altogether, CH is a closure operator.

Conversely, let C be a closure operator on A. Obviously, A = C(A); we


always have C(A) ⊆ A, and A ⊆ C(A) holds because of the extensivity of
the closure operator C. But this means that A itself is in our family HC , as
required for the first condition of a closure system. For the second condition,
let B ⊆ HC , and consider the intersection ∩B. For every H ∈ B we have
C(H) = H and ∩B ⊆ H. The monotonicity of the closure operator C
gives:
\
C(∩B) ⊆ C(H) = H and C(∩B) ⊆ H = ∩ B.
H∈B
2.1. CLOSURE OPERATORS 35

Combining this with ∩B ⊆ C(∩B) gives C(∩B) = ∩ B, and thus ∩B ∈ HC .


This shows that the system HC is a closure system.

(ii) For any set X ⊆ A, we have

CH (X) = ∩ {H ∈ H | H ⊇ X} = X ⇔ X ∈ H.

Thus the sets closed under CH are exactly the closures of H.

(iii) It follows immediately from the definition of HC that the closures of HC


are exactly the sets which are closed with respect to C.

(iv) Since
X ∈ HCH ⇔ CH (X) = X ⇔ X ∈ H,
we have HCH = H.

To show that CHC = C, we start with

CHC (X) = ∩ {H ∈ HC | H ⊇ X} = ∩ {H | C(H) = H and H ⊇ X}.

For every set H in the latter set, X ⊆ H gives

C(X) ⊆ C(H) = H.

Since C(X) is contained in every component of the intersection, it is con-


tained in the intersection itself, giving C(X) ⊆ CHC (X).

Since X ⊆ C(X) = C(C(X)), it follows that C(X) is a component of the


intersection. Therefore we also have CHC (X) ⊆ C(X). The two inclusions
then give CHC (X) = C(X).

The following definitions describe some additional properties closure systems


may have.

Definition 2.1.7 A non-empty system G of sets is called upward directed,


if for every pair X, Y ∈ G there exists a set Z ∈ G with X ∪ Y ⊆ Z.

A system M of sets is called inductive, if for every upward directed subsys-


tem G, the union ∪G is also in M.
36 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

A closure operator C defined on a set A is called inductive, if for all X ⊆ A,

C(X) = ∪ {C(E) | E ⊆ X and E is finite}.


If the system of sets being considered is a closure system, the concepts of
this definition agree, and we have the following result, which we shall not
prove here. (A proof is given by Th. Ihringer in [58]).

Theorem 2.1.8 A closure system H is inductive if and only if the corre-


sponding closure operator CH is inductive.
We shall need the following important example of an inductive closure system
and operator.

Corollary 2.1.9 For every algebra A = (A; (fiA )i∈I ), the system Sub(A)
is an inductive closure system and h iA is an inductive closure operator.
Proof: Let G ⊆ Sub(A) be upward directed; we have to prove that ∪G
∈ Sub(A). Let fiA be an ni -ary operation and assume that b1 , . . . , bni ∈ ∪G.
Then there exist sets G1 , . . ., Gni ∈ G with b1 ∈ G1 , . . ., bni ∈ Gni . Since
G is upward directed, there is a set G0 ∈ G with b1 , . . . , bni ∈ G0 . But
G0 ∈ Sub(A) means that

f (b1 , . . . , bni ) ∈ G0 ⊆ ∪G.

This makes ∪G ∈ Sub(A); so Sub(A) is inductive. The claim for h iA follows


from Theorem 2.1.6 and the fact that h iA = CSub(A) .

Just as we did for Sub(A), we can define a meet and join operation on any
closure system. If H ⊆ P(A) is a closure system, then for arbitrary sets
B ⊆ H, we use
^ _
B : = ∩ B and B : = ∩ {H ∈ H | H ⊇ ∪B}.

In particular, for two-element sets B, this gives us binary meet and join op-
erations on H. Then every closure system is a lattice, under these operations,
and in fact a complete lattice (see Definition 1.2.11).

Example 2.1.10 Let A be any algebra. Since Sub(A) and ConA are closure
systems, the lattices (Sub(A); ⊆) and (ConA; ⊆) are complete lattices.
2.2. GALOIS CONNECTIONS 37

This shows that all subalgebra lattices are complete lattices. We can then ask
whether any complete lattice occurs as the subalgebra lattice of some algebra,
and if not whether there are some classes of complete lattices which do occur
in this way. To answer these questions, we need the following concept.

Definition 2.1.11 Let (L; ≤) be a complete lattice. An element a of L is


called compact, if for every set B ⊆ L with a ≤
W
B there exists a finite
subset B0 ⊆ B with a ≤
W
B0 . A lattice is called algebraic if it is a complete
lattice in which every element is the supremum of compact elements.

Theorem 2.1.12 For every inductive closure system H, the partially or-
dered set (H; ⊆) is an algebraic lattice.
Proof: In an inductive closure system, every closure is the supremum of
finitely generated elements. Therefore inductive closure systems lead to al-
gebraic lattices.

We mention here without proof the following important result; a proof may
be found for instance in G. Grätzer, [51] or P. Cohn, [10].

Corollary 2.1.13 A lattice L is isomorphic to the subalgebra lattice of some


algebra iff L is algebraic.

2.2 Galois Connections


A Galois-connection between two sets of objects is a pair of mappings, with
certain properties, between the power sets of the two sets. These mappings
allow us to move back and forth between the two kinds of objects, often
using information about one kind to learn more about the other. Such con-
nections will be a main focus of study in this book. Before we introduce the
general definitions, we look in detail at an example of such a connection.
This example builds on the idea of an operation preserving a relation, which
was the basis of our definition of a congruence relation. Our connection will
be between operations and relations on a given set.

We start with a fixed base set A, and as one of our sets of objects the set
O(A) of all operations on A. In Definition 1.4.1, we considered an intercon-
nection between operations from O(A) and binary relations θ ⊆ A2 , namely
that an operation could be compatible with, or preserve, a binary relation.
38 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

This concept can be generalized to include h-ary relations on A for any


h ≥ 1, as follows. We denote by Rh (A) the set of all h-ary relations defined

Rh (A) the set of all finitary relations defined on
S
on A, and by R(A) =
h=1
A.

We say f ∈ On (A) preserves the h−ary relation ρ ∈ R(A), if whenever

(a11 , . . . , a1h ) ∈ ρ, . . . , (an1 , . . . , anh ) ∈ ρ,

it follows that also

(f (a11 , . . . , an1 ), . . . , f (a1h , . . . , anh )) ∈ ρ.

This connection between operations and relations determines a mapping


which associates to any relation a set of operations. For any relation ρ ∈
R(A), we can consider the set of all operations from O(A) which preserve ρ.
This set will be denoted by P olA ρ, i.e.,

P olA ρ = {f | f ∈ O(A) and f preserves ρ}.

(We remark that in fact P olA ρ is a clone on A, in the sense defined in


Example 1.2.14: it is a set of operations on A which contains the projections
and which is closed under composition of operations.)

Example 2.2.1 Let A be the set {0, 1}.

1. Let ρ be the unary relation {0}. Notice that unary relations on a set are
simply subsets of this set. Then P olA {0} is the set of all Boolean functions
which preserve {0}, so

P olA {0} = {f ∈ O(A) | f (0, . . . , 0) = 0}.

2. Let
α = {(a, b, c, d) ∈ A4 | a + b = c + d},
where + is the addition modulo 2. An n-ary operation f on A is called linear,
if there are elements a1 , . . . , an , c ∈ {0, 1} such that

f (x1 , . . . , xn ) = a1 x1 + · · · + an xn + c

for all x1 , . . . , xn ∈ {0, 1}, where again + is the addition modulo 2. It can be
shown that a Boolean function f is linear iff it preserves α. Thus P olA α is
the set of all linear Boolean functions.
2.2. GALOIS CONNECTIONS 39

These examples show how, for any given relation ρ on A, we can look for
the set of all operations which preserve ρ. In the other direction, for a given
operation f ∈ O(A) one can look for the set of all relations from R(A) which
are preserved by f . Such relations are called invariants of f , and the set of
all such invariants is denoted by InvA f , that is,

InvA f = {ρ ∈ R(A) | f preserves ρ}.

We can also extend the maps P olA and InvA to sets of relations or functions,
respectively. If F ⊆ O(A) is a set of operations on A, we define InvA F to
be the set of all relations which are invariant for all f ∈ F , and similarly
for a set Q ⊆ R(A) of relations on A, we denote by P olA Q the set of all
operations which preserve every relation ρ ∈ Q.

These maps P olA and InvA , between sets of relations and sets of operations
on A, show the basic idea of a Galois-connection, between two sets of objects.
The following theorem of Pöschel and Kalužnin shows that in this example,
we have the two important properties that will be used shortly to define a
Galois-connection. These properties are that our maps between the two sets
of objects map larger sets to smaller images, and that mapping a set twice
returns us to a set which contains the starting set.

Theorem 2.2.2 ([96]) The following interconnections hold between sets of


the form P olA Q and InvA F , for Q ⊆ R(A) and F ⊆ O(A):
(i) F1 ⊆ F2 ⊆ O(A) ⇒ InvA F1 ⊇ InvA F2 ,
Q1 ⊆ Q2 ⊆ R(A) ⇒ P olA Q1 ⊇ P olA Q2 ;
(ii) F ⊆ P olA InvA F,
Q ⊆ InvA P olA Q.
Proof: These propositions follow immediately from the definitions of P olA
and InvA .

The concept of an invariant, or a set of all objects which are invariant under
certain changes, also plays an important role in other branches of Mathemat-
ics and Science. From Analytic Geometry we know for example that by Felix
Klein’s “Erlanger Programm” ([64]), different geometries may be regarded
as the theories of invariants of different transformation groups. Instead of
sets of the form P olA R, in this setting we have sets of transformations, more
40 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

exactly the carrier sets of transformation groups. The corresponding invari-


ants can be parallelism, the affine ratio, the cross ratio, the distance, or the
angle; again, larger sets of mappings determine smaller sets of invariants.
The following table gives a survey of the invariants of three transformation
groups:

invariants under invariants under invariants under


affine mappings similarity mappings motions
parallelism parallelism parallelism
affine ratio affine ratio affine ratio
cross ratio angle
angle distance

These examples of mappings between two sets, with the two basic properties
from Theorem 2.2.2, form a model for our definition of a Galois-connection.

Definition 2.2.3 A Galois-connection between the sets A and B is a pair


(σ, τ ) of mappings between the power sets P(A) and P(B),

σ : P(A) → P(B) and τ : P(B) → P(A),

such that for all X, X 0 ⊆ A and all Y, Y 0 ⊆ B the following conditions are
satisfied:

(i) X ⊆ X 0 ⇒ σ(X) ⊇ σ(X 0 ), and Y ⊆ Y 0 ⇒ τ (Y ) ⊇ τ (Y 0 );


(ii) X ⊆ τ σ(X), and Y ⊆ στ (Y ).
Galois-connections are also related to closure operators, as the following
proposition shows.

Theorem 2.2.4 Let the pair (σ, τ ) with

σ : P(A) → P(B) and τ : P(B) → P(A)

be a Galois-connection between the sets A and B. Then:

(i) στ σ = σ and τ στ = τ ;

(ii) τ σ and στ are closure operators on A and B respectively;


2.2. GALOIS CONNECTIONS 41

(iii) The sets closed under τ σ are precisely the sets of the form τ (Y ), for
some Y ⊆ B; the sets closed under στ are precisely the sets of the form
σ(X), for some X ⊆ A.

Proof: (i) Let X ⊆ A. By the second Galois-connection property, we have


X ⊆ τ σ(X). By the first property, applying σ to this gives σ(X) ⊇ στ σ(X).
But we also have σ(X) ⊆ στ (σ(X)), by the second Galois-connection prop-
erty applied to the set σ(X). This gives us στ σ(X) = σ(X). The second
claim is proved similarly.

(ii) The extensivity of τ σ and στ follows from the second Galois-connection


property. From the first property we see that

X ⊆ X 0 ⇒ σ(X) ⊇ σ(X 0 ) ⇒ τ σ(X) ⊆ τ σ(X 0 ),

since σ(X) and σ(X 0 ) are subsets of B; and in the analogous way we get
from Y ⊆ Y 0 the inclusion στ (Y ) ⊆ στ (Y 0 ). Applying σ to the equation τ στ
= τ from part (i) gives us the idempotency of στ , and similarly for τ σ.

(iii) This is straightforward to verify.

A relation between the sets A and B is simply a subset of A×B. Any relation
R between A and B induces a Galois-connection, as follows. We can define
the mappings
σ : P(A) → P(B), τ : P(B) → P(A),

by
σ(X) : = {y ∈ B | ∀x ∈ X ((x, y) ∈ R)},
τ (Y ) : = {x ∈ A | ∀y ∈ Y ((x, y) ∈ R)}.

Then it is easy to verify that the pair (σ, τ ) forms a Galois-connection be-
tween A and B, called the Galois-connection induced by R. In our example
with P ol and Inv, consider the preservation relation R between O(A) and
R(A), defined by

R = {(f, θ) ∈ O(A) × R(A) | f preserves θ}.

The reader can verify that the Galois-connection induced by this R is in fact
the one we described.
42 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

2.3 Concept Analysis


Concept Analysis is a very useful application of Galois-connections. It is
based on the assumptions that all human knowledge involves conceptual
thinking, and that human reasoning involves manipulation of concepts. For-
mal concept analysis, developed since 1979 by R. Wille and his collaborators
(see [120]), goes back to Aristotle’s basic view of a concept as a unit of
thought constituted by its extension and its intension.

The extension of a concept is the collection of all objects belonging to the


concept, and the intension is the set of all attributes common to those ob-
jects. Subconcepts satisfy larger sets of attributes, while subsets of sets of
attributes determine superconcepts. This means precisely that we have a
Galois-connection between sets of objects and sets of attributes.

Since human thinking and communication always occurs in a context, the


first step is the formalization of contexts.

Definition 2.3.1 A (formal) context is a triple (G, M, I), in which G and


M are sets and I ⊆ G × M is a relation between G and M . The set G is
called the set of objects, and M is the set of attributes (Gegenstände und
Merkmale, respectively, in German). The relation I is defined by

I := {(g, m) ∈ G × M | object g has attribute m}.

This relation induces a Galois-connection (σ, τ ) between the sets G and M ,


as described in the previous section.

For a given formal context (G, M, I), our philosophical understanding of a


concept can be formalized by the following definition.

Definition 2.3.2 A pair (X, Y ) is said to be a formal concept of the formal


context (G, M, I) if X ⊂ G, Y ⊆ M , σ(X) = Y and τ (Y ) = X. The sets
X and Y are called the extent and the intent of the formal concept (X, Y ),
respectively.

It is clear from this definition that such formal concepts consist of pairs of
closed sets, under the closure operators τ σ and στ .
2.3. CONCEPT ANALYSIS 43

Let us denote by B(G, M, I) the set of all concepts of the context (G, M, I).
We can define a partial order on this set, by

(X1 , Y1 ) ≤ (X2 , Y2 ) :⇔ X1 ⊆ X2 (⇔ Y1 ⊇ Y2 ).

When (X1 , Y1 ) ≤ (X2 , Y2 ), the pair (X1 , Y1 ) is called a subconcept of the pair
(X2 , Y2 ), and conversely (X2 , Y2 ) is called a superconcept of (X1 , Y1 ). It is
easy to verify that (B(G, M, I), ≤) is a complete lattice.

Example 2.3.3 We consider the following concept, using as our set G of


objects the set of planets in our solar system and a set of seven attributes
concerning size, distance from the sun and whether the planet has moons.
The relation I is given by the following table, where an x indicates that the
object in that row has the attribute of the given column. For convenience we
will abbreviate each planet name by one or two letters, and each attribute
as shown in the table below.

To illustrate how the concepts of this context may be identified, let us choose
an object, say the planet Jupiter. We find the set of all properties or at-
tributes this object has: large, far from the sun and has moons. Now we look
for the set of all objects which has exactly these properties: Jupiter and Sat-
urn. This gives us the concept ({J, S}, {l, f, y}). We can also start with a set
of objects instead of with a single object, and dually we can work from a set
of attributes. For instance, we obtain the concept ({J, S, U, N, P }, {f, y}),
which is a superconcept of the first one.

small medium large close f ar moons no moons


k m l c f y nm

Me x x x
V x x x
E x x x
Ma x x x
J x x x
S x x x
U x x x
N x x x
P x x x
44 CHAPTER 2. GALOIS CONNECTIONS AND CLOSURES

This method leads to the following list of concepts:

({M e, V }, {k, c, nm}), ({E, M a}, {k, c, y}), ({J, S}, {l, c, y}),
({U, N }, {m, c, y}), ({P }, {k, c, y}), ({M e, C, E, M a}, {k, c}),
({J, S, U, N, P }, {c, y}), ({E, M a, J, S, U, N, P }, {y}).

Extending this list by taking intersections, we reach the Hasse diagram shown
below for the lattice B(G, M, I); ≤).

The main goal is to find, for a given context, the hierarchy of concepts. The
super- and sub-concept relations also give implications between concepts,
allowing this method to be used to search for new implications. For more
information and examples about the theory of concept analysis we refer the
reader to the book [47] by B. Ganter and R. Wille.

M e, V, E, M a, J, S, U, N, P
s

M e, V, E, M a, P s s E, M a, J, S, U, N, P
E, M a, P
s s s J, S, U, N, P
M e, V, E, M a
s s P s
M e, V E, M a J, S
s s U, N
s

2.4 Exercises
2.4.1. Prove that for any relation R ⊆ A × B, the maps σ and τ defined by

σ(X) : = {y ∈ B | ∀x ∈ X ((x, y) ∈ R)}


τ (Y ) : = {x ∈ A | ∀y ∈ Y ((x, y) ∈ R)}
2.4. EXERCISES 45

define a Galois-connection between A and B.

2.4.2. Let R ⊆ A × B be a relation between the sets A and B and let (µ, ι)
be the Galois-connection between A and B induced by R. Prove that for any
families {Ti ⊆ A | i ∈ I} and {Si ⊆ B | i ∈ I}, the following equalities hold:
S T
a) µ( Ti ) = µ(Ti ).
i∈I
S i∈I
T
b) ι( Si ) = ι(Si ).
i∈I i∈I

(These properties will be used in Chapter 13.)

2.4.3. Prove Theorem 2.2.2.

2.4.4. A kernel system on A is defined as a subset K ⊆ P(A) with the


property that for all B ⊆ K, the set B is in K.
S

A kernel operator is a mapping D : P(A) → P(A) with the properties

(i) ∀M ⊆ A (D(M ) ⊆ M ) (intensivity)

(ii) ∀M, N ⊆ A (M ⊆ N ⇒ D(M ) ⊆ D(N )) (monotonicity)

(iii) ∀M ⊆ A (D(D(M )) = D(M )) (idempotency).

Formulate and prove a theorem for kernels analogous to 2.1.6.

2.4.5. We saw in Chapter 1 that given an algebra A, the operation which


takes any non-empty subset X of A to the subalgebra of A generated by X is
a closure operator. There is another operator involving subalgebras. For any
class K of algebras of a fixed type τ , let S(K) be the class of all algebras of
type τ which are subalgebras of some algebra in K. This defines an operator
S on the class Alg(τ ) of all algebras of type τ . Prove that this operator is a
closure operator on Alg(τ ). (This operator will be studied in Chapter 6.)

2.4.6. Choose a set G of objects consisting of four-sided polygons (square,


rectangle, parallelogram, etc.) and a set M of attributes of such objects. Pre-
pare a table for the corresponding relation I and draw the concept lattice.
46
Chapter 3

Homomorphisms and
Isomorphisms

Consider the function which assigns to every real number a ∈ R its absolute
value | a | in the set R+ of non-negative real numbers. This function h : R →
R+ has the property that it is compatible with the multiplicative structure of
R, because | a · b | = | a | · | b |. We get the same result whether we multiply
the numbers first and then use our mapping, or permute these actions, since
calculation with the images proceeds in the same way as calculations with
the originals. A corresponding observation can be made about assigning to
a square matrix A its determinant | A |, or to a permutation s on the set
{1, . . . , n} its sign. In these cases the compatibility of the mapping with
the operation can be described by the equations

|A·B | = |A|·|B | and sgn(s1 ◦ s2 ) = sgns1 · sgns2 .

In each of these examples, we have a mapping between the carrier sets of


two algebras of the same type, which is compatible with the operations of
the algebras. If the mapping between the carrier sets of the two algebras is
also a bijection, then the difference between the two algebras amounts only
to a relabelling of the elements.
This can be seen for instance in the following example. We consider the
algebras

A3 = ({(1), (123), (132)}; ◦) and Z3 = ({[0]3 , [1]3 , [2]3 }; +),

the group of all even permutations (the alternating group) of order three
and the cyclic group of order three, respectively. Both algebras have type

47
48 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

τ = (2), and it is easy to verify that both are groups. From their Cayley
tables

◦ (1) (123) (132) + [0]3 [1]3 [2]3


(1) (1) (123) (132) [0]3 [0]3 [1]3 [2]3
(123) (123) (132) (1) [1]3 [1]3 [2]3 [0]3
(132) (132) (1) (123) [2]3 [2]3 [0]3 [1]3

it appears that the bijection

h : {(1), (123), (132)} → {[0]3 , [1]3 , [2]3 },

defined by the mapping (1) 7→ [0]3 , (123) 7→ [1]3 and (132) 7→ [2]3 , is com-
patible with the structure. For instance, we have

h((123) ◦ (132)) = h((1)) = [0]3 = [1]3 + [2]3 = h((123)) + h((132)),

and similarly for all other pairs of elements.

Functions or bijections between algebras which have this compatibility prop-


erty are called homomorphisms or isomorphisms.

The following example shows that this compatibility property occurs in very
“practical” cases. An industrial automaton is designed to construct more
complex parts of a machine from certain simpler parts. There are working
phases in which two parts are combined into a new part, other phases in
which three parts are combined into a new one, and so on. These correspond
to binary operations, ternary operations, etc. If the parts are combined in a
different order, represented by a mapping h, then the automaton will con-
tinue to work in the correct way when h has our compatibility property. Thus
the concepts of homomorphism and isomorphism may be said to model math-
ematically the “artificial intelligence” we expect from such an automaton.

In this chapter, we define homomorphisms and isomorphisms, and present


several important theorems outlining their properties and connecting them
with quotient algebras and congruences.
3.1. THE HOMOMORPHISM THEOREM 49



 h


(different order)


3.1 The Homomorphism Theorem


We begin with the definitions of homomorphism and isomorphism.

Definition 3.1.1 Let A = (A; (fiA )i∈I ) and B = (B; (fiB )i∈I ) be algebras
of the same type τ . Then a function h : A → B is called a homomorphism
h : A → B of A into B if for all i ∈ I we have
h(fiA (a1 , . . . , ani )) = fiB (h(a1 ), . . . , h(ani )),
for all a1 , . . . , ani ∈ A. In the special case that ni = 0, this equation means
that h(fiA (∅)) = fiB (∅). That is, the element designated by the nullary
operation fiA in A must be mapped to the corresponding element fiB in B.

If the function h is bijective, that is both one-to-one (injective) and “onto”


(surjective), then the homomorphism h : A → B is called an isomorphism
from A onto B. An injective homomorphism from A into B is also called an
embedding of A into B.

A homomorphism h : A → A of an algebra A into itself is called an endo-


morphism of A, and an isomorphism h : A → A from A onto A is called an
automorphism of A.

Example 3.1.2 1. It is easy to prove that for every algebra A, the identical
mapping idA : A → A, defined by idA (x) = x for all x ∈ A, is an automor-
phism of A.
50 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

2. Let A, B and C be algebras of the same type, and let h1 : A → B and


h2 : B → C be homomorphisms. The composition function h2 ◦ h1 : A → C is
defined by (h2 ◦ h1 )(x) = h2 (h1 (x)) for all x ∈ A. The reader should verify
that this composition is also a homomorphism, and that when both h1 and
h2 are surjective, injective or bijective, then the composition has the same
property. (See Exercise 3.3.4.)

3. If θ is a congruence relation on A and if A/θ is the corresponding quotient


algebra (see 1.4.11) then h : A → A/θ defined by a 7→ [a]θ is a surjective
homomorphism. The definition of operations on A/θ from 1.4.11 gives us

h(fiA (a1 , . . . , ani )) = [fiA (a1 , . . . , ani )]θ =


A/θ A/θ
fi ([a1 ]θ , . . . , [ani ]θ ) = fi (h(a1 ), . . . , h(ani ))

for all i ∈ I. This homomorphism is called the natural homomorphism in-


duced by θ on A, and is usually denoted by nat θ.
It is useful to consider the behaviour of subalgebras under homomorphic
mappings.

Theorem 3.1.3 Let h : A → B be a homomorphism of the algebra A of


type τ into the algebra B of type τ . Then we have:
(i) The image B1 = h(A1 ) of a subalgebra A1 of A under the homomor-
phism h is a subalgebra of B.

(ii) The preimage h−1 (B 0 ) = A0 of a subalgebra B 0 of h(A) ⊆ B is a


subalgebra of A.

(iii) For any subset X ⊆ A, we have hh(X)iB = h(hXiA ).


Proof: (i) By definition,

h(A1 ) = {b ∈ B | ∃ a ∈ A1 (h(a) = b)} ⊆ B.

Let fiB be an ni -ary operation on B, for i ∈ I, and let (b1 , . . . , bni ) ∈ h(A1 )ni .
Then for each 1 ≤ j ≤ ni , we have bj = h(aj ) for some aj in A. Then

fiB (b1 , . . . , bni ) = fiB (h(a1 ), . . . , h(ani )) = h(fiA (a1 , . . . , ani )),

and the latter is in h(A1 ) since fiA (a1 , . . . , ani ) ∈ A1 . Application of the
subalgebra criterion thus proves that the image B1 is a subalgebra of B.
3.1. THE HOMOMORPHISM THEOREM 51

(ii) Let (a1 , . . . , ani ) ∈ (h−1 (B 0 ))ni and let fiA be ni -ary. We have

h−1 (B 0 ) = {a ∈ A | ∃ b ∈ B 0 (h(a) = b)};

so for each aj in h−1 (B 0 ), there is an element bj ∈ B 0 such that aj = h−1 (bj ).


This means that

fiA (a1 , . . . , ani ) = fiA (h−1 (b1 ), . . . , h−1 (bni )).

Then

h(fiA (h−1 (b1 ), . . . , h−1 (bni )))


= fiB (h(h−1 (b1 )), . . . , h(h−1 (bni )))
= fiB (b1 , . . . , bni ),

and the latter is in B 0 , because all of b1 , . . . , bni are in B 0 and B 0 is a subal-


gebra of B. From this we get

fiA (h−1 (b1 ), . . . , h−1 (bni )) = fiA (a1 , . . . , ani ) ∈ h−1 (B 0 ).

(iii) Let E be the operator used in Theorem 1.3.8, so

E(X) : = X ∪ {fiA (a1 , . . . , ani ) | i ∈ I, a1 , . . . , ani ∈ X}.

We show first that E(h(X)) = h(E(X)) for all X ⊆ A. The set E(h(X))
consists of all elements h(y) with y ∈ X, plus elements of the form

fiB (h(y1 ), . . . , h(yni )), for i ∈ I, y1 , . . . , yni ∈ X.

The set h(E(X)) also consists of the elements h(y) with y ∈ X plus the
elements h(fiA (y1 , . . . , yni )), which agree with fiB (h(y1 ), . . . , h(yni )).

By induction on k we can prove that E k (h(X)) = h(E k (X)) for all k ∈ N.


Then we have:

[ ∞
[
hh(X)iB = E k (h(X)) = h(E k (X))
k=0 k=0
[∞
= h( E k (X)) = h(hXiA ).
k=0
52 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

Before we continue with general properties of homomorphisms, we will exam-


ine more closely the automorphisms on an algebra. The set of all automor-
phisms of the algebra A is denoted by AutA, and the set of all its endomor-
phisms is denoted by EndA. Since the composition ◦ of two automorphisms
(endomorphisms) of A is again an automorphism (endomorphism) of A and
since ◦ is associative, (AutA; ◦, idA ) and (EndA; ◦, idA ) are monoids, with
the former a submonoid of the latter. It is also true that if h : A → A is an
automorphism of A, the mapping h−1 : A → A is again an automorphism
of A. This can be seen as follows:

h−1 (fiA (b1 , . . . , bni )) = h−1 (fiA (h(a1 ), . . . , h(ani ))) =


h−1 (h(fiA (a1 , . . . , ani ))) = fiA (a1 , . . . , ani )
= fiA (h−1 (b1 ), . . . , h−1 (bni )).

This gives the following result.

Lemma 3.1.4 The set of all automorphisms of an algebra A forms a


group, AutA = (AutA; ◦, −1 , idA ), called the automorphism group of A.
Let A = (A; (fiA )i∈I ) be an algebra of type τ and let h : A → A be an
automorphism of A. An element a ∈ A is called a fixed point of h, if h(a) =
a. Every a ∈ A is of course a fixed point of the identical automorphism idA
on A.

Lemma 3.1.5 The set of all fixed points of an automorphism h of A is a


subalgebra of A.
Proof: Let h be an automorphism on A. We consider the set Fh of all fixed
points of h:
Fh : = {a | a ∈ A and h(a) = a}.
Let fi , for i ∈ I, be an ni -ary operation on A, and assume that a1 , . . . , ani ∈
Fh . Then

fiA (a1 , . . . , ani ) = fiA (h(a1 ), . . . , h(ani )) = h(fiA (a1 , . . . , ani )),

since h is an automorphism, and thus fiA (a1 , . . . , ani ) ∈ Fh . By Criterion


1.3.3, Fh is the carrier set of a subalgebra of A.

We have just considered, for a given automorphism h on an algebra A, the


set of all elements of A which are fixed points of h. In the opposite direction,
3.1. THE HOMOMORPHISM THEOREM 53

we can choose a subset B of A and look for the set of all automorphisms on A
whose set of fixed points contains B. When B is exactly the set of fixed points
of this set of automorphisms, Lemma 3.1.5 tells us that B is a subalgebra of
A. To explore this connection further, we will need the following definitions.

Definition 3.1.6 Let A and A0 be algebras of the same type τ and let
B be a subalgebra of both A and A0 . An isomorphism h : A → A0 with
h(b) = b for all b ∈ B is called a relative isomorphism between A and
A0 with respect to B. When h is an automorphism on A, it is then called a
relative automorphism of A with respect to the subalgebra B.
We can consider such a relative isomorphism, h : A → A0 with respect
to a common subalgebra B, in two ways. From “above,” we say that the
restriction of h to B is the identity isomorphism on B, while from “below,”
we call h an extension of the identity isomorphism on B.

More generally, let B and B 0 be subalgebras of the algebras A and A0 , re-


spectively. Let g : B → B 0 and h : A → A0 be isomorphisms. Then we say
that the isomorphism h is an extension of g, if for all b ∈ B the equation
g(b) = h(b) is satisfied. The following result is easy to verify.

Lemma 3.1.7 The set AutrelB A of all relative automorphisms of A with


respect to a subalgebra B ⊆ A forms a subgroup of the automorphism group
of A.
We have seen that any automorphism h on A determines a subalgebra of A,
consisting of the fixed points of h. If we take h to be a relative automorphism
with respect to some subalgebra B, this fixed point subalgebra contains B.
Now let G be a subgroup of the group of all relative automorphisms of A
with respect to B. We form the set

B 0 : = {b ∈ A | s(b) = b for all s ∈ G}

of the elements of A which are fixed points of all the automorphisms in G.


Lemma 3.1.5 tells us that B 0 is a subalgebra of A, which again has B as a
subalgebra. This gives a way of associating to every subgroup G ⊆ AutrelB A
an algebra B 0 between B and A, so B ⊆ B 0 ⊆ A. We can ask whether the
converse holds: does every algebra B ⊆ B 0 ⊆ A determine a subgroup of
AutrelB A which consists exactly of the automorphisms of A fixing the ele-
ments from B 0 ?
54 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

The careful reader will have noticed that these interconnections amount to
a Galois-connection between the sets A and AutrelB A. We have a basic re-
lation

R := {(a, s) | a ∈ A and s ∈ AutrelB A and s(a) = a},


which induces the maps

σ(X) := {s ∈ AutrelB A | ∀a ∈ X(s(a) = a)}


and τ (Y ) := {a ∈ A | ∀s ∈ Y (s(a) = a)},

for all X ⊆ A and Y ⊆ AutrelB A.

We now return to the general theory of homomorphisms, with some theo-


rems that relate homomorphisms to congruences. We saw in Example 3.1.2
part 3 that every congruence relation θ on an algebra A determines a ho-
momorphism, namely the natural homomorphism nat θ:A → A/θ onto the
quotient algebra. Now we will show that conversely every homomorphism of
an algebra A also determines a congruence relation on A.

Let h : A → B be a homomorphism. Since the function h : A → B is in


general not injective, we may have different elements with the same image.
We investigate the equivalence relation corresponding to the partition of the
set A into classes consisting of elements having the same image, that is, the
kernel of h.

Definition 3.1.8 Let A and B be algebras of the same type τ and let h :
A → B be a homomorphism. The following binary relation is called the
kernel of the homomorphism h :

ker h : = {(a, b) ∈ A2 | h(a) = h(b)}.

We may alternately express this as ker h = h−1 ◦ h, where h−1 is the inverse
relation of h.

Lemma 3.1.9 The kernel of any homomorphism h : A → B is a congruence


relation on A.

Proof: By definition, ker h is an equivalence relation on A. To verify the


compatibility property needed for a congruence, let fiA , for some i ∈ I, be
3.1. THE HOMOMORPHISM THEOREM 55

an ni -ary fundamental operation of A and let (a1 , b1 ), . . ., (ani , bni ) be in


ker h. This means that h(a1 ) = h(b1 ), . . ., h(ani ) = h(bni ). Now applying
the operation fiB gives the equation:

fiB (h(a1 ), . . . , h(ani )) = fiB (h(b1 ), . . . , h(bni )).

Since h : A → B is a homomorphism of A into B, this gives us

h(fiA (a1 , . . . , ani )) = h(fiA (b1 , . . . , bni )),

and thus

(fiA (a1 , . . . , ani ), fiA (b1 , . . . , bni )) ∈ ker h.

Suppose we have a homomorphism h : A → B. We have seen that ker h


is a congruence on A, so we can form the quotient algebra A/ker h, along
with the natural homomorphism nat(ker h) : A → A/ker h which maps the
algebra A onto this quotient algebra. Now we have two homomorphic images
of A: the original h(A) and the new quotient A/ker h. What connection is
there between these two homomorphic images? The answer to this question
is a special case of the following General Homomorphism Theorem.

Theorem 3.1.10 (General Homomorphism Theorem) Let h : A → B and


g : A → C be homomorphisms, and let g be surjective. Then there exists a
homomorphism f : C → B which satisfies f ◦ g = h iff ker g ⊆ ker h.
h
A - B

S 6
(=)
S
S
S
S
g S f
S
S
S
SS
w
C
56 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

Here f ◦ g = h means “commutativity” of the diagram. When this f exists,


it has the following properties:

(i) the homomorphism f is uniquely defined by f = h ◦ g −1 ;

(ii) f is injective iff ker g = ker h;

(iii) f is surjective iff h is surjective.

Proof: We first assume that there exists a homomorphism f : C → B which


satisfies f ◦ g = h. To see that ker g ⊆ ker h, let (a, b) ∈ ker g. Then
g(a) = g(b), and so f (g(a)) = f (g(b)), that is, (f ◦ g)(a) = (f ◦ g)(b).
Since f ◦ g = h, it follows that h(a) = h(b) and (a, b) ∈ ker h, as required.

Conversely, let ker g ⊆ ker h. We define f : = h ◦ g −1 , and show that f


has the required properties. The domain of f is C because of the surjectivity
of g. To see that f is uniquely determined, let c be any element of C, and
suppose that both a1 and a2 are in g −1 (c). Then we have g(a1 ) = g(a2 ) =
c, and so (a1 , a2 ) ∈ ker g. But under our assumption this puts (a1 , a2 ) ∈
ker h. Therefore h(a1 ) = h(a2 ), and f (c) = h ◦ g −1 (c) gives the same result
whether a1 or a2 is used for g −1 (c). Thus f is a well-defined function of C
into B, and clearly f satisfies f ◦ g = h.

Finally, to see that f is a homomorphism, assume that fiC is ni -ary, for i ∈ I.


Let c1 , . . . , cni ∈ C, with g −1 (ci ) = ai for 1 ≤ i ≤ n. Then we have

f (fiC (c1 , . . . , cni )) = f (fiC (g(a1 ), . . . , g(ani )))


= f (g(fiA (a1 , . . . , ani ))) = (f ◦ g)(fiA (a1 , . . . , ani ))
= fiB ((f ◦ g)(a1 ), . . . , (f ◦ g)(ani ))
= fiB (f (g(a1 )), . . . , f (g(ani )))
= fiB (f (c1 , . . . , cni )),

since h = f ◦ g is a homomorphism.

(i) We show that any homomorphism f 0 : C → B which satisfies f 0 ◦ g =


h agrees with f = h ◦ g −1 . If f ◦ g = h and f 0 ◦ g = h, then we have
f ◦ g = f 0 ◦ g, and hence for all a ∈ A, f (g(a)) = f 0 (g(a)). Thus, because
of the surjectivity of g the equation f (c) = f 0 (c) is satisfied for all c ∈ C,
and we have f = f 0 .
3.1. THE HOMOMORPHISM THEOREM 57

(ii) We are assuming that ker g ⊆ ker h, so that f exists. Now we assume
that f is injective, and take (a1 , a2 ) ∈ ker h. So h(a1 ) = h(a2 ), and using
f ◦ g = h gives (f ◦ g)(a1 ) = (f ◦ g)(a2 ) and f (g(a1 )) = f (g(a2 )). Now the
injectivity of f gives g(a1 ) = g(a2 ) and thus (a1 , a2 ) ∈ ker g. This shows
that ker h ⊆ ker g, and hence the two kernels are equal.

Conversely, suppose that ker g = ker h, and let us show that f is injective.
Let f (c1 ) = f (c2 ) for c1 , c2 ∈ C. Since g is surjective, we can represent cj
in the form g(aj ), for some aj ∈ A, for j = 1, 2. Now we have f (g(a1 )) =
f (g(a2 )), and hence h(a1 ) = h(a2 ). This put (a1 , a2 ) in ker h, and by our
assumption also in ker g. But this means c1 = g(a1 ) = g(a2 ) = c2 , as re-
quired.

(iii) When f is surjective, the fact that g is surjective by assumption makes


the composition h = f ◦ g also surjective. If conversely h is surjective, then
of course f ◦ g = h is also surjective. Thus for every b ∈ B there exists an
element a ∈ A with (f ◦ g)(a) = f (g(a)) = b. Then for every b ∈ B there
exists an element from C, namely g(a), which is mapped by f to b, and f is
surjective.

As mentioned above, as a specific case of the General Homomorphism Theo-


rem we can consider the surjective natural homomorphism associated to any
congruence. We have the following result, which tells us that any homomor-
phic image of an algebra is isomorphic to a quotient algebra of that algebra
by the kernel of the homomorphism.

Theorem 3.1.11 (Homomorphic Image Theorem) Let h : A → B be a


surjective homomorphism. Then there exists a unique isomorphism f from
A/ker h onto B with f ◦ nat(ker h) = h.
Proof: We have ker(nat(ker h)) = ker h, since

(a1 , a2 ) ∈ ker(nat(ker h)) ⇔ (nat(ker h))(a1 ) = [a1 ]ker h

= [a2 ]ker h = (nat(ker h))(a2 ) ⇔ (a1 , a2 ) ∈ ker h.


Thus Theorem 3.1.10 gives the existence of a uniquely defined homomor-
phism f with f ◦ nat(ker h) = h. Moreover, parts (ii) and (iii) of Theorem
3.1.10 show that f is an isomorphism.
58 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS
h
A - B

S 6
(=)
S
S
S
S
nat(ker h) S f
S
S
S
SS
w
A/ker h

The basic idea of the Homomorphic Image Theorem æ is that whenever we


have a surjective homomorphism h from A onto B, the image B is actually
isomorphic to the quotient algebra A/ker h of A. This means that any
homomorphic image of an algebra, which is usually “outside” of the algebra,
can in fact be characterized up to isomorphism “inside” the algebra, as a
quotient algebra determined by the kernel of the homomorphism. So to know
all homomorphic images of a given algebra, it is enough to find all congruence
relations and quotients of this algebra.

3.2 The Isomorphism Theorems


In this section we apply our General Homomorphism Theorem to two other
situations, to produce two theorems which are usually called the First and
Second Isomorphism Theorems. For the first one, we consider subalgebras of
an algebra.

Theorem 3.2.1 (First Isomorphism Theorem) Let A and B be algebras of


the same type, and let h : A → B be a homomorphism. Let A1 be a subalgebra
of A and h(A1 ) ⊆ B its image. We also assume that A∗1 is the preimage of
h(A1 ) and that h1 = h|A1 is the restriction of h to A1 , and we take h∗1 =
h|A∗1 to be the restriction of h to A∗1 . Then

ϕ : A1 /kerh1 → A∗1 /kerh∗1 , defined by [a]ker h1 7→ [a]ker h∗1 ,

is an isomorphism from A1 /ker h1 onto A∗1 /ker h∗1 .


Proof: By definition of the functions h1 and h∗1 , we start with the commu-
tative diagram shown below. Since h1 and h∗1 are surjective, we can apply
3.2. THE ISOMORPHISM THEOREMS 59

the Homomorphism Theorem twice on them and the corresponding natural


homomorphisms, to get the existence of two isomorphisms f1 and f1∗ , as
shown in the second diagram.

A x
H H
HH h HH
6 HH 6 HH
HH
j HH
j
H
B h(x)
6 6
(=)

A∗1 PP ∗ given x P
h1 P
6
PP
PP by 6
PP
PP
q
P P
q
P
h(x)
h(A1 )
(=) >
 >


 
 
 
 
 h1 
A1 x 


A∗1 /ker h∗1 [x]ker h∗1


1
 1

 
 
  
A∗1 
H f1∗ x 
HH
H
6 HH 6 HH
H
HH ? given H
HH ?
j
H
h(A1 ) by
j
H
1
 1

h(x)
 
 
  6   6
A1 P x 

PP f1 P PP
PP P PP
PP P
q
P
q
P
A1 /ker h1 [x]ker h1
60 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

Therefore ϕ : = f1∗−1 ◦ f1 is an isomorphism.

For the second Isomorphism Theorem, we consider two congruences θ1 and


θ2 on an algebra A, with θ1 ⊆ θ2 . We can define a new relation on A/θ1 , by

θ2 /θ1 : = {([a]θ1 , [b]θ1 ) | (a, b) ∈ θ2 }.

Theorem 3.2.2 (Second Isomorphism Theorem) Let θ1 and θ2 be congru-


ences on an algebra A, with θ1 ⊆ θ2 . Then the relation θ2 /θ1 is a congruence
relation on A/θ1 , and the function

ϕ : (A/θ1 )/(θ2 /θ1 ) → A/θ2 , defined by [[a]θ1 ]θ2 /θ1 7→ [a]θ2 ,

is an isomorphism.

Proof: It is clear from the definition that θ2 /θ1 is a relation and in fact
A/θ
an equivalence relation on A/θ1 . Let fi 1 , for i ∈ I, be a fundamental
operation of the algebra A/θ1 and let

([a1 ]θ1 , [b1 ]θ1 ) ∈ θ2 /θ1 , . . . , ([ani ]θ1 , [bni ]θ1 ) ∈ θ2 /θ1 .

Then it follows that


A/θ1
fi ([a1 ]θ1 , . . . , [ani ]θ1 ) = [fiA (a1 , . . . , ani )]θ1 and
A/θ1
fi ([b1 ]θ1 , . . . , [bni ]θ1 ) = [fiA (b1 , . . . , bni )]θ1 .

From the definition of θ2 /θ1 we also know that

(a1 , b1 ) ∈ θ2 , . . . , (ani , bni ) ∈ θ2 ,

and thus
(fiA (a1 , . . . , ani ), fiA (b1 , . . . , bni )) ∈ θ2 .
This makes

([fiA (a1 , . . . , ani )]θ1 , [fiA (b1 , . . . , bni )]θ1 ) ∈ θ2 /θ1 ,

and hence
A/θ1 A/θ1
(fi ([a1 ]θ1 , . . . , [ani ]θ1 ), fi ([b1 ]θ1 , . . . , [bni ]θ1 )) ∈ θ2 /θ1 .
3.3. EXERCISES 61

Using the General Homomorphism Theorem on the two surjective homomor-


phisms
natθ2 : A → A/θ2 and natθ1 : A → A/θ1 ,
we deduce the existence of a surjective homomorphism

f : A/θ1 → A/θ2 , which is defined by [a]θ1 7→ [a]θ2 .

Since θ2 /θ1 is a congruence relation on A/θ1 , we also have the corresponding


surjective natural homomorphism

nat(θ2 /θ1 ) : A/θ1 → (A/θ1 )/(θ2 /θ1 ).

Again by the General Homomorphism Theorem, there then exists a surjective


homomorphism
ϕ : (A/θ1 )/(θ2 /θ1 ) → A/θ2 .
Furthermore we have:

([a1 ]θ1 , [a2 ]θ1 ) ∈ ker f ⇔ f ([a1 ]θ1 ) = f ([a2 ]θ1 ) ⇔


[a1 ]θ2 = [a2 ]θ2 ⇔ (a1 , a2 ) ∈ θ2 ⇔
([a1 ]θ1 , [a2 ]θ1 ) ∈ θ2 /θ1 ⇔ (([a1 ]θ1 , [a2 ]θ1 ) ∈ ker(nat(θ2 /θ1 )),

because ker(nat(θ2 /θ1 )) = θ2 /θ1 . Therefore part (i) of the General Ho-
momorphism Theorem tells us that our surjective homomorphism is also
injective, and so ϕ is an isomorphism.

3.3 Exercises
3.3.1. Let A be a set, let Θ be an equivalence relation on A and let f : A → A
be a function. Prove that f is compatible with Θ iff there is a mapping
g : A → A with

kerg ⊆ Θ ⊆ ker(f ◦ g).

Hint: Choose g in such a way that kerg = Θ.

3.3.2. Let A and B be algebras of type τ , and let h : A → B be a function.


Prove that h is a homomorphism iff {(a, h(a)) | a ∈ A} is a subalgebra of
A × B.
62 CHAPTER 3. HOMOMORPHISMS AND ISOMORPHISMS

3.3.3. Let G = ({0, 1, 2, 3}; +, 0), where + is the operation of addition mod-
ulo 4, and let A = ({e, a}; ·, e) with a = e · a = a · e, e · e = a · a = e, both
algebras of type (2, 0). Let h : G → A be the mapping defined by 0 7→ e,
1 7→ e, 2 7→ a, and 3 7→ a. Is h a homomorphism?

3.3.4. Prove that the composition of two (surjective, injective or bijective)


homomorphisms is again a (surjective, injective or bijective) homomorphism.

3.3.5. Prove that the inverse of an isomorphism is also an isomorphism.

3.3.6. We can define an operator H on the class Alg(τ ) of all algebras of


type τ , as follows. For any K ⊆ Alg(τ ), let H(K) be the class of all algebras
of type τ which are homomorphic images of some algebra in K. Prove that
this operator H is a closure operator on Alg(τ ). (This operator will be used
again in Chapter 6.)
Chapter 4

Direct and Subdirect


Products

In the previous chapters, we have seen three ways to construct new alge-
bras from given algebras: by formation of subalgebras, quotient algebras,
and homomorphic images. In this chapter we examine another important
construction, the formation of product algebras. One useful feature of this
new construction involves the cardinalities of the algebras obtained. The for-
mation of subalgebras or of homomorphic images of a given algebra leads
to algebras with cardinality no larger than the cardinality of the given alge-
bra. The formation of products, however, can lead to algebras with bigger
cardinalities than those we started with. There are several ways to define a
product of given algebras; we shall examine two products, called the direct
product and the subdirect product.

4.1 Direct Products


Definition 4.1.1 Let (Aj )j∈J be a family of algebras of type τ . The direct
Aj of the Aj is defined as an algebra with the carrier set
Q
product
j∈J

Y
P : = Aj : = {(xj )j∈J | ∀j ∈ J (xj ∈ Aj )}
j∈J

and the operations


A
(fiP (a1 , . . . , ani ))(j) = fi j (a1 (j), . . . , ani (j)),

63
64 CHAPTER 4. DIRECT AND SUBDIRECT PRODUCTS

for a1 , . . ., ani in P ; that is,


A
fiP ((a1j )j∈J , . . . , (ani j )j∈J ) = (fi j (a1j , . . . , ani j )j∈J ).

If for all j ∈ J, Aj = A, then we usually write AJ instead of Aj . If


Q
j∈J
J = ∅, then A∅ is defined to be the one-element (trivial) algebra of type τ .
If J = {1, . . . , n}, then the direct product can be written as A1 × · · · × An .

Aj are the mappings


Q
The projections of the direct product
j∈J
Y
pk : Aj → Ak defined by (aj )j∈J 7→ ak .
j∈J

It is easy to check that the projections of the direct product are in fact sur-
jective homomorphisms.

Remark 4.1.2 Let A be an algebra of type τ , let (Aj )j∈J be a family of


algebras of type τ and let (fj : A → Aj )j∈J be a family of homomor-
phisms. Then there exists a unique homomorphism f : A → Aj such
Q
j∈J
that pj ◦ f = fj for all j ∈ J, namely f = (fj )j∈J . This homomorphism f
makes the following diagram commute.

fj
A - Aj a - fj (a)
@ 6 S 6
@ S
@ S
f @ (=) pj given by S
@ S
@ S
@ S
@ w
S
RQ
@
Aj (fj (a))j∈J
j∈J

Example 4.1.3 Let us consider the direct product of the two permutation
groups S2 and A3 . Here
−1 −1
S2 = ({τ0 , τ1 }; ◦, , (1)) and A3 = ({τ0 , α1 , α2 }; ◦, , (1)),
4.1. DIRECT PRODUCTS 65

with

τ0 : = (1), τ1 : = (12), α1 : = (123), α2 : = (132).

For the Cartesian product set, we let

γ00 : = ((1), (1)), γ01 : = ((1), (123)), γ02 : = ((1), (132)),

γ10 : = ((12), (1)), γ11 : = ((12), (123)), γ12 : = ((12), (132)).


Then we have

S2 × A3 = {γ00 , γ01 , γ02 , γ10 , γ11 , γ12 }.


The binary operation ◦ of the direct product is defined by the following
Cayley table:
◦ γ00 γ01 γ02 γ10 γ11 γ12

γ00 γ00 γ01 γ02 γ10 γ11 γ12


γ01 γ01 γ02 γ00 γ11 γ12 γ10
γ02 γ02 γ00 γ01 γ12 γ10 γ11
γ10 γ10 γ11 γ12 γ00 γ01 γ02
γ11 γ11 γ12 γ10 γ01 γ02 γ00
γ12 γ12 γ10 γ11 γ02 γ00 γ01

We now consider a direct product of two factors. In this case we have two
projection mappings, p1 and p2 , each of which has a kernel which is a congru-
ence relation on the product. We will show that these two kernels have some
special properties. We recall first the definition of the product (composition)
θ1 ◦ θ2 of two binary relations θ1 , θ2 on any set A:

θ1 ◦ θ2 := {(a, b) | ∃c ∈ A ((a, c) ∈ θ2 ∧ (c, b) ∈ θ1 )}.

Two binary relations θ1 , θ2 on A are called permutable, if θ1 ◦ θ2 = θ2 ◦ θ1 .

Lemma 4.1.4 Let A1 , A2 be two algebras of type τ and let A1 × A2 be their


direct product. Then:
(i) ker p1 ∧ ker p2 = ∆A1 ×A2 ;

(ii) ker p1 ◦ ker p2 = ker p2 ◦ ker p1 ;

(iii) ker p1 ∨ ker p2 = (A1 × A2 )2 .


66 CHAPTER 4. DIRECT AND SUBDIRECT PRODUCTS

Proof: (i) Since ker p1 and ker p2 are equivalence relations on A1 × A2 , the
relation ker p1 ∧ ker p2 is also an equivalence relation on A1 × A2 , with
∆A1 ×A2 ⊆ ker p1 ∧ ker p2 . Conversely, let (x, y) ∈ ker p1 ∧ ker p2 , with x =
(a1 , b1 ) and y = (a2 , b2 ). From (x, y) ∈ ker p1 we have

a1 = p1 ((a1 , b1 )) = p1 ((a2 , b2 )) = a2 .

From (x, y) ∈ ker p2 , we have

b1 = p2 ((a1 , b1 )) = p2 ((a2 , b2 )) = b2 .

Thus x = y and (x, y) ∈ ∆A1 ×A2 . This shows that

ker p1 ∧ ker p2 ⊆ ∆A1 ×A2 ,

and altogether we have

ker p1 ∧ ker p2 = ∆A1 ×A2 .

(ii) Assume that (x, y) ∈ (A1 × A2 )2 , with x = (a1 , b1 ) and y = (a2 , b2 )


for some a1 , a2 ∈ A1 and b1 , b2 ∈ A2 . Since

((a1 , b1 ), (a1 , b2 )) ∈ ker p1 and ((a1 , b2 ), (a2 , b2 )) ∈ ker p2 ,

we always have
((a1 , b1 ), (a2 , b2 )) ∈ ker p2 ◦ ker p1 ,
giving
(A1 × A2 )2 ⊆ ker p2 ◦ ker p1 .
The converse inclusion is true by definition, so we have

ker p2 ◦ ker p1 = (A1 × A2 )2 .

Similarly we can show that

ker p1 ◦ ker p2 = (A1 × A2 )2 ,

making ker p1 ◦ ker p2 = ker p2 ◦ ker p1 . This proves (ii).

(iii) Now we use the equality

ker p1 ◦ ker p2 = ker p2 ◦ ker p1 = (A1 × A2 )2


4.1. DIRECT PRODUCTS 67

from (ii) to show that ker p1 ∨ ker p2 = (A1 × A2 )2 . For this we need the
well-known fact that for any two equivalence relations θ1 and θ2 , the equation

θ 1 ∨ θ2 = θ1 ◦ θ2

is satisfied iff θ1 and θ2 are permutable. This is because we always have


ker p1 ∨ ker p2 equal to

ker p1 ∨ (ker p2 ◦ ker p1 ) ∨ (ker p2 ◦ ker p1 ◦ ker p2 ) ∨ · · · ,

(see for instance Th. Ihringer, [58]), so when ker p1 ◦ker p2 = ker p2 ◦ker p1
we obtain

ker p1 ◦ ker p2 = (A1 × A2 )2 ⊆ ker p1 ∨ ker p2 .

Together with ker p1 ∨ ker p2 ⊆ (A1 × A2 )2 , this gives the desired


equality.

Thus any direct product of two factors produces two congruences with the
three special properties of Lemma 4.1.4. Conversely, the next theorem shows
that if we have two congruences on an algebra with these properties, we can
use them to write the algebra as a direct product of two factors.

Theorem 4.1.5 Let A be an algebra, and let θ1 , θ2 ∈ ConA be a pair of


congruence relations with the following properties:

(i) θ1 ∧ θ2 = ∆A ;

(ii) θ1 ∨ θ2 = A2 ;

(iii) θ1 ◦ θ2 = θ2 ◦ θ1 .

Then A is isomorphic to the direct product A/θ1 × A/θ2 , by an isomorphism

ϕ : A → A/θ1 × A/θ2

given by:
ϕ(a) = ([a]θ1 , [a]θ2 ), a ∈ A.
Proof: The given mapping ϕ is defined using the two natural homomor-
phisms, and is the unique map determined by them, as in Remark 4.1.2.
This makes ϕ a homomorphism, and we will show that it is also a bijection.
First, ϕ is injective: if ϕ(a) = ϕ(b), then [a]θ1 = [b]θ1 and [a]θ2 = [b]θ2 , so it
68 CHAPTER 4. DIRECT AND SUBDIRECT PRODUCTS

follows that (a, b) ∈ θ1 ∧ θ2 and a = b by (i). To see that the map ϕ is also
surjective, let (a, b) be any pair in A2 . Conditions (ii) and (iii) mean that
there exists an element c ∈ A with (a, c) ∈ θ1 and (c, b) ∈ θ2 , and therefore

([a]θ1 , [b]θ2 ) = ([c]θ1 , [c]θ2 ) = ϕ(c).

This theorem shows that we may use certain congruences on an algebra A


to express A as a direct product of possibly smaller algebras. An algebra
A is called directly irreducible if it cannot be expressed in this way without
using A itself as one of the factors. We use |A| to denote the cardinality of
a set A.

Definition 4.1.6 An algebra A is called directly irreducible, if whenever


A∼
= B1 × B2 , either |B1 | = 1 or |B2 | = 1.

Corollary 4.1.7 An algebra A is directly irreducible iff (∆A , A × A) is the


only pair of congruence relations on A which satisfies the conditions (i) -
(iii) of Theorem 4.1.5.

Proof: Let A be directly irreducible and assume that θ1 , θ2 ∈ ConA satisfy


the three conditions of Theorem 4.1.5. Then A ∼ = A/θ1 × A/θ2 , and the irre-
ducibility means that one of the factors has cardinality one. Without loss of
generality, suppose that |A/θ1 | = 1. Then we must have θ1 = A ×A, and
θ2 must equal ∆A by condition (i).

Assume now that conversely (∆A , A×A) is the only pair with the properties
(i) - (iii), and let A ∼
= A1 × A2 . Then (∆A1 ×A2 , (A1 × A2 )2 ) is also the only
pair of congruence relations on A1 × A2 to satisfy conditions (i) - (iii). But
by Lemma 4.1.4, the kernels of the projection mappings p1 and p2 do satisfy
the three conditions. Therefore one of ker p1 or ker p2 must equal ∆A1 ×A2 ,
and thus one of A1 or A2 must have cardinality one.

4.2 Subdirect Products


There is another way to define a product of algebras, which is different from
the direct product.
4.2. SUBDIRECT PRODUCTS 69

Definition 4.2.1 Let (Aj )j∈J be a family of algebras of type τ . A sub-


algebra B ⊆ Aj of the direct product of the algebras Aj is called
Q
j∈J
a subdirect product of the algebras Aj , if for every projection mapping
Aj → Ak we have
Q
pk :
j∈J

pk (B) = Ak .

Example 4.2.2 1. Every direct product is also a subdirect product.

2. For every algebra A the diagonal ∆A = {(a, a) | a ∈ A} is easily shown to


be the carrier set of a subalgebra ∆A of A × A. Moreover, p1 (∆A ) = p2 (∆A )
= A, so ∆A is a subdirect product of A × A.

3. Consider the two lattices C2 and C3 , chains on two and three elements,
respectively, shown below:

•3

•2

•b
•1
•a

Their direct product is the lattice described by the diagram

•(b, 3)

(b, 2)• •(a, 3)

(b, 1)• •(a, 2)

•(a, 1)
.

The sublattice L ⊆ C2 × C3 which is described by the diagram


70 CHAPTER 4. DIRECT AND SUBDIRECT PRODUCTS

• (b, 3)

(b, 2) • • (a, 3)

• (a, 2)

• (a, 1)

is obviously a subdirect product of C2 and C3 .

Theorem 4.2.3 Let B be a subdirect product of the family (Aj )j∈J of alge-
Aj → Ak satisfy the
Q
bras of type τ . Then the projection mappings pk :
j∈J
ker(pj | B) = ∆B .
T
equation
j∈J

Proof: Let (a, b) ∈ ker(pj |B). This implies that pk (a) = pk (b) for all
T
j∈J
k ∈ J, so that every component of a agrees with the corresponding compo-
nent of b. This means that a = b and thus (a, b) ∈ ∆B . Conversely, it is
clear that ∆B ⊆ ker(pk |B) for all pk .

As was the case for direct products, it turns out that this property of the
kernels of the projection mappings can be used to characterize subdirect
products, in the sense that any set of congruences on an algebra with these
properties can be used to express the algebra as a subdirect product.

Theorem 4.2.4 Let A be an algebra. Let {θj | j ∈ J} be a family of


congruence relations on A, which satisfy the equation
T
θj = ∆A . Then
j∈J
A is isomorphic to a subdirect product of the algebras A/θj , for j ∈ J.
In particular, the mapping ϕ(a) := ([a]θj | j ∈ J) defines an embedding
ϕ: A→
Q
(A/θj ), whose image ϕ(A) is a subdirect product of the algebras
j∈J
A/θj .
Proof: The map ϕ is the unique homomorphism determined by the natu-
ral homomorphism mappings, as in Remark 4.1.2. Also, ϕ is injective, since
4.2. SUBDIRECT PRODUCTS 71

ϕ(a) = ϕ(b) implies [a]θj = [b]θj and thus (a, b) ∈ θj for all j ∈ J. There-
fore (a, b) ∈ θj = ∆A and so a = b. This proves the isomorphism of A
T
j∈J
(A/θj ) → A/θk denotes the k-th projection
Q
and ϕ(A). Moreover, if pk :
j∈J
mapping, then by the definition of ϕ we have pk (ϕ(A)) = A/θk for all
k ∈ J. Therefore ϕ(A) is a subdirect product of the algebras A/θj .

We remark that the converse of this theorem is also true. If A is isomorphic


to a subdirect product of a family (Aj )j∈J of algebras, then there exists a
family of congruence relations on A whose intersection is the relation ∆A .
We leave the proof as an exercise for the reader.

In analogy to the definition of irreducible algebras in the direct product


case, we want to consider algebras which cannot be expressed as a subdirect
product of other smaller algebras, except in trivial ways.

Definition 4.2.5 An algebra A of type τ is called subdirectly irreducible, if


every family {θj | j ∈ J} of congruences on A, none of which is equal to ∆A ,
has an intersection which is different from ∆A . In this case, the conditions
of Theorem 4.2.4 are not satisfied, and no representation of A as a subdirect
product is possible.

Remark 4.2.6 It is easy to see that an algebra A is subdirectly irreducible


if and only if ∆A has exactly one upper neighbour or cover in the lattice
ConA of all congruence relations on A. Then the congruence lattice has the
form shown in the diagram below.

Example 4.2.7 1. Every simple algebra is subdirectly irreducible. Since an


arbitrary two-element algebra is simple, such algebras are always subdirectly
irreducible.

2. A three-element algebra which has no more than one congruence other


than A2 and ∆A is subdirectly irreducible.
72 CHAPTER 4. DIRECT AND SUBDIRECT PRODUCTS

• A2

• {ConA \ {∆A }}
T

• ∆A

We started this chapter looking at products as a way to produce larger and


more complicated algebras out of given algebras. But we have also looked
at this process in the other direction: we can try to express any algebra as a
product of certain “simpler” algebras, such as irreducible algebras. However,
direct products are not the best concept to use here, since not every algebra
is isomorphic to a direct product of directly irreducible algebras. Subdirect
products, on the other hand, do have the right property, as shown by the fol-
lowing theorem of G. Birkhoff ([8]). We present this important result without
proof.

Theorem 4.2.8 Every algebra is isomorphic to a subdirect product of sub-


directly irreducible algebras.

4.3 Exercises
4.3.1. Prove that if A is a finite algebra, then A is isomorphic to a direct
product of directly irreducible algebras.

4.3.2. Prove that every semilattice is isomorphic to a subdirect product of


the two-element semilattice ({0, 1}; ∧). This means that (up to isomorphism)
the semilattice ({0, 1}; ∧) is the only subdirectly irreducible semilattice.

4.3.3. Let A and B be algebras of the same type. Show that if A has a
one-element subalgebra, the direct product A × B has a subalgebra which is
4.3. EXERCISES 73

isomorphic to B.

4.3.4. Prove that for any algebra A, the diagonal relation ∆A on the set A
is the universe of a subalgebra of the algebra A × A.

4.3.5. We can define an operator P on the class Alg(τ ) of all algebras of type
τ , as follows. For any class K ⊆ Alg(τ ), let P(K) be the class of all algebras
of type τ which are products of one or more algebras in K. Prove that this
operator P is extensive and monotone, but not idempotent, and hence is not
a closure operator on Alg(τ ). (This operator will be used again in Chapter
6.)

4.3.6. Prove that a finite abelian group is subdirectly irreducible iff it is a


cyclic group of prime power order.

4.3.7. Prove the claim made in the remark following Theorem 4.2.4.
74
Chapter 5

Terms, Trees, and


Polynomials

In the previous chapters, we have studied four algebraic constructions on


algebras: formation of subalgebras, homomorphic images, quotient algebras
and product algebras. Now we begin another approach to the study of al-
gebras, the equational approach. We start by looking at terms and polyno-
mials in this chapter. In the following chapter we will use these concepts
to define equations and identities, and connect the algebraic and equational
approaches to algebras.

Terms and polynomials on an algebra A define special kinds of operations on


the base set A. We have been studying the properties of the fundamental op-
erations on A, for example that the fundamental operations are compatible
with congruences on A and preserve all subalgebras of A. But for a given
algebra A, there are other operations besides the fundamental operations
which have these nice properties. Any operation obtained by arbitrary com-
positions of the fundamental operations will have these properties, and it is
these operations which are called term operations. If we also allow the use
of constants in our arbitrary compositions, we obtain operations called poly-
nomial operations. Term operations and polynomial operations can also be
obtained starting from abstract terms and polynomials, respectively. Terms
and polynomials are important as a way to define identities satisfied by an
algebra.

75
76 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

5.1 Terms and Trees


In Section 1.2. we defined semigroups as algebras (G; ·) of type τ = (2) where
the associative law
x · (y · z) ≈ (x · y) · z

is satisfied. Satisfaction of this law means that for all elements x, y, z ∈ G,


the equation x · (y · z) = (x · y) · z holds. To write this equation, we need
the symbols x, y and z. But these symbols are not themselves elements of
G; they are only symbols for which elements from G may be substituted.
Such symbols are called variables. To write identities or laws of an algebra
we need a language, which must include such variables as well as symbols to
represent the operations. In the associative law above we did not distinguish
between the operation and the symbol used to denote it, using · for both, but
in our new formal language we shall usually make such a distinction. That
is, we will have formal operation symbols distinct from concrete operations
on a set.

Now we proceed to define this formal language in the general setting. Let
n ≥ 1 be a natural number. Let Xn = {x1 , . . . , xn } be an n-element set. The
set Xn is called an alphabet and its elements are called variables. We also
need a set {fi |i ∈ I} of operation symbols, indexed by the set I. The sets Xn
and {fi |i ∈ I} have to be disjoint. To every operation symbol fi we assign
a natural number ni ≥ 1, called the arity of fi . As in the definition of an
algebra, the sequence τ = (ni )i∈I of all the arities is called the type of the
language. With this notation for operation symbols and variables, we can
define the terms of our type τ language.

Definition 5.1.1 Let n ≥ 1. The n-ary terms of type τ are defined in the
following inductive way:

(i) Every variable xi ∈ Xn is an n-ary term.

(ii) If t1 , . . . , tni are n-ary terms and fi is an ni -ary operation symbol, then
fi (t1 , . . . , tni ) is an n-ary term.

(iii) The set Wτ (Xn ) = Wτ (x1 , . . . , xn ) of all n-ary terms is the smallest
set which contains x1 , . . ., xn and is closed under finite application of
(ii).
5.1. TERMS AND TREES 77

Remark 5.1.2 1. It follows immediately from the definition that every n-


ary term is also k-ary, for k > n.

2. Our definition does not allow nullary terms. This could be changed by
adding a fourth condition to the inductive definition, stipulating that every
nullary operation symbol of our type is an n-ary term. We could also extend
our language to include a third set of symbols, to be used as constants or
nullary terms; we shall explore this approach later, in Section 5.3.

Example 5.1.3 Let τ = (2), with one binary operation symbol f . Let X2 =
{x1 , x2 }. Then f (f (x1 , x2 ), x2 ), f (x2 , x1 ), x1 , x2 and f (f (f (x1 , x2 ), x1 ), x2 )
are binary terms. The expression f (f (x3 , f (x1 , x2 )), x4 ) is a quaternary or
4-ary term, but f (f (x1 , x2 ), x3 is not a term (one bracket is missing).

Example 5.1.4 Let τ = (1), with one unary operation symbol f . Let X1 =
{x1 }. Then the unary terms of this type are x1 , f (x1 ), f (f (x1 )), f (f (f (x1 ))),
and so on. Note that W(1) (X1 ) is infinite. In a specific application such as
group theory, we might denote our unary operation by −1 instead of f ,
writing our terms as x1 , x−1 −1 −1
1 , (x1 ) , etc. In the group theory case we
might want to consider the terms x1 and (x−1 1 )
−1 as equal. But such an

equality depends on a specific application, and does not hold in the most
general sense that we are defining here. Thus our terms are often called
the “absolutely free” terms, in the sense that we make no restrictions or
assumptions about the properties of our operation symbols, beyond their
arity as specified in the type.

An important feature of our definition of terms, Definition 5.1.1, is that it


is an inductive definition, based on the number of occurrences of operation
symbols in a term. This number is sometimes called the complexity of a
term, and many of our proofs in this chapter will proceed by induction on
the complexity of terms. This means that in order to prove that the set of all
terms has a certain property, it will suffice to prove that the variable terms
have the property and that if the terms t1 , . . ., tni have the property, then
so does the compound term fi (t1 , . . . , tni ).

There are various methods used to measure the complexity of a term, be-
sides the number of operation symbols which occur in it. Another common
measure is what is called the depth of the term, defined by the following
steps:
78 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

(i) depth(t) = 0 if t = xi is a variable,

(ii) depth(t) = max{depth(t1 ), . . . , depth(tni )} + 1 if t = f (t1 , . . . , tni ).

In Computer Science it is very common to illustrate terms by tree diagrams,


called semantic trees. The semantic tree of the term t is defined as follows:

(i) If t = xi , then the semantic tree of t consists only of one vertex which
is labelled with xi , and this vertex is called the root of the tree.

(ii) If t = fi (t1 , . . . , tni ) then the semantic tree of t has as its root a vertex
labelled with fi , and has ni edges which are incident with the vertex
fi ; each of these edges is incident with the root of the corresponding
term t1 , . . . , tni (ordered by 1 ≤ 2 ≤ · · · ≤ ni , starting from the left).

Consider for example the type τ = (2, 1) with a binary operation symbol f2
and a unary operation symbol f1 , and variable set X3 = {x1 , x2 , x3 }. Then
the term t = f2 (f1 (f2 (f2 (x1 , x2 ), f1 (x3 ))), f1 (f2 (f1 (x1 ), f1 (f1 (x2 ))))) corre-
sponds to the semantic tree shown below.

x1 x2 x3 x1 x2

I   6 
s s f
1
f2
s
f1 s f1
s s
f2 f1
s
cs f2
f1 s
f1
s
f2

Notice that semantic trees are ordered in such a way that we start with
the first variable on the left hand side which occurs in a term t. We can
also describe, for any node or vertex in a tree, the path from the root of
the tree to that node or vertex. That is, to any node or vertex of a term
t = fi (t1 , . . . , tni ), we can assign a sequence (or word) over the set N+ of
positive integers, as follows. Suppose the outermost term is ni -ary. The root
5.1. TERMS AND TREES 79

of the tree is labelled by the symbol e, which we call the empty sequence.
The vertices on the second level up are labelled by 1, . . . , ni , from the left
to the right. Continuing in this way, we assign to each branch of the tree a
sequence on N+ . For instance, the tree from our example above is labelled
as shown below.

1111 1112 1121 2111 21211

I   6 
s s
112
111
s 2121
s 211 s s 212
11
s
cs 21
1 s
2
s
e

We can think of the words we assign to each branch as determining an “ad-


dress” for every vertex or node of a tree. This notation is convenient for
encoding various operations we can perform on trees. For instance, if t is a
tree and u is such an address, then we denote by t/u the subtree of t which
starts with the vertex labelled by the sequence u. If s is another tree, then
we denote by t[u/s] the tree obtained from t by replacing the subtree t/u by
the tree s.

We shall make use of semantic trees in Chapter 7, when we consider term


rewriting systems. For now, we return to the main development of the theory
of terms. Let τ be a fixed type. Let X be the union of all the sets Xn of
variables, so X = {x1 , x2 , . . .}.
We denote by Wτ (X) the set of all terms of type τ over the countably infinite
alphabet X:

[
Wτ (X) = Wτ (Xn ).
n=1

Now we want to use this set Wτ (X) as the universe of some algebra, of the
same type τ . What operations can we perform on these terms? In fact, for
80 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

every i ∈ I we can define an ni -ary operation f¯i on Wτ (X), with

f¯i : Wτ (X)ni → Wτ (X) defined by (t1 , . . . , tni ) 7→ fi (t1 , . . . , tni ).

Note the distinction here between the concrete operation f¯i being defined
on the set of all terms, and the formal operation symbol fi , used in the for-
mation of terms. The second step of Definition 5.1.1 shows that the element
fi (t1 , . . . , tni ) in our definition belongs to Wτ (X), and so our operation f¯i
is well defined. In this way we make Wτ (X) into the universe of an algebra
of type τ = (ni )i∈I , since for every operation symbol fi we have a concrete
operation f¯i on Wτ (X).

Definition 5.1.5 The algebra Fτ (X) := (Wτ (X); (f¯i )i∈I ) is called the term
algebra, or the absolutely free algebra, of type τ over the set X.

The following result is an easy consequence of our inductive definition of


terms.

Lemma 5.1.6 For any type τ , the term algebra Fτ (X) is generated by the
set X.

Proof: Definition 5.1.1 (i) shows that X ⊆ Wτ (X), and 5.1.1 (ii) gives
hXiFτ (X) = Fτ (X).

Instead of Fτ (X), we could also consider the algebra Fτ (Xn ) :=


(Wτ (Xn ); (f¯i )i∈I ) where now f¯i are the restrictions of the operations de-
fined on Wτ (X) to the subset Wτ (Xn ). By Definition 5.1.1 these restrictions
are also operations defined on Wτ (Xn ). This algebra is called the absolutely
free algebra or the term algebra of type τ over the set Xn of n generators.
As in Lemma 5.1.6, the algebra Fτ (Xn ) is generated by the set Xn .

As we mentioned in Example 5.1.4, the term algebra defined here is the


“absolutely free” one, in the sense that we make no assumptions about the
operation symbols other than their arities. The phrase “absolutely free” is
also used in a more technical sense, as described in the following theorem.

Theorem 5.1.7 For every algebra A ∈ Alg(τ ) and every mapping f : X →


A, there exists a unique homomorphism fˆ : Fτ (X) → A which extends the
mapping f and such that fˆ ◦ ϕ = f , where ϕ : X → Fτ (X) is the embedding
of X in Fτ (X). This homomorphism makes the following diagram commute.
5.1. TERMS AND TREES 81

f
-
X A
6
ϕ fˆ

~
Fτ (X)

Proof: We define fˆ in the following way:


(
f (t), if t = x ∈ X is a variable,
fˆ(t) :=
fiA (fˆ(t1 ), . . . , fˆ(tni )), if t = fi (t1 , . . . , tni ).

Here fiA is the fundamental operation of the algebra A which corresponds to


the operation symbol fi . This definition is inductive, in that we are assuming
in the second condition that each fˆ(tj ), for 1 ≤ j ≤ ni , is already defined.
Clearly, fˆ(f i (t1 , . . . , tni )) = fˆ(fi (t1 , . . . , tni )) = fiA (fˆ(t1 ), . . . , fˆ(tni )) shows
that fˆ is a homomorphism, and it follows directly from the definition that
fˆ extends f .

We have so far defined term algebras on the finite sets Xn and the countably
infinite set X, using variable symbols x1 , x2 , x3 , . . . . However, it should
be clear that we could start with any non-empty set Y of symbols, of any
cardinality, and carry out the same process. We define terms on the set
Y inductively just as in Definition 5.1.1, with all the variables in Y being
terms and then any result of applying the operation symbols fi to terms
giving terms. Then we can form the set Wτ (Y ) of all terms of type τ over Y ,
and make it into an algebra Fτ (Y ) generated by Y which has the analogous
freeness property of Theorem 5.1.7. Thus we have a free algebra over any set
of symbols of any cardinality, although we shall see in Chapter 6 that in some
sense the sets of finite or countably infinite cardinality are sufficient for our
purposes. Moreover, the next theorem shows that for a fixed cardinality, one
may use any choice of variable symbols. For example, the reader may have
noticed that in our formal language we have variables xi for i ≥ 1, while in
our example with the associative law at the beginning of this section we used
variables x, y and z. That it is justified to make such a change of variable
symbols, where convenient, follows from the following theorem:
82 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

Theorem 5.1.8 Let Y and Z be alphabets with the same cardinality. Then
the term algebras Fτ (Y ) and Fτ (Z) are isomorphic.
Proof: When |Y | = |Z| there exists a bijection ϕ : Y → Z. Since
Fτ (Z) ∈ Alg (τ ), by Theorem 5.1.7 we can extend ϕ to a homomor-
phism ϕ̂ : Fτ (Y ) → Fτ (Z). Now using Theorem 5.1.7 again on the
mapping ϕ−1 : Z → Y gives a homomorphism (ϕ−1 )ˆ : Fτ (Z) →
Fτ (Y ). We will show by induction on the complexity of the term t
that (ϕ−1 )ˆ ◦ ϕ̂ = idWτ (Y ) and ϕ̂ ◦ (ϕ−1 )ˆ = idWτ (Z) . This will prove
that ϕ̂ is an isomorphism, with (ϕ−1 )ˆ as its inverse. The claim is clear
for the base case that t = x is a variable. Now assume that t =
fi (t1 , . . . , tni ), and that the claim is true for the terms t1 , . . ., tni . Then
F (Z)
we have ((ϕ−1 )ˆ ◦ ϕ̂)(t) = (ϕ−1 )ˆ(ϕ̂(t)) = (ϕ−1 )ˆ(fi τ (ϕ̂(t1 ), . . . , ϕ̂(tni ))
F (Y )
= (ϕ−1 )ˆ(fi (ϕ̂(t1 ), . . . , (ϕ̂(tni ))) = fi τ (((ϕ−1 )ˆ ◦ ϕ)(t1 ), . . . , ((ϕ−1 )ˆ ◦
ϕ)(tni ))) = fi (t1 , . . . , tni ) = t. The proof for ϕ̂ ◦ (ϕ−1 )ˆ is similar.

5.2 Term Operations


Terms are formal expressions on our formal language of type τ . In order to
formulate statements using terms which are true or false in a given algebra
A, we have to evaluate the variables in the terms by elements of the concrete
set A, and we have to interpret the operation symbols by concrete operations
on this set. It is this process which produces term operations from terms.
We shall continue to denote by X the countably infinite set {x1 , x2 , x3 , . . .}
of variables.

Definition 5.2.1 Let A be an algebra of type τ and let t be an n-ary term


of type τ over X. Then t induces an n-ary operation tA on A, called the term
operation induced by the term t on the algebra A, via the following steps:
n,A n,A
(i) If t = xi ∈ Xn , then tA = xA
j = ej ; here ej is the n-ary projection
on A defined by en,A
j (a1 , . . . , an ) = aj for all a1 , . . . , an ∈ A.

(ii) If t = fi (t1 , . . . , tni ) is an n-ary term of type τ , and tA A


1 , . . . , tni
are the term operations which are induced by t1 , . . . , tni , then t = A

fiA (tA A
1 , . . . , tni ).

In part (ii) of this definition, the right hand side of the equation refers to
the composition or superposition of operations, so that
5.2. TERM OPERATIONS 83

tA (a1 , . . . , an ) := fiA (tA A


1 (a1 , . . . , an ), . . . , tni (a1 , . . . , an )),

for all a1 , . . . , an ∈ A.

Since any concrete operation on a set has an arity attached to it, if we


want to induce such an operation from a term, we must also have an arity
attached to the term. It is for this reason that our definition of terms begins
in Definition 5.2.1 with terms of each fixed arity n.
Roughly speaking, if t is an n-ary term and A is an algebra of type τ , then
to obtain the operation tA we substitute elements from the set A for the
variables occurring in t, and interpret the operation symbols fi for i ∈ I as
the corresponding fundamental operations fiA . More precisely, we map the
variables xj occurring in t to elements a1 , . . . , an of A by a function f : X →
A, and then our term operation tA is the unique extension fˆ : Fτ (X) → A
from Theorem 5.1.7. We will denote by Wτ (Xn )A the set of all n-ary term
operations of the algebra A, and by Wτ (X)A the set of all (finitary) term
operations on A.

There is another way to obtain the set Wτ (X)A of all term operations on A,
using clone operations. Using A as our base set, we consider the set O(A) of
all finitary operations on A. Recall from Example 1.2.14 that the set O(A)
is closed under a composition operation

On (A) × (Om (A))n → Om (A),

and contains all the projection operations on A. This makes O(A) a clone
on the set A, called the full clone on A; any subset of O(A) which contains
the projections and is also closed under composition is called a subclone of
O(A), or a clone on A.

Definition 5.2.2 Let C ⊆ O(A) be a set of operations on a set A. Then the


clone generated by C, denoted by hCi, is the smallest subset of O(A) which
contains C, is closed under composition, and contains all the projections
en,A
i : An → A for arbitrary n ≥ 1 and 1 ≤ i ≤ n.
Then we have the following connection to our term algebras.

Theorem 5.2.3 Let A = (A; (fiA )i∈I ) be an algebra of type τ , and let
Wτ (X) be the set of all terms of type τ over X. Then Wτ (X)A is a clone
84 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

on A, called the term clone of A. Moreover, the clone Wτ (X)A is generated


by the set of all fundamental operations of the algebra A. That is, Wτ (X)A
= h{fiA |i ∈ I}i.

Proof: We prove first that Wτ (X)A is indeed a clone. Since xi ∈ Xn , we


n,A
have xAi = ei ∈ Wτ (X)A for all n ≥ 1. Thus Wτ (X)A contains all the pro-
jections. Now let f A , g1A , . . ., gnA be in Wτ (X)A , with f A n-ary and g1A , . . .,
gnA each m-ary. Then f (x1 , . . . , xn ) and the gi (x1 , . . . , xm ), for 1 ≤ i ≤ n, are
terms which induce the term operations f A , g1A , . . ., gnA respectively. But
then f (g1 (x1 , . . . , xn ), . . . , gn (x1 , . . . , xn )) is also a term, and the induced
term operation is f A (g1A , . . . , gnA ) ∈ Wτ (X)A . Therefore, Wτ (X)A is closed
under composition of operations, and is a clone.

Clearly, {fiA |i ∈ I} ⊆ Wτ (X)A , and so h{fiA |i ∈ I}i ⊆ Wτ (X)A . We will


show the converse inclusion by induction on the complexity of a term t.
If t is a variable xi ∈ Xn , then the induced term operation is a projec-
tion which belongs to the clone h{fiA |i ∈ I}i. If t = fi (t1 , . . . , tni ) and
tA ∈ Wτ (X)A and if we assume that tA A A
1 , . . ., tni ∈ h{fi |i ∈ I}i, then
A A A A A A
fi (t1 , . . . , tni ) = t ∈ h{fi |i ∈ I}i since h{fi |i ∈ I}i is a clone. Therefore
h{fiA |i ∈ I}i ⊇ Wτ (X)A , and altogether we have equality.

We point out that Theorem 5.2.3 is no longer true in the case of partial
algebras, that is algebras in which the fundamental operations fi are not
totally defined on A.

As we remarked in the introduction to this chapter, an important feature


of term operations of an algebra is that they have many of the same useful
properties the fundamental operations of the algebra do, with respect to
subalgebras, homomorphisms and congruence relations.

Theorem 5.2.4 Let A be an algebra of type τ and let tA be the n-ary term
operation on A induced by the n-ary term t ∈ Wτ (X).

(i) If B is a subalgebra of A, then tA (b1 , . . . , bn ) ∈ B, for all b1 , . . . , bn ∈ B.

(ii) If B is an algebra of type τ and ϕ : A → B is a homomorphism, then


for all a1 , . . ., an in A,

ϕ(tA (a1 , . . . , an )) = tB (ϕ(a1 ), . . . , ϕ(an )).


5.3. POLYNOMIALS AND POLYNOMIAL OPERATIONS 85

(iii) If θ is a congruence relation on A, then for all pairs (a1 , b1 ), . . .,


(an , bn ) in θ, we have (tA (a1 , . . . , an ), tA (b1 , . . . , bn )) ∈ θ.
Proof: For (i) and (ii) we give a proof by induction on the complexity of
the term t ∈ Wτ (Xn ). If t is a variable xi , for 1 ≤ i ≤ n, then
n,A
xA
i (b1 , . . . , bn ) = ei (b1 , . . . , bn ) = bi ∈ B and
ϕ(en,A n,B
i (a1 , . . . , an )) = ϕ(ai ) = ei (ϕ(a1 ), . . . , ϕ(an )).
Inductively, let t = fi (t1 , . . . , tni ) and assume that (i) and (ii) are
satisfied for the term operations tA A A
1 , . . . , tni . Then t (b1 , . . . , bn ) =
fiA (tA A A A A
1 , . . . , tn )(b1 , . . . , bn ) = fi (t1 (b1 , . . . , bn ), . . . , tn (b1 , . . . , bn )), and this
A
is in B since all the ti (b1 , . . . , bn ) are in B and B is a subalgebra of A.

For homomorphisms, we have

ϕ(tA (a1 , . . . , an ))
= ϕ(fiA (tA A
1 , . . . , tni )(a1 , . . . , an ))
= ϕ(fi (t1 (a1 , . . . , an ), . . . , tA
A A
ni (a1 , . . . , an )))
= fi (ϕ(t1 (a1 , . . . , an )), . . . , ϕ(tA
B A
ni (a1 , . . . , an )))
= fiB (tB 1 (ϕ(a 1 ), . . . , ϕ(an )), . . . , tB
ni (ϕ(a1 ), . . . , ϕ(an )))
B B B
= fi (t1 , . . . , tni )(ϕ(a1 ), . . . , ϕ(an ))
= tB (ϕ(a1 ), . . . , ϕ(an )).

This shows that (i) and (ii) are satisfied.

(iii) Let θ be a congruence on A, and let (a1 , b1 ), . . ., (an , bn ) be in


θ. We know from Section 3.1 that θ is the kernel of the correspond-
ing natural homomorphism, nat θ : A → A/θ. So (a1 , b1 ) ∈ θ, . . .,
(an , bn ) ∈ θ means (a1 , b1 ) ∈ ker natθ, . . ., (an , bn ) ∈ ker natθ, and then
natθ(a1 ) = natθ(b1 ), . . ., natθ(an ) = natθ(bn ). Using (ii) on this homo-
morphism, we obtain natθ(tA (a1 , . . . , an )) = tA/θ (natθ(a1 ), . . . , natθ(an )) =
tA/θ (natθ(b1 ), . . . , natθ(bn )) = natθ(tA (b1 , . . . , bn )). From this we conclude
that (tA (a1 , . . . , an ), tA (b1 , . . . , bn )) ∈ θ.

5.3 Polynomials and Polynomial Operations


In this section we define polynomials, with corresponding polynomial oper-
ations on an algebra. Like terms, polynomials are expressions in a formal
86 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

language, composed inductively from variables and operation symbols; the


difference is that for polynomials we are also allowed to use constants. Thus
we introduce a third set of objects, the set of constants, to our language.

There is one subtlety here that the reader should notice. We are defining
terms and polynomials in this chapter in a formal or general way, based only
on a type, and not on any specific algebra. Thus we should have one set of
constants, to be used in the formation of all polynomials of the given type.
However, when we consider the induced polynomial operations on a given
algebra A, we usually want the constants in our polynomials to represent
specific elements of our base set A. There are several ways to deal with this
obstacle. We will proceed by fixing one set A of constant symbols to be used
for all polynomials. Another approach is to associate to every algebra A of
type τ a corresponding set A of constants, which has the same cardinality as
the universe set A. This approach was used by Denecke and Leeratanavalee
in [30] and [31].

Let A be our set of constant symbols, pairwise disjoint from both the set X of
variables and the set {fi | i ∈ I} of operation symbols. We define polynomials
of type τ over A (for short, polynomials) via the following inductive steps:

(i) If x ∈ X, then x is a polynomial.

(ii) If a ∈ A, then a is a polynomial.

(iii) If p1 , . . . , pni are polynomials and fi is an ni -ary operation symbol,


then fi (p1 , . . . , pni ) is a polynomial.

(iv) The set Pτ (X, A) of all polynomials of type τ over A is the smallest
set which contains X ∪ A and is closed under finite application of (iii).

Much of the work we did in Section 5.1 for terms can now be carried out
for polynomials. We can define the polynomial algebra Pτ (X, A) of type τ
over A, generated by X ∪ A, and prove results similar to 5.1.6 and 5.1.7. We
leave the verification as an exercise for the reader.
The next step is to make polynomial operations over an algebra A out of
our formal polynomials. We proceed as for terms, with the addition that
we interpret the constant symbols from A by elements selected from A as
nullary operations. In this case we assume that |A| ≥ |A|, and consider a
subset A1 ⊆ A with |A1 | = |A|. Then just as for terms we obtain for every
5.3. POLYNOMIALS AND POLYNOMIAL OPERATIONS 87

polynomial p of type τ over A an induced polynomial operation pA , induced


by the algebra A. Let Pτ (X, A)A be the set of all polynomial operations
produced in this way. We have the following analogue of Theorem 5.2.3:

Theorem 5.3.1 Pτ (X, A)A is a clone, and is generated by the set {fiA |i ∈
I} ∪ {ca |a ∈ A}, where ca is the nullary operation which selects a ∈ A. We
write Pτ (X, A)A = h{fiA |i ∈ I} ∪ {ca |a ∈ A}i.
We leave it as an exercise for the reader to prove that every polynomial
operation of an algebra A is compatible with any congruence relation on A.
In Theorem 1.4.5 we saw that an equivalence relation on an algebra A is a
congruence iff it is compatible with all translations on the algebra. Since such
translations are in fact just unary polynomial operations on the algebra, we
have the following useful result.

Lemma 5.3.2 An equivalence relation θ on an algebra A is a congruence


relation on A iff θ is compatible with all unary polynomial operations
on A.
In Remark 1.4.9 we gave a characterization of the congruence on A generated
by a binary relation on the set A. This characterization can also be rephrased
in terms of unary polynomial operations. We shall make use of the following
result in Chapters 11 and 12.

Lemma 5.3.3 Let A be an algebra and let % be a binary relation on the


set A, so that % ⊆ A2 . Assume that % is reflexive and symmetric on A. Let
h%iConA be the congruence generated by %. Then (u, v) ∈ h%iConA if and only
if there are pairs (a1 , b1 ), . . ., (ak , bk ) ∈ % and unary polynomial operations
pA A
1 , . . ., pk of A such that

u = pA1 (a1 ),
pA
i (b A
i = pi+1 (ai+1 ),
) for 1 ≤ i < k,
pA
k (bk ) = v.

Proof: We define a set θ, by

θ := {(u, v)|u, v ∈ A and ∃k ∈ N and ∃ (a1 , b1 ), . . . , (ak , bk ) ∈ % and


∃pA A
1 , . . . , pk ∈ P
(A) (A) such that u = pA (a ), pA (b ) = pA (a
1 1 i i i+1 i+1 ) for
A
1 ≤ i < k, pk (bk ) = v}.
88 CHAPTER 5. TERMS, TREES, AND POLYNOMIALS

Then θ is an equivalence relation with % ⊆ θ ⊆ h%iConA . If we can show that


θ is compatible with all unary polynomial operations of A then by Lemma
5.3.2 the relation θ is a congruence, and therefore θ = h%iConA . Let pA be
a unary polynomial operation of A and let (u, v) ∈ θ. Then by definition of
θ there is a natural number k, there are elements (a1 , b1 ), . . . , (ak , bk ) ∈ %
and unary polynomial operations pA A A
1 , . . ., pk of A such that u = p1 (a1 ),
A A A
pi (bi ) = pi+1 (ai+1 ) for 1 ≤ i < k and pk (bk ) = v. Then we have also pA (u)
= pA (pA A A A A A A
1 (a1 )), p (pi (bi )) = p (pi+1 (ai+1 )) for 1 ≤ i < k and p (pk (bk )) =
pA (bk ). Since the composition of two unary polynomial operations of A is
again a unary polynomial operation of A, we have (pA (u), pA (v)) ∈ θ, and
θ is a congruence relation on A.

5.4 Exercises
L L L L
5.4.1. Let L = ({a, b, c, d}; ∧, ∨) be a lattice, with operations ∧ and ∨ given
by the following Cayley tables:

L L
∧ a b c d ∨ a b c d
a a a a a a a a c d
b a b a b b b b d d
c a a c c c c d c d
d a b c d d d d d d

Let h be the binary operation on the set L given by the Cayley table:

h a b c d
a a b a b
b b b b b
c a b a b
d b b b b

Is h a term operation or a polynomial operation on L?

5.4.2. Let L be the lattice from Exercise 5.4.1. Let f be a function f :


{x, y, z} → {a, b, c, d}, with x 7→ b, y 7→ c, z 7→ c. Let Y be the set {x, y, z}.
By Theorem 5.1.7, there is a unique extension fˆ of f to the term algebra
5.4. EXERCISES 89

L L L
Fτ (Y ). Calculate fˆ(t) for the terms t = x ∧ y and t = (x ∨ y) ∧ z.

5.4.3. Determine all the term operations and all the polynomial operations
of the algebra (N; ¬), where ¬x := x + 1.

5.4.4. Show that if Y and Z are non-empty sets with |Y | ≤ |Z|, then the
algebra Fτ (Y ) can be embedded in Fτ (Z) in a natural way. (One algebra
can be embedded in another if the second contains an isomorphic copy of
the first.)

5.4.5. Prove Theorem 5.3.1.

5.4.6. Prove that the polynomial algebra of type τ over A, from Theorem
5.3.1, satisfies properties similar to those of Lemma 5.1.6 and Theorem 5.1.7.

5.4.7. Prove that all polynomial operations on an algebra A are compati-


ble with all congruence relations on A. That is, prove that for θ a congru-
ence on A and pA a polynomial operation, if (a1 , b1 ), . . ., (an , bn ) ∈ θ, then
(pA (a1 , . . . , an ), pA (b1 , . . . , bn )) ∈ θ.
90
Chapter 6

Identities and Varieties

Our motivation for defining terms and polynomials was to use them to de-
fine equations and identities. An equation is a statement of the form t1 ≈ t2 ,
where t1 and t2 are terms. We will define what it means for such an equa-
tion to be satisfied, or to be an identity, in an algebra A. The relation of
satisfaction, of an equation by an algebra, will give us a Galois-connection
between sets of equations and classes of algebras, and allow us to consider
classes of algebras which are defined by sets of equations. Finally, we show
that such equational classes, or model classes, are precisely the same classes
of algebras as those we are interested in from the algebraic approach of the
first four chapters.

6.1 The Galois Connection (Id, Mod)


We begin by defining formally what is meant by satisfaction of an identity by
an algebra. Recall that we have a fixed set X = {x1 , x2 , x3 , . . .} of variable
symbols.

Definition 6.1.1 An equation of type τ is a pair of terms (s, t) from Wτ (X);


such pairs are more commonly written as s ≈ t. Such an equation s ≈ t is
said to be an identity in the algebra A of type τ if sA = tA , that is, if the
term operations induced by s and t on the algebra A are equal. In this case
we also say that the equation s ≈ t is satisfied or modelled by the algebra A,
and we write A |= s ≈ t.

Remark 6.1.2 Let A be an algebra and s ≈ t an equation of type τ . Let

91
92 CHAPTER 6. IDENTITIES AND VARIETIES

Xn = {x1 , . . . , xn } be the set of variables occurring in the equation s ≈ t.


Recall from Chapter 5 that every map f : Xn → A has a unique extension
fˆ : Fτ (Xn ) → A to the free term algebra on Xn . Then the equation sA
= tA means that for every mapping f : Xn → A, we have fˆ(s) = fˆ(t) or
(s, t) ∈ ker fˆ. That is, the pair (s, t) must belong to the intersection of the
kernels of all these mappings fˆ. Thus an identity s ≈ t holds in an algebra
A (or in a class K of algebras), iff (s, t) is in the intersection of the kernels
of fˆ, for every map f : X → A (for every algebra A in K). We shall make
frequent use in this chapter of this characterization of identities.

We now consider the class Alg(τ ) of all algebras of type τ , and the class
Wτ (X) × Wτ (X) of all equations of type τ . Satisfaction of an equation by an
algebra gives us a fundamental relation between these two sets. Formally, we
have the relation |= of all pairs (A, s ≈ t) for which A |= s ≈ t. As discussed
in Section 2.2, this relation induces a Galois-connection between Alg(τ ) and
Wτ (X) × Wτ (X). We will use the names Id and M od for the two associated
mappings. That is, for any subset Σ ⊆ Wτ (X) × Wτ (X) and any subclass
K ⊆ Alg(τ ) we define

M odΣ := {A ∈ Alg(τ ) | ∀s ≈ t ∈ Σ, (A |= s ≈ t)} and

IdK := {s ≈ t ∈ Wτ (X)2 | ∀A ∈ K, (A |= s ≈ t)}.


Then the pair (Id, M od) is the Galois-connection induced by the satisfaction
relation |=. This gives us the various properties, common to any Galois-
connection, which we proved in Theorem 2.2.4. Since these properties will
be needed for our further work in this chapter, we list them again here for
our new example of a Galois-connection.

Theorem 6.1.3 Let τ be a fixed type.

(i) For all subsets Σ and Σ0 of Wτ (X) × Wτ (X), and for all subclasses K
and K 0 of Alg(τ ), we have
0 0
Σ ⊆ Σ ⇒ M odΣ ⊇ M odΣ and K ⊆ K 0 ⇒ IdK ⊇ IdK 0 ;

(ii) For all subsets Σ of Wτ (X) × Wτ (X) and all subclasses K of Alg(τ ),
we have Σ ⊆ IdM odΣ and K ⊆ M odIdK;

(iii) The maps IdM od and M odId are closure operators on Wτ (X)×Wτ (X)
and on Alg(τ ), respectively.
6.1. THE GALOIS CONNECTION (ID, MOD) 93

(iv) The sets closed under M odId are exactly the sets of the form M odΣ,
for some Σ ⊆ Wτ (X) × Wτ (X), and the sets closed under IdM od are
exactly the sets of the form IdK, for some K ⊆ Alg(τ ).

Proof: (i) and (ii) correspond to the definition of a Galois-connection, while


(iii) and (iv) correspond to Theorem 2.2.4, parts (ii) and (iii).

Definition 6.1.4 A class K ⊆ Alg(τ ) is called an equational class, or is


said to be equationally definable, if there is a set Σ of equations such that K
= M odΣ. A set Σ ⊆ Wτ (X) × Wτ (X) is called an equational theory if there
is a class K ⊆ Alg(τ ) such that Σ = IdK.

From Theorem 6.1.3, part (iv), we see that the equational classes are ex-
actly the closed sets, or fixed points, with respect to the closure operator
M odId, and, dually, the equational theories are exactly the closed sets, or
fixed points, with respect to the closure operator IdM od. As we mentioned
in Chapter 2, the collections of such closed sets form complete lattices.

Theorem 6.1.5 The collection of all equational classes of type τ forms a


complete lattice L(τ ), and the collection of all equational theories of type τ
forms a complete lattice E(τ ). These lattices are dually isomorphic: there
exists a bijection ϕ : L(τ ) → E(τ ), satisfying ϕ(K1 ∨ K2 ) = ϕ(K1 ) ∧ ϕ(K2 )
and ϕ(K1 ∧ K2 ) = ϕ(K1 ) ∨ ϕ(K2 ).

Proof: As remarked above, the fact that these collections are complete lat-
tices follows from the general results on Galois-connections and closure op-
erators in Chapter 2. For arbitrary subclasses K of L(τ ), the infimum of
K is the set-theoretical intersection, ∧K = ∩K, and the supremum is the
variety which is generated by the set-theoretical union, so K = {K 0 ∈
W T

L(τ )|K 0 ⊇ K}. Again by Theorem 2.1.6 we get K = Mod Id ( K).


S W S

The meet and join for E(τ ) are obtained similarly.

Now we define a mapping ϕ : L(τ ) → E(τ ) by K 7→ IdK for every equational


class K of L(τ ). By Definition 6.1.4, the image IdK is an equational theory
of type τ ; so the mapping ϕ is well defined. By Definition 6.1.4 and Theorem
6.1.3 (iv), we know that ϕ is also surjective. Furthermore, if IdK1 = IdK2
then we have K1 = M odIdK1 = M odIdK2 = K2 , using Theorem 6.1.3 (iii)
and the fact that equational classes are exactly the fixed points with respect
to the closure operator M odId. Therefore, ϕ is a bijection on L(τ ).
94 CHAPTER 6. IDENTITIES AND VARIETIES

Next we show that our map ϕ has the property claimed on meets. We have
ϕ(K1 ∧ K2 ) = Id(K1 ∩ K2 ), by definition. Since the operator Id reverses
inclusions, our set Id(K1 ∩ K2 ) contains both IdK1 and IdK2 . Since it is
an equational theory, it also contains their join, IdK1 ∨ IdK2 , which equals
ϕ(K1 ) ∨ ϕ(K2 ). This gives us the inclusion ϕ(K1 ∧ K2 ) ⊇ ϕ(K1 ) ∨ ϕ(K2 ).

For the opposite inclusion, we start with the fact that for each i = 1, 2,
we have IdKi ⊆ IdK1 ∨ IdK2 . Applying the operator M od to this, and
using the fact that Ki = M odIdKi for equational classes, we have Ki ⊇
M od(IdK1 ∨ IdK2 ). Thus we have

M od(IdK1 ∨ IdK2 ) ⊆ K1 ∩ K2 .

Applying Id to this gives

IdM od(IdK1 ∨ IdK2 ) ⊇ Id(K1 ∩ K2 ).

But IdK1 ∨ IdK2 is an equational theory, and therefore it is closed under


IdM od. This gives us

Id(K1 ∩ K2 ) ⊆ IdK1 ∨ IdK2 .

Thus we have our inclusion ϕ(K1 ∧ K2 ) ⊆ ϕ(K1 ) ∨ ϕ(K2 ), and hence the
equality we needed.

Finally, we verify the claim for joins. Since ϕ reverses inclusions, the inclu-
sion Ki ⊆ K1 ∨K2 , for i = 1, 2, implies ϕ(K1 ∨K2 ) ⊆ ϕ(Ki ), for i = 1, 2. This
gives us one direction, namely that ϕ(K1 ∨K2 ) ⊆ ϕ(K1 )∧ ϕ(K2 ). Conversely,
we know that IdK1 ∧ IdK2 = IdK1 ∩ IdK2 ⊆ IdKi , for i = 1, 2; so applying
M od gives M od(IdK1 ∧ IdK2 ) ⊇ M odIdKi , for i = 1, 2. Since K1 and K2
are closed under M odId, we get M od(IdK1 ∧ IdK2 ) ⊇ K1 ∨ K2 . Applying Id
once more gives IdM od(IdK1 ∧IdK2 ) ⊆ Id(K1 ∨K2 ). But IdK1 ∧IdK2 is an
equational theory and closed under IdM od; so finally we have IdK1 ∧ IdK2
⊆ Id(K1 ∨ K2 ). This amounts to ϕ(K1 )∧ ϕ(K2 ) ⊆ ϕ(K1 ∨ K2 ). This com-
pletes our proof of the equality ϕ(K1 ) ∩ ϕ(K2 ) = ϕ(K1 ∨ K2 ).

We point out that in fact the claims of Theorem 6.1.5 are true for the lat-
tices of closed sets of any Galois-connection. A careful reading of the previous
proof will show that we used only properties of the Galois connection, and
not any properties of the particular connection Id − M od.
6.2. FULLY INVARIANT CONGRUENCE RELATIONS 95

If K is an arbitrary class of algebras, then its closure under the closure


operator M odId is the equational class M odIdK. This class is called the
equational class generated by K, and usually denoted by E(K) := M odIdK.

Remark 6.1.6 We have developed our Galois-connection (Id, M od) in this


section using the countably infinite alphabet X of variable symbols for terms.
Since any individual identity s ≈ t contains at most finitely many variables,
it is clear that we need at most a countably infinite alphabet to discuss
equational classes and equational theories. Moreover, since absolutely free
algebras on different sets of the same cardinality are isomorphic, by Theorem
5.1.8, it is enough to use the set X. However, we could of course carry out
the same process for any set Y of variables, with the same theorems holding.
This will be significant in Sections 6.4 and 6.5, where we will want to consider
free algebras and sets of identities on arbitrary sets Y .

6.2 Fully Invariant Congruence Relations


As we saw in the previous section, for a given type τ the collection of all
equational theories of type τ forms a complete lattice, obtained as the lattice
of closed sets from our Galois-connection. In this section we characterize
such equational theories, using the concept of a fully invariant congruence
relation.

Definition 6.2.1 A congruence relation θ on an algebra A of type τ is said


to be fully invariant if whenever (x, y) ∈ θ, we also have (ϕ(x), ϕ(y)) ∈ θ,
for every endomorphism ϕ of A; that is, if θ is compatible with all endomor-
phisms ϕ of A.

Theorem 6.2.2 Let Σ ⊆ Wτ (X) × Wτ (X) be a set of equations of type τ .


Then Σ is an equational theory if and only if it is a fully invariant congruence
relation on the term algebra Fτ (X).

Proof: If Σ is an equational theory of type τ , then there exists a class


K ⊆ Alg(τ ) of algebras of type τ such that Σ = IdK. The set IdK is a binary
relation on the set Wτ (X). For every term t ∈ Wτ (X) we have t ≈ t ∈ IdK,
since of course tA = tA for any algebra A in the class K. The symmetry and
transitivity of the binary relation IdK are similarly easy to verify, so that Σ
= IdK is at least an equivalence relation on Wτ (X). For the congruence prop-
erty, suppose that s1 ≈ t1 , . . . , sni ≈ tni are identities satisfied in K. This
96 CHAPTER 6. IDENTITIES AND VARIETIES

means that sA A A A
1 = t1 , . . . , sni = tni for every algebra A from K. Let fi be an
ni -ary operation symbol. Then our assumption means that fiA (sA A
1 , . . . , sni ) =
A A
fi (t1 , . . . , tni ). By the inductive step of the definition of a term operation in-
duced by a term, this means that [fi (s1 , . . . , sni )]A = [fi (t1 , . . . , tni )]A , and
the definition of satisfaction gives fi (s1 , . . . , sni ) ≈ fi (t1 , . . . , tni ) ∈ IdA.
Thus fi (s1 , . . . , sni ) ≈ fi (t1 , . . . , tni ) ∈ IdK, as required for a congruence.
For the fully invariant property, we have to show that IdK is preserved by
an arbitrary endomorphism ϕ of Fτ (X). So we take (s, t) ∈ IdK = Σ, and
show that (ϕ(s), ϕ(t)) is also in Σ. For this we use the property from Re-
mark 6.1.2, that Σ = IdK is equal to the intersections of the kernels of the
homomorphisms fˆ, for all maps f : X → A and all algebras A in K. But
for any such A and map f , the map fˆ ◦ ϕ is also a homomorphism from
Fτ (X) to A, and is the extension of some map g from X to A. Thus our pair
(s, t) from Σ must also be in the kernel of this new homomorphism fˆ ◦ ϕ.
This means precisely that the pair (ϕ(s), ϕ(t)) must be in ker fˆ. Since this
is true for all algebras A and maps f : X → A, we have (ϕ(s), ϕ(t)) in Σ.
This shows that Σ is a fully invariant congruence.

Conversely, assume that θ is a fully invariant congruence relation on Fτ (X).


We show that θ is the equational theory IdK, for K the class consisting of
the quotient algebra Fτ (X)/θ.

First, let s ≈ t ∈ IdFτ (X)/θ. Consider the mapping f : X → Fτ (X)/θ de-


fined by f (x) = [x]θ for all x ∈ X. Since Fτ (X)/θ is an algebra of type τ , this
mapping f has a unique homomorphic extension fˆ : Fτ (X) → Fτ (X)/θ. But
the natural homomorphism natθ : Fτ (X) → Fτ (X)/θ has the same prop-
erty, so the uniqueness of fˆ gives fˆ = natθ. Now our assumption that s ≈ t
is an identity in Fτ (X)/θ means, by Remark 6.1.2, that (s, t) is in kerfˆ.
This gives natθ(s) = natθ(t), so (s, t) ∈ θ. Thus we have Id(Fτ (X)/θ) ⊆ θ.

For the opposite inclusion, we use Remark 6.1.2 again; it will suffice to show
that for any map f : X → Fτ (X)/θ, we have θ ⊆ ker fˆ, where as usual fˆ is
the unique extension of f . To show this, we first define a map g : X → Fτ (X)
by x 7→ s in such a way that we assign to x a representative s ∈ fˆ(x) = [x]θ .
Then we get the commutative diagram shown below, with ϕ the natural
embedding. Combining ĝ ◦ ϕ = g and natθ ◦ g = fˆ ◦ ϕ gives us natθ ◦ ĝ ◦ ϕ
= fˆ ◦ ϕ, and since ϕ is an embedding we have natθ ◦ ĝ = fˆ. Now for any
(s, t) ∈ θ, the full invariance of θ means that (ĝ(s), ĝ(t)) ∈ θ. Therefore
fˆ(s) = [ĝ(s)]θ = [ĝ(t)]θ = fˆ(t), and (s, t) ∈ kerfˆ.
6.3. THE ALGEBRAIC CONSEQUENCE RELATION 97

X g Fτ (X) natθ Fτ (X)/θ

6
(=) (=)

ϕ fˆ

w /
Fτ (X)

As a consequence of this theorem, we see that the set of all fully invari-
ant congruences on the free algebra Fτ (X) forms a complete lattice, called
Conf i Fτ (X), and that this lattice is a sublattice of the congruence lattice
ConFτ (X).

6.3 The Algebraic Consequence Relation


In this section we define a consequence relation on sets of equations. There
are two ways we can do this, which turn out to be equivalent. The first
approach is a logical one, based on deductions of some equations from others
according to certain rules of deduction. We write Σ ` s ≈ t, read as “Σ yields
s ≈ t,” if there is a formal deduction of s ≈ t starting with identities in Σ,
using the following five rules of consequence or derivation or deduction rules:

(1) ∅ ` s ≈ s,
(2) {s ≈ t} ` t ≈ s,
(3) {t1 ≈ t2 , t2 ≈ t3 } ` t1 ≈ t3 ,
0 0 0
(4) {tj ≈ tj : 1 ≤ j ≤ ni } ` fi (t1 , . . . , tni ) ≈ fi (t1 , . . . , tni ), for every
operation symbol fi (i ∈ I) (the replacement rule),
(5) Let s, t, r ∈ Wτ (X) and let s̃, t̃ be the terms obtained from s, t by
replacing every occurrence of a given variable x ∈ X by r. Then s ≈
t ` s̃ ≈ t̃. (This is called the substitution rule.)
98 CHAPTER 6. IDENTITIES AND VARIETIES

It should be clear that these rules (1) - (5) reflect the properties of a fully
invariant congruence relation: the first three are the properties of an equiva-
lence relation, the fourth describes the congruence property and the fifth the
fully invariant property. Thus by Theorem 6.2.2 equational theories Σ are
precisely sets of equations which are closed with respect to finite application
of the rules (1) - (5). This equational approach will be used in Chapter 7,
when we study term-rewriting systems.

The second approach to consequences is a more algebraic one. Here we say


that s ≈ t follows from a set Σ of identities if s ≈ t is satisfied as an identity
in every algebra A of type τ in which all equations from Σ are satisfied as
identities. In this case we write Σ |= s ≈ t.
The connection between these approaches is given by the Completeness and
Consistency Theorem of equational logic: For Σ ⊆ Wτ (X) × Wτ (X) and
s ≈ t ∈ Wτ (X) × Wτ (X), we have
Σ |= s ≈ t ⇔ Σ ` s ≈ t.
The “⇒”-direction of this theorem is usually called completeness, since it
means that any equation which is “true” is derivable; the “⇐”-direction is
called consistency, in the sense that every derivable equation is “true.” The
equivalence here follows directly from Theorem 6.2.2.

6.4 Relatively Free Algebras


Let Y be any non-empty set of variable symbols, and let Fτ (Y ) be the
absolutely free algebra of type τ generated by Y . Let K ⊆ Alg(τ ) be a
class of algebras, and let IdK be the set of all identities on the alphabet Y
satisfied in K (that is, satisfied in every algebra A of K). Since IdK is a (fully
invariant) congruence relation on the free algebra Fτ (Y ), we can form the
quotient algebra Fτ (Y )/IdK, as we did in the proof of Theorem 6.2.2. This
quotient of the absolutely free algebra also has some “freeness” properties,
and is called the relatively free algebra, with respect to the class K, over the
set Y . Notice that since the absolutely free algebra Fτ (Y ) is generated by
Y , the quotient algebra is generated by the image Y of Y under the natural
homomorphism natIdK.

Definition 6.4.1 Let Y be a non-empty set of variables, and K be a class


of algebras of type τ . The algebra FK (Y ) := Fτ (Y )/IdK is called the K-free
algebra over Y or the free algebra relative to K generated by Y = Y /IdK.
6.4. RELATIVELY FREE ALGEBRAS 99

Remark 6.4.2 1. Since FK (Y ) is generated by the set Y = Y /IdK =


{[y]IdK |y ∈ Y }, instead of FK (Y ) we should actually write FK (Y ); but
for notational convenience this is not usually done.

2. FK (Y ) exists iff Fτ (Y ) exists iff Y 6= ∅. Thus we have a K-free algebra of


type τ over any non-empty set Y .

6 0, then FK (Y ) ∼
3. If |Y | = |Z| = = FK (Z) under an isomorphism mapping
Y to Z. The proof of this fact is similar to the proof of Theorem 5.1.8. This
means that (up to isomorphism) only the cardinality of the generating set
Y is important, and not the particular choice of variable symbols.

4. In the case that Y is the set Xn = {x1 , . . . , xn }, we will write FK (n)


instead of FK (Xn ). In this case, our algebra is called the K-free algebra on
n generators.

The algebra FK (Y ) also satisfies a relative “freeness” property corresponding


to the absolutely free property of Theorem 5.1.7:

Theorem 6.4.3 Let Y be any non-empty set of variables. For every algebra
A ∈ K ⊆ Alg(τ ) and every mapping f : Y → A, there exists a unique
homomorphism fˆ : FK (Y ) → A which extends f .

Proof: Let A be in K, with a map f : Y → A. Let ϕ be the inclusion map-


ping from Y to the absolutely free algebra Fτ (Y ). By the freeness of this lat-
ter algebra, Theorem 5.1.7, there is a unique homomorphism f : Fτ (Y ) → A,
which extends f , so that f ◦ϕ = f . Now consider the homomorphism nat IdK
from Fτ (Y ) to FK (Y ). Its kernel is precisely IdK, and by Remark 6.1.2 this
is contained in kerf . Since we have two homomorphisms defined on Fτ (Y )
with the kernel of one contained in the kernel of the other, we can use the
General Homomorphism Theorem to conclude that there is a homomorphism
fˆ from FK (Y ) to A, with fˆ ◦ nat IdK = f . (See the diagram below.) More-
over fˆ ◦ natIdK|Y = fˆ ◦ natIdK ◦ ϕ = fˆ ◦ ϕ = f , and fˆ extends f . By the
homomorphism theorem fˆ is uniquely determined by K and f .
100 CHAPTER 6. IDENTITIES AND VARIETIES

f
-
Y A

KA
A
A
A
ϕ f A fˆ
A
A
A
U A
A
natIdK
Fτ (Y ) - FK (Y )

The free algebra with respect to a class K is in fact uniquely determined


(up to isomorphism) by the relative freeness property expressed in Theorem
6.4.3.

Theorem 6.4.4 Let K be a class of algebras of type τ . Let F be an algebra


in K with the properties that F is generated by a subset Y ⊆ F and that
for any algebra A ∈ K and for any mapping f : Y → A there exists a
homomorphism fˆ : F → A extending f . Then F is isomorphic to FK (Y ).

Proof: Let ϕ be the inclusion mapping of Y into F . Since F is an algebra


of type τ , by Theorem 5.1.7 the mapping ϕ : Y → F can be extended to a
unique homomorphism ϕ : Fτ (Y ) → F . This homomorphism is surjective,
since its image contains ϕ(Y ), which generates F. Thus by the Homomorphic
Image Theorem, we see that F is isomorphic to the quotient Fτ (Y )/ker ϕ.
Since our relatively free algebra is just Fτ (Y )/IdK, it will suffice to prove
that kerϕ = IdK.

First, since F ∈ K we have IdK ⊆ ker ϕ, by Remark 6.1.2. Conversely, to


show that kerϕ ⊆ IdK, we use Remark 6.1.2 again, and take any algebra
A in K and any map f : Y → A. Using the inclusion map ϕ : Y → F and
the freeness assumption on F, we see that there is a unique homomorphism
fˆ : F → A, extending f . But then fˆ ◦ ϕ is a uniquely determined homomor-
phism from Fτ (Y ) to A, and we have kerϕ ⊆ ker(fˆ ◦ ϕ). This shows that
kerϕ is contained in IdK, which completes our proof.

Theorem 6.4.4 means that free algebras relative to K ⊆ Alg(τ ) can be char-
acterized and therefore defined by the freeness property from Theorem 6.4.3.
6.5. VARIETIES 101

We conclude this section with an example of a relatively free algebra which


is very useful in automata theory and other areas of theoretical Computer
Science, and which we shall use in Chapter 8.

Example 6.4.5 Let τ be the type (2), so that we have one binary operation
symbol which we will denote by f . Let K be the class of all semigroups, that
is, of type (2) algebras which satisfy the associative identity, and let Y be any
non-empty set of variables. We consider the K-free algebra FK (Y ) over Y . It
is customary in this case to indicate the binary operation f by juxtaposition,
writing xy for the term f (x, y). Moreover, since x(yz) ≈ (xy)z is an identity
of K, we see that any two terms in Wτ (Y ) which have the same variables
occurring in the same order are equivalent under the congruence IdK. This
means that we can write any term in a normal form in which we omit the
brackets. For example, the term f (f (f (x, y), f (y, x)), f (z, y)) can be written
as xyyxzy. We refer to terms in this normal form as words on the alphabet
Y . It is easy to verify that the set of all such words forms a semigroup under
the operation of concatenation of words, called the free semigroup on Y , and
usually denoted by Y + . By adjoining an empty word e to Y + to act as an
identity element, we can also form the free monoid Y ∗ on the alphabet Y .

6.5 Varieties
In this section we link together our two approaches to classes of algebras, the
equational approach from the preceding sections and the algebraic approach
from Chapters 1,3 and 4. We introduce operators H, S and P on classes
of algebras, corresponding to the algebraic constructions of homomorphic
images, subalgebras and product algebras studied earlier. A class of algebras
which is closed under these operators is called a variety. Our main theorem
in this section will show that in fact varieties are equivalent to equational
classes.

Definition 6.5.1 We define the following operators on the set Alg(τ ) of all
algebras of a fixed type τ . For any class K ⊆ Alg(τ ),

S(K) is the class of all subalgebras of algebras from K,


H(K) is the class of all homomorphic images of algebras from K,
P(K) is the class of all direct products of families of algebras from K,
102 CHAPTER 6. IDENTITIES AND VARIETIES

I(K) is the class of all algebras which are isomorphic to algebras from K,
PS (K) is the class of all subdirect products of families of algebras from K.

We can also combine these operators to produce new ones; we write IP for
instance for the composition of I and P. Recall from Chapter 2 that an
operator is a closure operator if it is extensive, monotone and idempotent.
We first verify that some of our operators are in fact closure operators.

Lemma 6.5.2 The operators H, S and IP are closure operators on the set
Alg(τ ).

Proof: We will give a proof only for H; the others are quite similar. It is clear
from the definition that for any subclasses K and L of Alg(τ ), the inclusion
K ⊆ L implies H(K) ⊆ H(L), and since any algebra is a homomorphic im-
age of itself under the identity homomorphism, we always have K ⊆ H(K).
For the idempotency of H, we note that by the extensivity and monotonicity
we have H(K) ⊆ H(H(K)). Conversely, let A be in H(H(K)). Then there
exists an algebra B ∈ H(K) and a surjective homomorphism ϕ : B → A. For
B ∈ H(K) there exists an algebra C ∈ K and a surjective homomorphism
ψ : C → B. Then the composition ϕ ◦ ψ : C → A is also a surjective homo-
morphism, and thus A ∈ H(K).

Note however that the operator P is not a closure operator, since it is not
idempotent: A1 × (A2 × A3 ) is not equal to (A1 × A2 ) × A3 , although they
are isomorphic.

Definition 6.5.3 A class K ⊆ Alg(τ ) is called a variety if K is closed under


the operators H, S and P; that is, if H(K) ⊆ K, S(K) ⊆ K and P(K) ⊆ K.

We investigate now the properties of these operators, especially how they


combine with each other.

Lemma 6.5.4 Let K be a class of algebras of type τ . Then

(i) SH(K) ⊆ HS(K),

(ii) PS(K) ⊆ SP(K),

(iii) PH(K) ⊆ HP(K).


6.5. VARIETIES 103

Proof: (i) Let A be an element of SH(K). Then A is a subalgebra of an


algebra B, which in turn is a homomorphic image of an algebra C in K,
under a surjective homomorphism ϕ : C → B. Since A ⊆ B, by Theo-
rem 3.1.3 the preimage ϕ−1 (A) is a subalgebra of C. This preimage satisfies
ϕ(ϕ−1 (A)) = A, making A a homomorphic image of the subalgebra ϕ−1 (A)
of C. Therefore A ∈ HS(K).

(ii) If A ∈ PS(K) then A = Π Bj for some algebras Bj in S(K), and for


j∈J
each j ∈ J there is an algebra Cj in K with Bj ⊆ Cj . Since Π Bj is then
j∈J
a subalgebra of Π Cj , we have A ∈ SP(K).
j∈J

(iii) If A ∈ PH(K) then A = Π Bj for some algebras Bj ∈ H(K), and


j∈J
for each j ∈ J there is an algebra Cj in K and a surjective homomorphism
ϕj : Cj → Bj . For the projection mapping p` : Π Cj → C` , the composition
j∈J
ϕj ◦ p` : Π Cj → B` is a surjective homomorphism. By Remark 4.1.2 there
j∈J
exists a homomorphism Π Cj → Π Bj which is surjective. This makes
j∈J j∈J
A= Π Bj ∈ HP(K).
j∈J

Using this Lemma we can obtain a characterization of the smallest variety


to contain a class K of algebras.

Theorem 6.5.5 For any class K of algebras of type τ , the class HSP(K)
is the least (with respect to set inclusion) variety which contains K.

Proof: We show first that HSP(K) is indeed a variety, that is, that it is
closed under application of H, S and P. We have H(HSP(K)) = HSP(K)
by the idempotence of H, and S(HSP(K)) ⊆ H(SSP(K)) = HSP(K) by
Lemma 6.5.4 (i) and the idempotence of S. For P, we have P(HSP(K)) ⊆
(HPSP(K)) ⊆ HSPP(K) ⊆ HSIPIP(K) = HSIP(K) ⊆ HSHP(K) ⊆
HHSP(K) = HSP(K), using properties from 6.5.2 and 6.5.4.

Thus HSP(K) is a variety. Now let K 0 be any variety which contains K.


Then HSP(K) ⊆ HSP(K 0 ) ⊆ K 0 , since as a variety K 0 is closed under all
three operators.
104 CHAPTER 6. IDENTITIES AND VARIETIES

For any class K of algebras of the same type, the variety HSP(K) from
Theorem 6.5.5 is called the variety generated by K. It is often denoted by
V (K). When K consists of a single algebra A, we usually write V (A) for the
variety generated by K.

By Theorem 6.5.5, we have K ⊆ HSP(K) for any class K. When K is a


variety, then closure of K under each of H, S and P gives us HSP(K) ⊆
K as well, and hence HSP(K) = K for K a variety. Conversely, whenever
HSP(K) = K, we must have K a variety, as shown in the first part of the
proof of Theorem 6.5.5. This gives us the following Corollary.

Corollary 6.5.6 A class K of algebras of type τ is a variety if and only if


HSP(K) = K.

In Chapter 4 (Theorem 4.2.8) we have already remarked that every algebra


is isomorphic to a subdirect product of subdirectly irreducible algebras. Now
we can prove a stronger result, that every algebra of a variety K is isomorphic
to a subdirect product of subdirectly irreducible algebras from K.

Theorem 6.5.7 Every algebra of a variety K is isomorphic to a subdirect


product of subdirectly irreducible algebras from K.

Proof: By Theorem 4.2.8 every algebra A in K is isomorphic to a subdirect


product of some subdirectly irreducible algebras Aj . By the remark follow-
ing Theorem 4.2.4, each Aj is in fact isomorphic to a quotient algebra of A,
and is thus a homomorphic image of A. Since A ∈ K and K is a variety, we
see that Aj ∈ H(K) ⊆ K too.

Our goal is to prove that equational classes and varieties are the same thing.
The next Lemma gives one direction of this equivalence, that any equational
class of algebras is a variety.

Lemma 6.5.8 Let K be an equationally definable class of algebras of type


τ . Then K is a variety.

Proof: Let K be an equationally definable class of algebras of type τ . This


means that there exists a set Σ of equations of type τ , over the alphabet X,
such that K = M odΣ. To see that M odΣ is closed under H, S and P, let
s ≈ t ∈ Σ be any identity in Σ. Then we have sA = tA for any algebra A ∈ K.
6.5. VARIETIES 105

If B is a subalgebra of an A in K, then by Theorem 5.2.4 tB = tA |B and


therefore sB = tB ; thus s ≈ t holds in B and B ∈ K. Therefore S(K) ⊆ K.
In a similar way, using Theorem 5.2.4 (ii), we can show that H(K) ⊆ K.
Lastly, suppose Aj ∈ K and that Aj satisfies s ≈ t for each j ∈
J. Then for a1 , . . . , an ∈ C = Π Aj we have sAj (a1 (j), . . . , an (j)) =
j∈J
tAj (a1 (j), . . . , an (j)) ⇒ (sC (a1 , . . . , an ))(j) = (tC (a1 , . . . , an ))(j) for all j ∈ J
and thus sC = tC . Therefore the product C of algebras in K is in K, which
shows that P(K) ⊆ K.

Before we can show the other direction of our equivalence, we need one more
fact. We show that for any class K of algebras of type τ , and for any set Y
of variables, the free algebra FK (Y ) with respect to K belongs to ISP(K).

Theorem 6.5.9 For every class K ⊆ Alg(τ ) and every non-empty set Y of
variables, the relatively free algebra FK (Y ) is in ISP(K).
Proof: We want to use Theorem 4.2.4 to write our algebra FK (Y ) as a sub-
direct product. To do this, we first need to verify that the intersection of all
the congruences on FK (Y ) is the identity relation. First note that FK (Y ) is
the quotient Fτ (Y )/IdK. Any congruence on this algebra is the kernel of a
homomorphism onto some algebra A in K, and any such homomorphism is
the extension fˆ of some mapping f : Y → A. Thus it is enough to show that
the intersection of the kernels of all such fˆ on Fτ (Y ) is the identity relation
on Fτ (Y )/IdK. But this holds, by Remark 6.1.2 and the definition of IdK.

Now by Theorem 4.2.4 the algebra Fτ (Y )/IdK is isomorphic to a subdirect


product of algebras (Fτ (Y )/IdK)/((ker fˆ)/IdK). For each of these alge-
bras, using the Homomorphic Image Theorem and the Second Isomorphism
Theorem, and the definition of fˆ, we obtain (Fτ (Y )/IdK)/((ker fˆ)/IdK)

= Fτ (Y )/ ker fˆ ∼
= fˆ(Fτ (Y )). This last algebra is a subalgebra of the algebra
A, which is in K, putting it in S(K). Thus the relatively free algebra is iso-
morphic to a subdirect product of algebras which in turn are isomorphic to
subalgebras of K. Altogether we have Fτ (Y )/IdK = FK (Y ) ∈ ISP(IS(K)).
Using the properties of Lemma 6.5.4 (which hold for I as well as H) and
Lemma 6.5.2, we get FK (Y ) ⊆ ISP(S(K)) ⊆ ISP(K).

Since varieties are also closed under the operators I, S and P, it follows from
this theorem that every variety K contains all the relatively free algebras
FK (Y ), for each non-empty set Y .
106 CHAPTER 6. IDENTITIES AND VARIETIES

Next we show that the identities satisfied in a class K of algebras of type


τ are exactly the identities satisfied in the free algebra FK (X), where X as
usual is our countably infinite set of variables.

Lemma 6.5.10 Let K be a variety of algebras of type τ , and let s and t be


terms in Wτ (X). Then

K |= s ≈ t ⇔ FK (X) |= s ≈ t.

Proof: Since K is a variety, we know by the remark just above that the
relatively free algebra FK (X) is in K. This means that any identity of K
must in particular hold in FK (X), giving us one direction of the claim.

If conversely FK (X) satisfies s ≈ t, then sFK (X) = tFK (X) . Since FK (X) is
the quotient of Fτ (X) by IdK, this forces [s]IdK = [t]IdK ; and from this, we
have (s, t) ∈ ker natIdK = IdK and so K satisfies s ≈ t.

With these results, we are ready to prove our main theorem, sometimes
referred to as Birkhoff’s Theorem.

Theorem 6.5.11 (Main Theorem of Equational Theory) A class K of al-


gebras of type τ is equationally definable if and only if it is a variety.

Proof: We have already proved one direction of this theorem, in Lemma


6.5.8. We will now prove the converse, that any variety K is also an equa-
tional class; specifically, we will show that K equals the equational class K 0
:= M odIdK. Again by Lemma 6.5.8, this class is a variety, and we clearly
have K ⊆ M odIdK = K 0 . Applying the operator Id to K 0 = M odIdK gives
us IdK = IdM odIdK 0 = IdK 0 , since IdM od is a closure operator. But then
we have FK (Y ) = Fτ (Y )/IdK = Fτ (Y )/IdK 0 = FK 0 (Y ), for every non-
empty set Y of variables. Now let A be any algebra in K 0 , and choose a set
Y of variables such that |Y | = |A|. There will be a mapping f : Y → A
which is surjective, and we know by Theorem 6.4.3 that this mapping has
a unique extension to a homomorphism from FK 0 (Y ) to A, also surjective.
This makes A a homomorphic image of FK 0 (Y ), which in turn is equal to
FK (Y ). But this relatively free algebra is in the variety K, and K is closed
under homomorphic images, so we have A in K. This shows that K 0 ⊆ K,
and finishes our proof that K = K 0 = M odIdK.
6.5. VARIETIES 107

Note that in the proof of Theorem 6.5.11, we needed the existence of rel-
atively free algebras over non-empty variable sets of arbitrary cardinality.
However, as we commented in Chapter 5, it is in some sense sufficient to have
absolutely or relatively free algebras over variable sets of finite or countably
infinite cardinality. The following Lemma explains this more precisely.

Lemma 6.5.12 Let K be a variety of type τ and let Y be any non-empty


variable set. The relatively free algebra FK (Y ) with respect to K over Y is
isomorphic to a subdirect product of the algebras FK (E), for E ⊆ Y non-
empty and finite.
Proof: For each variable y in Y we define y := [y]IdK . For every subset
E ⊆ Y , we set E := {y | y ∈ E}. Let U(E) be the subalgebra of FK (Y )
generated by E. It is easy to see that U(E) and FK (E) are isomorphic.
Therefore, it is enough to show that FK (Y ) is isomorphic to a subdirect
product of the algebras U(E).

For each such E consider a mapping ϕE : Y → U (E) with ϕE defined to be


the identity mapping on E. This mapping ϕE has a unique homomorphic
extension ϕ̂E , which is surjective and also the identity mapping on both E
and the subalgebra U(E) it generates. Every term depends only on finitely
many variables. Therefore, for every pair s, t in FK (Y ) there exists a finite
subset E ⊆ X with s, t ∈ U (E). If s 6= t then (s, t) 6∈ ker ϕ̂E , since ϕ̂E (s) = s
and ϕ̂E (t) = t. This shows that the intersections of the kernels of all such ho-
momorphisms ϕ̂E is the identity relation on FK (Y ). This means that we can
use Theorem 4.2.4 to express our algebra FK (Y ) as isomorphic to a subdirect
product of the algebras FK (Y )/ ker(ϕ̂E ). Finally, since FK (Y )/ ker(ϕ̂E ) ∼ =
U(E), our claim is proved.

As a consequence of this theorem, we have the following useful result about


generating sets for a variety K. Any variety is generated by its relatively
free algebra on a countably infinite set of generators, or by the collection of
relatively free algebras on n generators, for each natural number n ≥ 1.

Theorem 6.5.13 For every variety K,

K = HSP({FK (n) | n ∈ N, n ≥ 1}) = HSP({FK (X)}).

Proof: As we saw following Theorem 6.5.9, the fact that K is a variety


means that it contains the relatively free algebras with respect to K on any
108 CHAPTER 6. IDENTITIES AND VARIETIES

finite or countably infinite set of variables. Thus the two generating sets are
contained in K. For the converse, let A be any algebra in K. If we choose
a set Y whose cardinality is greater than the cardinality of A, we can make
a surjective homomorphism from FK (Y ) onto A. Then using Lemma 6.5.12
we can express FK (Y ) as a subdirect product of relatively free algebras with
respect to K on sets E of finite cardinality. Thus A is a homomorphic image
of a subdirect product of the algebras in our generating set. This shows that
K is contained in HSP({FK (n) | n ∈ N}), and hence we have equality.

Finally, we know that for every n ∈ N the algebra FK (n) is isomorphic to


a subalgebra of FK (X). Thus HSP({FK (X)}) contains all the elements of
the first generating set, and so contains all of K as well.

A useful consequence of this theorem is that two varieties K and K 0 are


equal if all the free algebras FK (n) = FK 0 (n) are equal, for every natural
number n.
We conclude this section with another application of free algebras.

Definition 6.5.14 An algebra A of type τ is called locally finite if every


finitely generated subalgebra of A is finite. A class K of algebras of the
same type is called locally finite if every member of K is locally finite.

Theorem 6.5.15 A variety K is locally finite iff the relatively free algebra
FK (Y ) is finite for every non-empty finite set Y .

Proof: Since K is a variety, it contains the K-free algebra on any non-empty


set Y . Moreover, this algebra is generated by the set Y . So if K is locally
finite, then by definition any finite set Y determines a finite algebra FK (Y ).

For the converse, let A be a finitely generated algebra from K, with a finite
set B ⊆ A of generators. Now we choose an alphabet Y in such a way that
there exists a bijection α : Y → B. This bijection can be extended to a
homomorphism α̂ : FK (Y ) → A. The image α̂(FK (Y )) is then a subalgebra
of A containing B, and hence must be equal to A. Therefore α̂ is surjective,
and as FK (Y ) is finite so is A.

Theorem 6.5.16 Let K ⊆ Alg(τ ) be a finite set of finite algebras. Then the
variety V (K) generated by K is a locally finite variety.
6.6. THE LATTICE OF ALL VARIETIES 109

Proof: We will verify first that the class P(K) is locally finite. We define an
equivalence relation ∼ on Wτ ({x1 , . . . , xn }) by p ∼ q iff the term operations
corresponding to p and q are the same for each member of K. The finiteness
condition shows that ∼ has finitely many equivalence classes. Subalgebras of
P(K) are also finite, since only finite sets can be produced from finite sets
using finitary operations. Since every finitely generated member of V (K) =
HSP(K) is a homomorphic image of a finitely generated member of SP(K),
we see that V (K) is locally finite.

6.6 The Lattice of All Varieties


In Section 6.1 we proved that the collection of all equational classes of alge-
bras of a fixed type τ (over a countably infinite alphabet) forms a complete
lattice L(τ ), which is dually isomorphic to the lattice of all equational the-
ories of type τ . One can prove that these lattices are in fact algebraic (see
Section 2.1). The greatest element of the lattice L(τ ) of all varieties of type
τ is the variety consisting of all algebras of type τ ; this variety is denoted
by Alg(τ ). Clearly, Alg(τ ) = M od{x ≈ x}. The least element in the lattice
L(τ ) is the trivial variety T consisting exactly of all one-element algebras of
type τ ; we have T = M od{x ≈ y}. Dually, the greatest element in the lattice
E(τ ) of all equational theories of type τ is the equational theory generated
by {x ≈ y} (using the five derivation rules), and the least element in E(τ ) is
the equational theory generated by {x ≈ x}. Clearly, the former consists of
all equations of type τ .

A subclass W of a variety V which is also a variety is called a subvariety of


V . The variety V is a minimal, or equationally complete, variety if V is not
trivial but the only subvariety of V not equal to V is the trivial variety.
We show now that every non-trivial V ∈ L(τ ) contains a minimal subvariety.

Theorem 6.6.1 Let V be a non-trivial variety. Then V contains a minimal


subvariety.

Proof: Since V = M odIdV , the set of all identities of V defines V and


by Theorem 6.2.2 the set IdV is a fully invariant congruence relation on
Fτ (X). Since V is non-trivial this fully invariant congruence relation is not
all of Fτ (X) × Fτ (X). Since Fτ (X) × Fτ (X) is the fully invariant congruence
generated by any pair (x, y) with x 6= y, it follows that as a fully invariant
congruence Fτ (X)×Fτ (X) is finitely generated. Using Zorn’s Lemma we can
110 CHAPTER 6. IDENTITIES AND VARIETIES

extend IdV to a maximal fully invariant congruence relation. By Theorem


6.2.2, the fact that the set Conf i Fτ (X) forms a sublattice of ConFτ (X) and
the properties of the Galois-correspondence (Id, M od), this gives a minimal
variety which is included in V .

It is known that the variety of all Boolean algebras and the variety of all dis-
tributive lattices are minimal. A variety of groups is minimal if and only if it
is abelian of some prime exponent (that is, consists of all abelian groups sat-
isfying the identity xp ≈ e where p is a fixed prime number). For semigroup
varieties of type (2), using the convention of replacing the binary operation
symbol by juxtaposition, we have the following minimal varieties:

SL = M od{x(yz) ≈ (xy)z, xy ≈ yx, x2 ≈ x}, the variety of semilattices,


LZ = M od{xy ≈ x}, the variety of left-zero semigroups,
RZ = M od{xy ≈ y}, the variety of right-zero semigroups,
Z = M od{xy ≈ zt}, the variety of zero semigroups, and
Ap = M od{x(yz) ≈ (xy)z, xy ≈ yx, xp y ≈ y}, the variety of abelian groups
of prime exponent p.

If V is a given variety of type τ , then the collection of all subvarieties of V


forms a complete lattice L(V ) := {W |W ∈ L(τ ) and W ⊆ V }. This lattice
is called the subvariety lattice of V .

6.7 Finite Axiomatizability


An old question in universal algebra is whether or not the identities of a
finite algebra can be derived (using the deduction rules of Section 6.3) from
a finite set of identities of the algebra. When this can be done, we say that
the algebra, or the variety it generates, is finitely axiomatizable or finitely
based, and we refer to the finite set of identities as a basis for the identities of
the algebra or variety. R. C. Lyndon proved in [72] that every two-element
algebra is finitely axiomatizable. We know that finite groups (see S. Oates
and M. B. Powell, [84]), finite rings (see R. L. Kruse, [68]) and finite algebras
generating a variety in which all congruence lattices are distributive (see K.
A Baker, [5]) are all finitely axiomatizable. It was quite surprising when
in 1954 R. C. Lyndon constructed in [73] a seven-element algebra with one
binary and one nullary operation whose identities are not finitely based. The
smallest such example is a non-finitely axiomatizable three-element algebra
6.7. FINITE AXIOMATIZABILITY 111

of type (2) found by V. L. Murskij in [83]; this is the algebra with base set
A = {0, 1, 2} and binary operation given by

0 1 2
0 0 0 0
1 0 0 1
2 0 2 2

P. Perkins also constructed a six-element semigroup whose identities are not


finitely based in [86]. An example of a finite non-associative ring whose iden-
tities are not finitely based was constructed by S. V. Polin in [93].

Although the set of all identities of an algebra A may not be finitely axiom-
atizable, we can show that the set of all identities which use only a finite
number of variables is finitely based.

Theorem 6.7.1 Let A be a finite algebra of type τ , and let Xm be a finite


set of variables. Then Wτ (Xm )2 ∩ IdA is finitely based.

Proof: We set θ := Wτ (Xm )2 ∩IdA. Then θ is a congruence of the absolutely


free algebra Fτ (Xm ) which defines the relatively free algebra FV (A) (Xm ).
Since A is finite, and since a pair of terms (p, q) is in θ iff the induced term
operations satisfy pA = q A , there are only finitely many equivalence classes
of θ. We choose one representative from each equivalence class, and form the
set A = {q1 , . . . , qn } of representatives. Using this set we define the following
finite set Σ of identities:

Σ = {x ≈ y | x, y ∈ Xm and (x, y) ∈ θ} ∪

{qi ≈ x | x ∈ Xm , qi ∈ Q and (x, qi ) ∈ θ}


∪ {fi (qi1 , . . . , qini ) ≈ qini +1 | qj ∈ Q and (fi (qi1 , . . . , qini ), qini +1 ) ∈ θ}.
Clearly Σ is finite, and is contained in IdA, so it will suffice to prove that Σ
is a basis for IdA. By induction on the number of operation symbols occur-
ring in a term p, it can be shown that

if (p, qi ) ∈ θ, then p ≈ qi is derivable from Σ.

It follows from this that for arbitrary terms p and q,


112 CHAPTER 6. IDENTITIES AND VARIETIES

if (p, q) ∈ θ, then p ≈ q is derivable from Σ.

We conclude this section by stating without proof some important theorems


in the area of finite axiomatizability. These theorems involve some proper-
ties of a variety, usually properties of the subdirectly irreducible elements
and of the congruence lattices of the algebras in the variety. Properties of
the congruence lattices of a variety are often definable by what are called
Mal’cev-type conditions, and will be studied in more detail in Chapter 9.
We define here only the two properties we need to state our theorems.

Definition 6.7.2 A variety V is called congruence distributive if for every


algebra A in V , the congruence lattice ConA satisfies the distributive law.
V is called congruence meet-semidistributive if for every algebra A in V , the
congruence lattice ConA satisfies the following meet-semidistributive law:

θ ∧ ψ = θ ∧ ϕ =⇒ θ ∧ ψ = θ ∧ (ψ ∨ ϕ).

Note that congruence distributive varieties are also congruence meet-


semidistributive.

We also need the concept of the residual bound of a variety or an algebra.

Definition 6.7.3 Let V be a variety. We denote by κ(V ) the least cardinal


number λ such that every subdirectly irreducible algebra in V has cardinality
less than λ, if there is such a cardinal number; in this case we say that V is
residually small. If no such cardinal number exists, we let κ(V ) = ∞, and V
is said to be residually large. The cardinal number κ(V ) is called the residual
bound of the variety V . The residual bound of an algebra A is defined to be
the residual bound of the variety V (A) generated by A. A variety V is called
residually finite if all its subdirectly irreducible algebras are finite.

An important result in this area is Baker’s Theorem.

Theorem 6.7.4 (Baker’s Theorem)([5]) A congruence distributive variety


of finite type which is residually finite is finitely based.

R. McKenzie proved in [77] that a locally finite variety V having only finitely
many subdirectly irreducible elements and having an additional property
called definable principal congruences is finitely axiomatizable. McKenzie
also proved in [81] that there are only countably many values possible for
6.8. EXERCISES 113

the residual bound of a finite algebra. This residual bound must be either
∞ or one of the following cardinals:

0, 3, 4, . . . , ω, ω1 , (2ω )+ ,

where ω is the cardinal number of the set of natural numbers (that is, the
first infinite cardinal), ω1 = ω + is the next largest cardinal number after
ω, and (2ω )+ is the successor cardinal of the cardinal of the continuum.
Recently R. Willard proved the following important theorem about finite
axiomatizability, which generalizes Baker’s Theorem.

Theorem 6.7.5 ([119]) If a variety is both congruence


meet-semidistributive and residually finite, then it is finitely axiomatizable.

6.8 Exercises
6.8.1. Verify that the pair (Id, M od) forms a Galois-connection between the
sets Alg(τ ) and Wτ (X)2 .

6.8.2. Prove that the set of all fully invariant congruence relations Conf i A
of an algebra A forms a sublattice of the lattice ConA of all congruence
relations on A.

6.8.3. Determine all elements of FRB ({x, y}), where RB = M od{x(yz) ≈


(xy)z, xyz ≈ xz, x2 ≈ x} is the type (2) variety of all rectangular bands.

6.8.4. Let L be the variety of all lattices. Determine all elements of FL ({x})
and of FL ({x, y}).

6.8.5. Prove that IdHSP(A) = IdA.

6.8.6. Show that ISP(K) is the smallest class containing K and closed under
I, S and P.

6.8.7. Let V be a variety and let Y and Z be non-empty sets with |Y | ≤ |Z|.
Show that FV (Y ) can be embedded in FV (Z) in a natural way.
114 CHAPTER 6. IDENTITIES AND VARIETIES

6.8.8. Prove that in the variety of all algebras of type τ = (3) defined by the
identities

f (x, x, z) ≈ f (x, z, x) ≈ f (z, x, x) ≈ z,


f (f (x1 , y1 , z1 ), f (x2 , y2 , z2 ), f (x3 , y3 , z3 )) ≈
f (f (x1 , x2 , x3 ), f (y1 , y2 , y3 ), f (z1 , z2 , z3 )),

all algebras are free.

6.8.9. Using the normal form for semigroup terms from Example 6.4.5, de-
scribe the free semilattice on the set Xn of n generators.
Chapter 7

Term Rewriting Systems

When a relation θ is a congruence relation on an algebra A of type τ , we can


form the quotient algebra A/θ. The Homomorphic Image Theorem tells
us that any homomorphic image of an algebra A is isomorphic to such a
quotient algebra of A. Another important use of the quotient algebra was
seen in Section 6.4: taking A to be the absolutely free algebra Fτ (X) of type
τ and θ to be the set IdK of identities of a class K of algebras of type τ , we
formed the quotient algebra Fτ (X)/IdK, called the relatively free algebra
with respect to K over the set X.

Quotient algebras and relatively free algebras are examples of a more gen-
eral construction. Given any set A and any equivalence relation on A, we
can form the quotient set A/θ of all the equivalence classes with respect
to θ. Elements of A/θ are classes or sets of equivalent elements of A, but
calculations on such classes are always done by choosing a representative ele-
ment from each class, and calculating with these representatives. This means
that it is important to be able to check whether two elements belong to the
same equivalence class. Given any two elements a and b of A, we must check
whether the pair (a, b) is in our original equivalence relation.

In this chapter we consider this basic problem from a constructive point of


view. In general, a problem is said to be effectively solvable if there is an
algorithm that provides the answer in a finite number of steps, no matter
what the particular inputs are. We want the maximum number of steps the
algorithm will take to be predictable in advance. An effective solution to a
problem that has a “yes” or “no” answer is called a decision procedure, and

115
116 CHAPTER 7. TERM REWRITING SYSTEMS

a problem which has a decision procedure is said to be decidable.

Thus for an equivalence relation θ on a set A we may ask whether it is decid-


able for any two elements a, b ∈ A if (a, b) ∈ θ is true or not. In the particular
case that A = Wτ (X) and θ = IdK for a variety K of algebras of type τ , our
question becomes: given two terms s and t of type τ , is it decidable whether
s ≈ t is an identity in K or not? This problem is also called the word problem
for the variety K or for the fully invariant congruence IdK. Only in cases
where the word problem is decidable can computations be carried out using
representatives. This shows the fundamental significance of the word prob-
lem. In the more general setting, for an arbitrary equivalence relation on a
set A, the equivalence problem has been shown to be undecidable (see M.
Davis, [15]). In this chapter we will examine methods for deciding the word
problem of a variety.

Since every equivalence or congruence is a binary relation, we begin with a


description in Section 7.1 of the equivalence relation generated by an arbi-
trary binary relation. Then we shift from the algebraic model of a set ρ of
ordered pairs on a set A to a more machine-oriented model: we think of a
pair (a, b) in ρ as a rule a → b which says that we can transform or reduce
a to b. Writing → for the relation ρ, we consider reduction systems (A; →).
Sections 7.1 and 7.2 study the properties of termination and confluence for
such systems. Are infinite sequences of reductions based on → possible? Is
it possible to reduce an element a by more than one sequence of reductions,
and if so do such reductions lead to equivalent results? Section 7.3 examines
the special case that our base set A is the set Wτ (X) of all terms of a fixed
type, in which case a reduction system is called a term rewriting system.
The important problem of testing for termination of a reduction system is
considered in Section 7.4.

7.1 Confluence
We shall be interested in testing for equivalence of elements with respect to
an equivalence relation on a set A. Any equivalence relation on A is a binary
relation on A, and any binary relation generates an equivalence relation. To
describe the equivalence relation generated by a relation ρ on A, we need
the following notation:
7.1. CONFLUENCE 117

ρR := ρ ∪ 4A is the reflexive closure of ρ,


ρS := ρ ∪ ρ−1 is the symmetric closure of ρ,
ρ(0) := 4A , ρ(i) := ρ ◦ ρ(i−1) ,
where ◦ is the relational product defined by
ρ1 ◦ ρ2 := {(x, y) | ∃z ∈ A((x, z) ∈ ρ2 and (z, y) ∈ ρ1 )},
ρ := ∪ ρ(i) is the transitive closure of ρ,
T
i≥1
ρRT := ∪ ρ(i) is the reflexive and transitive closure of ρ,
i≥0
ρSRT := ρRT ∪ (ρRT )−1 is the symmetric, reflexive, transitive
closure of ρ.

We shall be particularly interested in the reflexive, transitive closure ρRT of


a relation ρ. Note that the symmetric, reflexive, transitive closure of ρ is just
the reflexive, transitive closure of ρS ; we leave it as an exercise (see Exercise
7.5.1) to verify that

ρSRT = (ρ ∪ ρ−1 ∪ 4A )T .

Lemma 7.1.1 For any two binary relations ρ1 and ρ2 on a set A, we have

ρ2 = 4A ∪ ρ1 ◦ ρ2 ⇒ ρRT
1 ⊆ ρ2 .
(0)
Proof: Let ρ2 = 4A ∪ ρ1 ◦ ρ2 . Then by definition both ρ1 = 4A ⊆ ρ2 and
(n−1) (n) (n−1)
ρ1 ◦ ρ2 ⊆ ρ2 . Inductively, if ρ1 ⊆ ρ2 then ρ1 = ρ1 ◦ ρ1 ⊆ ρ1 ◦ ρ2 ⊆ ρ2 .
(n)
Thus ρ1 ⊆ ρ2 for all natural numbers n, and we have

ρRT
1 = ∪ ρ(i) ⊆ ρ2 .
i≥0

Lemma 7.1.2 For any two binary relations ρ1 and ρ2 on a set A,

(ρ1 ∪ ρ2 )RT = ρRT


1 ◦ (ρ2 ◦ ρRT
1 )
RT
.

Proof: We show first the inclusion (ρ1 ∪ ρ2 )RT ⊆ ρRT RT RT . By


1 ◦ (ρ2 ◦ ρ1 )
RT RT
Lemma 7.1.1, it suffices to show that ρ1 ◦ (ρ2 ◦ ρ1 ) RT = 4A ∪ (ρ1 ∪ ρ2 ) ◦
RT RT RT
(ρ1 ◦ (ρ2 ◦ ρ1 ) ). The following calculation shows that this equation is
satisfied:
118 CHAPTER 7. TERM REWRITING SYSTEMS

ρRT RT RT = (ρ ◦ ρRT )RT ∪ ρ ◦ ρRT ◦ (ρ ◦ ρRT )RT


1 ◦ (ρ2 ◦ ρ1 ) 2 1 1 1 2 1
= 4A ∪ ρ2 ◦ ρRT
1 ◦ (ρ 2 ◦ ρ RT )RT ∪ ρ ◦ ρRT ◦ (ρ ◦ ρRT )RT
1 1 1 2 1
= 4A ∪ (ρ1 ∪ ρ2 ) ◦ (ρRT RT RT ).
1 ◦ (ρ2 ◦ ρ1 )

To show the opposite inclusion, ρRT RT RT ⊆ (ρ ∪ ρ )RT , we have to


1 ◦ (ρ2 ◦ ρ1 ) 1 2
(m) (m) (l ) (l )
show that ρ1 ⊆ (ρ1 ∪ρ2 )RT and ρ1 ◦(ρ2 ◦ρ1 1 )◦· · ·◦(ρ2 ◦ρ1 n ) ⊆ (ρ1 ∪ρ2 )RT
for all m, l1 , . . . , ln ∈ IN and n ≥ 1. But this follows from the definition of
(ρ1 ∪ ρ2 )RT .

From Lemma 7.1.2 we have the inclusion ρRT 1 ◦ ρ2


RT ⊆ (ρ ∪ ρ )RT for any
1 2
two binary relations ρ1 and ρ2 . We look now for conditions under which we
get equality here. First, we observe that the map RT taking any relation ρ
to its reflexive transitive closure ρRT is a closure operator. In particular, we
have

ρ1 ⊆ ρ2 ⇒ ρRT RT
1 ⊆ ρ2 and (ρRT
1 )
RT
= ρRT
1 .

Also, transitivity of ρRT means that ρRT RT = ρRT . This shows us that
1 ◦ ρ1 1

ρ1 ◦ ρRT
2 ⊆ ρRT RT
2 ◦ ρ1 ⇒ (ρ1 ◦ ρRT
2 )
(n)
⊆ ρRT RT
2 ◦ ρ1 ,

for all n ∈ IN (see Exercise 7.5.2). We have now proved the following result.

Lemma 7.1.3 For any two binary relations ρ1 and ρ2 on set A,

ρ1 ◦ ρRT
2 ⊆ ρRT RT
2 ◦ ρ1 ⇒ (ρ1 ◦ ρRT
2 )
RT ⊆ ρRT ◦ ρRT .
2 1

Using this we prove the following equivalences.

Proposition 7.1.4 For any two binary relations ρ1 and ρ2 on a set A the
following conditions are equivalent:

(i) ρ1 ◦ ρRT RT RT
2 ⊆ ρ2 ◦ ρ1 ;

(ii) ρRT RT RT RT
1 ◦ ρ2 ⊆ ρ2 ◦ ρ1 ; and

(iii) (ρ1 ∪ ρ2 )RT ⊆ ρRT RT


1 ◦ ρ2 .
7.1. CONFLUENCE 119

(0)
Proof: (i) ⇒ (ii): Clearly, ρ1 ◦ ρRT2 = 4A ◦ ρRT 2 = ρRT2 ⊆ ρRT RT
2 ◦ ρ1 . In-
(n−1)
ductively, if for some n ≥ 0 we have ρ1 ◦ ρRT
2 ⊆ ρRT RT
2 ◦ ρ1 , then also
(n) (n−1)
ρ1 ◦ ρRT
2 = ρ1 ◦ (ρ1 ◦ ρRT RT
2 ) ⊆ (ρ1 ◦ ρ2 ) ◦ ρ1
RT ⊆ ρRT ◦ (ρRT ◦ ρRT ) =
2 1 1
(n)
RT RT RT
ρ2 ◦ ρ1 . This means that for all n ≥ 0, we have ρ1 ◦ ρ2 ⊆ ρ2 ◦ ρRT RT
1 .
(i) RT RT RT RT RT
From this we have ∪ ρ1 ◦ ρ2 = ρ1 ◦ ρ2 ⊆ ρ2 ◦ ρ1 .
i≥0

(ii) ⇒ (i): By definition of ρRT


1 we have ρ1 ◦ ρRT
2 ⊆ ρRT RT ⊆ ρRT ◦ ρRT .
1 ◦ ρ2 2 1

(i) ⇒ (iii): If (i) is satisfied then (ρ1 ◦ ρRT


2 )
RT ⊆ ρRT ◦ ρRT by Lemma 7.1.3.
2 1
Now by Lemma 7.1.2, (ρ2 ∪ ρ1 ) RT RT
⊆ ρ2 ◦ (ρRT ◦ ρRT ) = ρRT ◦ ρRT . The
2 1 2 1
fact that ρ1 ∪ ρ2 = ρ2 ∪ ρ1 then gives the claim.

(iii) ⇒ (i): Using ρ2 ◦ ρRT RT RT RT we get ρ ◦ ρRT ⊆ (ρ ∪ ρ )RT


1 ⊆ ρ1 ◦ (ρ2 ◦ ρ1 ) 2 1 1 2
by Lemma 7.2.2. This implies ρ2 ◦ ρ1 ⊆ ρRT
RT
1 ◦ ρRT , by (iii).
2

Our aim is to obtain a decision procedure for the equivalence relation ρSRT
generated by a binary relation ρ on a base set A. We shall see that this will
require some finiteness restrictions. We introduce now a change of notation:
instead of the algebraic notation ρ for a binary relation on a set A, we will
use the notation → more commonly used in Computer Science. The state-
ment (a, b) ∈ ρ becomes a → b. For the equivalence relation generated by
→, we first form the inverse relation ← := →−1 and the symmetric closure
RT
←→ defined as → ∪ ←; then we use the reflexive transitive closure ←→ of
this.

We think of a statement a → b, corresponding to a pair (a, b) in our relation,


as a rule which allows us to transform a into b. Usually in practice these rules
have the property that the element b is somehow simpler (by some measure
of complexity) than a, and we think of the rule as a reduction rule. The next
definition sets out some basic properties of such reduction systems →.

Definition 7.1.5 Let → be a binary relation on A. If there is an element y


such that x → y, then x is said to be reducible and y is called a reduct of x,
with respect to →. If no such y exists, then x is said to be irreducible or in
normal form with respect to →.
An element a ∈ A has a terminating reduction if there are elements
x0 , . . . , xm of A such that a = x0 , xi → xi+1 for i = 1, . . . , m − 1 and
120 CHAPTER 7. TERM REWRITING SYSTEMS

xm is irreducible. In this case we write

a = x0 → x1 → · · · → xm ↓ .

We say that an element a of A has a non-terminating reduction if there is


an infinite sequence (xn : n ∈ IN) in A such that a = x0 and xn → xn+1 for
all n ∈ IN. We write such a reduction as

a = x0 → x1 → · · · xs → · · · .

The relation → is called noetherian or terminating if there are no infinite


sequences of the form x1 → x2 → x3 → · · ·.

It is easy to see that if the relation → contains any pairs (a, a), or any pairs
(a, b) and (b, a), for elements a and b of A, non-terminating reductions will
be possible, and so → will not be terminating. As we will see in Section
7.3, we usually start with relations → which are both irreflexive and anti-
symmetric, to avoid this problem, and later form the reflexive, symmetric,
transitive closure of the relation.

Now suppose that we have a noetherian relation →, with the equivalence


RT
relation ←→ it generates. We would like to be able to decide whether two
RT
elements a and b from A are equivalent under ←→. One possibility might be
to reduce a and b as much as possible, to normal forms c and d, respectively,
and then to test whether these normal forms are equal. If they are equal,
RT RT RT
then we have a −→ c = d ←− b and we can conclude that a ←→ b. But if
RT
c 6= d, no conclusion as to whether a ←→ b can be reached, since in general
RT RT RT
it is impossible to infer c = d from a ←→ b and a −→ c, b −→ d.

Example 7.1.6 Let A = {a, b, c, d} and let → be the binary relation on A


given by the diagram below.

a ←→ b
↓ &.
c d

RT RT RT
Here c and d are in normal form, and we have a ←→ b, a −→ c and b −→ d,
but c 6= d.
7.1. CONFLUENCE 121

In general then we want to start with an element a in our base set, and
reduce it to some normal form c. However, it is possible, in an arbitrary
relation →, that there may be many different reductions starting from a,
which may or may not converge to the same result. In particular, an element
a may have no normal form, if the reduction relation → is not terminating,
or it may have more than one normal form reached by different reductions
from a. Some kinds of restrictions used to ensure a unique normal form for
every element are described in the next definition.

Definition 7.1.7 A relation → on a set A has the Church-Rosser property


RT
if for all x, y ∈ A, if x ←→ y then there is an element z ∈ A such that
RT RT
x −→ z ←− y. We write x ↓RT y to indicate this relationship between x
and y. Thus the Church-Rosser property may be expressed as the fact that
RT RT RT
←→ ⊆ ↓RT . Note also that ↓RT = −→ ◦ ←−. We will also write x ↑RT y to
RT RT
mean that there is an element u ∈ A such that x ←− u −→ y.
A relation → on A is called confluent if for all x, y ∈ A,

if x ↑RT y then x ↓RT y.

The property of confluence means that if we can go from an element u to


RT
two different elements x and y by −→, then we can also go from x and y to
some common point z. This can be illustrated by the following picture:
u
RT RT

y
/ ^
x

RT RT
w/
z

Lemma 7.1.8 Let → be a relation on a set A. Then → has the Church-


RT
Rosser property iff for all a, b ∈ A we have a ←→ b iff there is an element z
RT RT RT
such that a −→ z ←− b; that is, iff ←→ = ↓RT .
Proof: As we remarked in Definition 7.1.7, the Church-Rosser property
RT
means precisely that ←→ ⊆ ↓RT . Thus equality of these two relations guar-
122 CHAPTER 7. TERM REWRITING SYSTEMS

antees that → is Church-Rosser, and it is enough to show that any Church-


RT RT RT
Rosser relation satisfies ↓RT ⊆ ←→. By Lemma 7.1.2 we have ←→ = −→
RT RT RT RT RT
◦ (← ◦ −→)RT . Since ←− ⊆ (← ◦ −→)RT , it follows that −→ ◦ ←− ⊆
RT RT
←→, and ↓RT ⊆ ←→.

The central role of the Church-Rosser property is shown by the following


theorem.

Theorem 7.1.9 Let → be a relation on A which has the Church-Rosser


RT RT
property. Let a, b, c and d be elements of A such that a −→ c and b −→ d,
RT
where c and d are in normal form. Then a ←→ b if and only if c = d.
RT
Proof: If c = d then by Lemma 7.1.8 we have a ←→ b. Conversely, suppose
RT RT RT RT
that a ←→ b. Since a −→ c and b −→ d, and ←→ is the equivalence relation
RT
generated by −→, we also have c ←→ d. Using the Church-Rosser property
RT RT
we have an element e ∈ A such that c −→ e ←− d. But c and d are in normal
form and thus c = e = d.

Theorem 7.1.8 can be used to decide whether any pair is in the equivalence
RT
relation ←→. But in order to use this test we have to know that → has
the Church-Rosser property, and in general it is a hard problem to show
that a relation has this property. In the next sections some lemmas are
derived which allow us to reduce the “global” problem of proving the Church-
Rosser property to a “local” problem. But first we prove that having the
Church-Rosser property is equivalent to being confluent, a result known as
the “Church-Rosser-Theorem.”

Theorem 7.1.10 (Church-Rosser Theorem) A relation → defined on a set


A has the Church-Rosser property if and only if it is confluent.
Proof: First, suppose that the relation → has the Church-Rosser property.
RT RT
Consider elements x, y and z in A such that x ←− z −→ y. This implies
RT
that x ←→ y; so by the Church-Rosser property we have x ↓RT y. Thus →
is confluent.

Conversely, if → is a confluent relation, then we have to show that x ↓RT y


RT RT
for arbitrary elements x, y ∈ A with x ←→ y. By Lemma 7.1.2 we have ←→
RT RT
= −→ ◦ ∪ {(← ◦ −→)n | n ∈ IN}. We will show by induction on n that
7.2. REDUCTION SYSTEMS 123

RT RT RT
(← ◦ −→)n ⊆ −→ ◦ ←− for all n ∈ IN. This is clearly satisfied for n = 0. If
RT RT RT
(← ◦ −→)n−1 ⊆ −→ ◦ ←−, then
RT RT RT
(← ◦ −→)n = ← ◦ −→ ◦(← ◦ −→)n−1
RT RT RT
⊆ ← ◦ −→ ◦ −→ ◦ ←− by induction hypothesis
RT RT RT RT RT
= ← ◦ −→ ◦ ←− since −→ ◦ −→ = −→
RT RT RT
⊆ −→ ◦ ←− ◦ ←− by confluence
RT RT RT RT RT
= −→ ◦ ←− since ←− ◦ ←−=←− .
RT RT RT
Therefore, we get (← ◦ −→)n ⊆ −→ ◦ ←− for all n ∈ IN, and so
RT RT RT RT RT RT
{(← ◦ −→)n | n ∈ IN} ⊆ −→ ◦ ←−. Consequently ←→ ⊆ −→ ◦ ←−,
S

which means that → has the Church-Rosser property.

7.2 Reduction Systems


A pair (A; →) consisting of a set A and a binary relation → on A is called
a reduction system. In Definition 7.1.5 we introduced the concept of a ter-
minating reduction. The most useful reduction systems are those which are
both confluent and terminating, and in this section we characterize such
special systems.

Definition 7.2.1 A reduction system which is both confluent and termi-


nating is called complete.
If (A; →) is a confluent reduction system, each element of A has at most one
normal form. But when (A; →) is terminating, each element of A has at least
one normal form. Therefore, in a complete reduction system every element
has exactly one normal form, which we will call the canonical normal form
of the element. It is this property that makes complete reduction systems so
useful. In order to characterize such systems, we will need the property of
local confluence.

Definition 7.2.2 A reduction system (A; →) is called locally confluent if


for any two elements x, y ∈ A, whenever there exists an element u of A such
that u → x and u → y, then we have x ↓RT y.
Local confluence thus means that whenever there is an element u which we
can reduce in one step to both x and y, then we can go from each of x and
y in one or more steps to a common element z. This situation is illustrated
124 CHAPTER 7. TERM REWRITING SYSTEMS

by the following picture.


u
q

x R y

RT R/ RT
z

Clearly, every confluent reduction system is locally confluent. The converse


is not true in general, and the following picture shows a relation which is
locally confluent but not confluent.

r r r - r
-


For the proof of our next result we need the following Principle of Noetherian
Induction.

Theorem 7.2.3 Let → be a noetherian relation on A. Let P be a property


and let P(a) be the statement that the property P holds for the element a.
Assume that → satisfies the following condition: If P(x) for all x with a → x,
then P(a). Then P holds for all a in A.
Proof: Let us denote by AP the set of all elements in A for which P holds.
By definition AP ⊆ A. Suppose that AP ⊂ A. Then there are elements in A
for which P is not satisfied. Consider B = A \ AP . Since → is noetherian,
there are elements m in B such that no element x of A with m → x belongs
to B, for otherwise we would have infinite chains in B. Therefore every ele-
ment x with m → x belongs to AP and satisfies P. But then by assumption
we have P(m), a contradiction.

Theorem 7.2.4 A terminating reduction system is confluent if and only if


it is locally confluent.
Proof: We know that any confluent reduction system is locally confluent.
For the opposite direction, suppose that (A; →) is a locally confluent and
terminating reduction system. We consider the following property P:
7.2. REDUCTION SYSTEMS 125

RT RT
P(a) iff ∀x, y ∈ A (a −→ x and a −→ y ⇒ x ↓RT y).

We verify that

∀a ∈ A(∀x ∈ A(a → x ⇒ P(x)) ⇒ P(a))

is satisfied. Consider an arbitrary element a of A. We can assume that a is


reducible, since otherwise P(a) is trivially satisfied. To show that P holds
RT RT
for a, we prove that for all x, y ∈ A if a −→ x and a −→ y then we have
x ↓RT y. The reducibility of a means that there are elements b, c ∈ A such
RT RT
that a → b −→ x and a → c −→ y. Then by the local confluence there is an
RT RT
element d ∈ A such that b −→ d and c −→ d.

By our induction hypothesis, P holds for b and c and therefore there is an


RT RT RT RT
element z such that d −→ z and c −→ z. Now we have c −→ z and c −→ y.
By the induction hypothesis applied to y and z we also get an element z 0
RT RT
such that y −→ z 0 and z −→ z 0 . But then for x and y there is an element,
RT RT
namely z 0 ∈ A, such that x −→ z 0 and y −→ z 0 . This means that P(a) is
satisfied.

Since a complete reduction system is one which is both confluent and termi-
nating, we can rephrase Theorem 7.2.4 as follows.

Corollary 7.2.5 A reduction system is complete if and only if it is locally


confluent and terminating.

Reduction systems will be used to help us to solve the word problem. Given
an equivalence relation ∼ on A, we have to find a complete reduction system
which generates the relation ∼. Of course, different reduction systems can
generate the same equivalence relation. Such reduction systems are called
equivalent. Thus we want to find, among all the equivalent reduction systems
which generate our equivalence relation, any complete reduction systems.

Definition 7.2.6 Two reduction systems (A; →) and (A; ⇒) are called
RT RT
equivalent if ←→ = ⇐⇒. A complete reduction system which is equivalent
to (A; →) is called a completion of (A; →).
126 CHAPTER 7. TERM REWRITING SYSTEMS

As a first step in finding a completion of a given reduction system, we must


show that every reduction system does indeed have a completion. To see this,
RT
consider an arbitrary reduction system (A; →). Let ←→ be the equivalence
RT
relation generated by →, and let A/ ←→ be the quotient set with respect
RT
to ←→. A choice function Φ : A/ RT → A is a function which selects exactly
←→
RT
one element from each equivalence class with respect to ←→. (By the axiom
of choice such a mapping always exists.) The choice function Φ induces a
RT
mapping S : A → A for which S(x) ←→ x and S(x) = S(x0 ) for all x, x0 ∈ A
RT
whenever x ←→ x0 . (Essentially, S is defined by assigning the value used by
Φ to every element of each equivalence class, so that the classes are viewed
not so much as new objects but as subsets of A, all of whose elements take
the same value.)

Then we define a new relation ⇒ on A, by

x⇒y iff x 6= y and y = S(x),

for all x, y ∈ A.
RT RT
Clearly ⇐⇒ = ←→, so that the reduction systems (A; →) and (A; ⇒)
RT
are equivalent. Moreover, if x ⇐⇒ y then there is an element z with
RT RT
x =⇒ z ⇐= y, namely z = S(x) if x 6= y and x = z = y otherwise. Therefore
⇒ has the Church-Rosser property and is confluent by Theorem 7.1.10. The
relation ⇒ is also terminating and therefore complete. This shows that any
reduction system does have a completion.

For a terminating reduction system (A; →), we know by Corollary 7.2.5 that
local confluence is enough to guarantee completeness. To find a completion,
therefore, we can try the following construction, based on the strategy of
locating all cases in which local confluence is violated. We consider the set
of all so-called “critical pairs,” that is, pairs (x, y) ∈ A × A for which there
is an element z ∈ A such that z → x and z → y, but no element z 0 ∈ A
RT RT
satisfying x −→ z 0 and y −→ z 0 . If there are no such pairs, then the relation
→ is locally confluent and hence confluent and complete. Otherwise, if there
exists at least one critical pair (x, y) we can try to fix the local confluence
violation by adding either (x, y) or (y, x) to the reduction relation →. That
is, we add either x → y or y → x. However, it is possible that with this
addition the new enlarged relation is no longer terminating. If both possible
7.2. REDUCTION SYSTEMS 127

additions destroy the termination property, our procedure stops with “fail-
ure.”

Otherwise, we are successful in adjoining a critical pair to →, and getting


a new larger relation which still terminates. Then we repeat the process on
this new relation, finding its critical pairs and enlarging again if necessary.
If this procedure terminates, at some relation ⇒, then no critical pairs are
left, and (A; ⇒) is complete. By construction the system (A; ⇒) is then a
complete reduction system which is a completion of (A; →).

The problem with this procedure is that in general there can be infinitely
many critical pairs. So this procedure does not give an algorithm for con-
struction of the completion of a terminating reduction system, and in general
there is no such algorithm.

We conclude this section with an example to illustrate our procedure. As we


remarked earlier, we want to use this procedure in the special case that our
equivalence relation is the fully invariant congruence consisting of the set of
identities of some variety. Our example starts with a term clone as our base
set A.

Example 7.2.7 Taking S to be the type (2) variety of all semigroups, we


consider as our set A the base set of the free semigroup FS (X2 ) over the
two-element alphabet X2 = {x, y}. As we saw in Example 6.4.5, we usually
omit both the binary operation symbol and brackets from our terms, writing
them as words on the alphabet X2 . Thus our set A consists of all such words
composed from the two letters x and y. We define a relation → on A as
follows. For any two words w and w0 , we set w → w0 iff there exist words
w1 , w2 ∈ W (X2 ) such that w = w1 yyxyw2 and w0 = w1 w2 . Notice that the
relation → is compatible with the semigroup operation.

The reduction system (A; →) is terminating since any reduction step de-
creases the length of the words. But it is not confluent, since we have the
reductions

yyxyyxy
. &
yxy yyx
128 CHAPTER 7. TERM REWRITING SYSTEMS

but it is not possible to reduce yxy and yyx to a common element. Applying
our procedure to the critical pair (yxy, yyx), we check whether we should
add yxy → yyx or yyx → yxy. Since both words have equal length, we
use the lexicographical order induced by x > y to make our decision: since
yxy > yyx we add the rule yxy → yyx. Now we consider the relation →1
generated by the rules r0 = yyxy →1 e and r1 = yxy →1 yyx, where e is the
empty word (having the property we ≈ ew ≈ w for all words w). Our new
relation →1 is defined by w →1 w0 if there are words w1 , w2 ∈ A such that
either

(1) w = w1 yyxyw2 and w0 = w1 w2 , or

(2) w = w1 yxyw2 and w0 = w1 yyxw2 .

Since our relation is generated as a compatible relation from these rules, we


only have to check all possible “overlappings” of yyxy and yyx. By construc-
tion the reduction system (A; →1 ) is terminating, and we check now that it
RT RT
is equivalent to (A; →). First, since → is a subset of →1 we get ←→ ⊆ ←→1 .
Conversely, if (w, w0 ) is a critical pair with respect to →1 then w and w0 are
RT RT RT
equivalent modulo ←→. Thus →1 ⊆ ←→, and from this it follows that ←→1
RT
⊆ ←→. Altogether we have equality.

The reduction system (A; →1 ) is still not confluent. The overlappings of the
left hand sides of the two rules give new rules. The overlapping of the left
hand side of r0 with itself was already examined. Consider now r0 and r1 .

yyxy
. &
e yyyx

yyxyxy
. &
yyxyxy yx

yyxyxy
. &
xy yyxyyx

If we consider overlappings of the left hand side of r1 with itself we obtain


only
7.3. TERM REWRITING 129

yxyyxy
. &
yyxxy yxyyx

We add the following new rules:

r2 = yyyx →2 e, r3 = yyxyxy →2 yx,


r4 = yyxyyx →2 xy, r5 = yxyyx →2 yyxxy.

Let →2 be the relation generated by the rules r0 , r1 , r2 , r3 , r4 and r5 . Then


(A; →2 ) is terminating and equivalent to (A; →), but is still not confluent,
since for instance the overlapping of the left hand side of r0 with the left
hand side of r3 gives

yyxyxy
. &
xy yx

But if we now add the rule r6 = xy →3 yx, it is easy to show that no other
overlappings produce critical pairs. We denote by →3 the relation generated
by r0 , r1 , r2 , r3 , r4 , r5 and r6 . Then (A; →3 ) is a completion of (A; →). In fact
we can omit the rules r0 to r5 , since (A; →3 ) is equivalent to the complete
reduction system (A; ⇒) where ⇒ is generated by yyyx → e and xy → yx.
Thus (A; ⇒) is also a completion of (A; →).

7.3 Term Rewriting


In the previous section we studied reduction systems (A; →) in general. Now
we turn to the specific case we are interested in, when A is actually the
universe set of the free algebra, or term clone, of a fixed type τ . That is, we
take A to be the set Wτ (X) of all terms of type τ over an alphabet X. We will
need to impose some conditions on the relation → to guarantee termination.
When the relation → meets these conditions, the resulting reduction system
(Wτ (X); →) is called a term reduction system. We usually write pairs (t, t0 )
in the relation → as rules t → t0 , and refer to them as reduction rules. A set
of such reduction rules is called a term rewriting system for Wτ (X).
As an autonomous research field, term rewriting dates back to the introduc-
tion of the so-called Knuth-Bendix completion procedure ([66]) in 1970. The
aim of the Knuth-Bendix algorithm is to transform a given input, a set of
130 CHAPTER 7. TERM REWRITING SYSTEMS

equations, into a convergent rewriting system: a system for rewriting terms


which is always terminating and which produces a unique possible result for
any term, called its normal form. In this system, two terms are equal in the
input theory if and only if they have the same normal form.

The motivation for this algorithm lies in the deductive method for equational
theories in Universal Algebra. That is, we want to form our deduction rules
for terms from some identities of an equational theory. There is an impor-
tant difference, however, in that identities t ≈ t0 are symmetric, while our
deduction rules t → t0 are one-way only. We usually start with an irreflexive
and anti-symmetric set of rules as our set →, and later form the reflexive,
symmetric and transitive closure of this, as in the previous sections.

As we saw in Section 6.3, there are five rules of deduction for equational
theories, which allow us to deduce new equations from given ones. Of these
rules, the first three correspond to the three basic properties of an equiva-
RT
lence relation. As before, these are taken care of by taking the closure ←→
of →. We will focus now on the other two rules, the substitution rule and
the so-called “replacement” rule which allows us to apply fundamental op-
eration symbols or (from Theorem 5.2.4) arbitrary terms to equations to
produce new equations. We will call a relation invariant if it is closed under
the replacement rule, rule (4), and fully invariant if it is closed under both
the replacement rule and the substitution rule. The word problem for fully
invariant congruences can thus be transformed into a reduction problem for
fully invariant relations, and the first condition we impose on our relation
→ is that it should be fully invariant.

To express the algebraic properties of fully invariant relations in our new


language of reduction rules, and to describe another necessary condition for
→, we recall some notation from Chapters 5 and 6 which we can use to
describe the effects of the substitution and replacement rules. A substitution
(of type τ ) is any map s : X 7→ Wτ (X), and any such substitution has a
unique endomorphic extension ŝ : Wτ (X) 7→ Wτ (X). In Section 5.1 we as-
signed to any vertex or node of a term t, regarded as a tree, a sequence of
positive integers called its address. We use the notation t/u for the subtree
of t which starts with the vertex labelled with address u, and t[u/s] for the
tree obtained from t by replacing the subtree t/u by the tree s.
7.3. TERM REWRITING 131

Then a relation → on the free algebra Fτ (X) is fully invariant iff for all
terms t, t0 and t00 , for all substitutions s : X → Wτ (X), and all addresses u
in t00 ,

(1) t → t0 implies t00 [u/t] → t00 [u/t0 ], and

(2) t → t0 implies ŝ[t] → ŝ[t0 ].

It is clear that these two conditions express the deduction rules (4) and (5),
the replacement rule and the substitution rule, respectively. The following
proposition gives a necessary condition for a relation on Wτ (X) to be ter-
minating. We use the notation var(t) for the set of all variables which occur
in a term t.

Proposition 7.3.1 If → is a terminating reduction on Wτ (X), then for all


terms t, t0 ∈ Wτ (X), if t → t0 then var(t0 ) ⊆ var(t).

Proof: Let t be a terminating reduction on Wτ (X). Suppose that there are


terms t, t0 ∈ Wτ (X) with t → t0 but var(t0 ) 6⊆ var(t). Then there is a variable
z in t0 which does not occur in t. Consider the substitution s : X → Wτ (X)
defined by
(
x if x 6= z
s(x) =
t if x = z.

Since z does not occur in t it is clear that ŝ[t] = t for the extension of s. We
set t0 := ŝ[t] and t1 := ŝ[t0 ]. Then from t → t0 and condition (2) for term
reductions we get t0 → t1 .

Since z ∈ var(t0 ) there is an address u in t0 with t0 /u = z. Therefore for the


subtree t1 /u of t1 starting with the vertex addressed by u we have

t1 /u = ŝ[t0 ]/u = ŝ[t0 /u] = ŝ[z] = t,

and then t1 = t1 [u/t]. Now we set t2 := t1 [u/t1 ]. We have t = t0 → t1 , and


applying condition (1) to this gives t1 = t1 [u/t] → t1 [u/t1 ] = t2 . The term
t1 is a subterm of t2 since t2 /u = t1 . Continuing in this way, we produce a
non-terminating reduction

t0 → t1 → t2 → · · · .
132 CHAPTER 7. TERM REWRITING SYSTEMS

This contradicts the assumption that → is terminating.

This necessary condition for a terminating reduction is built into our defini-
tion of a term reduction system, along with the requirement of full invariance.

Definition 7.3.2 Let τ be a fixed type, and let → be a set of pairs on


Wτ (X). The pair (Wτ (X); →) is called a term reduction system if → is
fully invariant and for all terms t, t0 ∈ Wτ (X), if t → t0 then var(t0 ) ⊆
var(t). A pair (t, t0 ) ∈ Wτ (X)2 , or a rule t → t0 is called a reduction rule if
var(t0 ) ⊆ var(t). A set of reduction rules is called a term rewriting system.
Notice that the set of rules in a term rewriting system need not be fully
invariant, although the rules satisfy the variable property. This is not a
serious problem, since for any set R of such reduction rules we can always
form the fully invariant closure, which we denote by →R . Example 7.2.7 gave
a term rewriting system for type τ = (2). For arbitrary type τ we can use
some of the ideas developed in that example if we use the concept of an
address u in a term t as a special term of type τ = (2). We define a partial
order relation, called the prefix order, on the set of addresses by

u ≤ v ⇔ v = uw for some address w.


Then we have t →R t0 iff there exists a substitution s : X → Wτ (X), a term
t00 ∈ Wτ (X) and an address u of t00 such that (1) and (2) are satisfied.

Proposition 7.3.3 Let R ⊆ Wτ (X) × Wτ (X) be a term rewriting system,


and let t, t0 ∈ Wτ (X). Then t →R t0 iff there exists a substitution s : X →
Wτ (X), an address u of t and a reduction rule t1 → t2 ∈ R such that
t/u = ŝ[t1 ] and t0 = t[u/ŝ[t2 ]].
Proof: Let t, t0 ∈ Wτ (X) be terms which satisfy the condition. If →R is the
term reduction system generated by R then t1 → t2 ∈ R implies ŝ[t1 ] →R
ŝ[t2 ] by rule (2). If t is a term with t/u = ŝ[t1 ] then t[u/ŝ[t1 ]] = t, and by
rule (1) we have t = t[u/ŝ[t1 ]] → t[u/ŝ[t2 ]] = t0 , and so t → t0 .

Assume conversely that t →R t0 . Then there are terms t01 , t02 with t01 →R t02 ,
an address u in t and a substitution s : X → Wτ (X), such that t/u = ŝ[t01 ]
and t0 /u = ŝ[t02 ]. Using (2) and (1) we have t0 [u/ŝ[t02 ]] = t0 . Continuing in
this way we come finally to a reduction rule t1 → t2 ∈ R with this property
and our condition is satisfied.
7.3. TERM REWRITING 133

A special case occurs when t can be reduced to t0 in one step, that is if


there is a substitution s : X → Wτ (X) such that t0 = ŝ[t]. In this case this
substitution s is called a match of t to t0 .

All the properties of reduction systems described in Section 7.2 can be ap-
plied to term rewriting systems. Given a term rewriting system R, we have
to find an equivalent complete terminating rewriting system. In this section
we assume that our system is terminating. The termination of term rewrit-
ing systems will be considered in the next section. As we did in the example
given in Section 7.2, we proceed by transforming critical pairs into new re-
duction rules.

To check the local confluence of our term rewriting system R, any two diverg-
ing one-step reductions t → t0 and t → t00 must be inspected for a common
reduction t such that t0 and t00 converge to t. The easiest way is to find a
substitution s under which t0 and t00 are equal, making t = ŝ[t0 ] = ŝ[t00 ].

tq

t0 R t00
s s
R/
ŝ[t0 ] = ŝ0 [t00 ] = t̄

Two terms t and t0 of Wτ (X) are called unifiable if there is a substitution


s : X → Wτ (X) such that ŝ[t] = ŝ[t0 ]. More generally, a set M ⊆ Wτ (X) of
terms is called unifiable if there is a substitution s such that ŝ[t] = ŝ[t0 ] for
all t, t0 ∈ M ; such an s is called a unifier for M . A unifier s is called a most
general unifier for M if for all unifiers s1 for M there is a substitution s2
such that s1 = s ◦ s2 . In this case we also write s1 ≤ s. Clearly, the relation
≤ is reflexive and transitive; so is a quasiorder on the set of all substitutions.
We can then define an equivalence relation on the set of all substitutions, by

s1 ∼ s2 :⇔ s1 ≤ s2 and s2 ≤ s1 .

Let us denote by Substτ the set of all substitutions s : X → Wτ (X). Then


the relation ≤ induces a partial order on the quotient set Substτ / ∼. One
134 CHAPTER 7. TERM REWRITING SYSTEMS

can show that the relation < defined by s1 ≤ s2 , s1 6= s2 is noetherian on


Substτ (see E. Eder, [42]). As a consequence we get the following result.

Proposition 7.3.4 Let M ⊆ Wτ (X) be a finite unifiable set of terms. Then


there exists a most general unifier for M .

Proof: We consider the set UM of all unifiers for M and its quotient set
UM / ∼ when we factorise by the relation ∼. Clearly a most general unifier
for M is a maximal element with respect to the partial order ≤ on this
quotient set. Assume that s is not maximal in UM / ∼. Then there is an
element s1 in UM / ∼ such that s < s1 . If s1 is maximal, we are finished.
Otherwise, by iteration of this procedure (using the axiom of choice) we get
a sequence s1 , s2 , s3 , . . . such that

s1 < s2 < s3 < · · · .

Since < is noetherian this sequence must terminate, at some sm for which
there is no element sm+1 with sm < sm+1 . Therefore sm is a most general
unifier.

Remark 7.3.5 1. J. A. Robinson showed in [101] that for every nonempty


finite set M ⊆ Wτ (X) there is an algorithm which decides whether M is
unifiable or not. If M is unifiable, then the algorithm generates a most gen-
eral unifier.

2. Another kind of unification problem can be defined for equational theories


Σ. Two terms t and t0 in Σ are called Σ-unifiable if there exists a substitution
s : X → Wτ (X) such that ŝ[t] ≈ ŝ[t0 ] ∈ Σ. (See Section 15.1.)

Example 7.3.6 1. Consider the type τ = (2, 1, 1) with operation sym-


bols f, g, h, and the terms t = f (x1 , g(x2 )) and t0 = f (h(x3 ), x4 ). Let
s1 : X → Wτ (X) be a substitution with x1 7→ h(x3 ), x2 7→ x2 , x3 7→ x3 and
x4 7→ g(x2 ). Then s1 is a unifier for t and t0 , since ŝ1 [t] = ŝ1 [f (x1 , g(x2 ))] =
f (h(x3 ), g(x2 )) = ŝ1 [f (h(x3 ), x4 )] = ŝ1 [t0 ].

2. Let τ = (2) and let Σ be the equational theory of the variety of all
commutative semigroups, that is, the equational theory generated by

E = {f (x1 , x2 ) ≈ f (x2 , x1 ), f (x1 , f (x2 , x3 )) ≈ f (f (x1 , x2 ), x3 )}.


7.3. TERM REWRITING 135

Consider the terms t = f (a, x1 ) and t0 = f (b, x2 ), where a and b are variables
different from x1 and x2 . Then the substitution s1 defined by s1 (x1 ) = b,
s1 (x2 ) = a and s1 (xj ) = xj for all j ≥ 3 is a Σ-unifier for t and t0 , since ŝ1 [t]
= f (a, b) ≈ f (b, a) = ŝ1 [t0 ] is an identity in Σ.

The next example illustrates the process of replacing critical pairs by new
reduction rules, in our procedure for finding the completion of a given term
rewriting system.

Example 7.3.7 Consider the type τ = (2, 1, 0) with the binary operation
symbol ·, the unary operation symbol −1 and the nullary operation symbol
e. We start with three rules r0 , r1 and r2 .

r0 : (x1 · x2 ) · x3 → x1 · (x2 · x3 ),
r1 : x−1 · x → e, and
r2 : e · x → x.

These rules clearly arise from the axioms of a group, with the imposition
of a definite (one-way) orientation. We will show later, in Example 7.4.5,
that the term rewriting system R defined by these three rules is termi-
nating. But it is not confluent, since we can find some critical pairs. For
instance, we have (using r1 ) that (x−1 1 · x1 ) · x3 → e · x3 , and (using r0 ) that
(x1 · x1 ) · x3 → x1 · (x1 · x3 ). But the term x−1
−1 −1
1 · (x1 · x3 ) cannot be further
reduced with our three rules; so the pair (e · x3 , x−1 1 · (x1 · x3 )) is a critical
pair. We can reduce the term e · x3 to x3 , using r2 , and we see from this
that x−1
1 · (x1 · x3 ) ≈ x3 is an identity in the variety of all groups. But the
corresponding rule x−11 · (x1 · x3 ) → x3 cannot be derived from the rules r0 ,
r1 and r2 . Therefore, we add a new reduction rule:

r3 : x−1
1 · (x1 · x3 ) → x3 .

For notational convenience, we shall not distinguish here between rules from
the original R and new rules in the successive enlargements of R. Now from r3
and r1 we can get (x−1 −1 −1 −1 −1 −1 −1 −1
2 ) ·(x2 ·x2 ) → x2 and (x2 ) ·(x2 ·x2 ) → (x2 ) ·e.
This gives a new critical pair (x2 , (x−1
2 )
−1 · e) which cannot be further re-

duced. We add another rule, r4 :

r4 : (x−1
2 )
−1 · e → x .
2
136 CHAPTER 7. TERM REWRITING SYSTEMS

From rules r3 and r2 we obtain e−1 · (e · x3 ) → x3 and e−1 · (e · x3 ) → e−1 · x3 .


This gives the critical pair (e−1 · x3 , x3 ), and the new rule r5 :

r5 : e−1 · x3 → x3 .

The process continues in this way, as shown in Table 1 below, where we


outline which derivations we get and which critical pairs and reduction rules
we have to add. At the point where we have fourteen rules there are no
more critical pairs, and the new term rewriting system is complete. We note
that basically we needed to add, along with rule r14 , the rule x−1 −1
2 · x1 →
(x1 · x2 )−1 . But as Knuth and Bendix showed in [66], this would have led to
some strong complications.
Now we want to use our observations in this example to find a general
method to construct “critical pairs” in a systematic way. First we need a
more rigorous definition of a critical pair.

Lemma 7.3.8 Let t1 and t2 be terms with var(t1 )∩ var(t2 ) = ∅. Let v be an


address of t1 such that t1 /v is a subterm which is not a variable. If s1 and
s2 are substitutions with ŝ1 [t1 /v] = ŝ2 [t2 ] then there is a substitution s with
the property that ŝ[t1 /v] = ŝ[t2 ].
Proof: Since var(t1 ) ∩ var(t2 ) = ∅ we can define a substitution s by
(
s1 (x) if x ∈ var(t1 )
s(x) =
s2 (x) if x ∈ var(t2 ).
Then we have ŝ[t1 /v] = ŝ1 [t1 /v] = ŝ2 [t2 ] = ŝ[t2 ].

We now use the concept of a most general unifier for a pair of terms, to make
our definition of a critical pair.
7.3. TERM REWRITING 137

Rules Derivations “Critical Pair” New Reduction


Rule
r0 , ((x−12 )
−1 · e) · x →
1 ((x−1 −1
2 ) · r6 =
r4 (x−1
2 ) −1 · (e · x ),
1 (e · x1 ), (x−1
2 )
−1 · x →
1
((x−12 )
−1 · e) · x →
1 x2 · x1 ) x2 · x1
x2 · x1
r4 , (x−1
2 )
−1 · e → x ,
2 (x2 · e, x2 ) r7 = x2 · e
r6 (x−1
2 )
−1 · e → x · e
2 → x2
r4 , ((x2 )−1 )−1 · e → x2 , (((x2 )−1 )−1 , x2 ) r8 =
r7 ((x2 )−1 )−1 · e → (x−1
2 )
−1 → x
2
((x2 )−1 )−1
r5 , e−1 · e → e, (e−1 , e) r9 = e−1 → e
r7 e−1 · e → e−1
r1 , (x−1
2 )
−1 · x−1 → e,
2 (x2 · x−1
2 , e) r10 =
r8 ((x2 )−1 )−1 · x−1 2 → x2 · x−1
2 →e
x2 · x−12
r0 , (x2 · x−1 2 ) · x1 → (x2 · (x−1
2 · x1 ), r11 = x2 ·
r10 x2 (x−1
2 · x1 ), e · x1 ) (x−1
2 · x1 ) → x1
(x2 · x−1 2 ) · x1 → e · x1
r0 , (x1 · x2 ) · (x1 · x2 )−1 → (x1 · (x2 · r12 = x1 (x2 ·
r10 x1 · (x2 · (x1 · x2 )−1 ), (x1 · x2 )−1 ), e) (x1 · x2 )−1 )
(x1 · x2 ) · (x1 · x2 )−1 →e
→e
r12 , x−1 −1
1 · (x1 · (x2 (x2 · (x1 · x2 )−1 , r13 = x2 ·
r3 (x1 · x2 )−1 )) x−1
1 · e) (x1 · x2 )−1 →
→ x−1 1 · e, x−1
1
x−1
1 · (x1 (x2 ·
(x1 · x2 )−1 ))
→ x2 · (x1 · x2 )−1
r3 , x−1
2 (x2 · (x1 · x2 ) )
−1 ((x1 · x2 )−1 , r14 =
r13 → (x1 · x2 ) , −1 x−1 −1
2 , x1 ) (x1 · x2 )−1
x−1
2 · (x2 · (x1 · x2 ) )
−1 → x−12 · x1
−1
−1 −1
→ x2 · x1

Table 1
138 CHAPTER 7. TERM REWRITING SYSTEMS

Definition 7.3.9 Let R be a term rewriting system and let t1 → t01 and
t2 → t02 be two reduction rules of R. We may assume that var(t1 )∩ var(t2 ) =
∅, since otherwise the variables can be renamed in an appropriate way. Let
v be an address of t1 such that t1 /v is a subterm which is not a variable. If
there is a most general unifier s of t1 /v and t2 (so that ŝ[t1 /v] = ŝ[t2 ]), then
the pair (ŝ[t01 ], ŝ[t1 ][v/ŝ[t02 ]]) is called a critical pair in R.

Remark 7.3.10 If we reduce ŝ[t1 ] using the rules t1 → t01 and t2 → t02 , then
we obtain ŝ[t1 ][v/ŝ[t02 ]].

Example 7.3.11 Consider the reduction rules r3 and r12 from Example
7.3.7:

x−1
1 · (x1 · x3 ) → x3 and x1 · (x2 · (x1 · x2 )−1 ) → e.

These rules have a common variable x1 ; so we replace x1 in the first rule by


x01 , and consider instead the rules r30 = (x01 )−1 · (x01 · x3 ) → x3 and r12 . We
want to use our definition of critical pairs, with t1 = (x01 )−1 · (x01 · x3 ), t01 =
x3 , t2 = x1 · (x2 · (x1 · x2 )−1 ) and t02 = e. The address v = 2 gives the subterm
x01 · x3 of t1 , not a variable, and these terms t1 /v and t2 are unifiable with a
most general unifier s mapping x01 to x1 and x3 to x2 · (x1 · x2 )−1 . We have
ŝ[t01 ] = ŝ[x3 ] = x2 · (x1 · x2 )−1 and ŝ[t1 ][v/ŝ[t02 ]] = (x−1
1 · (x1 · x3 ))[v/e] =
−1 −1 −1
x1 · e. Therefore (x2 · (x1 · x2 ) , x1 · e) is a critical pair.

The next lemma shows that all other “critical situations” of the kind we
have been considering are pairs which can be obtained by substitutions from
a critical pair.

Lemma 7.3.12 Let R be a term rewriting system and let t1 → t01 and t2 →
t02 be two reduction rules of R. Let v be an address of t1 such that t1 /v is a
subterm which is not a variable. If there exist substitutions s1 and s2 such
that ŝ1 [t1 /v] = ŝ2 [t2 ], then there exist a critical pair (t0 , t00 ) and a substitution
s such that
ŝ[t0 ] = ŝ1 [t01 ] and ŝ[t00 ] = ŝ1 [t1 ][v/ŝ2 [t02 ]].

Proof: We can always find a bijection r : X → X which renames the


variables in t2 if necessary, so that we may assume that var(t1 )∩ var(r̂[t2 ]) =
∅. Then by Lemma 7.3.8 there is a substitution s̄ such that ˆs̄[t1 /v] = ˆs̄[r̂[t2 ]];
7.3. TERM REWRITING 139

and from the proof of Lemma 7.3.8 this substitution is given by


(
s1 (x), if x ∈ var(t1 )
s̄(x) =
s2 (r−1 (x)), if x ∈ var(r̂[t2 ]).

By assumption we have var(t01 ) ⊆ var(t1 ) and var(t02 ) ⊆ var(t2 ). By Propo-


sition 7.3.4 there exists a most general unifier m of t1 /v and r̂[t2 ]. Since s̄ is
also a unifier of these terms, we have s̄ ≤ m, and there is a substitution s
with s̄ = s ◦ m. Then s̄[t] = ŝ1 [m̂[t]] for all terms t.

Now we define a critical pair (t0 , t00 ) by t0 = m̂[t01 ] and t00 = m̂[t1 [v/r̂[t]]]. Then
we have ŝ1 [t01 ] = ˆs̄[t01 ] = ŝ[m̂[t01 ]] = ŝ[t0 ], and ŝ1 [t1 ][v/ŝ2 [t02 ]] = ˆs̄[t1 ][v/ˆs̄[r̂[t02 ]]]
= ˆs̄[t1 [v/r̂[t02 ]]] = ŝ[m̂[t1 [v/r̂[t02 ]]]] = ŝ[t00 ].

Using the previous results we obtain the following.

Proposition 7.3.13 A term rewriting system is locally confluent iff all its
critical pairs are convergent.

As we saw after Definition 7.2.1, in a terminating term rewriting system


every term t can be reduced to a normal form, while confluence insures that
each term has a unique normal form. We get the following result.

Theorem 7.3.14 (Knuth-Bendix Theorem) A terminating term rewriting


system R is confluent if and only if for all critical pairs (t0 , t00 ), the terms t0
and t00 have a common (unique) normal form.

Proof: Let R ⊆ Wτ (X) × Wτ (X) be a terminating term rewriting system.


If (t0 , t00 ) is a critical pair in R then there is a term t with t → t0 and t → t00 .
R R
If R is confluent then by Theorem 7.1.10 it has the Church-Rosser property,
and as we saw in the remark after Definition 7.2.1, every term t has a unique
normal form N F (t). Then by Theorem 7.1.9 we have N F (t0 ) = N F (t00 ).

If conversely t0 and t00 have a common normal form, for all critical pairs
(t0 , t00 ), then R is locally confluent by Proposition 7.3.13. Since R is termi-
nating, it is confluent by Theorem 7.2.4. Moreover in this case the normal
form of any term is unique.
140 CHAPTER 7. TERM REWRITING SYSTEMS

Theorem 7.3.14 may be used to develop an algorithm to complete a given


term rewriting system. This algorithm is called the Knuth-Bendix comple-
tion procedure. This procedure produces a sequence (Rn | n ∈ IN) of finite
sets of reduction rules satisfying the following conditions:

- Each Rn is terminating and equivalent to R,


- Critical pairs in Rn are convergent in Rn+1 , for each n.

The main problem here is that termination has to be preserved. For this the
following Lemma is needed.

Lemma 7.3.15 Let R be a term rewriting system. If > is a terminating


fully invariant relation on Wτ (X), then R is terminating whenever t > t0
for each rule t → t0 of R.
Proof: Let R ⊆ Wτ (X) be a term rewriting system. Let → be the fully
R
invariant relation on Wτ (X) generated by R. If we have R contained in >,
with > fully invariant, it follows that → is also contained in >. But as a
R
subrelation of a terminating relation, → is also terminating.
R

To use this result in our algorithm, it must be decidable for each pair (t, t0 )
of terms whether t > t0 holds or not. If a critical pair (t, t0 ) is produced
during the completing process for which the two components t and t0 are
incomparable with respect to >, then the Knuth-Bendix algorithm gives an
error-message.

The inputs of the Knuth-Bendix algorithm are a finite term rewriting system
R = {(t1 , t01 ), . . . , (tm , t0m )} and a terminating fully invariant relation >. The
output is a complete term rewriting system equivalent to R, or an error
message.
The sequence (Rn | n ∈ IN) of finite term rewriting systems is produced, if
possible, as follows:

begin R−1 := ∅; n := 0;
for i = 1, . . . , m do
if ti , t0i are incomparable then stop with error message
else if ti > t0i then add ti → t0i to R0
else add t0i → ti to R0 ;
while Rn 6= Rn−1 do
7.4. TERMINATION OF TERM REWRITING SYSTEMS 141

begin Rn+1 := Rn ; compute N F (t) and N F (t0 ) for any


critical pair (t, t0 ) in Rn ;
if N F (t), N F (t0 ) are incomparable
then stop with error message
else if N F (t) > N F (t0 ) then add N F (t) → N F (t0 ) to Rn+1
else add N F (t0 ) → N F (t) to Rn+1
n := n + 1
end
end.

If this algorithm calculates a sequence (Rn | n ∈ IN) of finite term rewriting


systems, without ever generating an error message, then the union of all
these Rn is a completion of R.

7.4 Termination of Term Rewriting Systems


As we have seen, termination is an important property of term rewriting sys-
tems. In general termination is an undecidable property. Since a relation →
on a set A is terminating if and only if its transitive closure →T is terminat-
ing, we may assume without loss of generality that the relation is transitive.
It is also necessary that a terminating relation be irreflexive. Relations which
are both irreflexive and transitive are called proper order relations. To verify
termination using the Knuth-Bendix completion algorithm, we need methods
which are based on proper fully invariant orders.

Definition 7.4.1 A terminating proper order on Wτ (X) is called a reduc-


tion order if it is fully invariant.
Then Lemma 7.3.15 can be reformulated as follows.

Lemma 7.4.2 Let R be a term rewriting system. Then R is terminating if


there is a reduction order > on Wτ (X) such that t > t0 for all reduction
rules t → t0 in R.
There are several ways to define a reduction order for a term rewriting sys-
tem. We must choose the best one, meaning that we want a reduction order
which allows a proof of termination and for which membership can be effec-
tively tested. The latter point is important since the Knuth-Bendix comple-
tion method requires comparisons between terms at every step of the process.
142 CHAPTER 7. TERM REWRITING SYSTEMS

Here we will describe the Knuth-Bendix ordering ([66]). We assume that


there is a total order ≤ defined on the set {fi | i ∈ I} of operation symbols
of our type, and that we have a function ϕ : {fi | i ∈ I} → IN called a weight
function, which has the following properties:

(i) If f0 is a nullary operation symbol then ϕ(f0 ) > 0;

(ii) If f1 is a unary operation symbol and ϕ(f1 ) = 0, then f1 is the greatest


element with respect to ≤.

This weight function can then be extended inductively to a mapping ϕ̂ :


Wτ (X) → IN on the set of all terms, as follows:

(i) ϕ̂(x) = min{ϕ(f0 ) | f0 is nullary}, for all x ∈ X,

(ii) ϕ̂(fi (t1 , . . . , tni )) = ϕ(fi ) + ϕ̂(t1 ) + · · · + ϕ̂(tni ), for any term
fi (t1 , . . . , tni ) where fi is an ni -ary operation symbol and t1 , . . . , tni
are terms in Wτ (X).

We will denote by occx (t) the number of occurrences of the variable x in the
term t.

Definition 7.4.3 Let ≤ be a total order on the set {fi | i ∈ I} of operation


symbols of type τ , and let ϕ : {fi | i ∈ I} → IN be a weight function. The
induced Knuth-Bendix order >KB on Wτ (X) is defined as follows. We set
t >KB t0 iff either

(i) ϕ̂(t) > ϕ̂(t0 ) and occx (t) > occx (t0 ) for all x ∈ X, or

(ii) ϕ̂(t) > ϕ̂(t0 ) and occx (t) = occx (t0 ) for all x ∈ X and either

(ii)(a) t = f (· · · f (t0 ) · · ·) and f is the greatest element in


{fi | i ∈ I} or
(ii)(b) t = fi (t1 , . . . , tni ), t0 = fj0 (t01 , . . . , t0nj ) and either fi > fj0 or fi = fj0
and t1 = t01 , . . . , ti−1 = t0i−1 and ti >KB t0i for some i = 1, . . . , ni
where ni = nj .

Then we can verify that this induced order is indeed a reduction order.

Proposition 7.4.4 Every Knuth-Bendix order is a reduction order.


7.4. TERMINATION OF TERM REWRITING SYSTEMS 143

Proof: Let τ be a fixed type, and let ≤ be a total order on the set {fi | i ∈ I}
of operation symbols and ϕ : {fi | i ∈ I} → IN be a weight function. We
consider the induced Knuth-Bendix order >KB on Wτ (X). We have to show
that >KB is fully invariant and terminating.
Assume that t1 and t2 are terms from Wτ (X) with t1 >KB t2 , and let
t ∈ Wτ (X) be an arbitrary term. We have to prove that t[u/t1 ] >KB t[u/t2 ]
for every address u of t. By the transitivity of >KB it suffices to verify this
for all addresses u of t with u ∈ IN \ {0}.

Since t1 >KB t2 , either case (i) or case (ii) of Definition 7.4.3 holds. If case
(i) holds, and ϕ̂(t) > ϕ̂(t0 ) and occx (t) > occx (t0 ) for all x ∈ X, then also
ϕ̂(t[u/t1 ]) > ϕ̂(t[u/t2 ]) and occx (t[u/t1 ]) > occx (t[u/t2 ]) for all x ∈ X. Then
again by case (i) of Definition 7.4.3 we have t[u/t1 ] >KB t[u/t2 ]. If instead
case (ii) holds, and ϕ̂[t1 ] = ϕ̂[t2 ] and occx (t[u/t1 ]) = occx (t[u/t2 ]) for all
x ∈ X, then by the same case we have t[u/t1 ] >KB t[u/t2 ]. This shows that
>KB is closed under the replacement rule. In a similar way it can be shown
that >KB is closed under all substitutions, so that >KB is fully invariant.

Next we show by contradiction that >KB is terminating. Suppose that there


is an infinite chain t0 >KB t1 >KB t2 >KB · · · >KB · · · of terms. By
the definition of the Knuth-Bendix order, such a sequence is only possi-
ble if our type contains a nullary operation symbol. Also by definition we
have var(tn ) ⊆ var(t0 ), for all n ≥ 0. Therefore there is a substitution
s : X → Wτ (X) such that all the variables occurring in any ti are mapped
to nullary terms. Then ŝ(t0 ) >KB ŝ(t1 ) >KB · · · >KB · · · is an infinite chain
of nullary terms. For any nullary term, let us denote by kn the number of
operation symbols of arity n in t. Then we can prove by induction that
k0 = 1 + k2 + 2k3 + 3k4 + · · ·.

Since each nullary operation symbol has positive weight, the weight w of
any nullary term is greater than or equal to k0 . Therefore for a fixed weight
w there is only a finite number of choices for k0 , k2 , k3 , · · ·. If each unary
operation symbol has a positive weight we have w ≥ k1 and therefore there
are only finitely many nullary terms of weight w. That means an infinite
chain
t0 >KB t1 >KB t2 >KB · · · >KB · · ·

is impossible unless there is a unary operation symbol of weight zero.


Thus we now assume that there is a unary operation symbol f0 of weight
144 CHAPTER 7. TERM REWRITING SYSTEMS

zero. We define a mapping h from the set of all unary terms into itself such
that h(t) is the unary term obtained from t by replacing all occurrences of f0
by another unary term. Clearly the mapping h preserves the weight and there
are only finitely many nullary terms of weight w (by the previous remark).
Now we show that there is no infinite chain t0 >KB t1 >KB > t2 >KB · · · of
nullary terms such that h(t0 ) = h(t1 ) = h(t2 ) = · · ·. Each nullary term
t can be regarded as a word over the set of all nullary operation sym-
bols; that means t can be written in the form t = f0r1 α1 f0r2 α2 · · · f0r1 or
t = α1 f0r1 α2 f0r2 · · · αn f0rn αn , where r1 , . . . , rn are natural numbers and
α1 , . . . , αn are words built up by nullary operation symbols except f0 . Let
r(t) = (r1 , . . . , rn ) be the n-tuples consisting of the exponents of the occur-
rences of f0 .

It can be shown that if h(t) = h(t0 ) then t >KB t0 iff r(t) >lex r(t0 ), where
>lex is the lexicographic order on n-tuples. It can easily be shown that >lex
is terminating (on words of equal length). This completes the proof of Propo-
sition 7.4.4.

Example 7.4.5 As an example of the Knuth-Bendix order we consider the


type τ = (2, 1, 0), with operation symbols ·, −1 and e (corresponding to
groups). On this set of operation symbols we define the following order
−1
e≤ · ≤ .

We also use the weight function ϕ : {·,−1 , e} → N with ϕ(·) = 1, ϕ(e) = 1


and ϕ(−1 ) = 0. For the extension ϕ̂ to terms, this gives for instance
ϕ̂(x) = 1 for all variables x ∈ X,
ϕ̂((x · y) · z) = ϕ(·) + (ϕ(·) + ϕ(x) + ϕ(y)) + ϕ(z) = 5,
ϕ̂(x · (y · z)) = ϕ(·) + ϕ(x) + (ϕ(·) + ϕ(y) + ϕ(z)) = 5.

Then by Definition 7.4.3 (ii)(b), second case, we have

(x · y) · z >KB x · (y · z).

Also, (x · y)−1 >KB y −1 · x−1 by Definition 7.4.3 (ii)(b), first case, and
(y −1 )−1 > y by Definition 7.4.3 (ii)(a). This order can be used to prove
the termination of the term rewriting system for groups from Example 7.3.7
(see [66]).
The following proposition generalizes Lemma 7.3.15.
7.5. EXERCISES 145

Proposition 7.4.6 A term rewriting system R is terminating if there is a


terminating and invariant (under substitution) proper order > on Wτ (X)
such that ŝ(t) > ŝ(t0 ) for all reduction rules t → t0 in R and all substitutions
s : X → Wτ (X).
Proof: For a given term rewriting system R, we will denote by → the fully
R
invariant relation generated by R. Then → is the invariant closure of the set
R

S = {(ŝ(t), ŝ(t0 )) | t → t0 ∈ R and s : X → Wτ (X) is a substitution}.

Assume that > is a terminating invariant proper order on Wτ (X), such that
ŝ(t) > ŝ(t0 ) for all t → t0 ∈ R and all substitutions s : X → Wτ (X). Then S
is a subset of >, and since > is invariant we have → a subset of > too. Thus
R
→ is terminating.
R

Unfortunately, there are many cases for which the Knuth-Bendix ordering is
not suitable. Suppose we want to show the termination of t = x · (y −1 · y) −→
(y · y)−1 · x, using the same order on the fundamental operation symbols and
the same weight function as in Example 7.4.5. The only possible case is
Definition 7.4.3 (ii)(b), second case, but then we must have x >KB (y · y)−1 ,
and this is a contradiction.
Several other orderings have therefore been developed; see for instance the
Handbook of Formal Languages, Vol. 3 ([54]).

7.5 Exercises
7.5.1. Let ρ be a relation on a set A. Prove that

ρSRT = (ρ ∪ ρ−1 ∪ 4A )T .

7.5.2. Let ρ1 and ρ2 be relations on a set A. Prove that for all n ∈ IN,

ρ1 ◦ ρRT RT RT RT (n)
2 ⊆ ρ2 ◦ ρ1 ⇒ (ρ1 ◦ ρ2 ) ⊆ ρRT RT
2 ◦ ρ1 .

7.5.3. Let Substτ be the set of all substitutions of type τ . Let ≤ be the
relation defined on Substτ by s1 ≤ s iff there is a substitution s such that
s1 = s ◦ s2 . Prove that ≤ is a reflexive and transitive order on Substτ , and
that the relation ∼ defined by s1 ∼ s2 iff s1 ≤ s2 and s2 ≤ s1 is an equiva-
lence relation on Substτ .
146 CHAPTER 7. TERM REWRITING SYSTEMS

7.5.4. Prove that the relation < defined on Substτ by s1 < s2 :⇔ s1 ≤ s2


and s1 6∼ s2 is noetherian.

7.5.5. Let H be the commutative semigroup generated by the four elements


a, b, c and s, and by the defining equations

(E) as = c2 s and bs = cs.


Let → be the smallest relation on H such that cs → bs, b2 s → as, and if
x → y then ux → uy for all x, y and u in H. Prove that → is noetherian
and has the Church-Rosser property.
Chapter 8

Algebraic Machines

An important topic in theoretical Computer Science is the study of which


problems can be solved by machines such as automata or Turing Machines
which arise as abstractions of “real-life” computers. As we saw in Example
1.2.16, automata are in fact multi-based algebras, and so we can think of
them as “algebraic machines.” The most important automata are those in
which the set of states is finite, and we consider finite automata or finite
state machines; these model the realistic situation of a computer with a
finite amount of memory and other resources.

For any class of automata of a fixed type we obtain a family of languages,


the languages recognized by the automata in the class. The relation of recog-
nition between automata and languages thus defines a Galois-connection
between the class of all finite automata of a given type and the family of all
languages. The class of all regular languages is described in Section 8.1, while
Section 8.2 introduces finite automata and the languages they recognize. In
Section 8.2 we present Kleene’s Theorem, which shows that the languages
recognizable by finite automata are precisely the regular languages.

Finite automata are machines which recognize languages made up of


“words,” that is, of terms on the free monoid on a finite alphabet. In this
way they are in fact a special case of more general machines which recog-
nize terms of any fixed type. Referring to terms as trees, we study machines
called tree-recognizers. Sections 8.4 to 8.8 present the theory, including a
Kleene-type theorem, of such machines. Finally in Section 8.9 we study Tur-
ing machines, which we use in the next section to describe the solution to

147
148 CHAPTER 8. ALGEBRAIC MACHINES

various decidability problems in Universal Algebra.

8.1 Regular Languages


In order to discuss the languages accepted by finite automata, we must first
recall the definition of words from Example 6.4.5. Let Xn = {x1 , . . . , xn }
be a finite alphabet of size n ≥ 0. We denote by Xn∗ the universe set of
the free monoid generated by Xn . When we denote the binary operation of
the monoid by juxtaposition, and use the associativity of this operation in a
monoid to justify omitting all brackets, we see that any monoid term from
Xn∗ may be expressed as a word composed of the letters from Xn . Note that
the letters may be repeated; for instance, each of x1 x2 , x2 x1 x1 and x1 x2 x3 x3
is a word on the alphabet X3 . Formally, any word w may be expressed as
w = xi1 xi2 · · · xim , for some m ≥ 0 and xi1 , . . . , xim ∈ X. The number m is
called the length of the word w, and is denoted by |w|. The special case that
m = 0 corresponds to the empty word, denoted by e. The set Xn+ is defined
to be the set of all non-empty words on the alphabet Xn . This set is the
universe of the free semigroup generated by the set Xn . Of course, Xn+ and
Xn∗ are (the universes of) a semigroup and a monoid respectively, with the
binary operation of juxtaposition or concatenation of words.

A language, or more precisely a language over the alphabet Xn , is simply any


subset of Xn∗ . The family of all possible languages on Xn is then the power set
of Xn∗ . We can introduce an algebraic structure on this power set, by defining
three operations on sets of languages. The first operation is the binary one
of union (where the union of two languages is just their set-theoretic union).
The second binary operation we need is called the product, and is usually
denoted by juxtaposition: for any two languages U and V , we set

U V := {uv | u ∈ U, v ∈ V }.

Note that this binary product is associative, meaning that U (V W ) =


(U V )W for all languages U , V and W on Xn , and satisfies in addition
the properties that U ∅ = ∅U = ∅ and U {e} = {e}U = U for every language
U.

For any language U , we inductively define powers U m for all m ≥ 0, by


8.1. REGULAR LANGUAGES 149

(i) U 0 = {e}, and


(ii) U m = U m−1 U , for m ≥ 0.

Then we define U ∗ = m and U + = m.


S S
m∈N U m≥1 U

A word w ∈ Xn∗ belongs to U ∗ if and only if it is the empty word or it can


be expressed in the form u1 u2 · · · um for some m ≥ 1 and u1 , . . . , um ∈ U .
Clearly Xnm is the set of all words of length m on the alphabet Xn , and Xn∗
= m∈N Xnm .
S

The unary operation taking a language U to the language U ∗ is called itera-


tion. The three operations just discussed, the union, product and iteration,
are called the regular language operations. A regular language is any lan-
guage which can be built out of languages consisting of a single length-one
word, using these three operations only.

Definition 8.1.1 The set RegX of all regular languages over an alphabet
X is the smallest set R such that

(i) ∅ ∈ R and {x} ∈ R for each x ∈ X, and


(ii) for any U and V in R, all of U ∪ V , U V and U ∗ are in R.

It is easy to see from this definition that all finite languages are regular. The
set RegX is the smallest set of languages over X which contains all the finite
languages and is closed under the three regular language operations.

Algebraically, we have defined three operations on the base set P(Xn∗ ). If


we define two binary operation symbols + and · to represent the union
and product operations, and a unary operation symbol ∗ for the iteration
operation, then using variables and nullary operation symbols for ∅ and e
we can define terms in this new language. These terms are called regular
expressions, and they correspond precisely to regular languages. As usual,
we omit the binary operation symbol and just use juxtaposition for the
product operation.

Example 8.1.2 Let X2 = {x1 , x2 }. Some elements of RegX2 are {x1 }, {x2 },
{x1 x2 }, {x1 x2 }∪ {x2 x2 } = {x1 x2 , x2 x2 }.
150 CHAPTER 8. ALGEBRAIC MACHINES

8.2 Finite Automata


In Section 1.2 we introduced a finite automaton as a multi-based algebra of
the form H = (Z, X; δ, z0 ), where

(1) Z is a finite non-empty set of states;


(2) X is the (finite) input alphabet;
(3) δ : Z × X → Z is the state transition function; and
(4) z0 ∈ Z is an initial state.

We sometimes also want to have some states designated as final states. In


this case we have a special subset Z 0 of Z, called the set of final states, and
we write our automaton as a quintuple H = (Z, X; δ, z0 , Z 0 ). In this chapter
we shall consider only the case that the input alphabet is a finite set Xn ,
but automata can also be defined on an infinite alphabet.

The “action” of a finite state automaton is given by the state transition


function δ. We think of the equation δ(z, x) = y as telling us that when the
machine receives input x when it is in state z, it moves or changes to state
y. Any state transition function δ can be uniquely extended to a mapping
δ̂ : Z × X ∗ → Z,
where X ∗ is the set of all words over X, in the following inductive way:

(i) δ̂(z, e) = z, for each state z ∈ Z, where e is the empty word, and
(ii) δ̂(z, wx) = δ(δ̂(z, w), x), for each state z ∈ Z, each letter x ∈ X and each
word w ∈ X ∗ .

The inductive step (ii) of this definition says that to compute which state
the machine is in, when it has started in state z with input word wx, we
first find what state it is in after input w, then move from that state with
input x. That is, words are read in one letter at a time, with a change of
state each time.

Now that we know how an automaton reads in words, we can describe how
an automaton acts as a language recognizer, over the alphabet X. We input
a word w to the machine, in the initial state z0 , and compute the resulting
state δ̂(z0 , w). If this state is a final state from the set Z 0 , the input word w is
recognized or accepted by the automaton; otherwise, it is said to be rejected
8.2. FINITE AUTOMATA 151

by the automaton. The language recognized by an automaton H is the set


L(H) of all the words from X ∗ which are recognized by the automaton. Thus

L(H) = {w ∈ X ∗ | δ̂(z0 , w) ∈ Z 0 }.

A language L over an alphabet X is said to be recognizable if there exists


an automaton H with alphabet X such that L = L(H). We will denote by
RecX the set of all recognizable languages over X, and by Rec the family
of all recognizable languages.

Notice that the finiteness of the state set is important here; otherwise, every
language over X would be recognizable. We now have two classes of languages
over an alphabet: the regular languages of the previous section, and the new
recognizable languages. The first important result of finite automata theory
is Kleene’s Theorem (see S. C. Kleene, [63]), which says that these two classes
are in fact the same. We can easily prove one direction of this theorem now,
that any recognizable language is regular, but the other direction will take
us more work.

Theorem 8.2.1 Let X be a finite alphabet. Any recognizable language over


X is regular, so that RecX ⊆ RegX.

Proof: Let L = L(H) be a language recognized by an automaton H. We


will show that L is a regular language. We may assume that the states of
H are enumerated as {z0 , z1 , . . . , zn }, where z0 is an initial state. For each
0 ≤ k ≤ n + 1 and each 0 ≤ i, j ≤ n, let Lkij be the set of words w such that
δ̂(zi , w) = zj and such that in the computation of δ̂(zi , w) no intermediate
state zm with m ≥ k is used. It is clear that L(H) is the union of the sets
Ln+1
0j over those indices j for which zj is a final state. This means that it
will suffice to prove, by induction on k, that the sets Lkij are regular, that is,
that they can be represented by regular expressions.

If k = 0, then no intermediate steps are allowed in our computations, and


the set Lkij is either {x ∈ X | δ(zi , x) = zj }, if i 6= j, or the union of this set
with the set {e}, if i = j. In either case the set Lkij is regular.

Now assume that all the sets Lkab are regular. Then Lk+1 ij is the union of Lkij
with the set of all words w such that δ̂(zi , w) = zj and the state zk is used at
least once in the computation. We show that this latter set is also regular.
152 CHAPTER 8. ALGEBRAIC MACHINES

Now words in this set use state zk at least once, and we can describe them
in terms of the occurrences of zk . Up to the first occurrence of zk , no state
zm with m ≥ k is used, and the same is true between any two consecutive
uses of zk and after the last use of zk . Thus we can express any such word
w in the form uw1 · · · wl v, where u ∈ Lkik , v ∈ Lkkj and wr ∈ Lkkk for each r.
This means that we can express Lk+1 ij = Lkij ∪ Lkik (Lkkk )∗ Lkkj . By the induction
hypothesis the sets used in this expression are each regular, and thus
so is Lk+1
ij .

We present the following example of the language recognized by a finite


automaton.

Example 8.2.2 Let H = (Z, X; δ, z0 , Z 0 ), where


Z = {z0 , z1 , z2 , z3 , z4 }, X = {0, 1}, Z 0 = {z3 } and the state transition func-
tion δ is given by

δ 0 1
z0 z1 z3
z1 z2 z3
z2 z3 z0
z3 z4 z4
z4 z4 z4
The graph below describes the work of this automaton. Notice that an arrow
from state zi to state zj is labelled with a letter x when δ(zi , x) = zj .

z0 0 z1
u - u
@
6@
1@
@
1 @ 1
@
0
@
1 - z4
0
u - u - u
R ?
@

z2 0 z3 0 1

Consider the word w = 0001101. The machine reads this word in one letter
at a time, from left to right, starting with input 0 in the initial state z0 . This
8.2. FINITE AUTOMATA 153

gives us δ(z0 , 0) = z1 , δ(z1 , 0) = z2 , δ(z2 , 0) = z3 , δ(z3 , 1) = z4 , δ(z4 , 1) = z4 ,


δ(z4 , 0) = z4 and finally δ(z4 , 1) = z4 . When the word is completely read in,
the machine ends in state z4 , which is not one of the final states from Z 0 .
Therefore the word 0001101 is rejected. Inputting the word 00101 gives us
δ(z0 , 0) = z1 , δ(z1 , 0) = z2 , δ(z2 , 1) = z0 , δ(z0 , 0) = z1 and then δ(z1 , 1) = z3 .
This word results in a final state; so it is accepted by this machine.

We want to show now that the language accepted by this automaton is reg-
ular, by finding a regular expression for it. We notice that the final state z3
can be reached only after a final step from states z0 , z1 or z2 , while if state z4
is reached there is no way to leave it for another state. Thus we may travel
through the states z0 , z1 and z2 as many times as we wish, but must then
finish with a transition to z3 . For instance, if the last transition used goes
from z0 to z3 , the word accepted has the form (001)n 1, for some n ≥ 1. This
gives a set of words represented by the regular expression (001)∗ 1. If the last
transition used goes from z1 or z2 to z3 , we get words of the form (001)n 01
or (001)n 000, respectively.

Thus we guess that the language L(H) is represented by the regular expres-
sion (001)∗ (1 + 01 + 000). Now we must prove that this is indeed the case,
by induction.

As a first step, we show that for all n ≥ 0 we have δ̂(z0 , (001)n ) =


z0 . For the base case of n = 0 our word (001)0 = e, the empty word,
and here δ̂(z0 , e) = z0 . Inductively, suppose that the claim is true for a
value n. Then δ̂(z0 , (001)n+1 ) = δ̂(z0 , (001)n 001) = δ(δ̂(z0 , (001)n 00), 1) =
δ(δ(δ̂(z0 , (001)n 0), 0), 1) = δ(δ(δ(δ̂(z0 , (001)n ), 0), 0), 1) = δ(δ(δ(z0 , 0), 0), 1)
= δ(δ(z1 , 0), 1) = δ(z2 , 1) = z0 .

Next we prove the converse of the first claim, that if δ̂(z0 , w) = z0 , then
w = (001)n for some n ≥ 0. This is obvious if w is the empty word e; so
we will assume that w = w1 r. Then since δ(δ̂(z0 , w1 ), r) = δ̂(z0 , w) = z0 ,
we see that r = 1 and δ̂(z0 , w1 ) = z2 . In a similar way we get w = w2 001,
where δ̂(z0 , w2 ) = z0 . Since w2 is a shorter word than w, we conclude by the
induction hypothesis that w2 = (001)n for some n, so that w = (001)n+1 .

Now we let w be any word belonging to the language represented by the


expression (001)∗ (1 + 01 + 000). Then w has the form (001)n v for some
n ≥ 0, where v is one of the words 1, 01 or 000. If v = 1, then δ̂(z0 , w) =
154 CHAPTER 8. ALGEBRAIC MACHINES

δ̂(z0 , (001)n 1) = δ(δ̂(z0 , (001)n ), 1) = δ(z0 , 1) = z3 , and so the word w is in


L(H). The other two cases for v are very similar, and we omit the details.

Conversely, we must show that any word w in L(H) has a representation


of the required form. We must have δ̂(z0 , w) = z3 , and so w 6= e. Let w =
w1 r where r is either 0 or 1. For r = 0 we have δ(δ̂(z0 , w1 ), 0) = z3 , which
forces δ̂(z0 , w1 ) = z2 . Similarly, if we write w1 = w2 0 we have δ̂(z0 , w2 )
= z1 , and again w2 = w3 0 with δ̂(z0 , w3 ) = z0 . Using the previous result,
we obtain w3 = (001)n for some n, so that w = (001)n 000 and belongs to
the language represented by our regular expression. The case that r = 1 is
handled similarly.

Our next aim is to prove the other direction of Kleene’s Theorem, by show-
ing that for any given regular expression there is a finite automaton which
recognizes the corresponding language. To do this we need the concept of a
derived language. Let L ⊂ X ∗ be a language over an alphabet X. For any
letter a ∈ X we define the derived language La of L with respect to a by

La := {w ∈ X ∗ | aw ∈ L}.

We want to find a finite automaton to recognize a given language L. Sup-


pose that for each a ∈ X we have found an automaton Ha which recognizes
the derived language La . We form a new automaton H as follows. We may
assume (by relabelling if necessary) that the automata Ha have no states in
common. Our new set of states consists of the union of the state sets of the
Ha , along with a new initial state labelled z0 . We define a transition function
δ so that for each letter a the state δ(z0 , a) is equal to the initial state of
Ha , and otherwise δ copies the state transitions of the Ha . The final states
of H are all states which are final in any of the Ha . It is easy to see that H
recognizes the language L.

However, we cannot assume inductively that the Ha have been constructed,


since unfortunately it is possible for the derived language La to equal L.
This happens for instance when L = {a, b}∗ . This problem can be overcome
by the use of loops. More serious is the possibility that there might be in-
finitely many different languages derived from the initial one, in which case
the automaton resulting from our construction would not be finite. The next
Lemma deals with this problem.
8.2. FINITE AUTOMATA 155

We extend our definition of derived languages, to allow words as well as


single letters. That is, for any language L and any word w ∈ X ∗ , we set

Lw := {v ∈ X ∗ | wv ∈ L}.

Notice that such languages Lw can be obtained by repeatedly forming derived


languages. For example, Laab = (((La )a )b ).

Lemma 8.2.3 If L is a language represented by a regular expression on a


finite alphabet X, then there are only finitely many distinct languages of the
form Lw for w ∈ X ∗ .
Proof: Let E be another language represented by a regular expression, and
let LE = {Lw | w ∈ E}. The language Lw = L{w} is represented by a
S

regular expression, since any single word has this property. We show by in-
duction on the construction of the regular expression for L that there are
only finitely many distinct languages of the form LE , and from this we get
the required result.

We consider first the three base cases, that L = ∅, L = {e} or L = {a} for
some letter a ∈ X. If L = ∅ then LE can only be the empty set too. If L =
{e}, then LE is either {e} if e ∈ E or the empty set again if not. If L = {a}
for a letter a, then LE is one of the following:

{e, a}, if a, e ∈ E;
{e}, if a ∈ E and e 6∈ E;
{a}, if e ∈ E and a ∈6 E;
∅, if a, e 6∈ E.

This gives only finitely many possibilities for LE .

For the inductive step, we first consider languages L1 ∪ L2 formed by taking


unions. It is evident that (L1 ∪ L2 )w = (L1 )w ∪ (L2 )w , and if we extend to
sets E by taking unions we have (L1 ∪ L2 )E = (L1 )E ∪ (L2 )E . Thus if two
sets E and E 0 produce distinct languages (L1 ∪ L2 )E 6= (L1 ∪ L2 )E 0 , then
either (L1 )E 6= (L1 )E 0 or (L2 )E 6= (L2 )E 0 . If by induction there are only
finitely many, say n, distinct languages of the form (L1 )E and finitely many,
say m, distinct languages of the form (L2 )E , then there are no more than
mn different languages of the form (L1 ∪ L2 )E .
156 CHAPTER 8. ALGEBRAIC MACHINES

Next we consider products L1 L2 . Here we will prove that

(L1 L2 )E = (L1 )E L2 ∪ (L2 )EL1 . (*)

From this it will follow, as in the case for unions, that there are no more
than mn distinct languages of the form (L1 L2 )E (with m and n as before).

For the first direction of (*) let v ∈ (L1 L2 )E . Then wv = w1 w2 for some
w1 ∈ L1 , w2 ∈ L2 and w ∈ E. There are two cases possible, depending on
the length of w:

Case 1: If the length of w is less than or equal to the length of w1 , then w1


= wv1 and v = v1 w2 for some v1 , and v is in (L1 )E L2 .
Case 2: If the length of w is greater than that of w1 , then w = w1 v1 and w2
= v1 v, which implies that v1 ∈ EL1 , and thus v ∈ (L2 )EL1 .

This establishes one inclusion for (*). Conversely, if v ∈ (L1 )E L2 , then v


= v1 v2 for some v1 in (L1 )E and v2 ∈ L2 . Thus for some w ∈ E we have
ww1 ∈ L1 and wv1 v2 ∈ L1 L2 , which shows that v = v1 v2 is in (L1 L2 )E . If
v ∈ (L2 )EL1 , then w2 v ∈ L2 for some w2 ∈ EL1 , and w1 w2 ∈ E for some
w1 ∈ L1 , and therefore w1 w2 v ∈ L1 L2 . This shows that v ∈ (L1 L2 )E , and
completes the proof of (*).

For our final step we consider languages L∗ , assuming that there are only
finitely many different languages of the form LE . We show that (L∗ )E is
either LEL∗ L∗ , if the empty word e is not in E, or the union of this set with
{e}, if e ∈ E.

Suppose that v ∈ (L∗ )E , and v 6= e. Then wv ∈ L∗ for some w ∈ E,


and we can write wv = w1 w2 · · · wn for some words wi ∈ L. Since v 6= e
we also have n > 0, and there is a value r such that w = w1 · · · wr−1 wr0
and v = wr00 wr+1 · · · wn , where wr = wr0 wr00 . Since w1 w2 · · · wr−1 ∈ L∗ and
w1 · · · wr−1 wr0 ∈ E, we get wr0 ∈ EL∗ . It follows that wr00 ∈ LEL∗ . Therefore v
= wr00 wr+1 · · · wn ∈ LEL∗ L∗ .

Conversely, if v ∈ LEL∗ L∗ , then v = uw1 w2 · · · wn for some u ∈ LEL∗


and wi ∈ L. Then u0 u ∈ L for some u0 ∈ EL∗ , and u1 u2 · · · um u0 ∈ E
for some ui ∈ L. Therefore u1 u2 · · · um u0 uw1 w2 · · · wn ∈ L∗ , and since
u1 u2 · · · um u0 ∈ E we have b = uw1 w2 · · · wn ∈ (L∗ )E . Finally, we note that
8.2. FINITE AUTOMATA 157

if e ∈ E then also e ∈ (L∗ )E .

Now we note that there are at most twice as many different languages of the
form (L∗ )E as there are of the form LE , and hence only finitely many.

This result can now be used to finish our proof of Kleene’s Theorem.

Theorem 8.2.4 (Kleene’s Theorem) Let X be a finite alphabet. Then any


language L over X is recognizable iff it is regular, so RecX = RegX.

Proof: Since one direction of this was proved in Theorem 8.2.1, we now
have to prove that any regular language L is recognizable. By Lemma 8.2.3,
there are only finitely many different languages of the form Lw ; we will label
these as L1 , . . ., Lk . We construct our new automaton H with states indexed
by these Li . We have a transition δ(Li , a) = Lj if Lj = (Li )a . This gives a
complete (deterministic) definition of our transition function δ. Our initial
state is that indexed by L itself, and the final states are those Li for which
the language Li contains the empty word e. We now claim that the language
accepted by this automaton H is precisely our given language L. This is
because for any word w, Lw is the language which labels the state δ̂(z0 , w).
Therefore

w ∈ L(H) iff δ̂(z0 , w) is final


iff e belongs to the language which labels δ̂(z0 , w)
iff w ∈ L.

The automata we have considered thus far, of the form H = (Z, X; δ, z0 , Z 0 ),


are called initial deterministic automata. Initial refers to the fact that we
have a unique initial state z0 . It is also possible to have a set Z0 ⊆ Z of
initial states, with |Z0 | ≥ 2; in this case we have a weak initial automaton.
The deterministic property refers to the fact that we have required that
our transition information be given by a function δ from Z × X to Z; this
ensures that any input combination of a state and letter leads to exactly one
resulting state. A non-deterministic automaton is one in which some com-
binations of state-letter inputs may lead to no state, or to more than one
choice of resulting state. That is, the transition δ may be a partial mapping,
from Z × X to the power set P(Z). It can be proven that a language is rec-
ognizable by a non-deterministic automaton if and only if it is recognizable
by a deterministic one.
158 CHAPTER 8. ALGEBRAIC MACHINES

The remaining variant of our basic definition that we need to consider is a


finite automaton with output, of the form H = (Z, X, B; δ, λ). Here we have
an additional output alphabet B and an output function λ : Z × X → B,
which maps each state-letter pair (z, a) to an output element in B. Just as
was the case for δ, this output function λ can be uniquely extended to a map
λ̂ from Z × X ∗ to B ∗ . This is done inductively, by the following rules:

(i) λ̂(z, e) = z,
(ii) λ̂(z, x) = λ(z, x) for any letter x, and
(iii) λ̂(z, xw) = λ(z, x)λ̂(δ̂(z, x), w).

Just as in the non-output case, the definition of an automaton with output


can be extended to an initial automaton with output, when one initial state
is selected, or to a weak initial automaton with output, when a larger set of
initial states is chosen.

Example 8.2.5 We consider the following finite deterministic automaton


with output: H = (Z, X, B; δ, λ), where X = {0, 1}, B = {0, 1, 2, 3, 4}, Z =
{z0 , z1 , z2 , z3 , z4 }, and the state transition function δ and the output function
λ are given by the following tables.

δ 0 1 λ 0 1
z0 z0 z1 z0 0 1
z1 z2 z3 z1 2 3
z2 z4 z0 z2 4 0
z3 z1 z2 z3 1 2
z4 z3 z4 z4 3 4

The reader should draw the graph of this automaton. We will illustrate the
behaviour of the extended output function λ̂ by calculating the output of
this automaton when it is started in state z0 with input word w = 110. We
have

λ̂(z0 , 110) = λ(z0 , 1)λ̂(δ̂(z0 , 1), 10)


= 1 λ̂(z1 , 10)
= 1 λ(z1 , 1)λ̂(δ(z1 , 1), 0)
= 13 λ̂(z3 , 0) = 131.
8.3. ALGEBRAIC OPERATIONS ON FINITE AUTOMATA 159

8.3 Algebraic Operations on Finite Automata


As we have remarked earlier, finite automata may be regarded as heteroge-
neous or multi-based algebras, and all the theory developed in Chapters
1, 3 and 4 regarding subalgebras, quotients and homomorphic images may
be extended to this setting. In this section we develop this theory for the
particular case of finite automata, beginning with the concept of a subau-
tomaton.

Definition 8.3.1 Let H1 = (Z1 , X1 , B1 ; δ1 , λ1 ) and H2 =


(Z2 , X2 , B2 ; δ2 , λ2 ) be two deterministic automata with output. Then H1
is called a subautomaton of H2 if the following conditions are satisfied:

(i) Z1 ⊆ Z2 , X1 ⊆ X2 and B1 ⊆ B2 , and


(ii) For all x ∈ X1 and z ∈ Z1 , the functions δi and λi satisfy
δ2 (z, x) = δ1 (z, x) and λ2 (z, x) = λ1 (z, x).

Other concepts regarding subalgebras from Section 1.3 can similarly be gen-
eralized to the multi-based setting. Let H = (Z, X, B; δ, λ) be a finite deter-
ministic automaton with output, and let {Hj = (Zj , Xj , Bj ; δj , λj ) | j ∈ J}
be an indexed family of subautomata of H. Assume that the intersections
T T T
Zj , Xj and Bj are non-empty. Then we define a new subautoma-
j∈J j∈J j∈J
Hj , defined by
T
ton called the intersection
j∈J

Bj ; δ 0 , λ0 ),
\ \ \ \
Hj := ( Zj , Xj ,
j∈J j∈J j∈J j∈J

where δ 0 and λ0 are the restrictions of δ and λ respectively to the correspond-


ing intersections. That this is indeed a subautomaton of each of the Hj can
be proved in the same way that we proved Corollary 1.3.5.

Let H be an automaton. When Z0 ⊆ Z, X0 ⊆ X and B0 ⊆ B, we can


consider the intersection

{H0 = (Z 0 , X 0 , B 0 ; δ 0 , λ0 ) | Z0 ⊆ Z 0 , X0 ⊆ X 0 , B0 ⊆ B 0 },
\

and this gives a new subautomaton of H which is called the subautomaton


generated by the triple (Z0 , X0 , B0 ). We will denote this by h(Z0 , X0 , B0 )i.
160 CHAPTER 8. ALGEBRAIC MACHINES

As in the homogeneous algebra case, the union of an indexed set {Hj |


j ∈ J} of subautomata of a given automaton H is not usually a sub-
automaton. But we can take the subautomaton generated by the unions,
h( BJ )i; this will be the least subautomaton of H to con-
S S S
Zj , Xj ,
j∈J j∈J j∈J
tain all the Hj , and we denote it by Hj . In this way we construct a meet
W
j∈J
operation (intersection) and a join operation on the set of all subautomata
of H. As before, these operations give us a complete lattice of substructures.

Proposition 8.3.2 Let H be a deterministic finite automaton. With meet


operation of intersection and join operation as defined above, the set of all
subautomata of H forms a complete lattice Sub(H).

To define homomorphisms between automata, we need triples of mappings


which preserve the structure.

Definition 8.3.3 Let H1 = (Z1 , X1 , B1 ; δ1 , λ1 ) and H2 = (Z2 , X2 , B2 ; δ2 ,


λ2 ) be two deterministic automata with output. Let f = (fI , fS , fO ) be a
triple of mappings, with fI : X1 → X2 , fS : Z1 → Z2 and fO : B1 → B2 .
(The subscripts I, S and O refer to inputs, states and outputs, respectively.)
Then f is called a homomorphism of H1 into H2 when for every z ∈ Z1 and
every x ∈ X1 , we have

fS (δ1 (z, x)) = δ2 (fS (z), fI (x)) and fO (λ1 (z, x)) = λ2 (fS (z), fI (x)).

If all three functions in the triple f = (fI , fS , fO ) are injective (surjective,


bijective) then the homomorphism f is said to be injective (surjective, bijec-
tive). A bijective homomorphism is called an isomorphism.

Notice that the fact that H1 is a subautomaton of H2 can also be expressed


by an injective homomorphism f : H1 → H2 . We define the composition
of two homomorphisms f = (fI , fS , fO ) and g = (gI , gS , gO ) by f ◦ g =
(fI ◦ gI , fS ◦ gS , fO ◦ gO ). It is easy to show that the composition of two
homomorphisms is again a homomorphism. (See Exercise 8.11.2.)

Our goal now is to formulate a version of the Homomorphism Theorem


for automata. To do this we must first define the appropriate analogues of
congruences and kernels.
8.3. ALGEBRAIC OPERATIONS ON FINITE AUTOMATA 161

Definition 8.3.4 Let H = (Z, X, B; δ, λ) be a finite deterministic automa-


ton with output. The triple R = (RI , RS , RO ) is called a congruence on
H if RI , RS and RO are equivalence relations on the sets X, Z and B re-
spectively, and each is compatible with respect to the operations of δ and
λ. This means that for all (z1 , z2 ) ∈ RS and all (x1 , x2 ) ∈ RI we have
(δ(z1 , x1 ), δ(z2 , x2 )) ∈ RS and (λ(z1 , x1 ), λ(z2 , x2 )) ∈ RO .

Let H = (Z, X, B; δ, λ) be an automaton with output, and let R =


(RI , RS , RO ) be a congruence on H. The quotient automaton of H with
respect to the congruence R is defined as the automaton

H/R := (Z/RS , X/RI , B/RO ; δ 0 , λ0 ),

where δ 0 and λ0 are defined by

δ 0 ([z]RS , [x]RI ) := [δ(z, x)]RS , and λ0 ([z]RS , [x]RI ) := [λ(z, x)]RO .

The reader should check that our congruence property means precisely that
these two mappings are well defined. As before, congruences occur as kernels
of homomorphisms.

Definition 8.3.5 Let f : H1 → H2 be a homomorphism of H1 into H2 .


The triple kerf := (kerfI , kerfS , kerfO ) is called the kernel of the homo-
morphism f .

It is a straightforward exercise (see Exercise 8.11.3) to show that kernels


of homomorphisms have the nice properties from Chapter 3: the kernel of
any homomorphism f : H1 → H2 is a congruence relation; and for any
congruence R on an automaton H the natural mapping f : H → H/R is a
homomorphism, whose kernel is the congruence R.

Theorem 8.3.6 (The Homomorphism Theorem for Automata) Let f : H →


H0 be a surjective homomorphism of automata. Then there is a unique iso-
morphism h : H/kerf → H0 such that the diagram below commutes.
162 CHAPTER 8. ALGEBRAIC MACHINES

f
H - H0
@ 6
@
@ (=)
natkerf @ h
@
@
@
@
R
@
H/kerf

All of these results for automata with output can be extended to the varia-
tions of initial and weak initial automata. In the initial case, we require that
a homomorphism f : H1 → H2 satisfy the additional requirement that fS
maps the initial state of H1 to the initial state of H2 , and maps final states
of H1 to final states of H2 .

The definition of algebraic constructions such as subautomata and homo-


morphic images gives us an algebraic way to compare different automata.
We can also compare automata by comparing how they behave with respect
to the same input words. We will call two automata equivalent if they be-
have the same way when given the same inputs. We will define this more
formally by means of another type of equivalence, that of equivalence of two
states in the same automaton.

Definition 8.3.7 Let H = (Z, X, B; δ, λ) be an automaton with output.


Two states z1 and z2 in Z are said to be equivalent if for all words w ∈ X ∗
we have λ̂(z1 , w) = λ̂(z2 , w). In this case we write z1 ∼ z2 .

Now we can define equivalence of automata.

Definition 8.3.8 Let H1 = (Z1 , X1 , B1 ; Z01 , δ1 , λ1 ) and H2 = (Z2 , X2 , B2 ;


Z02 , δ2 , λ2 ) be two weak initial finite deterministic automata with output.
Then H1 and H2 are called equivalent, written as H1 ∼ H2 , if there are
functions ϕ1 : Z01 → Z02 and ϕ2 : Z02 → Z01 , such that for all z1 ∈ Z01 and
all z2 ∈ Z02 we have z1 ∼ ϕ1 (z1 ) and z2 ∼ ϕ2 (z2 ).

In the case that H1 and H2 are initial, so Z0i = {z0i }, this condition reduces
to H1 ∼ H2 iff z01 ∼ z02 . We also leave it as an exercise for the reader to show
8.3. ALGEBRAIC OPERATIONS ON FINITE AUTOMATA 163

that ∼ defines an equivalence relation on the class of all finite automata.

The intention here is to use equivalence of states in a machine to reduce the


machine, in the sense of producing a smaller (having fewer states) machine
that accepts the same language. If two states are equivalent, they act the
same on all inputs, and hence we can keep one of them and omit the other;
that is, we form a quotient in which all equivalent states are identified,
without affecting the language accepted. An automaton in which no such
reductions are possible is called reduced.

Definition 8.3.9 A finite deterministic weak initial automaton with output


H = (Z, X, B; δ, λ) is called reduced if for all states z1 and z2 in Z, whenever
z1 ∼ z2 we have z1 = z2 .

Theorem 8.3.10 For each weakly initial deterministic finite automaton,


there exists an equivalent reduced automaton.
Proof: Let H = (Z, X, B; δ, λ) be a weakly initial deterministic finite au-
tomaton with output. We define a new automaton H0 = (Z 0 , X 0 , B 0 ; δ 0 , λ0 )
as follows. We use the same alphabet and output sets, so X 0 = X and B 0 =
B. The states in Z 0 are the equivalence classes of states from Z with respect
to the equivalence ∼ of states, so

Z 0 = {[z]∼ | z ∈ Z}.

We take Z00 = {[z0 ] | z0 ∈ Z0 }. We define the transition and output functions


δ 0 : Z 0 × X → Z 0 and λ0 : Z 0 × X → B by

δ 0 ([z]∼ , x) := [δ(z, x)]∼ and λ0 ([z]∼ , x) := λ(z, x).

We show first that these functions are well defined, then check that H0 does
recognize the same language as H. Suppose that ([z1 ]∼ , x1 ) = ([z2 ]∼ , x2 ), so
that x1 = x2 and z1 ∼ z2 . By the definition of equivalent states this means
that for any input w ∈ X ∗ we have

λ̂(z1 , w) = λ(z2 , w). (∗)

In particular, we have λ(z1 , x1 ) = λ(z2 , x2 ) when x1 = x2 . It follows that


λ0 ([z1 ]∼ , x1 ) = λ(z1 , x1 ) = λ(z2 , x2 ) = λ0 ([z2 ]∼ , x2 ), showing that λ0 is well
defined.
164 CHAPTER 8. ALGEBRAIC MACHINES

For δ 0 , we have to show (using x = x1 = x2 ) that when [z1 ]∼ = [z2 ]∼ ,


then [δ(z1 , x)]∼ = [δ(z2 , x)]∼ . This is equivalent to showing that when (∗)
holds, then also the two states δ(z1 , x) and δ(z2 , x) always produce the same
output, that is, that λ̂(δ(z1 , x), p) = λ̂(δ(z2 , x), p) for any input p. We apply
the equation (∗) with input w = xp. Then we have

λ̂(z1 , xp) = λ(z2 , xp).

From the inductive definition of λ̂, this gives

λ(z1 , x)λ̂(δ̂(z1 , x), p) = λ(z2 , x)λ̂(δ̂(z2 , x), p).

By (∗) with w = x, the first part in each of these concatenations is the same;
so the remaining parts must also be equal:

λ̂(δ̂(z1 , x), p) = λ̂(δ̂(z2 , x), p).

This is enough to show that δ(z1 , x) is equivalent to δ(z2 , x), as required to


show that δ 0 is well defined.

To see that H0 is equivalent to H, we define mappings ϕ1 : Z0 → Z00 and


ϕ2 : Z00 → Z0 . We let ϕ1 (z0 ) = [z0 ]∼ for all z0 ∈ Z0 . For ϕ2 we need to first
choose a system of representatives with respect to ∼, one for each equiv-
alence class; then we let ϕ2 ([z0 ]∼ ) be this representative. Then using the
definition of λ0 we can show by induction that λ(z0 , p) = λ0 ([z0 ]∼ , p). This
gives the equivalence of the two automata.

It can also be shown that the equivalent reduced automaton H0 constructed


in this proof is uniquely determined by H, up to isomorphism. (See Exercise
8.11.4.)

The equivalence of two automata was defined only in terms of their behaviour
on the same input words. But such equivalence also turns out to have an
algebraic interpretation.

Theorem 8.3.11 Let H = (Z, X, B; δ, λ) be a weakly initial deterministic


finite automaton with output, and let H0 be the corresponding equivalent
reduced automaton, as constructed in the previous proof. Then H0 is a ho-
momorphic image of H.
8.4. TREE RECOGNIZERS 165

Proof: We use the triple of mappings ϕ = (ϕI , ϕS , ϕO ) given by ϕI =


idX , ϕO = idB and ϕS : Z → Z 0 defined by ϕS (z) = [z]∼ for all z ∈ Z.
Then clearly ϕ is surjective, and it is a homomorphism since ϕS (δ(z, x)) =
[δ(z, x)]∼ = δ 0 ([z]∼ , x) = δ 0 (ϕS (z), x) = δ 0 (ϕS (z), ϕI (x)) and ϕO (λ(z, x)) =
λ0 ([z]∼ , x) = λ0 (ϕS (z), ϕI (x)).

8.4 Tree Recognizers


In Section 8.2 we defined language recognizers which accept or reject words
over a given finite alphabet. As we saw in Example 6.4.5, such words are
a convenient way of representing the terms in the relatively free algebra in
the variety of semigroups. In this section we want to define automata which
are able to accept or reject terms of any type, that is, elements of the free
algebra Fτ (X) of type τ . Since terms can always be regarded as trees (see
Section 5.1), we think of accepting or rejecting trees, and our automata will
be called tree-recognizers.
We will require that our alphabet X be finite, and also that our type τ
contain only finitely many operation symbols. We group the elements of
the set {fi | i ∈ I} of operation symbols of type τ into subsets of different
arities, to consider a so-called ranked alphabet Σ = Σ0 ∪· · ·∪Σn of operation
symbols; here Σj contains all the j-ary operation symbols of type τ . We will
now write our algebras (A; (fiA )i∈I ) as (A; ΣA ), where ΣA denotes a set of
operations defined on the set A and induced by the operation symbols from
Σ. We refer to Σ as a ranked alphabet of symbols, and to X as an input
alphabet or variable set. We will also use the name WΣ (X) for the set of all
terms on the alphabet X using the operation symbols from Σ.

Definition 8.4.1 A Σ − X-tree-recognizer is a sequence

A = (A, Σ, X, α, A0 ),
where
A := (A; ΣA ) is a finite algebra,
Σ is a ranked alphabet of operation symbols,
X is a set of individual variables,
α : X ∪ Σ0 −→ A ∪ ΣA 0 is a mapping, called the evaluation mapping, and
A0 is a subset of the (finite) set A.
166 CHAPTER 8. ALGEBRAIC MACHINES

The evaluation mapping α maps each variable from X to an element of A,


and each nullary operation symbol from Σ (if any) to the corresponding
nullary operation, that is, element of A. Let us denote by FΣ (X) the ab-
solutely free algebra Fτ (X) of the type τ defined by the operation symbols
from Σ. Then any evaluation map α can be uniquely extended to a homo-
morphism α̂ : FΣ (X) −→ A (see Chapter 5).

A language (over X and Σ) is any subset T of terms from WΣ (X). We want


of course to use tree-recognizers to recognize or accept a language. To think
of a tree-recognizer as an automaton, we call the elements of the finite set A
states, while states in the special subset A0 ⊆ A are called final states. The
extension evaluation α̂ maps any term to an element of A, and those terms
which are mapped to a final state are said to be accepted or recognized by
the automaton A. Thus we call the set

T (A) := {t | t ∈ WΣ (X) and α̂(t) ∈ A0 }

the language recognized by A. By definition, T (A) is the preimage of A0 un-


der α̂, that is, T (A) = α̂−1 (A0 ).

A language T ⊆ WΣ (X) is called recognizable if there is a Σ − X-tree-


recognizer A such that T = T (A).

Tree-recognizers defined in this way work in a deterministic top-down fashion


on terms regarded as trees. Our evaluation map α specifies the action of the
tree-recognizer at the top of a tree: variables or nullary operation symbols
which label the leaves of a given tree are replaced by elements of A and
by the corresponding nullary operations of the algebra A, respectively. Then
inductively the m-ary operation symbols which label the vertices of the given
tree are replaced by the corresponding m-ary fundamental operations of
A. Thus step-by-step the value α̂(t) will be calculated from the top down,
starting from the leaves and ending in the root of the tree. We illustrate this
with an example.

Example 8.4.2 We take Σ = Σ1 ∪Σ2 , where Σ1 = {h} and Σ2 = {f, g}, and
alphabet X = {x1 , x2 }. We consider the finite algebra A = ({0, 1}; ∧, ∨, ¬),
where hA = ¬, f A = ∧ and g A = ∨. We define an evaluation α by
α(x1 ) = 1 and α(x2 ) = 0. We also designate A0 = {1}. Consider the term
t = f (h(f (x2 , x1 )), g(h(x2 ), x1 )). The tree corresponding to this term is ac-
8.4. TREE RECOGNIZERS 167

cepted by this tree-recognizer, since α̂(t) = (¬(0 ∧ 1)) ∧ (¬(0) ∨ 1) = 1 ∈ A0 .

We remark that tree-recognizers can also be defined in a “bottom-up” style,


to work from the root of a tree to its leaves.

The following example shows that not every language is recognizable.

Example 8.4.3 Let Σ = Σ2 = {f } and let X be an arbitrary nonempty


finite alphabet. We will show that the language T = {f (t, t) | t ∈ WΣ (X)} is
not recognizable by any Σ − X-tree-recognizer. For suppose that there was
a Σ − X-tree-recognizer A with T = T (A). Since A is finite, there must be
two different Σ − X-trees t and t0 such that α̂(t) = α̂(t0 ).But then

α̂(f (t, t0 )) = f A (α̂(t), α̂(t0 )) = f A (α̂(t), α̂(t)) = α̂(f (t, t)) ∈ A0 .

This gives the contradiction f (t, t0 ) ∈ T .


Our Σ − X-tree-recognizers are designed to recognize terms from the abso-
lutely free algebra FΣ (X). One can also define recognizers for the relatively
free algebra FV (X) with respect to any variety V of type Σ. The reader
should verify that the usual finite automata from Section 8.2, recognizing
words (free semigroup terms) on a finite alphabet, are a special case of this
more general definition.

Another variation in the basic definition of a Σ − X-tree-recognizer involves


the property of determinism. Our definition so far gives a deterministic ma-
chine, but in some applications and in the theoretical development we shall
also need a non-deterministic version. To define this we must first define the
concepts of a non-deterministic operation and a non-deterministic algebra.

Let P(A) be the power set of the set A. For n ≥ 1, an n-ary mapping

f A : An −→ P(A)

is called an n-ary non-deterministic operation on A. Such a mapping assigns


to each element of A a set (possibly empty) of elements of A; so instead of a
function with exactly one output for each element of A we have a situation
where one input may be assigned no output or several different outputs.
Nullary non-deterministic operations are defined as mappings

f A : {∅} −→ P(A).
168 CHAPTER 8. ALGEBRAIC MACHINES

A non-deterministic algebra A = (A; ΣA ) is a pair consisting of a set A and


a set ΣA of non-deterministic operations on A.

Of course any deterministic algebra may be viewed as a non-deterministic


one, with the convention that elements of the universe are identified with the
corresponding singletons. Conversely, given any non-deterministic algebra
A = (A; ΣA ) we can form a related ordinary algebra. This is the power
set algebra P(A) = (P(A); ΣP(A) ), where the fundamental operations are
defined by

f P(A) (A1 , . . . , An ) :=
[
{f A (a1 , . . . , an ) | a1 ∈ A1 , . . . , an ∈ An },

for f A ∈ ΣA and A1 , . . . , An ∈ P(A).

Definition 8.4.4 A non-deterministic Σ − X-tree-recognizer is a sequence

A = (A, Σ, X, α, A0 ),
where
A:= (A; ΣA ) is a finite non-deterministic algebra,
Σ is a ranked alphabet of operation symbols,
X is a set of individual variables,
α : X ∪ Σ0 −→ P(A) ∪ ΣA 0 is a mapping (called the evaluation mapping),
and
A0 is a subset of the (finite) set A.
The set
T (A) := {t | t ∈ WΣ (X) and α̂(t) ∩ A0 6= ∅}
is called the language recognized by A.
It turns out that deterministic and non-deterministic tree-recognizers are
equivalent, in the sense that the families of languages which they recognize
are equal. This can be useful, since non-deterministic tree-recognizers are
sometimes easier to work with than deterministic ones.

Proposition 8.4.5 A language is recognized by a deterministic Σ-X-tree-


recognizer iff it is recognized by a non-deterministic one.
Proof: Since any deterministic Σ − X-tree-recognizer can be viewed as a
non-deterministic one, one direction of our proof is immediate. For the op-
posite direction, let A = (A, Σ, X, α, A0 ) be a non-deterministic Σ − X-
tree-recognizer, based on the non-deterministic algebra A = (A; ΣA ). We
8.5. REGULAR TREE GRAMMARS 169

construct a deterministic tree-recognizer which recognizes the same lan-


guage. For our algebra, we use the deterministic power set algebra P(A)
= (P(A); ΣP(A) ) discussed above, and for our set of final states we use A00 :=
{A1 ∈ P(A) | A1 ∩ A0 6= ∅}. For our tree-recognizer then we have P(A) =
(P(A), Σ, X, α, A00 ). Then for any term t,
t ∈ T (P(A)) ⇔ α̂(t) ∈ A00 ⇔ α̂(t) ∩ A0 6= ∅ ⇔ t ∈ T (A).

8.5 Regular Tree Grammars


In Section 8.2 we proved that any language which is recognizable by a finite
automaton without output is a regular language, and conversely that every
regular language is recognizable by some finite automaton. To prove a similar
result for tree-languages, we introduce the concept of a regular tree grammar.

Definition 8.5.1 A regular Σ − X-tree grammar is a sequence

G = (N, Σ, X, P, a0 ),

where
N is a finite non-empty set, called the set of non-terminal symbols,
Σ is a ranked alphabet of operation symbols,
X is an alphabet of individual variables,
P is a finite set of productions (or rules of derivation) which have the form
a → r for some a ∈ N and r ∈ WΣ (N ∪ X), and a0 ∈ N is called an initial
symbol.
We also specify that N ∩ (Σ ∪ X) = ∅.
Let G be a regular Σ − X-tree grammar. We think of a production a → r
from G as a rule allowing us to replace the non-terminal symbol a by the
term (tree) r. Such a replacement is done within terms, as follows. Let s be
a term from WΣ (X ∪ N ) in which the non-terminal symbol a occurs as a
subterm. If we have a production a → r in G, we can change s to a new
term t by replacing the subterm a by the term r. In this case we write

s ⇒G t,

to indicate that s can be transformed into t by one application of a produc-


tion rule from G.
170 CHAPTER 8. ALGEBRAIC MACHINES

We write
s ⇒∗G t
if either s = t or there is a sequence (t0 , . . . , tn ) of terms from WΣ (X ∪ N )
with t0 = s, tn = t and

t0 ⇒G t1 ⇒G · · · ⇒G tn−1 ⇒G tn .

Such a sequence is called a derivation of t from s. It is common to omit the


G-subscript, if the grammar is clear from the context. Note that ⇒∗G is the
reflexive and transitive closure of the relation ⇒G , if we consider ⇒G as a
relation on WΣ (X ∪ N ).

Definition 8.5.2 Let G = (N, Σ, X, P, a0 ) be a regular Σ−X-tree grammar.


Then the set
T (G) = {t | t ∈ WΣ (X) and a0 ⇒∗G t}
is called the language generated by the grammar G. Two regular Σ − X-tree
grammars G1 and G2 are said to be equivalent if T (G1 ) = T (G2 ).

Example 8.5.3 Consider the regular tree grammar

G = ({a, b}, Σ, X, P, a),

where Σ = Σ0 ∪ Σ2 , Σ0 = {h}, Σ2 = {f }, X = {x}, and P contains the


three productions

a → f (x, f (x, b)), a → f (h, a) and b → f (x, x).

Then the tree f (h, f (x, f (x, f (x, x)))) has the derivation

a ⇒G f (h, a) ⇒G f (h, f (x, f (x, b))) ⇒G f (h, f (x, f (x, f (x, x)))),

and the tree f (x, f (x, f (x, x))) has the derivation

a ⇒G f (x, f (x, b)) ⇒G f (x, f (x, f (x, x))).

As usual when we define an equivalence between objects, we want to be able


to replace a given grammar by a “simpler” one equivalent to it. To make
this more precise, we define the concept of a normal form of a tree grammar,
by selecting particular forms of productions. We will measure the depth of
a production a → r by the depth of the term r.
8.5. REGULAR TREE GRAMMARS 171

Definition 8.5.4 A regular tree grammar is said to be in normal form if


each of its productions a → r is of one of the following types:
(i) if r has depth 0, then r is either a variable x ∈ X or a nullary symbol
f ∈ Σ0 ;
(ii) otherwise r has the form fi (a1 , . . . , ani ) for some ni ≥ 1 and fi ∈ Σni
and some a1 , . . . , ani ∈ N .
Then we can prove the following.

Lemma 8.5.5 Every regular tree grammar is equivalent to a regular tree


grammar in normal form.
Proof: Let G be a regular Σ − X-tree grammar. We describe a procedure for
producing a normal form grammar equivalent to G. As a first step, we delete
from the production set P any production of the form a → b where both a
and b are non-terminal symbols in N , replacing it instead by all productions
of the form a → r for which we have a ⇒∗G b and b → r in P , for some
r ∈ WΣ (X ∪ N ) \ N .

Next let a → r be a production with depth greater than one. Then r has the
form fi (r1 , . . . , rni ), where ni ≥ 1, fi ∈ Σni and depth(rj ) < depth(r) for all
j = 1, . . . , ni . In this case we delete a → r, and replace it with productions
of the form

a → fi (a1 , . . . , ani ) (∗)

where a1 , . . . , ani are new non-terminal symbols, and

ai → rj , (∗∗)

for each 1 ≤ j ≤ ni .

Clearly any application of the rule a → r can be replaced by an application


of (*) followed by applications of productions of the form (**). Conversely,
any application of a production (**) occurs only after an application of (*),
and if (*) has been used then (**) has to follow. Thus these steps have
the same result as a single application of a → r. If one of the symbols rj
in a → fi (r1 , . . . , rni ) is a variable, a nullary operation symbol or a non-
terminal symbol, then we substitute a new non-terminal symbol for it and
172 CHAPTER 8. ALGEBRAIC MACHINES

introduce a rule d → rj of depth 0.

In this way, step by step, any production with a depth greater than 1 can be
replaced by productions of lower depth, until we reach a normal form. None
of these steps changes the language generated by the grammar; so we obtain
an equivalent normal-form grammar.

For technical reasons we introduce the following slight generalization of a


regular Σ − X grammar. Instead of the selected initial symbol a0 , we use
a set A0 ⊆ N of initial symbols. This gives us the concept of an extended
regular Σ − X grammar G = (N, Σ, X, P, A0 ). The language generated by
an extended regular grammar is defined as

T (G) := {t ∈ WΣ (X) | ∃a0 ∈ A0 (a0 ⇒∗G t)}.

Clearly, every language generated by an extended regular tree grammar can


be generated by an ordinary tree grammar; see Exercise 8. 11.7.

Regular grammars and recognizable languages are connected by the following


Kleene-type theorem:

Theorem 8.5.6 A language is recognizable by a Σ − X-tree-recognizer iff it


can be generated by a regular tree grammar.
Proof: Let A = (A, Σ, X, α, A0 ) be a non-deterministic Σ − X-recognizer.
We use A to construct an extended Σ − X-grammar G as follows. For the set
N of non-terminal symbols of G we take the base set A of the algebra from
A. Our grammar G will have the form (A, Σ, X, P, A0 ), where P is a set of
normal form productions defined as follows. We put into P productions of
three kinds:

(i) For any x ∈ X and any a ∈ α(x), we have a production a → x;


(ii) For any nullary symbol f ∈ Σ0 and any a ∈ f A , we have a production
a → f;
(iii) For any symbol fi ∈ Σni with ni ≥ 1 and any a, a1 , . . . , ani ∈ A,
a ∈ fiA (a1 , . . . , ani ), we have a production a → fi (a1 , . . . , ani ).

These productions have the form required to ensure that the grammar G
is in normal form. Conversely, given any grammar G, we may construct an
equivalent normal form grammar and then use the productions to define a
8.5. REGULAR TREE GRAMMARS 173

non-deterministic Σ − X-recognizer. Now we claim that this grammar and


recognizer produce the same language, that is, that T (A) = T (G). To prove
this it will suffice to show that the following equivalence is satisfied:

a ∈ α̂(t) iff a ⇒∗G t for all a ∈ A and t ∈ WΣ (X). (∗)

We proceed by induction on the depth of the tree t. For the base case, if
t has depth zero it is either a variable x or a nullary operation symbol f .
If t = x ∈ X and a ∈ α̂(x) then a ⇒∗G x since a → x is a production. If
conversely a ⇒∗G x then to start this derivation we can only use a production
of the form a → x. But then a ∈ α(x). A similar argument holds in the case
that t = f for some f ∈ Σ0 , using the second kind of productions in P .

For the inductive step, let t = fi (t1 , . . . , tni ) and suppose that

a ∈ α̂(tj ) iff a ⇒∗G tj for all a ∈ A and all j ∈ {1, . . . , ni }.

If a ∈ α̂(t), then there exist elements a1 , . . . , ani ∈ A such that aj ∈ α̂(tj )


for 1 ≤ j ≤ ni and such that a ∈ fiA (a1 , . . . , ani ). Then the induction
hypothesis implies aj ⇒∗G tj , for all j ∈ {1, . . . , ni }. Using the production
a → fi (a1 , . . . , ani ) we obtain the derivation

a ⇒G fi (a1 , . . . , ani ) ⇒∗G fi (t1 , . . . , tni ) = t.

Conversely, if a ⇒∗G t, then there is a derivation of the form

a ⇒G fi (a1 , . . . , ani ) ⇒∗G fi (t1 , . . . , tni ),

where a1 , . . . , ani ∈ A and aj ⇒∗G tj for j = 1, . . . , ni . Since the first step


of this derivation applies a rule of the form a → fi (a1 , . . . , ani ), we get
a ∈ fiA (a1 , . . . , ani ), and the induction hypothesis implies aj ∈ α̂(tj ) for j =
P(A)
1, . . . , ni . But this means that a ∈ fi (α̂(t1 ), . . . , α̂(tni )) = α̂(t).

This proves that the equivalence (*) holds. Now for every tree t we have
t ∈ T (A) ⇔ α̂(t) ∩ A0 6= ∅ ⇔ ∃a ∈ A0 (a ∈ α̂(t)) ⇔ ∃a ∈ A0 (a ⇒∗G t) ⇔
t ∈ T (G).

This finally gives T (A) = T (G).


174 CHAPTER 8. ALGEBRAIC MACHINES

8.6 Operations on Tree Languages


Let us denote by Rec(Σ, X) the set of all recognizable Σ − X-languages.
We will study some algebraic properties of this set, beginning with some
operations under which it is closed.

Proposition 8.6.1 If S and T are languages in Rec(Σ, X), then S ∩ T ,


S ∪ T and S \ T are in Rec(Σ, X).

Proof: Suppose that S and T are recognized by the Σ − X-tree-recognizers


A and B, respectively. For our new tree-recognizers, we use as underlying
Σ-algebra the direct product C := A × B. We use the evaluation mapping
γ : X → C := A × B, defined by x 7→ (α(x), β(x)), for all x ∈ X, where
α and β are the evaluation mappings from A and B, respectively. Clearly,
γ̂(t) = (α̂(t), β̂(t)) for all t ∈ FΣ (X). If we now take

C1 = (C, Σ, X, γ, A0 × B 0 ),
C2 = (C, Σ, X, γ, A0 × B ∪ A × B 0 ), and
C3 = (C, Σ, X, γ, A0 × (B \ B 0 )),

we obtain Σ − X-recognizers for the languages S ∩ T , S ∪ T and S \ T respec-


tively. For the intersection, we have t ∈ T (C1 ) iff γ̂(t) = (α̂(t), β̂(t)) ∈ A0 ×B 0
iff t ∈ T (A) ∩ T (B). This shows T (C1 ) = S ∩ T . The verifications for the
union and set difference are similar, and we leave them as exercises (see Ex-
ercise 8.11.8).

Now we consider mappings which transform trees from one language into
trees from another one. Let Σ := {fi | i ∈ I} be a set of operation symbols
of type τ1 = (ni )i∈I , where fi is ni -ary, ni ∈ IN and let Ω = {gj | j ∈ J}
be a set of operation symbols of type τ2 = (nj )j∈J where gj is nj -ary. We
denote by WΣ (X) and by WΩ (Y ) the sets of all terms of type τ1 and τ2 ,
respectively, where X and Y are alphabets of variables.

Definition 8.6.2 Let χni := {ξ1 , . . . , ξni } be an auxiliary alphabet. Let


hX : X → WΩ (Y ) and for each ni ∈ IN let hni : Σni → WΩ (Y ∪ χni ) be
mappings. Then the tree homomorphism determined by these mappings is
the mapping
h : WΣ (X) → WΩ (Y )
defined inductively by
8.6. OPERATIONS ON TREE LANGUAGES 175

(i) h(x) := hX (x) for every x ∈ X.

(ii) h(fi (t1 , . . . , tni )) = hni (fi )(ξ1 ← h(t1 ), . . . , ξni ← h(tni )), where ξj ←
h(tj ), means to substitute h(tj ) for ξj .

The tree homomorphism h is said to be linear if for any ni ≥ 0 and fi ∈ Σni


no variable ξj appears more than once in (ii).

Example 8.6.3 Consider the type τ = (2, 2) with operation symbols Σ :=


{f, g} and the alphabet X = Y = {x, y}. Then the mappings hX (x) := x
for every x ∈ X and h2 : {f, g} → WΣ (X) defined by f 7→ f (x, g(x, y)) and
g 7→ f (y, x) define a tree homomorphism h. In fact this is a special kind
of tree homomorphism called a hypersubstitution, which will be studied in
Chapter 14.

We remark that tree homomorphisms do not always preserve recognizability,


as the following example shows:

Example 8.6.4 We put Σ = Σ1 = {f }, Ω = Ω2 = {g}, X = Y = {x}. We


define hX and h1 by

hX (x) = x, and h1 (f ) = g(ξ1 , ξ1 ).

The Σ − X-trees have the form tk = f (f (· · · f (x) · · ·)) = f k (x), for k ≥ 0.


Clearly, h(WΣ (X)) consists of the trees

s0 := x, s1 := g(x, x), . . . , sk+1 := g(sk , sk ), . . . .

Suppose there is an Ω − Y -recognizer A = (A, Ω, Y, α, A0 ) such that T (A) =


h(WΣ (X)). Then there must exist two integers i, j ≥ 0 with i 6= j such that
α̂(si ) = α̂(sj ). But then α̂(g(si , sj )) = g A (α̂(si ), α̂(sj )) = g A (α̂(si ), α̂(si )) =
α̂(si+1 ) ∈ A0 . This means that g(si , sj ) ∈ h(WΣ (X)). Therefore h(WΣ (X))
cannot be recognizable.

If we have the additional condition that the tree homomorphism h is linear,


then the image h(T ) of a recognizable language T is recognizable.

Theorem 8.6.5 If h : WΣ (X) → WΩ (Y ) is a linear tree homomorphism


and if T is a recognizable language, then h(T ) is a recognizable language.
176 CHAPTER 8. ALGEBRAIC MACHINES

Proof: Since T is recognizable, there is a regular tree grammar G =


(N, Σ, X, P, a0 ) which generates T . We may assume that G is in normal form
and that G has no non-terminal symbols from which no Σ − X-tree from
T can be generated. Adding all non-terminal symbols from N as nullary
operation symbols to Σ and to Ω we obtain the sets Σ0 and Ω0 of oper-
ation symbols. Now we extend the given linear tree homomorphism to a
tree homomorphism h0 : WΣ0 (X) → WΩ0 (Y ), by extending h0 to a mapping
h00 : Σ0 ∪ N → WΩ0 (Y ) so that h00 (a) = a for all a ∈ N . Consider now the
regular Ω − Y -tree grammar G0 = (N, Ω, Y, P 0 , a0 ) with

P 0 := {a → h0 (t) | a → t ∈ P }.

The theorem will be proved if we show that T (G0 ) = h(T ). This in turn can
be proved by showing that a ⇒∗G0 t iff (∃s ∈ WΣ (X)) (h(s) = t and a ⇒∗G s
for all a ∈ N , t ∈ WΩ (Y )). We leave the remaining details to the reader (see
for instance F. Gécseg and M. Steinby, [48]).

8.7 Minimal Tree Recognizers


In this section we generalize our considerations of Section 8.3 to the case
of tree-recognizers. We compare tree-recognizers by comparing how they be-
have with respect to the same input words, and use this to define equivalence
of tree-recognizers. Then we look for tree-recognizers which are minimal, in
a class of equivalent tree-recognizers, with respect to the size of state sets.
Then minimal-tree recognizers will be characterized in terms of homomor-
phisms of tree-recognizers and quotient recognizers.

As usual, a homomorphism of tree-recognizers should be a mapping which


preserves the structure.

Definition 8.7.1 A homomorphism from a Σ − X-tree-recognizer A =


(A, Σ, X, α, A0 ) to a Σ − X-tree-recognizer B = (B, Σ, X, β, B 0 ) is a map-
ping ϕ : A → B such that

(i) ϕ is a homomorphism from the algebra A to the algebra B,

(ii) ϕ(α(x)) = β(x) for all x ∈ X, and

(iii) ϕ−1 (B 0 ) = A0 .
8.7. MINIMAL TREE RECOGNIZERS 177

If ϕ is injective (surjective, bijective) then the homomorphism ϕ is called


injective (surjective, bijective). A bijective homomorphism is called an iso-
morphism.

Lemma 8.7.2 Let A and B be two Σ − X-tree-recognizers. If there exists a


homomorphism ϕ : A → B, then T (A) = T (B).

Proof: We will show by induction on the depth of the term t that


ϕ(α̂(t)) = β̂(t), for all t ∈ WΣ (X). The base case is given by the defini-
tion of a homomorphism. Now inductively assume that ϕ(α̂(tj )) = β̂(tj ),
for 1 ≤ j ≤ ni . Then ϕ(α̂(fi (t1 , . . . , tni )) = ϕ(fiA (α̂(t1 ), . . . , α̂(tni ))) =
fiB (ϕ(α̂(t1 )), . . . , ϕ(α̂(tni )))
= fiB (β̂(t1 ), . . . , β̂(tni )) = β̂(fi (t1 , . . . , tni )).
Using this equation we have

t ∈ T (B) ⇔ β̂(t) ∈ B 0 ⇔ ϕ(α̂(t)) ∈ B 0 ⇔ α̂(t) ∈ ϕ−1 (B 0 ) = A0


⇔ t ∈ T (A).

An equivalence relation % on a set A is said to saturate a subset A0 of A if


A0 is the union of %−equivalence classes. In this case we write A0 = A0 %.

It turns out the following concept of congruence relation is suitable to ex-


press the expected relationship between congruence relations and homomor-
phisms.

Definition 8.7.3 A congruence of a Σ−X-tree-recognizer A is a congruence


on the algebra A which saturates A0 , so that A0 % = A0 . We will denote by
C(A) the set of all congruence relations defined on A.

Notice that by definition the set C(A) is a subset of the set ConA of all
congruences on the algebra A.

Proposition 8.7.4 C(A) is a principal ideal of the complete lattice ConA,


and therefore C(A) is also a complete lattice.

Proof: From the definition of a principal ideal in a lattice, we have to verify


the following facts:

(i) ∆A ∈ C(A) and therefore C(A) 6= ∅;

(ii) θ ⊆ % ∈ C(A) and θ ∈ Con(A) imply θ ∈ C(A);


178 CHAPTER 8. ALGEBRAIC MACHINES

{% | % ∈ C(A)} ∈ C(A).
S
(iii)
C(A)

The details of this proof are left to the reader; see Exercise 8.11.5.

The join in (iii) is the greatest and thus the generating element of C(A).
Later on we will give a more useful description of the greatest element of
C(A).

Quotient recognizers are defined in the following way:

Definition 8.7.5 Let % ∈ C(A). Then the quotient recognizer of A with


respect to the congruence % is defined by

A/% = (A/%, Σ, X, α% , A0 /%),

where α% (x) = [α(x)]% for each x ∈ X.

It is easy to check that the kernel kerϕ := ϕ−1 ◦ ϕ of a homomorphism


ϕ : A → B is a congruence relation on A. Conversely, if % ∈ C(A), then the
natural mapping

nat% : A → A/%, given by a 7→ [a]% for a ∈ A,

is a surjective homomorphism. With these facts, we have the analogue of the


Homomorphic Image Theorem.

Theorem 8.7.6 (Homomorphic Image Theorem for Tree-Recog-nizers) Let


ϕ : A → B be a surjective homomorphism of tree-recognizers. Then there ex-
ists a unique isomorphism f from A/ker ϕ onto B with f ◦ nat(ker ϕ) =
ϕ.

Using the natural homomorphism and Lemma 8.7.2 we have the following
conclusion.

Corollary 8.7.7 If % ∈ C(A), then T (A/%) = T (A).

Now we can define what it means for two tree-recognizers to be equivalent.


8.7. MINIMAL TREE RECOGNIZERS 179

Definition 8.7.8 Two states a and b of a Σ − X-tree-recognizer A are said


to be equivalent, and we write a ∼A b (or simply a ∼ b if the context is
clear), if

f (a) ∈ A0 ⇔ f (b) ∈ A0 for all unary polynomial operations f of A.

A tree-recognizer is called reduced if no two distinct states are equivalent.


We will also define the concept of a minimal tree-recognizer.

Definition 8.7.9 The Σ − X- tree-recognizer A is called

(i) reduced if ∼A = ∆A ;

(ii) connected if every state is reachable, in the sense that for every a ∈ A
there exists a tree t ∈ WΣ (X) such that α̂(t) = a;

(iii) minimal if it is both connected and reduced.

In a tree-recognizer which is connected, the set α(X) generates the algebra A.


Non-reachable states can be deleted without changing the language which is
recognized. In the finite case, the greatest congruence on A gives the smallest
quotient recognizer.

Theorem 8.7.10 For any Σ − X-tree-recognizer A the relation ∼A is the


greatest congruence of A, and A/ ∼A is a reduced Σ − X-tree-recognizer
which is equivalent to A.
Proof: To prove that ∼A is a congruence, it suffices to prove the congru-
ence property for all unary polynomial operations of A. If f, g are unary
polynomial operations of A then the composition g ◦ f is a unary polyno-
mial operation of A. Then g(f (a)) ∈ A0 iff g(f (b)) ∈ A0 and this implies
f (a) ∼A f (b). If a ∼A b and a ∈ A0 , then b = idA (b) ∈ A0 . Thus A0 ∼A = A0
and therefore ∼A is a congruence relation on A.

Now let % be a congruence of A. If (a, b) ∈ % and if f is a unary polynomial


operation of A, then (f (a), f (b)) ∈ %. From A0 % = A0 we obtain f (a) ∈ A0 iff
f (b) ∈ A0 and thus a ∼A b. This shows that ∼A is the greatest congruence
on A. By Corollary 8.7.7, T (A) = T (A/ ∼A ). Now we have to show that
A/ ∼A is reduced. It follows from ([a]∼A , [b]∼A ) ∈∼A/∼A that for all unary
polynomial operations f of A/ ∼A ,
180 CHAPTER 8. ALGEBRAIC MACHINES

f ([a]∼A ) ∈ A0 /A iff f ([b]∼A ) ∈ A0 /A


⇔ f (a) ∈ A0 iff f (b) ∈ A0
⇔ a ∼A b ⇔ [a]∼A = [b]∼A .

The other inclusion is clear since ∼A is an equivalence relation.

The quotient recognizer A/∼A is called the reduced form of A. Tree recog-
nizers which have isomorphic reduced forms are equivalent. For connected
tree-recognizers the converse is also true.

Theorem 8.7.11 Let A and B be two minimal tree-recognizers. If they are


equivalent then they are also isomorphic.

Proof: We show that the mapping ϕ : A → B defined by

ϕ(α̂(t)) = β̂(t) for all t ∈ WΣ (X)

is an isomorphism from A onto B. We prove this in several steps:

1. ϕ maps every state of A to a state of B since A is connected. Since B is


connected ϕ is surjective.

2. We show that ϕ is well defined. Assume that α̂(s) = α̂(t) and that β̂(s) 6=
β̂(t). Then β̂(s) and β̂(t) are non-equivalent since A and B are reduced.
Then there exists a unary polynomial operation f of the algebra A such
that f (β̂(s)) ∈ B 0 and f (β̂(t)) 6∈ B 0 , or vice versa. Furthermore, there is a
tree p ∈ WΣ (B ∪ {ξ}), where ξ is an auxiliary variable and ξ 6∈ B ∪ X, such
that for all b ∈ B we have f (b) = pB (βb ) where βb : B ∪ {ξ} → B is defined
by βb |B = 1B and βb (ξ) = b. Since B is connected, for each b ∈ B there
exists a Σ − X-tree pb such that β̂[pb ] = b. Now let

q = p(b ← pb | b ∈ B) (∈ WΣ (X ∪ {ξ})).

Consider the Σ − X-trees qs and qt which arise from q by substitution of the


trees s and t, respectively, for ξ. Then

β̂(qs ) = pB (ββ̂(s) ) = f (β̂(s)) ∈ B 0

and
β̂(qt ) = pB (ββ̂(t) ) = f (β̂(t)) 6∈ B 0 .
8.7. MINIMAL TREE RECOGNIZERS 181

Now we assign to every letter x ∈ X in q the value α(x), to get a


unary polynomial operation g of A such that g(a) = q A (αa ) for each
a ∈ A, where αa is defined by αa |X = α and αa (ξ) = a. Then we have
ϕ(g(α̂(s))) = ϕ(αα̂(s) (q A )) = ϕ(α̂(qs )) = β̂(qs ) ∈ B 0 and ϕ(g(α̂(t))) =
ϕ(αα̂(t) (q A )) = ϕ(α̂(qt )) = β̂(qt ) 6∈ B 0 . This contradicts the original assump-
tion that α̂(s) = α̂(t). Hence qs ∈ T (B), but qt 6∈ T (B). On the other hand,
α̂(s) = β̂(t) implies α̂(qs ) = α̂(qt ), and this is a contradiction of the assump-
tion that T (A) = T (B).

3. In a similar way, reversing the roles of A and B, it can be shown that


β̂(s) = β̂(t) implies α̂(s) = α̂(t) for all Σ − X-trees s and t. This means that
ϕ is injective.

4. We show that ϕ is compatible with the operations of A. Since A is con-


nected, for any a1 , . . . , ani there exist terms t1 , . . . , tni ∈ WΣ (X) such that
α̂(t1 ) = a1 , . . . , α̂(tni ) = ani . If fi ∈ Σni for ni ≥ 0, then

ϕ(fiA (a1 , . . . , ani )) = ϕ(fiA (α̂(t1 ), . . . , α̂(tni )))


= ϕ(α̂(fi (t1 , . . . , tni )) = β̂(fi (t1 , . . . , tni ))
= fiB (β̂(t1 ), . . . , β̂(tni )) = fiB (ϕ(α̂(t1 )), . . . , ϕ(α̂(tni )))
= fiB (ϕ(a1 ), . . . , ϕ(ani )).

5. For each x ∈ X we have ϕ(α(x)) = β̂(x), and thus ϕ ◦ α = β̂.

6. If α̂(t) ∈ A0 and t ∈ WΣ (X), then ϕ(α̂(t)) = β̂(t) ∈ B 0 since t ∈ T (A) =


T (B). Similarly, ϕ(α̂(t)) ∈ B 0 implies α̂(t) ∈ A0 . Hence ϕ−1 (B 0 ) = A0 .

As a corollary we have the following result.

Corollary 8.7.12 If A and B are connected Σ − X-tree recognizers such


that T (A) = T (B), then A/ ∼A is isomorphic to B/ ∼B .

For every Σ − X-language T there is at least the (infinite) Σ − X- recognizer


FT = (FΣ (X), Σ, X, idX , T ), where FΣ (X) is the absolutely free algebra.
Clearly, for each term t ∈ WΣ (X) we have

idX (t) = t ∈ T (FT ) iff t ∈ T.


182 CHAPTER 8. ALGEBRAIC MACHINES

The tree-recognizer FT is connected. To show that FT / ∼FT is also con-


nected, we show more generally that the homomorphic image of a connected
recognizer is connected. Let ϕ : A → B be a surjective homomorphism of
Σ − X-recognizers. Let b be an arbitrary state of B. Then there exists an
a ∈ A such that ϕ(a) = b. Since A is connected, there is a tree such that
α̂(t) = a. Then we have

β̂(t) = ϕ(α̂(t)) = ϕ(a) = b.

Altogether we have proved the following theorem.

Theorem 8.7.13 For every language T there is a minimal (possibly infi-


nite) tree-recognizer, and it is unique up to isomorphism. If A is a con-
nected recognizer of the language T , then the minimal recognizer of T is a
homomorphic image of A (under a surjective homomorphism). The quotient
recognizer A/ ∼A is minimal.

We remark that the previous results can be used to prove that an arbitrary
Σ − X-language T is recognizable iff there exist a finite Σ-algebra A, a
homomorphism ϕ : FΣ (X) → A and a subset A0 ⊆ A such that ϕ−1 (A) =
T . This proposition allows us to give a new definition of recognizability for
subsets of the universes of arbitrary algebras, not just free algebras. Let A
be any algebra. Then a subset T ⊆ A is called recognizable if there exist a
finite algebra B of the same type, a homomorphism ϕ : A → B and a subset
B 0 ⊆ B such that ϕ−1 (B 0 ) = T . This definition gives us the recognizability of
Σ − X-languages when A is the free algebra of its type, and the recognizable
languages of Section 8.1 when A is the free monoid X ∗ generated by X.

8.8 Tree Transducers


In this section we consider sets of terms or trees of two different types, Σ and
Ω. A tree transducer is a system which transforms trees of one type into trees
of the other (just as automata transform strings into strings). Such systems
also give us tree transformations, which are subsets of WΣ (X) × WΩ (X).
The concept of a tree transducer used here is due to J. W. Thatcher ([113]).

More precisely, let Σ := {fi | i ∈ I} be a set of operation symbols of type


τ1 = (ni )i∈I , where fi is ni -ary for ni ∈ IN, and let Ω = {gj | j ∈ J} be a set
of operation symbols of type τ2 = (nj )j∈J where gj is nj -ary. As in Section
8.8. TREE TRANSDUCERS 183

8.6 we denote by WΣ (X) and WΩ (X) the sets of all terms of type τ1 and τ2 ,
respectively. Then we define a tree transformation as a binary relation

Tτ1 ,τ2 ⊆ WΣ (X) × WΩ (X).

The most important tree transformations are those which can be given in
an effective way.

Definition 8.8.1 Let τ1 and τ2 be two types. A (τ1 − τ2 )-tree transducer is


a sequence
A = (Σ, X, A, Ω, P, A0 ),
where
Σ = {fi | i ∈ I} is a set of operation symbols of type τ1 ,
Ω = {gj | j ∈ J} is a finite set of operation symbols of type τ2 ,
A = {a1 , . . . , am } is a finite set of unary operation symbols,
A0 ⊆ A, and
P is a finite set of productions or rules of derivation of the forms

(i) x → a(t), for x ∈ X, a ∈ A and t ∈ WΩ (X),

(ii) fi (a1 (ξ1 ), . . . , ani (ξni )) → a t(ξ1 , . . . , ξni ), for fi ∈ Σni , a1 , . . . , am ∈ A,


a ∈ A and ξ1 , . . . , ξni ∈ χm , where χm = {ξ1 , . . . , ξm } is an auxiliary
alphabet, and t(ξ1 , . . . , ξm ) ∈ WΩ (X ∪ χm ).
For two trees s and t, we will say that s directly derives t in A, if t can be
obtained from s by the following steps:
(1) replacement of an occurrence of a variable x ∈ X in s by the right
hand side of a production from (i) or

(2) replacement of an occurrence of a subtree


fi (a1 (q1 ), . . . , ani (qni )) in s, for a1 , . . ., ani ∈ A and q1 , . . ., qni
∈ WΩ (X ∪ χm ), by a t(q1 , . . . , qm ), if fi (a1 (ξ1 ), . . . , ani (ξni )) →
a t(ξ1 , . . . , ξm ) is a production.
If s directly derives t in A, we write s →A t. Forming the reflexive and tran-
sitive closure of this relation →A , we say that s derives t in A if there is a
sequence s →A s1 →A s2 →A · · · →A sn = t of direct derivations of t from s
or if s = t. In this case we write s ⇒∗A t.

Every tree transducer induces a tree transformation, in a natural way.


184 CHAPTER 8. ALGEBRAIC MACHINES

Definition 8.8.2 If A is a (τ1 − τ2 )-tree transducer, then the tree transfor-


mation induced by A is the set

TA := {(s, t) | s ∈ Wτ1 (X) and s ⇒∗A a0 t for some a0 ∈ A0 }.

This means that tree transformations of the form TA , induced from a tree
transducer, can be described in an effective (algorithmic) way. Now we con-
sider the following example of a tree transducer.

Example 8.8.3 Let A = (Σ, {x}, {a0 , a1 }, Ω, P, {a1 }) be the tree trans-
ducer with Σ = Σ2 = {f }, Ω = Ω2 = {g}, and where P consists of the
productions x → a0 x, f (a0 , a0 ) → a1 g(ξ1 , ξ2 ), f (a0 , a1 ) → a0 g(ξ1 , ξ2 ),
f (a1 , a0 ) → a1 g(ξ1 , ξ2 ) and f (a1 , a1 ) → a1 g(ξ1 , ξ2 ).
Then the term t = f (f (x, x), x) has the following derivation:

f (f (x, x), x) ⇒∗A f (f (a0 x, a0 x), a0 x) ⇒∗A f (a1 g(x, x), a0 x) ⇒∗A
a1 g(g(x, x), x).

Therefore (f (f (x, x), x), g(g(x, x), x)) ∈ TA .


The tree homomorphisms h : WΣ (X) → WΩ (X) defined in Section 8.6 also
induce tree transformations Th := {(t, h(t)) | t ∈ WΣ (X)}. This raises the
question of whether for each tree homomorphism h : WΣ (X) → WΩ (X)
there is a tree transducer A such that Th = TA .

Definition 8.8.4 A tree transducer is called an H-transducer if it has the


form A = (Σ, X, {a}, Ω, P, a), with

P = {x → ahX (x) | x ∈ X} ∪
{fi (a(ξ1 ), . . . , a(ξni )) → ahni (fi )(ξ1 , . . . , ξni ) | fi ∈ Σni },

for some tree homomorphism h.

Theorem 8.8.5 Let h : WΣ (X) → WΩ (X) be a tree homomorphism and let


Th be the tree transformation defined by h. Then Th can be induced by an H-
transducer. Conversely, tree transformations induced by H-transducers are
defined by tree homomorphisms.
Proof: We show that any pair (t, h(t)) ∈ Th is also in TA , proceeding by
induction on the complexity of the tree t. If t ∈ X, then t = x derives ahX (x)
8.9. TURING MACHINES 185

= h(x) and therefore (x, h(x)) ∈ TA . Inductively, let t = fi (t1 , . . . , tni ) and
suppose that (tj , h(tj )) ∈ Th implies tj ⇒∗A ah(tj ) for all j = 1, . . . , nj . Then
t has the following derivation in A:

t = fi (t1 , . . . , tni ) ⇒∗A ahni (fi )(h(t1 ), . . . h(tni )) = ah(t);


and this means (t, h(t)) ∈ TA . This shows Th ⊆ TA . If conversely t ⇒∗A as,
and t = x is a variable, then the only possible start for a derivation is to
use a rule x → ahX (x) = h(x), and then (t, s) ∈ Th . Now we assume that
t = fi (t1 , . . . tni ) and that tj ⇒∗A asj implies (tj , sj ) ∈ Th , so that sj = h(tj )
for j = 1, . . . , ni . A derivation of t has to start with a rule of the second kind.
Therefore t has the form t = fi (as1 , . . . , asni ) → ahni (fi )(s1 , . . . , sni ) =
ahni (fi )(h(t1 ), . . . , h(tni )) = h(t). Therefore (t, s) ∈ Th . The second proposi-
tion can be proved in a similar way.

8.9 Turing Machines


The concept of a Turing machine was introduced by A. Turing to define
the class of “computable” functions. The Turing machine is a way to make
precise the concept of an algorithm, and we will use Turing machines in the
next section to show that some algebraic properties are undecidable. The
definition of a Turing machine is similar to that of a finite automaton, in
that we have an input alphabet, an initial state, final states and a transition
function. The main difference is that a Turing machine also has an infinite
storage capacity. This is provided by a doubly infinite tape, marked off into
squares, and a read-write head which when positioned on a square can read
the letter currently on the square or write a letter in the square.

To define Turing machines more precisely, we use the following notation.


We use X = {0, 1} for the input alphabet, and also refer to 0 and 1 as tape
symbols. The set Γ = {L, R} is the set of motion symbols, indicating whether
the read-write head will move one square to the left or to the right. We have
a doubly infinite tape, on which each square is printed with exactly one of
the tape symbols. Thus the tape can be regarded as a function t : Z → {0, 1},
with t(n) equal to the symbol printed on the n-th square of the tape. The
blank tape corresponds to the constant function t0 with value 0. We also have
an infinite set Z of states, indexed by the non-negative integers:
Z = {µn | n ≥ 0}.
186 CHAPTER 8. ALGEBRAIC MACHINES

The state µ1 is called the initial state, and we set Z1 = {µ1 }, while the state
µ0 is called the final state.

An instruction of the machine is a quintuple µi rsT γ, where r, s ∈ {0, 1}, T is


one of L or R, µi is a state symbol other than µ0 , and γ is any state symbol
(possibly µ0 ). Such a quintuple is interpreted as follows: if the machine is
in state µi and reads the symbol r on the current square, it performs the
following steps:

1. The machine replaces r by s on the current square.


2. If T = L, the read-write head moves one square to the left;
if T = R, the read-write head moves one square to the right.
3. The state changes to state γ.

It is clear that the work of a Turing machine could be described by a partial


transition function δ : Z × X → Z × Γ × X. (Notice that δ is not defined on
pairs using state µ0 , so it is only a partial function.) An input is accepted
if the state µ0 is reached; otherwise it is rejected. The language accepted by
the Turing machine T is the set

L(T ) := {w ∈ X ∗ | w is accepted by T }.

This version of a Turing machine is modelled on the language recognizers of


the previous sections. We can also view Turing machines as a set of machine
instructions.

Definition 8.9.1 A Turing machine is a finite set T of machine instruc-


tions, for which there is some natural number k such that

(i) For each 1 ≤ i ≤ k and for each r ∈ {0, 1}, T contains exactly one
instruction µi rsT γ.
(ii) No state symbol µj for which j > k occurs in the instructions in T .
A configuration of a Turing machine T is a triple Q = (t, n, γ), where t is a
tape, n is an integer and γ is a state symbol. This encodes the information
that the machine is in state γ, reading the n-th square on tape t. If there is
an instruction γt(n)sT γ 0 , then the machine will move into state γ 0 , and the
read-write head moves either right or left according to T . In addition, the
tape t is converted to a tape t0 , for which t0 (n) = s and t0 (k) = t(k) for all
k 6= n. The result is described by a new configuration, either (t0 , n − 1, γ 0 )
8.10. UNDECIDABLE PROBLEMS 187

when T = L or (t0 , n + 1, γ 0 ) when T = R. We write Q0 = T (Q) to indicate


the transition from the configuration Q to the resulting one. The initial con-
figuration Q0 = (t0 , 0, µ1 ) starts the machine in its initial state, reading at
the 0 position on a blank tape t0 .

In this way, starting with Q0 and applying the transition to new configura-
tions, we obtain a sequence Q0 , Q1 , Q2 , . . . of configurations. This process
will only stop if we reach a configuration Qm which uses the final state µ0 ,
since there is no instruction for this state. In this case the sequence of con-
figurations stops at some finite stage m; otherwise it will continue infinitely.
We say that the Turing machine T halts iff its sequence of configurations is
finite. In the next section we will use the well-known fact that the problem of
deciding whether a Turing machine will halt, the so-called Halting Problem,
is recursively undecidable. More details may be found in S. C. Kleene [63].

8.10 Undecidable Problems


In 1993 R. McKenzie resolved several longstanding and challenging prob-
lems concerning varieties generated by finite algebras. One of these was the
problem known as Tarski’s Finite Basis Problem, which asked whether it
is algorithmically decidable whether any finite algebra is finitely axiomatiz-
able. Since many properties of varieties generated by finite algebras do turn
out to be algorithmically decidable, the following results of McKenzie were
surprising:

1. There is no algorithm to decide whether any given finite algebra is finitely


axiomatizable.

2. There is no algorithm to decide whether any given finite algebra generates


a residually finite variety.

The main idea of McKenzie’s proofs involves the construction of a finite al-
gebra A(T ) which encodes the computation of a Turing machine T . We will
outline here the construction of this algebra, and its use in the proof, but
the full proofs are beyond the scope of this book. Complete details may be
found in [79], [80] and [81].

We begin by describing the algebra A(T ) constructed from a given Turing


188 CHAPTER 8. ALGEBRAIC MACHINES

machine T . The universe of this algebra is rather complicated, and will be


described in pieces. We let µ0 , . . . , µk be the states of the Turing machine.
We define the following sets:

U = {1, 2, H},
W = {C, D, C, D},
A = {0} ∪ U ∪ W ,
k
Vi , where Vi = Vi0 ∪ Vi1 and Vir = Vir0 ∪Virs ,
S
V =
i=0
with Virs s , D s , M r , C s , D s , M r }, for 0 ≤ i ≤ k and {r, s} ⊆ {0, 1}.
= {Cir ir i ir ir i
The unbarred symbols are used to encode configurations of the Turing ma-
chine, while the barred versions control the finite subdirectly irreducible
algebras. The universe of A(T ) is defined to be the set A ∪ V . We point
out that the cardinality of this set is 20k + 28, where k is the number of
non-halting states of the Turing machine.

On this base set, we define the following operations. The first is a semilattice
operation ∧, defined by

(
x, if x = y
x∧y =
0, otherwise.

We need a multiplication · defined by

2 · D = H · C = D, 1 · C = C,
2 · D = H · C = D, 1 · C = C,
x · y = 0 for all other pairs (x, y).

We also need the following operations:


 x,
 if x = y 6= 0,
J(x, y, z) = x ∧ z, if x = y ∈ V ∪ W,

 0, otherwise.


 x ∧ z,
 if x = y 6= 0,
0
J (x, y, z) = x, if x = y ∈ V ∪ W,

 0, otherwise.
8.10. UNDECIDABLE PROBLEMS 189

(
(x ∧ y) ∨ (x ∧ z), if u ∈ V0 ,
S0 (u, v, x, y, z) =
0, otherwise.

(
(x ∧ y) ∨ (x ∧ z), if u ∈ {1, 2},
S1 (u, v, x, y, z) =
0, otherwise.

(
(x ∧ y) ∨ (x ∧ z), if u = v ∈ V ∪ W,
S2 (u, v, x, y, z) =
0, otherwise.


 x · y,
 if x · y = z · u 6= 0 and x = z and y = w,
T (x, y, z, u) = x · y, if x · y = z · u 6= 0 and x 6= z or y =
6 w,

 0, otherwise.

The following unary operation I serves to set up the initial config-


uration:

 0 ,
C10 if x = 1,

 M 0,

if x = H,
1
I(x) = 0 ,

 D10 if x = 2,


0, otherwise.

For each instruction µi rsLµj of the Turing machine T and each t ∈ {0, 1},
we have an operation Lirt which will describe the operation of the machine
when it is reading symbol r in state µi , and the square just to the left of the
current one has symbol t on it.
s0 , 0
s for some s0 ,


 Cjt if x = y = 1, u = Cir
Mjt , t ,




 if x = H, y = 1, u = Cir
s if x = 2, y = H, u = Mir ,
 Djt ,



0 s0 for some s0 ,
Lirt (x, y, u) =

Ds ,
jt if x = y = 2, u = Dir
v, if u ∈ V and Lirt (x, y, u) = v ∈ V,








 according to the previous lines
0, otherwise.

For each instruction of the form µi rsRµj in T and for each t ∈ {0, 1}, we
have an operation Rirt , given by
190 CHAPTER 8. ALGEBRAIC MACHINES

s0 , 0
s for some s0 ,


 Cjt if x = y = 1, u = Cir
s, if x = H, y = 1, u = Mit ,




 Cjt
t if x = 2, y = H, u = Ditr ,
 Mj ,



0 s0 for some s0 ,
Rirt (x, y, u) =

Ds ,
jt if x = y = 2, u = Dir
v, if u ∈ V and Rirt (x, y, u) = v ∈ V,








 according to the previous lines
0, otherwise.

Let L be the collection of all these operations Lirt , and dually for R. We will
assume that F1 , . . . , Fc is a complete list of all these operations. We define
a binary relation  on U by

:= {(2, 2), (2, H), (H, 1), (1, 1)}.

Now we use this to define operations Ui1 and Ui2 , for each 1 ≤ i ≤ c, by

 Fi (x, y, u),
 if x  z, y 6= z, Fi (x, y, u) 6= 0,
Ui1 (x, y, z, u) = Fi (x, y, u) if x  z, y = z, Fi (x, y, u) 6= 0,

 0, otherwise.


 Fi (y, z, u),
 if x  z, x 6= y, Fi (y, z, u) 6= 0,
Ui2 (x, y, z, u) = Fi (y, z, u) if x  z, x = y, Fi (y, z, u) 6= 0,

 0, otherwise.

The configuration algebra of a Turing machine consists of all its configu-


rations, together with a partial function which maps any configuration Q
whose state is not the final or halting state to the configuration Q0 = T (Q).
Consider the direct product B = A(T )X for a non-empty set X. The oper-
ations in A(T ) allow us to encode in B certain subsets of the configuration
algebra, along with the production function restricted to a subset. If X is
finite, any connected subset of the configuration algebra can be encoded in
B. For the details, see [80].

Now the two possibilities for T , that it halts or not, are considered. McKen-
zie showed that if T does not halt, the variety generated by A(T ) contains a
denumerably infinite subdirectly irreducible algebra, and the residual bound
of A(T ) satisfies κ(A(T )) ≥ ω1 . However, if T halts, then A(T ) is residu-
ally finite, with a finite cardinal m such that κ(A(T )) ≤ m. Since it is not
8.11. EXERCISES 191

decidable whether a Turing machine halts or not, there can be no algorithm


to decide whether any given finite algebra is residually finite or not.

Now we can apply Willard’s theorem, Theorem 6.7.5. The variety V (A(T ))
can be shown to be congruence meet-semidistributive, and if T halts, the
variety is also residually finite. Therefore, if T halts, the algebra A(T ) is
finitely axiomatizable. But it is algorithmically undecidable if T halts; so it
is also undecidable whether A(T ) is finitely axiomatizable or not.

8.11 Exercises
8.11.1. Draw a directed graph illustrating the action of the automaton from
Example 8.2.5.

8.11.2. Prove that the composition of two homomorphisms f = (fI , fS , fO )


and g = (gI , gS , gO ), as defined in Definition 8.3.3, is again a homomorphism.

8.11.3. a) Prove that the kernel of a homomorphism f : H1 → H2 between


automata is a congruence relation.
b) Let R be a congruence on an automaton H. Prove that the natural map-
ping f : H → H/R is a homomorphism, whose kernel is the congruence R.

8.11.4. Prove that for any weakly initial deterministic finite automaton H,
the reduced automaton H0 constructed in the proof of Theorem 8.3.10 is
unique up to isomorphism.

8.11.5. Prove that the set of all congruences of a Σ − X-tree-recognizer C(A)


forms a principal ideal of the complete lattice ConA, and therefore that C(A)
is a complete lattice itself.

8.11.6 Prove the Homomorphic Image Theorem for Σ − X-tree-recognizers.

8.11.7. Prove that every language generated by an extended regular tree


grammar can be generated by an ordinary tree grammar.

8.11.8. Let A = (A, Σ, X, α, A0 ) and B = (B, Σ, X, β, B 0 ) be two Σ − X-tree-


recognizers, with T (A) = S and T (B) = T . Let C2 = (A × B, Σ, X, α ×
β, A0 × B ∪ A × B 0 ) and C3 = (A × B, Σ, X, α × β, A0 × (B/B 0 )), where
192 CHAPTER 8. ALGEBRAIC MACHINES

α × β : Wτ (X) → Wτ (X) × Wτ (X) is defined by (α × β)(t) := (α(t), β(t)),


for all t ∈ Wτ (X). Prove that T (C2 ) = S ∪ T and T (C3 ) = S \ T .

8.11.9. Let Xn be a fixed finite alphabet. Let R be the relation between the
set of all languages on the alphabet Xn and the set of all finite automata on
Xn , defined by (L, H) ∈ R iff H recognizes L. Describe the Galois-connection
induced by this relation R.
Chapter 9

Mal’cev-Type Conditions

As we saw in Chapter 1, any algebra A has associated with it a lattice, the


lattice Con(A) of all congruence relations on A. We can often use properties
of the algebra A itself to deduce properties of the associated congruence
lattice, and we can also sometimes use properties of Con(A) to tell us about
the algebra A as well. Thus we want to relate properties of lattices, such as
permutability, distributivity, modularity, and so on, to properties of algebras
and varieties. The first result in this direction was given by A. I. Mal’cev in
1954 ([74], [75]): he showed that all the congruence relations of any algebra
in a variety are permutable with respect to the relational product iff the
variety satisfies a certain identity (equality of terms). The special term used
in this identity is called a Mal’cev term, and theorems like this one which
relate properties of the congruence lattices of all the algebras in a variety to
identities of the variety are usually called Mal’cev-type conditions. In this
chapter we investigate a number of properties of congruence lattices, and the
corresponding Mal’cev-type conditions. We also investigate these properties
in detail for a particular example, the case of varieties generated by algebras
of size two; our analysis here will be used in Chapter 10 to describe the
lattice of all clones on a two-element set.

9.1 Congruence Permutability


The first Mal’cev-type condition is the original one given by Mal’cev in 1954
([74]), characterizing congruence permutable varieties of algebras. We recall
that two congruence relations θ and ψ on an algebra A are called permutable,
or are said to permute, if θ ◦ ψ = ψ ◦ θ. An algebra A is called congruence

193
194 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

permutable if any two congruences of A are permutable. A class K of al-


gebras of type τ is called congruence permutable if each algebra from K is
congruence permutable.

It is easy to show that for any two equivalence relations θ and ψ defined on
a set A, the union θ ∪ ψ is again an equivalence relation defined on A iff
θ and ψ are permutable. In this case θ ◦ ψ is the least equivalence relation
containing θ and ψ, and we have θ∪ψ = θ◦ψ. When θ and ψ are congruences
on an algebra A, this makes the join θ ∨ ψ in the congruence lattice equal to
θ ◦ ψ. (We have used this argument already in the proof of Theorem 4.1.4.)

Mal’cev gave the following characterization of congruence permutable vari-


eties.

Theorem 9.1.1 A variety V of algebras of type τ is congruence permutable


iff there is a ternary term p ∈ Wτ (X)/Id V , called a Mal’cev term for V ,
such that
p(x, x, y) ≈ y, p(x, y, y) ≈ x ∈ Id V.

Proof: Assume that V is congruence permutable. To produce the term p,


and verify the identities claimed, we will work with the V -free algebra on
three generators. That is, we use X3 , or equivalently the three element gen-
erator set Y = {x, y, z}. From the absolutely free algebra Fτ (Y ) we form
the quotient FV (Y ) = Fτ (Y )/Id V , which we know is generated by the
three equivalence classes x := [x]Id V , y := [y]Id V and z := [z]Id V and
is in the variety V . Now recall that the congruence relation θ(a, b) gen-
erated by a pair (a, b) of elements of an algebra is the intersection of all
congruences containing this pair. We consider the congruences θ(x, y) and
θ(y, z) on the V -free algebra. We have (x, z) ∈ θ(y, z) ◦ θ(x, y). But con-
gruence permutability of V implies (x, z) ∈ θ(x, y) ◦ θ(y, z), and therefore
there is a term p(x, y, z) ∈ Wτ ({x, y, z})/Id V with (x, p(x, y, z)) ∈ θ(y, z)
and (p(x, y, z), z) ∈ θ(x, y). This gives us our term p, whose properties we
now need to verify. For this we consider a function from Y = {x, y, z} to
FV ({x, y}) which maps x to x and both of y and z to y. Since FV ({x, y}) is
an algebra in V , and FV (Y ) has the freeness property of the relatively free
algebra, this function extends to a unique homomorphism ϕ from FV (Y ) to
FV ({x, y}). Since y and z have the same image, (y, z) ∈ ker ϕ and we must
have θ(y, z) ⊆ ker ϕ. Since (x, p(x, y, z)) ∈ θ(y, z) ⊆ ker ϕ, we also have ϕ(x)
9.2. CONGRUENCE DISTRIBUTIVITY 195

= ϕ(p(x, y, z)). Then, using the fact that ϕ is a homomorphism, we get

x = ϕ(x) = ϕ(p(x, y, z)) = p(ϕ(x), ϕ(y), ϕ(z)) = p(x, y, y).

This means that in FV ({x, y}) we have x = p(x, y, y), and hence that
p(x, y, y) ≈ x ∈ Id V . A similar argument shows that p(x, x, y) ≈ y is also
in Id V .

For the converse, assume now that there is a term p such that p(x, x, y) ≈
y, p(x, y, y) ≈ x ∈ Id V . To show that V is congruence permutable, we
let A be any algebra in V , let θ and ψ be any congruences on A, and let
(a, b) ∈ ψ ◦ θ. By definition there is an element c ∈ A such that (a, c) ∈ θ
and (c, b) ∈ ψ. Using (a, a), (b, c), (b, b) ∈ ψ and (a, a), (c, a), (b, b) ∈ θ,
and the compatibility property of terms from Theorem 5.2.4, we have

(p(a, b, b), p(a, c, b)) ∈ ψ and (p(a, c, b), p(a, a, b)) ∈ θ.

Thus (a, b) = (p(a, b, b), p(a, a, b)) ∈ θ ◦ ψ. This shows that ψ ◦ θ ⊆ θ ◦ ψ,


and in the same way we show ψ ◦ θ ⊇ θ ◦ ψ, to get the desired equality.

Example 9.1.2 Consider the class of all groups, viewed as algebras of type
(2,1,0). This is a variety, defined equationally by the identities

(xy)z ≈ x(yz), ex ≈ xe ≈ x, xx−1 ≈ x−1 x ≈ e.

(As usual, we omit the binary multiplication symbol.) This variety has a
Mal’cev term p, given by p(x, y, z) := xy −1 z. Clearly for this term we have
p(x, x, y) = xx−1 y ≈ ey ≈ y and p(x, y, y) = xy −1 y ≈ xe ≈ x, in any group.
Theorem 9.1.1 then tells us that the variety of all groups is congruence
permutable. Similarly, the class of all rings forms a variety with a Mal’cev
term given by p(x, y, z) = x − y + z, since in any ring p(x, x, y) = x − x + y
≈ y and p(x, y, y) = x − y + y ≈ x. So the variety of rings is also congruence
permutable.

9.2 Congruence Distributivity


Another important property of lattices, and in particular of the congruence
lattice of an algebra, is distributivity.
196 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

Definition 9.2.1 An algebra A is called congruence distributive if its con-


gruence lattice Con A is distributive; that is, if for all congruences θ, ψ and
Φ ∈ Con A, the following equations are satisfied:
θ ∧ (ψ ∨ Φ) = (θ ∧ ψ) ∨ (θ ∧ Φ),
θ ∨ (ψ ∧ Φ) = (θ ∨ Ψ) ∧ (θ ∨ Φ).
(Note that these two equations are equivalent to each other, so that it is
enough to check one of them.) A class K ⊆ Alg(τ ) of algebras of type τ is
called congruence distributive if each algebra of K is congruence distributive.
In order to show congruence distributivity of a class, we will need to consider
joins of congruences. We recall from Chapter 1 that the join ψ ∨ Φ of two
congruences in a congruence lattice is the congruence relation generated by
their union ψ ∪Φ. We will need the following characterization of the elements
in such a join.

Lemma 9.2.2 Let ψ and Φ be two congruences on an algebra A, and let


a and b be elements of A. Then (a, b) ∈ Φ ∨ ψ iff there exist finitely many
elements a1 , . . . , an in A such that (a, a1 ) ∈ ψ, (a1 , a2 ) ∈ Φ, . . ., (an−1 , an ) ∈
ψ and (an , b) ∈ Φ.
Proof: Let (a, b) be in Φ ∨ ψ. From Remark 1.4.9 and the fact that the
union of two congruences is reflexive and symmetric, we see that the join is
the transitive closure of the relation Φ ∪ ψ. This means that we can produce
(a, b) in a finite number of instances of transitivity using pairs in the union.
Thus we have elements with the desired property.

Conversely, suppose there are such elements a1 , . . . , an . Then from (a, a1 ) ∈


ψ and (a1 , a2 ) ∈ Φ we get (a, a2 ) ∈ Φ ∨ ψ since Φ ∨ ψ is transitive. Similarly
we get (a2 , a4 ) ∈ Φ ∨ ψ; and now combining these two facts we have (a, a4 )
∈ Φ ∨ ψ. Continuing in this way we finally get (a, b) ∈ Φ ∨ ψ.

The following theorem gives a Mal’cev-type condition for congruence dis-


tributivity. The ternary term m used here is usually called a (two-thirds)
majority term.

Theorem 9.2.3 Let V be a variety with a ternary term m such that


m(x, x, y) ≈ m(x, y, x) ≈ m(y, x, x) ≈ x ∈ Id V.
Then V is congruence distributive.
9.2. CONGRUENCE DISTRIBUTIVITY 197

Proof: Consider any algebra A ∈ V , and any three congruences θ, Φ and


ψ on A, and assume that (a, b) ∈ θ ∧ (Φ ∨ ψ). Then (a, b) is in both θ and
Φ ∨ ψ. By Lemma 9.2.2, there exist some elements a1 , . . . , an in A such that
(a, a1 ) ∈ ψ, (a1 , a2 ) ∈ Φ, . . ., (an−1 , an ) ∈ ψ and (an , b) ∈ Φ. We will use
these elements, along with the converse direction of Lemma 9.2.2, to prove
that (a, b) is in (θ ∧ Φ) ∨ (θ ∧ ψ).
Taking a0 = a and an+1 = b, we have (aj , aj+1 ) in either ψ or Φ, for each
0 ≤ j ≤ n + 1. Applying the majority term m, and the fact that (a, a) and
(b, b) are in any congruence, gives us (m(a, aj , b), m(a, aj+1 , b)) also in the
same congruence ψ or Φ. But we also have, for any aj , that m(a, aj , b) is
θ-related to m(a, aj , a) since (b, a) is in θ; m(a, aj , a) in turn is equal to a by
the properties of the majority term m, and for the same reason also equal
to m(a, aj+1 , a), which is then related by θ to m(a, aj+1 , b). Thus we have
each pair (m(a, aj , b), m(a, aj+1 , b)) in both θ and one of ψ or Φ. There-
fore, (m(a, a0 , b), m(a, an+1 , b)) ∈ (θ ∧ Φ) ∨ (θ ∧ ψ). But this pair is just
(m(a, a, b), m(a, b, b)) = (a, b). Thus we have θ ∧ (Φ ∨ ψ) ⊆ (θ ∧ Φ) ∨ (θ ∧ ψ).
The opposite inclusion is valid in all lattices, giving the equality needed for
distributivity.

Example 9.2.4 We will use Theorem 9.2.3 to show that the variety of all
lattices is congruence distributive. Let m be the ternary lattice term

m(x, y, z) = (x ∨ y) ∧ (x ∨ z) ∧ (y ∨ z).

Then in any lattice, using the idempotency and absorption laws, we have
m(x, x, y) = (x ∨ x) ∧ (x ∨ y) ∧ (x ∨ y) ≈ x ∧ (x ∨ y) ≈ x,
m(x, y, x) = (x ∨ y) ∧ (x ∨ x) ∧ (y ∨ x) ≈ (x ∨ y) ∧ x ≈ x, and
m(y, x, x) = (y ∨ x) ∧ (y ∨ x) ∧ (x ∨ x) ≈ (y ∨ x) ∧ x ≈ x.

Thus m is a two-thirds majority term, and the variety of all lattices is con-
gruence distributive.

Theorem 9.2.3 gives only a sufficient condition for congruence distributivity.


A condition that is both necessary and sufficient was given by B. Jonsson in
1967 ([61]), in what is known as Jonsson’s Condition.

Theorem 9.2.5 (Jonsson’s Condition) A variety V is congruence distribu-


tive iff there exist a natural number n ≥ 1 and ternary terms t0 , . . . , tn ∈
198 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

Wτ (X)/Id V such that the following identities are satisfied in V :



t0 (x, y, z) ≈ x, 

tj (x, y, x) ≈ x for all 0 ≤ j ≤ n ,




tj (x, x, y) ≈ tj+1 (x, x, y) for all 0 ≤ j < n, j even , (∆n )
≈ tj+1 (x, y, y) for all 0 ≤ j < n, j odd ,

tj (x, y, y) 



tn (x, y, z) ≈ z .

We note that the previous characterization of congruence distributivity, in


Theorem 9.2.2, is actually a special case of this theorem: when V has a ma-
jority term m, we may take n = 2, and terms t0 = x1 , t1 = m and t2 = x3 ,
to obtain the identities (∆2 ) of Jonsson’s Condition.

When the conditions (∆n ) from this theorem are satisfied for a variety V ,
the variety is said to be n-distributive. It is possible to show that an n1 -
distributive variety is also n2 -distributive for any n2 ≥ n1 . But the converse
direction is false, since n2 -distributivity does not imply n1 -distributivity (see
K. Fichtner, [44]).

Proof: If V is congruence-distributive, then in particular the relatively free


algebra FV ({x, y, z}) is congruence-distributive. We know from Chapter 6
that this algebra is generated by the equivalence classes [x]IdV , [y]IdV and
[z]IdV . For notational convenience we shall write these generators simply as
x, y and z respectively. To produce the necessary terms, we consider the
congruences θ(x, y), θ(x, z) and θ(y, z) of FV ({x, y, z}) generated by the
pairs (x, y), (x, z) and (y, z), respectively. These congruences must satisfy
the distributive property, so we have

(θ(x, y) ∧ θ(x, z)) ∨ (θ(y, z) ∧ θ(x, z)) = (θ(x, y) ∨ θ(y, z)) ∧ θ(x, z). (D)

Since (x, y) and (y, z) are both in θ(x, y) ∨ θ(y, z), and this relation is a
congruence, we have (x, z) ∈ θ(x, y) ∨ θ(y, z). Hence the pair (x, z) is in
the relation on the right hand side of the equation (D) above, and because
of the equality must be in the relation on the left hand side too. By the
join criterion from Lemma 9.2.2, there are a natural number n and finitely
many elements d0 , . . . , dn of FV ({x, y, z}) such that d0 = x, dn = z, and
(d0 , d1 ) ∈ θ(y, z), (d1 , d2 ) ∈ θ(x, y), and so on, with the last pair (dn−1 , dn )
in either θ(x, y) or θ(y, z), depending on whether the number n is even or
odd.
9.2. CONGRUENCE DISTRIBUTIVITY 199

The elements d0 , . . . , dn belong to Wτ ({x, y, z})/Id V , which means that


they are (equivalence classes of) terms in V : for each 0 ≤ j ≤ n we have
dj := tj (x, y, z) for some term tj . When j is even, we have:
(tj (x, y, z), tj+1 (x, y, z)) ∈ θ(y, z)
⇒ (tj (x, x, z), tj+1 (x, x, z)) ∈ θ(y, z).
If we restrict θ(y, z) to the subalgebra of FV ({x, y, z}) generated by x and z,
we obtain tj (x, x, z) ≈ tj+1 (x, x, z). The other identities needed are obtained
similarly.

Assume now that all the identities (∆n ) are satisfied for some number n
and some terms t0 , . . ., tn . Let A ∈ V and let θ, ψ and Φ be congruences
on A. Since we always have (θ ∧ Φ) ∨ (θ ∧ ψ) contained in θ ∧ (Φ ∨ ψ),
it will suffice to prove the converse inclusion. We let (x, y) ∈ θ ∧ (Φ ∨ ψ).
Then (x, y) ∈ θ, and from Lemma 9.2.2 there are a number m and elements
d0 , . . . , dm ∈ A with x = d0 , (d0 , d1 ) ∈ ψ, (d1 , d2 ) ∈ Φ, . . ., (dm−1 , dm ) ∈ Φ
and dm = y. For each j = 1, . . . , n − 1, applying the term tj to these
pairs and the pairs (x, x) and (y, y) gives us (tj (x, x, y), tj (x, d1 , y)) ∈ ψ,
(tj (x, d1 , y), tj (x, d2 , y)) ∈ Φ, . . ., (tj (x, dm−1 , y), tj (x, y, y)) ∈ Φ. We
also have each pair (tj (x, dk , y), tj (x, dk+1 , x)) ∈ θ, by the same argu-
ment as in the proof of Theorem 9.2.3. Combining these facts we have
(tj (x, x, y), tj (x, d1 , y)) ∈ θ ∧ ψ, (tj (x, d1 , y), tj (x, d2 , y)) ∈ θ ∧ Φ, . . .,
(tj (x, dm−1 , y), tj (x, y, y)) ∈ θ ∧ Φ. Hence by Lemma 9.2.2 we have
(tj (x, x, y), tj (x, y, y)) ∈ (θ ∧ Φ) ∨ (θ ∧ ψ). Altogether we have

x = t0 (x, x, y) = t1 (x, x, y),


(t1 (x, x, y), t1 (x, y, y)) ∈ (θ ∧ Φ) ∨ (θ ∧ ψ),
(t2 (x, y, y), t2 (x, x, y)) ∈ (θ ∧ Φ) ∨ (θ ∧ ψ),
...
tn−1 (x, d, y) = tn (x, d, y) = y,

with d = x or d = y. Thus, (x, y) ∈ (θ ∧ Φ) ∨ (θ ∧ ψ), and we have congruence


distributivity.

The next famous theorem of Baker and Pixley ([6]) shows that the presence
of a majority term operation has some far-reaching consequences.

Theorem 9.2.6 (Baker-Pixley Theorem) Let A be a finite algebra with a


majority function m among its term operations. Then for any n ∈ IN, an
200 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

n-ary operation f : An → A on A is a term operation of A iff f preserves


each subalgebra of A2 .

Proof: It is straightforward to show that any term operation on A preserves


each subalgebra of A2 , and we leave this as an exercise for the reader. For the
converse, we assume that f : An → A is an operation on A which preserves
every subalgebra of A2 . We shall consider sets D ⊆ An such that there exists
a term operation p of A which agrees with f on D; so that

(x1 , . . . , xn ) ∈ D ⇒ f (x1 , . . . , xn ) = p(x1 , . . . , xn ).

We will give an inductive proof that for all k ≥ 2, any set D ⊆ An of size k
has this property. Since A is finite, this will suffice.
When k = 2 we can write D = {(x1 , . . . , xn ), (y1 , . . . , yn )}.
Since by assumption f preserves every subalgebra of A2 , the pair
(f (x1 , . . . , xn ), f (y1 , . . . , yn )) belongs to the subalgebra of A2 which is gen-
erated by {(x1 , y1 ), . . . , (xn , yn )}. Then by Theorem 1.3.8, there is a term
operation p of A for which f (x1 , . . . , xn ) = p(x1 , . . . , xn ) and f (y1 , . . . , yn ) =
p(y1 , . . . , yn ). Let J be the collection of all sets of n-tuples such that
f (x1 , . . . , xn ) = p(x1 , . . . , xn ). Thus D belongs to J.

Assume now that D has size k+1 ≥ 3 and that (x1 , . . . , xn ), (y1 , . . . , yn ), (z1 ,
. . . , zn ) are pairwise different elements of D. By induction f agrees on
D1 := D \ {(x1 , . . . , xn )} with some term operation p1 of A, f agrees on
D2 := D \ {(y1 , . . . , yn )} with some term operation p2 of A, and f agrees on
D3 := D \ {(z1 , . . . , zn )} with some term operation p3 of A.

Now we claim that f agrees with the term m(p1 , p2 , p3 ) on D. For


any (u1 , . . . , un ) ∈ D, there are numbers i 6= j ∈ {1, 2, 3} such
that (u1 , . . . , un ) ∈ Di and (u1 , . . . , un ) ∈ Dj . Then pi (u1 , . . . , un ) =
f (u1 , . . . , un ) and pj (u1 , . . . , un ) = f (u1 , . . . , un ), and we have
m(p1 (u1 , . . . , un ), p2 (u1 , . . . , un ), p3 (u1 , . . . , un )) = f (u1 , . . . , un ), as re-
quired.

Remark 9.2.7 Theorem 9.2.6 can be generalized in the following way. If


the finite algebra A has a (d + 1)-ary term operation u(x1 , . . . , xd+1 ) with
u(x, . . . , x, y, x, . . . , x) ≈ x for any position of y, then f : An → A is a term
operation of A iff f preserves any subalgebra of Ad .
9.3. ARITHMETICAL VARIETIES 201

The following theorem shows how congruence distributivity influences the


structural properties of a variety. For a congruence distributive variety
V (A) there is a simpler structural characterization than the usual V (A)
= HSP(A), given by the famous Jonsson’s Lemma ([61]). In the case that
A is a finite algebra the Lemma takes the following form:

Theorem 9.2.8 If A is a finite non-trivial algebra such that the variety


V (A) is congruence distributive, then every algebra from V (A) is isomor-
phic to a subdirect product of homomorphic images of subalgebras of A. That
is, V (A) = IPs HS(A).

If A is simple, and either has no proper subalgebras or has at most one-


element subalgebras, then V (A) = IPs (A). This means that A is (up to
isomorphism) the only subdirectly irreducible algebra in V (A).

9.3 Arithmetical Varieties


A lattice which is both permutable and distributive is called arithmetical.
Since we have Mal’cev-type conditions for both permutability and distribu-
tivity of congruence lattices, we should expect to find such conditions for
arithmeticity of congruence lattices as well.

Definition 9.3.1 A variety K of algebras of type τ is called arithmetical


if each algebra from K is both congruence permutable and congruence dis-
tributive.

Theorem 9.3.2 For a variety V the following are equivalent:


(i) V is arithmetical;

(ii) there are terms p and m with p(x, x, y) ≈ y ≈ p(y, x, x) ∈ Id V and


m(x, x, y) ≈ m(x, y, x) ≈ m(y, x, x) ≈ x ∈ Id V ;

(iii) there is a ternary term q such that q(x, y, y) ≈ q(x, y, x) ≈ q(y, y, x) ≈


x ∈ Id V .

Proof: (i) ⇒ (ii): Let V be arithmetical. Then V is congruence permutable,


and by Theorem 9.1.1 there is a Mal’cev term p which fulfills the desired
202 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

identities for p. Moreover, V is congruence distributive; so in FV ({x, y, z})


we have
θ(x, z) ∧ (θ(y, z) ∨ θ(x, y)) = (θ(x, z) ∧ θ(y, z)) ∨ (θ(x, z) ∧ θ(x, y)) .
Since (x, z) is in the relation on the left side of this equation, it is also in
the relation on the right side. By congruence permutability this relation is
equal to (θ(x, z) ∧ θ(y, z)) ◦ (θ(x, z) ∧ θ(x, y)). But then there is a term m
with (x, m(x, y, z)) ∈ θ(x, z) ∧ θ(x, y) and (m(x, y, z), z) ∈ θ(x, z) ∧ θ(y, z),
and this term m satisfies the identities needed to make it a majority term.
(ii) ⇒ (i) is clear.
(ii) ⇒ (iii): Given terms p and m as in (ii), we take q(x, y, z) to be the term
p(x, m(x, y, z), z). Then q has the desired properties, since we have

q(x, y, y) ≈ p(x, m(x, y, y), y) ≈ p(x, y, y) ≈ x,


q(x, y, x) ≈ p(x, m(x, y, x), x) ≈ p(x, x, x) ≈ x, and
q(y, y, x) ≈ p(y, m(y, y, x), x) ≈ p(y, y, x) ≈ x.

(iii) ⇒ (ii): Given a term q as in (iii), we define terms p(x, y, z) := q(x, y, z)


and m(x, y, z) := q(x, q(x, y, z), z). Then p is a Mal’cev term, making V con-
gruence permutable by Theorem 9.1.1; and m is a majority term, making V
congruence distributive by Theorem 9.2.3, since we have

m(x, x, y) ≈ q(x, q(x, x, y), y) ≈ q(x, y, y) ≈ x,


m(x, y, x) ≈ q(x, q(x, y, x), x) ≈ q(x, x, x) ≈ x, and
m(y, x, x) ≈ q(y, q(y, x, x), x) ≈ q(y, y, x) ≈ x.

Example 9.3.3 In this example we will show that the variety of all
Boolean algebras is arithmetical. It can be shown that the variety of
all Boolean algebras is generated by the two-element Boolean algebra
2B = ({0, 1}; ∧, ∨, ¬, ⇒, 0, 1), where the operation symbols here denote the
usual Boolean operations on the set {0, 1}. This means that every iden-
tity which is satisfied in the two-element Boolean algebra 2B is satisfied in
any Boolean algebra. To prove arithmeticity by Theorem 9.3.2 (iii), we look
for a ternary term q(x, y, z) on the set {0, 1} which satisfies the identities
q(x, y, y) ≈ q(x, y, x) ≈ q(y, y, x) ≈ x in 2B . Considering the truth table of
such an operation, as shown below, we see that there is exactly one such
ternary operation q. This operation q is indeed a term operation of 2B , since
it can be expressed as q(x, y, z) = (¬x)(¬y)z ∨ x(¬y)(¬z) ∨ x(¬y)z ∨ xyz.
Thus the variety is arithmetical.
9.4. N -MODULARITY AND N -PERMUTABILITY 203

x 0 0 0 0 1 1 1 1
y 0 0 1 1 0 0 1 1
z 0 1 0 1 0 1 0 1
q(x, y, z) 0 1 0 0 1 1 0 1

9.4 n-Modularity and n-Permutability


In this section we examine the property of modularity for congruence lattices,
as well as variations of modularity and permutability.

Definition 9.4.1 An algebra A is called congruence modular if Con A is a


modular lattice, so that any three congruences θ1 , θ2 and θ3 on A satisfy the
modular law:

θ0 ⊇ θ2 ⇒ θ0 ∩ (θ1 ∪ θ2 ) ⊆ (θ0 ∩ θ1 ) ∪ θ2 .

There are several Mal’cev-type conditions for congruence modularity, which


we will state here without proof. The first result, due to A. Day, uses quater-
nary terms, while the second one, due to H. P. Gumm, uses ternary terms.
When the identities (Mn ) or (Gn ) of these Theorems are satisfied by terms
of a variety V , the variety is said to be n-modular.

Theorem 9.4.2 ([16]) A variety V of algebras is congruence modular iff


there are a natural number n ≥ 2 and quaternary terms q0 , . . . , qn ∈
Wτ (X)/Id V such that the following list (Mn ) of identities is satisfied in
V:
q0 (x, y, z, u) ≈ x,
qi (x, y, y, x) ≈ x for all 0 ≤ i ≤ n ,
qi (x, y, y, u) ≈ qi+1 (x, y, y, u) for all 0 < i < n, i odd,
qi (x, x, z, z) ≈ qi+1 (x, x, z, z) for all 0 ≤ i < n , i even,
qn (x, y, z, u) ≈ u.
204 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

Theorem 9.4.3 ([52]) A variety V of algebras is congruence modular iff


there are a natural number n and ternary terms g0 , . . . , gn , d ∈ Wτ (X)/Id V
such that the following list (Gn ) of identities is satisfied in V :

g0 (x, y, z) ≈ x
gi (x, y, x) ≈ x for all 0 ≤ i ≤ n,
gi (x, x, y) ≈ gi+1 (x, x, y) if i is even,
gi (x, y, y) ≈ gi+1 (x, y, y) if i is odd,
gi (x, y, y) ≈ d(x, y, y)
d(x, x, y) ≈ y.

Lemma 9.4.4 If an algebra A is congruence-permutable, then it is


congruence-modular.

Proof: Let θ, ψ and Φ ∈ Con A with θ ⊇ Φ. To show that θ ∧ (ψ ∨ Φ) ⊆


(θ ∧ ψ) ∨ Φ, let (a, b) ∈ θ ∧ (ψ ∨ Φ). Then (a, b) ∈ θ, and by congruence
permutability there is an element c ∈ A with (a, c) ∈ Φ and (c, b) ∈ ψ. Since
(a, c) ∈ Φ and Φ ⊆ θ, we have (a, c) ∈ θ. Now by symmetry and transitiv-
ity we also have (c, b) ∈ θ. This gives (c, b) ∈ θ ∧ ψ. Combining this with
(a, c) ∈ Φ gives us (a, b) ∈ (θ ∧ ψ) ∨ Φ.

We noted in Section 9.1 that if two congruences θ and ψ are permutable, then
θ ∪ ψ = θ ◦ ψ. In this case we say that the congruences are 2-permutable. We
can generalize this, to say that two congruences θ and ψ are n-permutable,
for n ≥ 2, if
(
θ if i is even
θ ∪ ψ = θ1 ◦ . . . ◦ θn where θi = .
ψ if i is odd

If any two congruences of Con A have this property we call A n-permutable.


A variety V is said to be n-permutable if each algebra from V has this
property. It can be shown that a variety which is n-permutable is also k-
permutable for any k > n. A Mal’cev-type condition for n-permutability
was given by J. A. Hagemann and A. Mitschke.

Theorem 9.4.5 ([53]) Let n ≥ 2. A variety V of algebras is n-permutable


iff there are ternary terms p0 , . . . , pn ∈ Wτ (X)/Id V such that in V the
following identities are satisfied:
9.5. CONGRUENCE REGULAR VARIETIES 205


p0 (x, y, z) ≈ x, 

pi (x, x, y) ≈ pi+1 (x, y, y) for all 0 ≤ i < n, (Pn )
pn (x, y, z) ≈ z.

9.5 Congruence Regular Varieties


In some algebras, any congruence is uniquely determined by one of its congru-
ence classes. For instance, each congruence of a group is uniquely determined
by a normal subgroup, which is the congruence class of the identity element
of the group. Similarly, each congruence of a ring is uniquely determined by
an ideal. This motivates the following definition:

Definition 9.5.1 A congruence θ of an algebra A is called regular if it is


completely determined by one of its congruence classes. An algebra A is
called congruence regular if every congruence on A is regular, and a variety
V is called congruence regular if every A ∈ V is congruence regular.
Congruence regularity of a variety can be characterized by a Mal’cev-type-
condition which was given by R. Wille in [121].

Theorem 9.5.2 ([121]) A variety V is congruence-regular iff there are nat-


ural numbers n and m, and ternary terms p0 , . . ., pn and quaternary terms
q1 , . . ., qm in V , such that for all 1 ≤ k ≤ m and suitable numbers
0 ≤ ik , jk ≤ n, all of
p0 (x, y, z) ≈ z, pi (x, x, z) ≈ z for all 1 ≤ i ≤ n,
q1 (pi1 (x, y, z), x, y, z) ≈ x,
qk−1 (pjk−1 (x, y, z), x, y, z) ≈ qk (pik (x, y, z), x, y, z) (2 ≤ k ≤ m),
qm (pim (x, y, z), x, y, z) ≈ y
are identities in V .

Note that congruence regularity of a variety V implies n-permutability for


some number n (see [53]). A simple consequence of Theorem 9.5.2 is the
following.
206 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

Corollary 9.5.3 Let V be a variety. If there exists a ternary term l in V


with

 z
 if x = y
l(x, y, z) = y if x = z, then V is congruence regular.

 x otherwise

Proof: Given such a term l, we set n = m = 1, p0 (x, y, z) = z, p1 (x, y, z) =


l(x, y, z), and q1 (u, x, y, z) = l(u, y, z). Then we have q1 (p0 (x, y, z), x, q, z) =
q1 (l(x, y, z), x, y, z) = l(l(x, y, z), y, z) = x and p1 (x, x, z) = l(x, x, z) = z.

In [21] K. Denecke proved the following result for the special case of varieties
generated by two-element algebras.

Corollary 9.5.4 A variety generated by a two-element algebra is congru-


ence regular if and only if it is congruence permutable (2-permutable).

9.6 Two-Element Algebras


Some important varieties are generated by two element algebras: the vari-
eties of distributive lattices, of semilattices, of implicative algebras and of
Boolean algebras are all generated by a two element algebra. In the next
Chapter, we shall give a complete survey and classification of all two ele-
ment algebras. We set the foundations for this later work in this section,
where we survey some of the properties of congruences, from the previous
sections, for the special case of two element algebras A. Our presentation
is based on the work of M. Reschke, O. Lüders and K. Denecke in [100].
The analysis is made easier by the fact that properties such as congruence
distributivity and congruence modularity can be characterized by the pres-
ence of certain ternary terms. When A has cardinality two, there are a finite
number of ternary operations defined on A, and a complete search for terms
with desired properties is possible and in fact, as we shall see, not very dif-
ficult.

When an algebra A has a base set A of cardinality two, we usually denote


A by the set {0, 1}, and we refer to operations on A as Boolean operations.
Throughout this section, we assume A = ({0, 1}; F ), where F is some set of
Boolean operations. A complete system of representatives of all two-element
9.6. TWO-ELEMENT ALGEBRAS 207

algebras was given by E. L. Post ([95]) in 1941. We shall study Post’s lattice
of representatives in the next chapter, making use of the congruence prop-
erties we determine here.

It should be emphasized that two-element algebras are studied here only


up to equivalence, that is, equality of the sets of term operations, and up to
isomorphism. We will use the following standard notation for Boolean opera-
tions: ¬ for negation, ⇒ for implication, ∧ for conjunction, ∨ for disjunction,
+ for addition modulo 2, and ⇔ for equivalence. We sometimes also denote
conjunction by juxtaposition.

We begin our study of the properties of congruence lattices of two-element


algebras with the property of congruence distributivity, starting with 2-
distributivity. Suppose that the two-element algebra A is 2-distributive.
Then by Theorem 9.2.3 we must have terms t0 , t1 and t2 which satisfy
the following identities:

t0 (x, y, z) ≈ x, t1 (x, y, x) ≈ x, x ≈ t0 (x, x, y) ≈ t1 (x, x, y),


t1 (x, y, y) ≈ t2 (x, y, y) ≈ y, t2 (x, y, z) ≈ z.

In particular, this tells us that t1 satisfies the identities t1 (x, y, x) ≈


t1 (x, x, y) ≈ t1 (y, x, x) ≈ x, and hence acts as a majority term m. Thus
for 2-distributivity we need a ternary majority term. When we look at a
table of values to see what ternary operations are possible, we see that the
majority term properties completely restrict the values t1 (x, y, z) on all eight
triples (x, y, z) in A3 .

x 0 0 0 0 1 1 1 1
y 0 0 1 1 0 0 1 1
z 0 1 0 1 0 1 0 1
t1 (x, y, z) 0 0 0 1 0 1 1 1

Exactly one Boolean operation is determined, namely

t1 (x, y, z) = (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z) = m(x, y, z).

This proves the following result for 2-distributivity.


208 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

Lemma 9.6.1 A two-element algebra A is 2-distributive iff the ternary op-


eration t1 (x, y, z) = (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z) is one of its term operations.
Next we look for two-element algebras which are 3-distributive. Any such
algebra must have terms t0 , . . ., t3 which satisfy the (∆3 ) identities from
Theorem 9.2.5:

t0 (x, y, z) ≈ x, t1 (x, x, y) ≈ t0 (x, x, y) ≈ x, t1 (x, y, x) ≈ x,


t1 (x, y, y) ≈ t2 (x, y, y), t2 (x, x, y) ≈ t3 (x, x, y) ≈ y,
t3 (x, y, x) ≈ x, t3 (x, y, z) ≈ z.

We can take t0 to be the term x1 , and t3 to be x3 ; so we are looking for


Boolean operations t1 and t2 satisfying

t1 (x, x, y) ≈ t1 (x, y, x) ≈ x
t2 (x, x, y) ≈ y, t2 (x, y, x) ≈ x.

Again we turn to tables of values to see what choices are possible for t1 and
t2 . Note that the identities above determine the value of each of t1 and t2
on six of the eight triples in A3 , leaving us to consider possible values on the
two remaining triples (0, 1, 1) and (1, 0, 0).

x 0 0 0 0 1 1 1 1
y 0 0 1 1 0 0 1 1
z 0 1 0 1 0 1 0 1
t1 (x, y, z) 0 0 0 ? ? 1 1 1
t2 (x, y, z) 0 1 0 ? ? 1 0 1

Moreover, our identity t1 (x, y, y) ≈ t2 (x, y, y) forces t1 and t2 to have the


same value on these last two triples, leaving us with 4 cases to consider:

Case 1: t1 (0, 1, 1) = 0, t1 (1, 0, 0) = 0 (and the same values for t2 ).


Here we have t1 (x, y, z) = x ∧ (y ∨ z) and t2 (x, y, z) = (x ∨ ¬y) ∧ z. Thus if
both of these operations are term operations of A, then A is 3-distributive.

Case 2: t1 (0, 1, 1) = 0, t1 (1, 0, 0) = 1 (and the same values for t2 ).


In this case we have t1 (x, y, z) = x and t2 (x, y, z) = (z ∧ (xy ∨ (¬x)(¬y))) ∨
9.6. TWO-ELEMENT ALGEBRAS 209

(x ∧ ¬y). But using t2 we can define a term m(x, y, z) = t2 (x, t2 (x, y, z), z),
which we can easily see acts as a majority term:

m(x, x, y) = t2 (x, t2 (x, x, y), y) ≈ t2 (x, y, y) ≈ x,


m(x, y, x) = t2 (x, t2 (x, y, x), x) ≈ t2 (x, x, x) ≈ x, and
m(y, x, x) = t2 (y, t2 (y, x, x), x) ≈ t2 (y, y, x) ≈ x.

Thus in this case our algebra is in fact 2-distributive, by the previous Lemma,
and so 3-distributive as well.

Case 3: t1 (0, 1, 1) = 1, t1 (1, 0, 0) = 0 (and the same values for t2 ).


In this case, t1 is the operation encountered in the 2-distributivity case above,
with the properties of a ternary majority term m(x, y, z), and t2 (x, y, z) = z.
Thus when this operation t1 is a term, we know that the algebra is 2-
distributive and hence also 3-distributive.

Case 4: t1 (0, 1, 1) = 1, t1 (1, 0, 0) = 1 (and the same values for t2 ).


Here t1 (x, y, z) = x ∨ (y ∧ z), and t2 (x, y, z) = (x ∧ ¬y) ∨ z. This is the dual
of Case 1, with a similar result.

Thus for 3-distributivity, we have two pairs of terms which gives us 3-


distributivity, along with two cases which reduce to 2-distributivity. In fact,
a stronger result may be shown for the Cases 1 and 4 above. We consider
Case 1 only, with Case 4 being dual. We have shown that if both t1 and t2 of
this case are terms, our algebra is 3-distributive. We will show below that if
(only) t1 = x ∧ (y ∨ z) is a term, we can deduce 4-distributivity. Conversely,
if t2 = (x ∨ ¬y) ∧ z is a term, we can in fact produce the operation t1 as a
term, giving us 3-distributivity: when (x ∨ ¬y) ∧ z is a term, so is x ∧ (y ∨ ¬z),
and then t1 can be expressed as the term x ∧ (y ∨ ¬((x ∧ (y ∨ ¬z)))).

For 4-distributivity, we can start with the same argument for t1 as in the 3-
distributive case. If a two-element algebra has terms t0 , . . ., t4 which satisfy
conditions (∆4 ), then in particular the term t1 has its value constrained on
six of the eight triples, and we have 4 possibilities for t1 , as we did in the
3-distributive case.

Case 1: t1 = x ∧ (y ∨ z). If this operation is a term operation, then so are


x ∧ (z ∨ z) and z ∧ (y ∨ x). Taking these for t2 and t3 respectively, and taking
t0 = x and t4 = z, we have five terms t0 , . . ., t4 which satisfy (∆4 ). Thus in
210 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

this case our algebra is 4-distributive.

Case 2: t1 = x. Here we must consider the values possible for the remaining
terms t2 and t3 (with t4 = z). Making a table of values possible for these
two, and using the 3-distributive identities forcing values of t3 , shows that
their values are also determined on six triples, and must be equal on the
remaining two triples. This leads to four subcases:

(i) t2 = x ∧ (¬y ∨ z) and t3 = (x ∨ y) ∧ z. Thus the presence of these two


terms gives 4-distributivity. But it is easy to see that the presence of these
two as terms is equivalent to Case 4 of the 3-distributive case (with variables
x and z interchanged).
(ii) t2 = x but t3 = m. In this case we have 2-distributivity.
(iii) t2 = (z ∧ (xy ∨ (¬x)(¬y))) ∨ (x ∧ ¬y). Then as in Case 2 for
3-distributivity, we can use t2 to define a majority term and obtain 2-
distributivity again.
(iv) t2 = x ∨ (¬y ∧ z) and t3 = (x ∧ y) ∨ z. This is dual to Case 2(i), and
again gives 3-distributivity.

Case 3: t1 = m, the majority term. In this case our algebra is 2-distributive.

Case 4: t1 = x ∨ (y ∧ z). This case is dual to Case 1.

All remaining instances of n-distributivity, for n ≥ 5, are dealt with by the


following Lemma.

Lemma 9.6.2 If a two-element algebra A is n-distributive for n ≥ 2, then


it is also 4-distributive.
Proof: Let A be n-distributive for n ≥ 2. By Theorem 9.2.5 there is a term
operation t1 of A satisfying t1 (x, x, z) ≈ x and t1 (x, y, x) ≈ x. As in the
previous proofs, the value of t1 is defined on all triples except (0, 1, 1) and
(1, 0, 0); so there are only four possibilities for the operation t1 . Three such
possibilities lead to n-distributivity for n equal to one of 2, 3 or 4, exactly as
in Cases 1, 3 and 4 for 3-distributivity above. The only remaining case is the
one in which t1 = x. Using tables of values as before, we see that there are
four possibilities for t2 in this case. Again three of these lead to either 2- or
3-distributivity. The remaining case is when both t1 and t2 equal x. But in
this case, we see that the n-distributive case reduces to (n − 2)-distributive,
9.6. TWO-ELEMENT ALGEBRAS 211

since t3 now acts like t1 , t4 like t2 , and so on.

Combining our analysis of all the cases for n-distributivity, we have the
following conclusion.

Theorem 9.6.3 A two-element algebra A is congruence-distributive iff one


of the following Boolean operations is a term operation of A:
m(x, y, z) = (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z),
x ∧ (y ∨ z), (x ∨ ¬y) ∧ z,
x ∨ (y ∧ z), or (x ∧ ¬y) ∨ z.

We now turn to the question of n-permutability of two-element algebras,


beginning with 2-permutability. We recall that in a two-element algebra we
use + to denote addition modulo two, and abbreviate x ∧ y by xy.

Lemma 9.6.4 A two-element algebra A is 2-permutable iff the operation p


with p(x, y, z) = x + y + z is a term operation of A.

Proof: The operation p fits the conditions from Mal’cev’s characterization in


Theorem 9.1.1 of congruence permutability; so if p is a term operation then
A is congruence permutable and 2-permutable. Conversely, suppose that A
is 2-permutable. Then the identities (P2 ) from Theorem 9.4.5 are satisfied for
some ternary terms p0 , p1 and p2 . The conditions for the term p1 constrain
its value on all triples in A3 other than (0,1,0) and (1,0,1), which means that
there are exactly four possible term operations p1 which fit the conditions
(P2 ). One of these, the operation with p1 (0, 1, 0) = 1 and p1 (1, 0, 1) = 0, is
the Mal’cev term p, and so guarantees 2-permutability.

We will show that in the three remaining cases for p1 , we can use the exis-
tence of the term p1 to produce a Mal’cev term p.

Case 1: If p1 (0, 1, 0) = 0 = p1 (1, 0, 1), then we can write p1 in the form


p1 (x, y, z) = x + z + xy + yz + xyz. From such a term we can produce
a Mal’cev term p, by p(x, y, z) = p1 (x, p1 (x, y, x), p1 (y, p1 (x, y, x), z)), and
hence p is a term operation of A.

Case 2: If p1 (0, 1, 0) = 0 and p1 (1, 0, 1) = 1, then p1 can be written in the


form p1 (x, y, z) = x + z + xy + xz + yz. From this we again can obtain p as
212 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

a term, by p(x, y, z) = p1 (p1 (x, z, y), x, p1 (x, y, z)).

Case 3: If p1 (0, 1, 0) = 1 = p1 (1, 0, 1), then p1 can be written as


p1 (x, y, z) = x + y + z + xz + xyz, and then we obtain p by p(x, y, z)
= p1 (p1 (y, x, z), y, p1 (x, z, y)).

Our next lemma reduces the investigation of n-permutability to the cases


n = 2 and n = 3.

Lemma 9.6.5 If a two-element algebra A is n-permutable for n > 2, then


it is either 2-permutable or 3-permutable.

Proof: Let A be n-permutable, for some n ≥ 3. Then there is a term op-


eration p1 of A which satisfies the identity p1 (x, z, z) ≈ x. This identity
uniquely determines the value of p1 on four of the eight triples in A3 , leav-
ing the value undefined on (0,0,1), (0,1,0), (1,0,1), and (1,1,0). This gives us
16 possible choices for the values of p1 . But we can eliminate a number of
these cases immediately. First, p1 cannot equal the first projection function
x1 , since in that case n-permutability just reduces to (n-1)-permutability.
In one case p1 equals the Mal’cev term p, and the algebra is 2-permutable
by the previous Lemma. Moreover, a ternary operation p1 (x, y, z) is a term
operation of the algebra A iff every operation arising from p1 by permuting
y and z is a term operation of A too; and identities satisfied by p1 are also
satisfied by the operation dual to p1 . With this information we are left with
exactly five possibilities to consider for p1 .
Case 1: p1 (0, 0, 1) = p1 (0, 1, 0) = p1 (1, 1, 0) = 0 and p1 (1, 0, 1) = 1.
Then p1 (x, y, z) = x + xy + xyz, and by superposition we obtain a term p2
by p2 (x, y, z) = z + zy + xyz. Now taking p0 = x, p1 = x + xy + xyz, p2 =
z + zy + xyz and p3 = z, we have terms which satisfy condition (P3 ) from
Theorem 9.4.5, and our algebra is 3-permutable.

Case 2: p1 (0, 0, 1) = p1 (0, 1, 0) = p1 (1, 0, 1) = p1 (1, 1, 0) = 0.


Then p1 (x, y, z) = x + xy + xz. By superposition we can produce a term q
with q(x, y, z) = p1 (x, p1 (p1 (x, z, y), y, x), y). But then q is exactly the term
obtained for p1 in Case 1, and as before the algebra is 3-permutable.

Case 3: p1 (0, 0, 1) = p1 (1, 1, 0) = 0, p1 (0, 1, 0) = p1 (1, 0, 1) = 1.


Here we have p1 (x, y, z) = x + y + yz. Again we can make the term used in
9.7. EXERCISES 213

Case 1, by p1 (x, y, p1 (y, x, z)), so that 3-permutability is satisfied.

Case 4: p1 (0, 0, 1) = p1 (1, 0, 1) = 0, p1 (0, 1, 0) = p1 (1, 1, 0) = 1.


Here we can write p1 (x, y, z) = x + y + xy + yz + xz. As in the proof of the
previous Lemma, we can obtain by superposition a Mal’cev term p, and A
is 2-permutable.

Case 5: p1 (0, 0, 1) = p1 (1, 0, 1) = p1 (1, 1, 0) = 0 and p1 (0, 1, 0) = 1.


As in Case 4, we can use p1 (x, y, z) = x + y + xz + yz + xyz to produce a
Mal’cev term p, again giving us 2-permutability.

Since 2-permutability also implies n-permutability for any n > 2, we have


shown that any n-permutable two-element algebra is 3-permutable. In fact
we have exactly two possibilities:

(i) A is 2-permutable, or

(ii) A is 3-permutable but not 2-permutable.

9.7 Exercises
9.7.1. Prove Corollary 9.5.4.

9.7.2. Prove the assertion following the statement of Theorem 9.2.4, that
when V has a majority term m, we may take n = 2, and terms t0 = x1 , t1
= m and t2 = x3 , to obtain the identities (∆2 ) of Jonsson’s Condition.

9.7.3. Prove that for any two-element algebra A, the variety V (A) generated
by A is congruence modular iff it is congruence distributive or congruence
permutable.

9.7.4. Prove that a two-element algebra is 3-distributive but not 2-


distributive iff (x ∨ ¬y) ∧ z or (x ∧ ¬y) ∨ z is a term operation, but m
is not a term operation.

9.7.5. Prove that a two-element algebra is 4-distributive but not 3-


distributive if either

(i) x ∨ (y ∧ z), but not (x ∧ ¬y) ∨ z, or


214 CHAPTER 9. MAL’CEV-TYPE CONDITIONS

(ii) x ∧ (y ∨ z), but not (x ∨ ¬y) ∧ z

are term operations, but m is not a term operation.

9.7.6. Prove that a two-element algebra A is 3-permutable but not 2-


permutable iff either r or r0 , with

r(x, y, z) = x + xz + xyz, r0 (x, y, z) = x + y + xy + yz + xyz,

are term operations of A, but p is not a term operation of A.

9.7.7. Prove that in a congruence permutable variety V there is a term p


such that p(x, x, y) ≈ y is an identity of V .
Chapter 10

Clones and Completeness

In Example 1.2.14, and again in Chapters 2 and 5, we studied the concept of


a clone as a set of operations defined on a base set A which is closed under
composition and contains all the projection operations. In Section 10.1 we
describe several ways in which we can regard a clone as an algebraic struc-
ture. In Section 10.2 we describe clones using relations, since clones occur
as sets of operations which preserve relations on a set A. This description is
based on a Galois-connection, with the clones corresponding to closed sets,
and thus the set of all clones of operations defined on a fixed finite set forms
a lattice. In Section 10.3 we shall use our results from Chapter 9 on the
congruence lattice properties of two-element algebras, to give a complete de-
scription of the lattice of all clones of operations defined on a two-element
set. An important question is to decide when a set of operations defined on
a fixed set generates the set of all operations defined on this set, the so-
called functional completeness problem. We will also describe the algebraic
properties of certain classes of finite algebras connected with the functional
completeness problem.

10.1 Clones as Algebraic Structures


There are several ways to regard a clone of operations as an algebraic struc-
ture, that is, as a pair consisting of one (or more) sets and of sets of operations
defined on these sets. The problem is how to handle the different arities of
the operations. A. I. Mal’cev ([74]) defined clones using what is called the full
iterative algebra O(A) = (O(A); ∗, ζ, τ, ∆, e21 ), as in Example 1.2.14. Here we
have a (homogeneous) algebra with one set of objects, but with five different

215
216 CHAPTER 10. CLONES AND COMPLETENESS

operations on our set, one binary, three unary, and one nullary. Subalgebras
of this algebra are then called iterative algebras. The binary operation ∗ is
clearly associative. This makes any iterative algebra a semigroup, with three
additional unary operations and a nullary operation. It is not hard to prove
that universes of subalgebras of O(A) are clones in our sense: they are closed
under composition and contain all the projections. Conversely, any clone of
operations can be shown to be closed under the iterative algebra operations
∗, ζ, τ , ∆ and e21 . Thus any clone occurs as the base set of a subalgebra of
a full iterative algebra. Universes of clones are also called closed classes or
superposition-closed classes, and we refer to superposition of functions.

Another approach to defining clones algebraically is to use heterogeneous


algebras, also known as many-sorted or multi-based algebras (see Higgins,
[55] and G. Birkhoff and J. D. Lipson, [9]). In such algebras, we have more
than one base set of objects, and operations between different sets of objects.
This approach is useful for clones of operations, where we can separate the
operations of different arities into different base sets. Thus we have a base
set On (A) for each n ∈ IN+ ; with composition operations Sm n for each pair

n, m ∈ IN+ and nullary operations eni for each n ∈ IN+ and 1 ≤ i ≤ n,


picking out the projection operations. We write

O(A) = ((On (A))n∈IN+ ; (Sm


n
)m,n∈IN+ , (eni )n∈IN+ ,1≤i≤n ),

for the (heterogeneous) full clone of all operations defined on the set A. A
clone on A is then any subalgebra of this algebra. Each such clone belongs
to a variety K0 of heterogeneous algebras defined by the following identities
(C1), (C2), (C3):
p (z, S n (y , x , . . . , x ), . . . , S n (y , x , . . . , x )) ≈
(C1) Sm m 1 1 n m p 1 n
n (S p (z, y , . . . , y ), x , . . . , x ),
Sm n 1 p 1 n (m, n, p ∈ IN+ ),
n (en , x , . . . , x ) ≈ x ,
(C2) Sm i 1 n i (m ∈ IN+ , 1 ≤ i ≤ n),

(C3) Snn (y, en1 , . . . , enn ) ≈ y, (n ∈ IN+ ).

Up to isomorphism, the elements of the variety K0 are exactly the clones


regarded as heterogeneous algebras (see for instance W. Taylor, [112]). Our
heterogeneous clones correspond to algebraic theories, or particular cate-
gories, in the sense of F. W. Lawvere ([71]).
10.2. OPERATIONS AND RELATIONS 217

10.2 Operations and Relations


In this section we recall and expand on the Galois-connection introduced in
Section 2.2 between sets of operations and sets of relations on a base set
A. For any positive integer h, an h-ary relation on the set A is a subset ρ
of Ah (a set of h-tuples consisting of elements of A). We say that an n-ary
operation f ∈ On (A) preserves ρ if for every h × n matrix Y whose column
vectors y 1 , . . . , y n all belong to ρ, the vector (f (y 1 ), . . . , f (y n )) belongs to
ρ. In this case we also say that ρ is invariant with respect to f , or f is a
polymorphism of ρ, or f is compatible with ρ.

The set of all operations defined on A preserving a given relation ρ ⊆ Ah


is denoted by P olA ρ. It is very easy to show that sets of operations of the
form P olA % are clones. It is also wellknown that all clones occur in this way,
using the following notation. For a set A let Rh (A) be the set of all h-ary

Rh (A), the set of all finitary relations on
S
relations on A and let R(A) =
h=1
A.

For any set F ⊆ O(A) of operations and any set Q ⊆ R(A) of relations we
consider

P olA Q := {f | f ∈ O(A) and ∀ % ∈ Q (f preserves %)} and


InvA F := {% | % ∈ R(A) and ∀ f ∈ F (f preserves %)}.

(For convenience we usually write P olA % and InvA f for P olA {%} and
InvA {f }, and if the base set is clear from the context we usually omit the
subscript A.)

In Section 2.2 we considered the operators

Inv : O(A) → R(A) : F 7→ InvA F and


P ol : R(A) → O(A) : Q 7→ P olA Q

as an example of a Galois-connection. This connection is induced by the


relation of preservation, that is,

R := {(f, %) | f ∈ O(A), % ∈ R(A) and f preserves %}.


218 CHAPTER 10. CLONES AND COMPLETENESS

Clones can then be characterized as Galois-closed sets of operations, that is,


sets F having the property that P olInvF = F (see R. Pöschel and L. A.
Kalužnin, [96]). Dually, closed sets R of relations satisfying InvP olR = R
are called relational clones. There is also an algebraic characterization of
relational clones using certain operations defined on sets of relations ([96]).
As we saw in Section 6.1, the two classes of closed sets of a Galois-connection
form complete lattices which are dually isomorphic to each other. In the spe-
cific case of clones of operations, we see that the set of all clones of operations
defined on a base set A forms a complete lattice LA ; moreover this lattice is
dually isomorphic to the lattice of all relational clones on A.

In the next section we will give a complete description of this lattice LA , in


the special case that A is a two-element set. First, we give some examples of
relations and the corresponding clones in this special case. We fix A = {0, 1},
and refer to operations on A as Boolean operations. For the first example,
we take ρ to be the binary relation {(0, 1), (1, 0)}. We will denote subtraction
modulo 2 by −, and, as before, negation by ¬. Then a 2 × n matrix Y = (yij )
has all columns in ρ iff y2j = 1 − y1j = ¬y1j holds for all j = 1, . . . , n. Thus
the condition is that
(f (y11 , . . . , y1n ), f (¬y11 , . . . , ¬y1n )) ∈ ρ
holds for all y11 , . . . , y1n ∈ {0, 1}. This means that
¬f (y11 , . . . , y1n ) = f (¬y11 , . . . , ¬y1n )
or
f (y11 , . . . , y1n ) = ¬(f (¬y11 , . . . , ¬y1n ))
for all y11 , . . . , y1n ∈ {0, 1}. A Boolean function f ∗ is called dual to the
Boolean function f if f ∗ (y11 , . . . , y1n ) = ¬(f (¬y11 , . . . , ¬y1n )); Boolean func-
tions with f ∗ = f are called self-dual. We can express this more algebraically
by saying that f is self-dual iff the permutation which interchanges 0 and 1
is an automorphism of the algebra ({0, 1}; f ). Using Post’s original notation
([95]), we will denote by D3 the set of all self-dual Boolean functions. Thus
we see that for the given relation ρ on A = {0, 1}, we have P olA ρ = D3 .

Next we consider the relation ρ := {(0, 0), (0, 1), (1, 1)}, which can be defined
by (x, y) ∈ ρ iff x ≤ y. A 2×n matrix has all columns in ρ whenever y1j ≤ y2j
for all j = 1, . . . , n and thus f preserves ρ if
f (y11 , . . . , y1n ) ≤ f (y21 , . . . , y2n ) provided y11 ≤ y21 , . . . , y1n ≤ y2n .
10.3. THE LATTICE OF ALL BOOLEAN CLONES 219

This is the standard definition of a monotone Boolean function. The set of


all monotone Boolean functions is denoted by A1 , and we have P olA ρ = A1
in this example.
A unary relation on A is just a subset of A. For ρ equal to the singleton
set {0}, there is a single 1 × n matrix with all columns in {0}, namely
Y = (0, . . . , 0). The Boolean function f preserves {0} iff f (0, . . . , 0) = 0.
Similarly, f preserves {1} iff f (1, . . . , 1) = 1. We use Post’s names C2 for the
set of all 1-preserving Boolean functions and C3 for the set of all 0-preserving
Boolean functions.

As a last example consider

ρ := {(y1 , y2 , y3 , y4 ) ∈ {0, 1}4 | y1 + y2 = y3 + y4 },

where + as usual is addition modulo 2. A Boolean function is linear if there


are elements c0 , c1 , . . . , cn ∈ {0, 1} such that f (x1 , . . . , xn ) = c0 + c1 x1 + · · · +
cn xn . It can be shown that a Boolean function is linear iff it preserves the
relation ρ. The clone of all linear Boolean functions is denoted by L1 .

10.3 The Lattice of All Boolean Clones


As we saw in the previous section, the class of all clones on a fixed set A
forms a complete lattice LA . In the case that A is the two-element set {0, 1},
this lattice is called the lattice of Boolean clones. It was first described by E.
L. Post in 1941, and is sometimes also called Post’s lattice. The lattice is
countably infinite, complete, algebraic, atomic and dually atomic. It is also
known that every clone in the lattice is finitely generated. Post’s original
proof of the structure of the lattice requires several combinatorial consid-
erations, and simpler proofs have been given since then (see for instance J.
Berman, [7] or D. Lau, [69]).

In this section we will give a proof using our results from Section 9.6 on the
properties of the congruence lattices of two-element algebras. Clearly, if all
two-element algebras are known, then all closed classes of Boolean operations
are known. For the set A = {0, 1}, we will give a complete classification
of the varieties V (A) generated by algebras with base set A. Our analysis
will be broken into a number of cases, beginning with whether V (A) is
congruence modular or not. In the case that V (A) is congruence modular,
we will consider two further cases, depending on whether V (A) is congruence
220 CHAPTER 10. CLONES AND COMPLETENESS

distributive or not. The case that V (A) is congruence distributive will again
be broken into cases, based on whether V (A) is 2-distributive, 3-distributive
but not 2-distributive or finally 4-distributive but not 3-distributive. Note
that as in Section 9.6 we do not distinguish between isomorphic or equivalent
(having the same clone of term operations) algebras. We will use the notation
cloneA for the term clone, or clone of all term operations, of A. We also use
the standard notation for Boolean operations, as introduced in Section 9.6.
In addition, we will denote by c20 and c21 the binary constant operations with
values 0 and 1, respectively.

Lemma 10.3.1 Let A be a two-element algebra, with term clone cloneA.


If V (A) is not congruence modular, then cloneA is a subset of the clone
generated by one of the following sets of Boolean operations:
M1 = {∧, c20 , c21 , e21 , e22 },
M2 = {∨, c20 , c21 , e21 , e22 },
M3 = {c20 , c21 , e21 , e22 , ¬ e21 , ¬ e22 }.
Proof: We note first that none of the Boolean operations g1 to g7 in the
following list can be contained in cloneA:

g1 (x, y) = ¬ x ∧ y, g2 (x, y) = x ∧ ¬y, g3 (x, y) = ¬x ∧ ¬y,


g4 (x, y) = x + y + 1, g5 (x, y) = x ∨ ¬y, g6 (x, y) = ¬x ∨ y,
g7 (x, y) = ¬x ∨ ¬y.

This is because otherwise we could construct by superposition one of the


operations x + y + z, (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∧ (y ∨ z), x ∨ (y ∧ z),
(x ∨ ¬ y) ∧ z, (x ∧ ¬y) ∨ z, any of which we know from Theorem 9.6.3 guaran-
tees congruence modularity. Similarly, x ∧ y and x ∨ y cannot both be term
operations of A, since from them we could produce as a term the operation
(x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z). Also, if either of ∧ or ∨ is an element of cloneA
then ¬ cannot be a term operation, since otherwise we could produce one of
the operations g1 to g7 above.
For any set F of Boolean operations, we denote by F d the set of all op-
erations dual to those from F . It is easy to prove that the algebras A =
({0, 1}; (fiA )i∈I ) and Ad = ({0, 1}; ((fiA )d )i∈I ) are always isomorphic. Obvi-
ously M2d = M1 , so we can restrict ourselves to M1 and M3 . We will show
that the set of all binary term operations of A is contained in either M1 or
M3 .
10.3. THE LATTICE OF ALL BOOLEAN CLONES 221

Next we claim that any at least essentially ternary operation g from cloneA
must be monotone. Monotonicity means that if we define (a1 , a2 , . . . , an ) 
(b1 , b2 , . . . , bn ) iff ai ≤ bi for all i ∈ {1, . . . , n}, for ai , bi ∈ {0, 1}, then
for arbitrary n-tuples (a1 , a2 , . . . , an ) and (b1 , b2 , . . . , bn ), if (a1 , . . . , an ) 
(b1 , . . . , bn ) then g(a1 , . . . , an ) ≤ g(b1 , . . . , bn ). Let g be an essentially at
least ternary operation. If g was not monotone, there would be a pair of
n-tuples
a = (a1 , a2 , . . . , ai−1 , 0, ai+1 , . . . , an )
b = (a1 , a2 , . . . , ai−1 , 1, ai+1 , . . . , an )
for which g(a) = 1 > 0 = g(b). We can assume that g is not constant, and
therefore that g(0, . . . , 0) = 0 and g(1, . . . , 1) = 1. But then by identifica-
tion of variables we can make a ternary operation f with f (0, 0, 0) = 0,
f (1, 0, 1) = 0, f (0, 0, 1) = 1 and f (1, 1, 1) = 1. Consideration of all possible
values for f on the remaining four triples in A3 shows that we would get
one of the operations x + y + z, (x ∨ ¬ y) ∧ z or (x ∧ ¬ y) ∨ z. But these are
impossible, as we saw above. Hence g must be monotone.

Now we will show that the n-ary term operations of cloneA are n-ary pro-
jections, negations of n-ary projections, n-ary constant operations, or can
be written in the form f (x1 , . . . , xn ) = xi1 ∧ . . . ∧ xin with {i1 , . . . , in } ⊆
{1, . . . , n}. For the binary term operations this is clear. For n > 2, we have
seen that any essentially n-ary term operation h is monotone. Suppose that
h is not constant and not an n-ary projection. Then h(0) = 0 and h(1) = 1.
In addition, there must exist an n-tuple a = (a1 , . . . , ai , . . . , aj , . . . , an ) with
a 6= 0 and a 6= 1, but h(a) = 1; that is, there is at least one i and a num-
ber j, with 1 ≤ i, j ≤ n, such that either ai = 0 and aj = 1 or aj = 0
and ai = 1. Since h is not an n-ary projection, there is an n-tuple b =
(b1 , . . . , bi , . . . , bj , . . . , bn ) with h(b) 6= bj . By identification and permutation
of variables we can make a binary term operation h0 with h0 (0, 0) = 0,
h0 (0, 1) = 1, h0 (bi , bj ) 6= bj and h0 (1, 1) = 1. Since h0 is monotone we must
have bj = 0 and thus bi = 1, and so h0 (x, y) = x ∨ y.

But this means we can produce the join operation ∨ as a term operation of
A. If the binary term operations of A are contained in M1 , we have both ∨
and ∧, and from these we obtain (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z). Similarly, if the set
of all binary term operations of A is contained in the set M3 , we have both
¬ and ∨ as term operations, and again can produce (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z).
But this makes A congruence modular.
222 CHAPTER 10. CLONES AND COMPLETENESS

This shows that cloneA is contained in the clone generated by M1 . If the


binary term operations of A are contained in M3 then the n-ary term opera-
tions of A are n-ary constants, n-ary projections, or negations of n-ary pro-
jections. Operations of the form f (x1 , . . . , xn ) = xi1 ∧. . .∧xin , {i1 , . . . , in } ⊆
{1, . . . , n}, cannot be term operations since otherwise ∧ is a term operation.
Thus cloneA is included in the clone generated by M3 in this case.

Analysis of the sets M1 and M3 yields the following description of all two-
element algebras which generate varieties which are not congruence modular.
As usual, we use Post’s names for the algebras.

Theorem 10.3.2 Let A be a two-element algebra generating a variety which


is not congruence modular. Then (up to isomorphism and equivalence) A is
one of the following algebras:

P6 = ({0, 1}; ∧, c20 , c21 ), P3 = ({0, 1}; ∧, c20 ), P5 = ({0, 1}; ∧, c21 ),
P1 = ({0, 1}; ∧), O9 = ({0, 1}; ¬, c20 ),
O8 = ({0, 1}; e21 , c20 , c21 ), O6 = ({0, 1}; e21 , c20 ),
O4 = ({0, 1}; ¬), O1 = ({0, 1}; e21 ).

Having found all two-element algebras which generate varieties which are
not congruence modular, we turn now to the other case, of two-element
algebras which generate congruence modular varieties. This case will also
be treated in two subcases, depending on whether the variety generated
is congruence distributive or not. First, suppose that V (A) is congruence
modular but not congruence distributive. In this case, we know that V (A) is
congruence permutable. By Theorems 9.6.3 and 9.6.4, the operation x+y +z
is a term operation of A, but none of (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∧ (y ∨ z),
x ∨ (y ∧ z), (x ∨ ¬ y) ∧ z and (x ∧ ¬ y) ∨ z are term operations of A. Clearly,
the clone h{x + y + z}i is the least clone of Boolean operations to contain
the operation x + y + z. From a well-known result of R. McKenzie in [78] it
follows that the greatest clone of Boolean operations to have this property,
that is, to contain x + y + z but none of (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∧ (y ∨ z),
x ∨ (y ∧ z), (x ∨ ¬ y) ∧ z and (x ∧ ¬ y) ∨ z, is the clone consisting of all
linear Boolean operations. Linear Boolean operations are those of the form
f (x1 , . . . , xn ) = c0 + c1 x1 + . . . + cn xn , for ci ∈ {0, 1}. It is easy to show that
10.3. THE LATTICE OF ALL BOOLEAN CLONES 223

the interval [h{x + y + z}i, h{+, c21 }i] contains exactly the following clones of
Boolean operations:

h{+, c21 }i, h{x + y + 1}i, h{¬, x + y + z}i, h{+}i, h{x + y + z}i.

This tells us all the two-element algebras which generate a congruence mod-
ular but not congruence distributive variety, namely

L1 = ({0, 1}; +, c21 ), L2 = ({0, 1}; x + y + 1), L3 = ({0, 1}; +),


L5 = ({0, 1}; x + y + z, ¬), L4 = ({0, 1}; x + y + z).

At this point in our analysis, any remaining two-element algebras generate


congruence distributive varieties. As we know from Section 9.6, there are
exactly the following three cases for congruence distributivity: V (A) may be
2-distributive, or 3-distributive but not 2-distributive, or 4-distributive but
not 3-distributive.

Consider first the case that V (A) is 2-distributive. Then by Lemma 9.6.1,
the majority operation (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z) is a term operation of A.
Using the Baker-Pixley Theorem, Theorem 9.2.6, we can describe cloneA
as the clone consisting of exactly all operations defined on A which pre-
serve all subalgebras of the direct square A2 . By examining all sublattices
of the lattice of all subsets of A2 which can be subalgebra lattices of A2
we can determine all two-element algebras with a majority term operation
(x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z). To describe them all, we need the following five
relations on A:

≤:= {(00), (01), (11)}, s2 := {(01), (10)},


R2 := {(00), (01), (10)}, R20 := {(11), (01), (10)},
∆ := {(00), (10)}.

Then we get exactly the following clones:

P ol ∆, P ol{0}, P ol{1}, P ol({0}, {1}), P ol ≤, P olR2 , P olR20 ,


P ols2 , P ol({0}, ≤), P ol({0}, {1}, ≤), P ol({1}, ≤−1 ), P ol({0}, {1}, ≤−1 ),
P ol({1}, R2 ), P ol({0}, R20 ), P ol({0}, {1}, s2 ), P ol(≤, R2 , R20 ).

It is not hard to determine a generating system for each clone in this list,
and from this we produce our list of all two-element algebras which generate
224 CHAPTER 10. CLONES AND COMPLETENESS

a 2-distributive variety:

A1 = ({0, 1}; ∧, ∨, c20 , c21 ),


A2 = ({0, 1}, ∧, ∨, c21 ),
A3 = ({0, 1}; ∧, ∨, c20 ),
A4 = ({0, 1}; ∧, ∨),
C1 = ({0, 1}; ∧, ¬),
C3 = ({0, 1}); ∨, g2 ), where g2 = x ∧ ¬ y,
C2 = ({0, 1}; ∨, x + y + 1),
C4 = ({0, 1}; ∨, t), with t(x, y, z) = x ∧ (y + z + 1),
D2 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z)),
D1 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x + y + z),
D3 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x + y + z, ¬),

F52 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), (x ∧ ¬ y) ∨ z),


F62 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∧ (y ∨ z)),
F72 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), c20 ),
F92 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), g2 ),
F12 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), (x ∨ ¬ y) ∧ z),
F22 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∨ (y ∧ z)),
F32 = ({0, 1}; (x ∧ y) ∨ (y ∧ z) ∨ (x ∧ z), c21 ),
F42 = ({0, 1}; (x ∧ y) ∨ (y ∧ z) ∨ (x ∧ z), x ∨ ¬ y).

For the next case, we assume now that V (A) is congruence distributive, but
not 2-distributive. Then x ∧ (y ∨ z) or (x ∨ (y ∧ z)) is a term operation,
but (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z) is not a term operation. We consider two
further subcases here, depending on whether the term clone cloneA can be
represented as P ol ρ for some u-ary relation ρ on {0, 1}. The next lemma,
due to M. Reschke and K. Denecke ([99]), handles the case where cloneA =
P olρ for some at least ternary relation. It uses a function hµ with properties
corresponding to the generalized Baker-Pixley Theorem (see Remark 9.2.7).
Such functions are called near-unanimity functions. The generalized Baker-
Pixley Theorem shows that a function f is a term function of A if and only
if f preserves all subalgebras of Aµ .

Lemma 10.3.3 ([99]) Let cloneA = P ol ρ for a µ-ary relation ρ ⊆ {0, 1}µ ,
where µ ≥ 3. Then V (A) is congruence distributive but not 2-distributive if
and only if the following operation hµ or its dual h0µ is a term operation in
10.3. THE LATTICE OF ALL BOOLEAN CLONES 225

cloneA:
µ+1
_
hµ (x1 , . . . , xµ+1 ) = (x1 ∧ . . . ∧ xi−1 ∧ xi+1 ∧ . . . ∧ xµ+1 ).
i=1

Proof: We will prove that x ∧ (y ∨ z) ∈ P ol ρ if and only if hµ ∈ P ol ρ.


The dual case, that x ∨ (y ∧ z) ∈ P ol ρ if and only if h0µ ∈ cloneA, is sim-
ilar. In one direction, if hµ ∈ P ol ρ for some natural number µ ≥ 3, then
hµ (x1 , x2 , x3 , x1 , . . . , x1 ) = x1 ∧ (x2 ∨ x3 ) ∈ P ol ρ. Conversely, assume that
x1 ∧ (x2 ∨ x3 ) ∈ P ol ρ and that

a11 aµ+11
   
 ..  ..
a1 =  .  ∈ ρ, . . . , aµ+1 =   ∈ ρ.
 
.
a1 µ aµ+1 µ
Since cloneA = P olρ, it is enough to show that

b1
 
 .. 
hµ (a1 , . . . , aµ+1 ) =  .  = b ∈ ρ.

Let i1 , i2 , . . ., il be the indices k from {1, . . . , µ} for which bk = 1. Since hµ is


a join of meets, this means that there must be an index j, with 1 ≤ j ≤ µ+1,
for which the corresponding entries aji1 , . . ., ajil are also equal to 1. (If b
is the zero vector, we can pick any index j.) If aj agrees with b in all other
entries as well, then b = aj ∈ ρ. Otherwise, let α be the least index from
1, . . . , µ such that bα 6= aj α . This means that bα = 0, while aj α = 1. From
the structure of hµ again, it is clear that there are two elements ar and as ,
with r 6= s 6= j and r, s ∈ {1, . . . , µ + 1}, such that ar α = as α = 0. Now since
the term x1 ∧ (x2 ∨ x3 ) is in P olρ, we must have aj ∧ (ar ∨ as ) = b1 ∈ ρ.
This b1 is a µ-tuple which agrees with b at the positions i1 , i2 , . . . , il and
α. If b = b1 ∈ ρ, we are done. Otherwise, we repeat this process: we let
β ∈ {1, . . . , µ} be the least index for which bβ 6= b1 β , so that bβ = 0 and
b1 β = 1. Continuing in this way, after finitely many steps, we get a µ-tuple
bk ∈ ρ which agrees with b.

The least clone of the form cloneA = P ol ρ such that V (A) is 4-distributive
but not 2-distributive is the clone generated by hµ . It can be shown that
this clone is P ol{Rµ , ≤, {1}}, where Rµ is the relation {0, 1}µ \ {(1, . . . , 1)}
226 CHAPTER 10. CLONES AND COMPLETENESS

for µ ≥ 3. Dually, we have h{h0µ }i = P ol{Rµ0 , ≤, {0}} for the relation Rµ0 =
{0, 1}µ \ {(0, . . . , )}. There are four clones of this kind, for µ ≥ 3:

1. cloneA = P ol{Rµ , ≤} (dually P ol{R0 µ, ≤}),


2. cloneA = P ol{Rµ , ≤, {1}} (dually P ol{Rµ0 , ≤, {0}}),
3. cloneA = P ol{Rµ , {1}} (dually P ol{Rµ0 , {0}}),
4. cloneA = P ol Rµ (dually P ol Rµ0 ).

In cases 3 and 4 the terms (x ∧ ¬ y) ∨ z or dually (x ∨ ¬ y) ∧ z are elements


of the clone, and thus V (A) is 3-distributive, while in cases 1 and 2 V (A)
is 4-distributive but not 3-distributive. These are exactly all cases for which
V (A) is congruence distributive but not 2-distributive and cloneA = P ol ρ
for some finitary ρ.

We turn now to our last case, where we have V (A) congruence distribu-
tive but not 2-distributive, but there is no finitary relation ρ on {0, 1} such
that cloneA = P ol ρ. By Lemma 10.3.3 we know that hµ 6∈ cloneA. The
least clone cloneA such that V (A) is 4-distributive and not 3-distributive is
the clone generated by x ∧ (y ∨ z) or x ∨ (y ∧ z), and the least clone such
that V (A) is 3-distributive and not 2-distributive is the clone generated by
(x ∧ ¬ y) ∨ z or (x ∨ ¬ y) ∧ z. It is easy to show moreover that the only other
clones with the first property are h{x ∧ (y ∨ z), c10 }i and h{x ∨ (y ∧ z), c11 }i,
and the only other clones of the second type are h{(x ∧ ¬ y) ∨ z, c10 }i and
h{(x ∨ ¬ y) ∧ z, c11 }i.

Theorem 10.3.4 (i) Let A be a two-element algebra generating a 4-


distributive but not 3-distributive variety V (A), and assume that
cloneA = P ol ρ for some µ-ary relation ρ (with µ ≥ 3 minimal). Then
A is (up to equivalence and dual isomorphism) one of the algebras F6µ
= ({0, 1}; hµ ) or F7µ = ({0, 1}; c20 ).

(ii) Let A be a two-element algebra generating a 4-distributive but not 3-


distributive variety V (A), and assume that there is no finitary relation
ρ for which cloneA = P ol ρ. Then A is (up to equivalence and dual
isomorphism) one of the algebras F6∞ = ({0, 1}; x ∧ (y ∨ z)) or F7∞ =
({0, 1} : x ∧ (y ∨ z), c20 ).
10.3. THE LATTICE OF ALL BOOLEAN CLONES 227

Cr1
C2 r rC3
Cr4

Ar1
A2 A3
r r
r
F24 r A4 rF2
F23 F27 8
r r r r
2
F1 r r F25
F22 F26
F34 r F33 F 3 rF3
8
r r r 7 r
F31 r Dr3 r F35
F32 F36
r
D 1

r r rF∞
F∞
4 F∞
3 D2 F∞
7 8
r r r r ∞
L1
F∞
1 r r r F 5
F∞
2 r rL5 r F∞
6
rS6 L2 L3 P6 r
r
S3 r rS5
L4 P5 r r P3
rS r
1 Or9 P1

r
O8
O5 r r r O6
r O4
O1

(iii) Let A be a two-element algebra generating a 3-distributive but not 2-


distributive variety, and assume that cloneA = P ol ρ for some µ-ary
relation ρ (with µ ≥ 3 minimal). Then A is (up to equivalence and
dual isomorphism) one of the algebras F5µ = ({0, 1}; (x ∧ ¬ y) ∨ z, hµ )
or F8µ = ({0, 1}; (x ∧ ¬y) ∨ z, hµ , c20 ) = ({0, 1}; hµ , g2 ).

(iv) Let A be a two-element algebra generating a 3-distributive but not 2-


228 CHAPTER 10. CLONES AND COMPLETENESS

distributive variety, and assume that there is no finitary relation ρ for


which cloneA = P ol ρ. Then A is (up to equivalence and dual iso-
morphism) one of the algebras F5∞ = ({0, 1}; (x ∧ ¬ y) ∨ z) or F8∞ =
({0, 1}; (x ∧ ¬ y) ∨ z, c20 ) = ({0, 1}; g2 ).

This completes our survey of all two-element algebras, since all our cases
have now been considered. The diagram shows the lattice of all clones of
Boolean operations.

10.4 The Functional Completeness Problem


The functional completeness problem is that of deciding, for a given subset
C of the set O(A) of all operations on a base set A, whether C generates
all of O(A). That is, we want to know when the clone generated by a subset
C is the whole clone O(A). Such a set C is said to be functionally com-
plete. To solve this problem, we apply the General Completeness Criterion
for finitely generated algebras, Theorem 1.3.16. Throughout this section, we
assume that the set A is finite, with | A |≥ 2.

In the case that A is a two-element set, we consider the clone of all Boolean
functions. We know already that in this case O(A) is generated by {∨, ¬}
and thus is finitely generated. Post’s results show that there are exactly 5
maximal subclones of O(A): these are C2 , C3 , A1 , D3 and L1 . Hence a set C
of Boolean functions is functionally complete iff for each of these five maxi-
mal subclones there exists a function f ∈ C which is not an element of this
maximal subclone.

To prove a similar functional completeness criterion for O(A) where A is a


finite set with |A| > 2, we have to prove that O(A) is finitely generated, and
then to determine all maximal subclones of O(A). For the first step we have
the following theorem, first proved by E. L. Post ([95]), which shows that on
a finite set A, the clone O(A) is indeed finitely generated.

Theorem 10.4.1 Let A be a finite set of cardinality ≥ 2, and let 0 and 1


be two different elements in A. Let + and · be two binary operations on A,
with the properties that 0 + x ≈ x ≈ x + 0 and x · 1 ≈ x, x · 0 ≈ 0. For each
10.4. THE FUNCTIONAL COMPLETENESS PROBLEM 229

a ∈ A, let χa : A → A be a unary operation with


(
1 if x = a
χa (x) =
0 otherwise.

Then h{+ , · , (χa )a∈A }iO(A) = O(A), and any operation f : An → A may
be expressed as
X Y
f (x1 , . . . , xn ) = f (a1 , . . . , an ) · χai (xi ),
(a1 ,...,an )∈An i≤n

Q
where Σ means iteration of + in a canonical way and yi is defined simi-
i≥1
yi ) · yn+1 .
Q Q Q
larly by yi := y1 and yi := (
i≤1 i≤n+1 i≤n

Proof: If f : An → A is an arbitrary n-ary operation and (a1j , . . . , anj )


is an arbitrary n-tuple, then f (a1j , . . . , anj ) = f (a1j , . . . , anj ) χanj (anj ).
This shows that any f can be expressed as f (x1 , . . . , xn ) =
f (a1 , . . . , an ) ·
P Q
χai (xi ), as claimed. Moreover, this represen-
(a1 ,...,an )∈An i≤1
tation requires only the operations + and · and the functions χA for each a
in A. Since A is finite, we now have a finite generating system for O(A).

Note that in the Boolean case A = {0, 1}, we have χ0 (x) = ¬x and χ1 (x) =
x. If we take + to represent disjunction and · to represent conjunction, the
representation given in Theorem 10.4.1 is just the usual disjunctive normal
form.

Now that we know that O(A) is finitely generated when A is finite, we must
look for the maximal subclones of O(A). These are in fact known for every
finite A. The deep theorem which describes explicitly all maximal clones on
O(A) is due to I. G. Rosenberg ([103], [102]). The special cases where |A| is
2, 3 or 4 were solved earlier by E. L. Post ([95]), S.V. Jablonskij ([59]), and
A.I. Mal’cev (unpublished). New proofs of Rosenberg’s Theorem were also
given by R. W. Quackenbush ([97]) and D. Lau ([70]).

This description of the maximal subclones of O(A) gives us the following


completeness criterion.
230 CHAPTER 10. CLONES AND COMPLETENESS

Corollary 10.4.2 Let A be a finite set and let F be a subset of O(A). Then
hF iO(A) = O(A) if and only if F is not contained in one of the maximal
subclones of O(A).

Rosenberg’s Theorem classifying all maximal clones is too complex to be


proven here. Instead we merely describe his classification. Every maximal
clone on a finite set A is of the form P olρ for some h-ary relation ρ. So
the description of the maximal clones amounts to the determination of the
corresponding h-ary relations ρ ⊆ Ah . We have the following six classes of
relations.

(i) Let SA be the full symmetric group of all permutations on A. Let s ∈ SA


be a fixed-point free permutation with r = n/p cycles of equal prime length
p. We set ρs = {(a, b) ∈ A2 | s(a) = b}. Then P olρs is a maximal clone.

(ii) Let ρ ⊆ A2 be a partial order with least and greatest elements 0 and 1,
respectively. Then P olρ is a maximal clone.

(iii) Let G = (A; + , − , 0) be an abelian group. A function f ∈ Ok (A) is


called quasilinear with respect to G if for all x1 , . . . , xk and y1 , . . . , yk ∈ A
we have f (x1 , . . . , xk ) + f (y1 , . . . , yk ) = f (x1 + y1 , . . . , xk + yk ) + f (0, . . . , 0).
The set of all quasilinear functions is a clone which can be characterized
as P olχG for the relation χG := {(x, y, z, u) ∈ A4 : x + y = z + u}. This
clone is maximal iff G is p-elementary (meaning that px = 0 for all x ∈ A)
or, equivalently, if G is the additive group of an m-dimensional vector space
over the p-element field GF (p). Thus |A| = pm for some prime p and m ∈ IN.

(iv) Let |A| ≥ 3. Let ϑ be a non-trivial equivalence relation on A, distinct


from A2 and the diagonal relation ∆A . Then P olA ϑ is a maximal clone. For
B a proper subset of A with at least two elements, let ϑB denote the (non-
trivial) equivalence relation having B as unique non-singleton block (that is,
the blocks of ϑB are B and all {c} with c ∈ A \ B). Later we will distinguish
the two cases ϑ = ϑB and ϑ 6= ϑB .

(v) A relation % ⊆ Ah is called totally reflexive if % contains each h-tuple


(a1 , . . . , ah ) ∈ Ah with a repetition of coordinates (that is, with ai = aj
for some 1 ≤ i < j ≤ h). A relation ρ is called totally symmetric if
(a1 , . . . , ah ) ∈ % ⇔ (aπ(1) , . . . , aπ(h) ) ∈ % for every permutation π on the
set {1, 2, . . . , h}. The center C(%) of % is the set of all elements c ∈ A with
10.5. PRIMAL ALGEBRAS 231

the property that (c, a2 , . . . , ah ) ∈ % for all a2 , . . . , ah ∈ A. A totally reflexive


and totally symmetric relation % with a non-trivial (not all of A) center is
called central. P olA % is a maximal clone for every central relation %. Later
we distinguish two cases depending on whether % is unary or not.

(vi) Let |A| = n. For each 3 ≤ t ≤ n, let


Et := {1, 2, . . . , t},
ιt : = {(c1 , . . . , ct ) ∈ Ett | ci = cj for some 1 ≤ i < j ≤ t}, and
ι⊗m
t := ιt ⊗ . . . ⊗ ιt
= {((c11 , . . . , c1m ), . . . , (ct1 , . . . , ctm )) ∈ (Etm )t | (c1i , . . . , cti ) ∈ ιt for i =
1, . . . , m}.
For t ≥ 3, a t-ary relation % ⊂ At is called t-universal if there are an m ≥ 1
and a surjective mapping µ : A −→ Etm such that

% = %(µ) := {(a1 , . . . , at ) ∈ At | (µ(a1 ), . . . , µ(at )) ∈ ι⊗m


t }.

There is only one non-trivial n-universal relation (t = |A| = n, m = 1),


namely

ιn (A) = {(a1 , . . . , an ) ∈ An | ai = aj for some 1 ≤ i < j ≤ n}.

The well-known Slupecki criterion says that f ∈ P olιn iff f is not surjective
or f depends essentially on at most one variable ([109]). P ol% is a maximal
clone for every t-universal relation.

10.5 Primal Algebras


In this section we describe algebraic properties and characterizations of pri-
mal and functionally complete algebras. We recall from Section 10.3 the
notation cloneA for the clone of term operations of an algebra A. From
Theorem 5.2.3, this clone is generated by the set {fiA | i ∈ I} of fundamen-
tal operations of the algebra A. Throughout this chapter we will denote this
set by F A , so that cloneA = hF A i. We also have the clone of all polynomial
operations on A, denoted by P (A) or P cloneA and generated by the set of
fundamental operations of A plus the constant functions ca for each a ∈ A.

Definition 10.5.1 A finite algebra A is called primal if every operation


f ∈ O(A) is a term operation of A, so that cloneA = O(A). A finite algebra
is called functionally complete if P cloneA = O(A).
232 CHAPTER 10. CLONES AND COMPLETENESS

Primality of A thus means that for every n ≥ 1 and every operation f :


An → A, there exists a term operation tA of A such that f = tA ; so for all
n-tuples (a1 , . . . , an ) ∈ An the equation
f (a1 , . . . , an ) = tA (a1 , . . . , an )
is satisfied. Functional completeness of an algebra A means that for every
f ∈ O(A) there exists a polynomial operation pA on A with f = pA . This
property is a generalization of the interpolation property in ring theory,
where an arbitrary unary operation is interpolable by a polynomial over a
ring.

It follows from the definition that any primal algebra is functionally com-
plete. Another connection between these two properties is based on the fol-
lowing construction. For any finite algebra A, we can form a new algebra A+
from A by adding as fundamental operations every constant operation ca , for
a ∈ A. (The finiteness of A means that we still have only finitely many fun-
damental operations in this algebra.) That is, A+ = (A; F A ∪ {ca | a ∈ A}).
Then the following result is easily verified, and its proof is left as an exercise
for the reader.

Lemma 10.5.2 A finite algebra A is functionally complete if and only if


A+ is primal.

The functional completeness criterion from Section 10.4 can be used to obtain
examples of primal algebras. It is very easy to see in this way that the two-
element Boolean algebra is primal. We list here some more examples of
primal and functionally complete algebras; we leave the verification, using
Theorem 10.4.1 and Corollary 10.4.2, to the reader.

Example 10.5.3 1. For k ≥ 2, let Ak = ({0, . . . , k − 1}; min, g) be the type


(2, 1) algebra with min the minimum with respect to the usual order of the
natural numbers 0, 1, . . . , k − 1 and g defined by
(
x + 1 if x 6= k − 1
g(x) =
0 if x = k − 1.

E. L. Post showed in [94] that Ak is primal. (Note that for k = 2, this is the
algebra ({0, 1}; ∧, ¬).)
10.5. PRIMAL ALGEBRAS 233

2. For k ≥ 2, let Bk = ({0, . . . , k − 1}; /) be the type (2) algebra with


(
0 if x = y = k − 1
x/y =
min(x, y) + 1 otherwise.

The algebras Bk were shown to be primal by D. L. Webb in [117].

3. Let A = (A; ·, g) be an algebra of type (2, 1) in which there is an element


0 ∈ A such that

(i) (A \ {0}; ·) is a group,

(ii) a · 0 = 0 · a = 0 for all a ∈ A, and

(iii) g is a cyclic permutation on A.

This algebra was shown to be primal by A. L. Foster in [45].


Using the General Completeness Criterion of Theorem 1.3.16, Corollary
10.4.2 and the list from Section 10.4 of all relations determining maximal
clones, we get the following result.

Theorem 10.5.4 A finite non-trivial algebra A = (A; F A ) is primal if and


only if F A is not a subset of P ol% for any of the relations % from the list of
all relations determining maximal clones.

Another important example of primal algebras is the k-element Post algebra


of order k. These are defined, for k ≥ 2, by

Ak = ({0, . . . , k − 1}; ∪, ∩, C, D1 , . . . , Dk−1 , 0, . . . , k − 1),

with i ∪ j := max(i, j) and i ∩ j := min(i, j) for i, j ∈ {0, . . . , k − 1}, and


(
k − 1 if i≤j
Di (j) :=
0 if i > j,
(
k − 1 if i=0
C(i) :=
0 if i > 0.
These algebras play the same role for k-valued propositional calculi with k >
2 as the two-element Boolean algebra plays for the two-element propositional
calculus: the term operations of the Post algebras of order k are precisely the
234 CHAPTER 10. CLONES AND COMPLETENESS

truth-value functions of the k-valued propositional calculus. Traczyk ([116])


investigated the variety generated by the Post algebra of order k, and gave
a description of this variety by axioms.

Another useful example of primal algebras is given in the next theorem.

Theorem 10.5.5 For every prime number p, the prime field modulo p is
primal.
Proof: The prime field modulo p is up to isomorphism the algebra Zp =
(Zp ; +, −, ·, 0, e) of type τ = (2, 1, 2, 0, 0) where Zp is the set of the residue
classes modulo p. We can use the fundamental operations of Zp to construct,
for each k ∈ Zp , the unary operation χk defined by χk (x) := e − (x − k)p−1 .
These operations satisfy the following property:
(
e if x = k
χk (x) =
0 otherwise.

This is because χk (k) = e − (k − k)p−1 = e, while for all x 6= k, we have


(x − k)p−1 = e (the multiplicative group of Zp has order p − 1), and so
χk (x) = e − e = 0 for x 6= k. Now by Theorem 10.4.1 the fundamental
operations together with these terms χk for each k ∈ Zp generate the whole
term clone of the algebra, making Zp primal.

Note that the Baker-Pixley Theorem, Theorem 9.2.6, also gives a character-
ization of primal algebras. It can be formulated as follows:

Theorem 10.5.6 A finite algebra A is primal iff there is a majority term


which induces a term operation on A and A2 has only itself and the diagonal
∆A as subalgebras.

Primal and functionally complete algebras have the following properties:

Proposition 10.5.7
(i) Let A be a primal algebra. Then
1. A has no proper subalgebras,
2. A has no non-identical automorphisms,
3. A is simple, and
4. A generates an arithmetical variety.
(ii) Every functionally complete algebra is simple.
10.5. PRIMAL ALGEBRAS 235

Proof: (i) Since every operation defined on the set A is a term operation of
the algebra A, for each proper subset B ⊂ A, each non-identical permutation
ϕ on A and each non-trivial equivalence relation θ ⊆ A2 there exist term
operations tA A A
1 , t2 , t3 of A such that
A
t1 (b1 , . . . , bn ) 6∈ B for some elements b1 , . . . , bn ∈ B,
ϕ(tA A
2 (a1 , . . . , an ) 6= t2 (ϕ(a1 ), . . . , ϕ(an )) for some a1 , . . . , an ∈ A, and
A A
(t3 (a1 , . . . , an ), t3 (b1 , . . . , bn )) 6∈ θ for some pairs (ai , bi ) ∈ θ, i = 1, . . . , n.
The variety V (A) is then arithmetical since there is a term q satisfying the
identities of Theorem 9.3.2 (iii) which induces a term operation on A.

(ii) Let A be functionally complete. Then by Lemma 10.5.2 A+ is primal and


therefore (by part (i)) simple. But then A is also simple, since the constant
operations ca , a ∈ A preserve all equivalence relations on A.

It turns out that the four properties of primal algebras given in Proposition
10.5.7 are sufficient to characterize primal algebras. An additional character-
ization was given by H. Werner in [118] (see also A. F. Pixley, [87]), using
the so-called ternary discriminator term:
(
z if x = y
t(x, y, z) =
x otherwise.

Theorem 10.5.8 For a finite algebra A the following propositions are equiv-
alent:

(i) A is primal.

(ii) A generates an arithmetical variety, has no non-identical automor-


phisms, has no proper subalgebras and is simple.

(iii) There is a ternary discriminator term which induces a term operation


on A, and the algebra A has no proper subalgebras and no non-identical
automorphisms.

Proof: (i) ⇒ (iii): This is clear since when A is primal every operation, in-
cluding the ternary discriminator, induces a term operation on A. The other
two properties were proved in the previous theorem.

(iii) ⇒ (ii). The ternary discriminator term satisfies the identities

t(x, x, y) ≈ t(x, y, x) ≈ t(y, y, x) ≈ x,


236 CHAPTER 10. CLONES AND COMPLETENESS

making the variety V (A) arithmetical by Theorem 9.3.2. To show that A


is simple, let θ 6= ∆A be a congruence relation of A. Then there are el-
ements a, b ∈ A, a 6= b with (a, b) ∈ θ. But for every c ∈ A we have
(t(a, b, c), t(a, a, c)) = (a, c) ∈ θ, and therefore θ = A2 .

(ii) ⇒ (i): By Theorem 9.3.2 there is a majority term operation in A; so


by Theorem 10.5.6 it will suffice to show that A2 has only itself and ∆A as
subalgebras. Suppose that B is a proper subalgebra of A2 ; we will show in
several steps that B must equal ∆A .

Step 1. We will show that there are subalgebras B1 and B2 of A, congruence


relations θ1 ∈ ConB1 and θ2 ∈ ConB2 and an isomorphism ϕ : B1 /θ1 →
B2 /θ2 such that the universe of B can be written as
[
B= {X × ϕ(X) | X ∈ B1 /θ1 }.

To show this, we also need several steps.

Step 1a). We show first that B has the following property, which is called
rectangularity:

If (a, d), (b, d), (b, c) ∈ B then (a, c) ∈ B.

Let pj : B → A, for j = 1, 2 be the projection homomorphisms. Then we


have
((a, d), (b, d)) ∈ kerp2 and ((b, d), (b, c)) ∈ kerp1 ,
and thus ((a, d), (b, c)) ∈ kerp1 ◦ kerp2 . Since V (A) is an arithmetical
and therefore congruence permutable variety, this composition is equal
to kerp2 ◦ kerp1 . But this means there exists a pair (y, z) ∈ B with
((a, d), (y, z)) ∈ kerp1 and ((y, z), (b, c)) ∈ kerp2 . From this we have a = y
and z = c, giving (a, e) ∈ B.

Step 1b). Using the projection homomorphisms we define our algebras and
congruences:

B1 := p1 (B) and B2 := p2 (B),


θ1 := {(a, b) ∈ A2 | ∃d ∈ A((a, d), (b, d)) ∈ B},
θ2 := {(d, e) ∈ A2 | ∃a ∈ A((a, d), (a, e)) ∈ B}.
10.5. PRIMAL ALGEBRAS 237

Now we must show that θ1 and θ2 are congruences on B1 and B2 respec-


tively. By definition θ1 ⊆ B12 and θ1 is both symmetric and reflexive. For
transitivity, suppose that (a, b), (b, c) ∈ θ1 . Then there are elements d and
e in A with (a, d), (b, d), (b, e) and (c, e) all in B. From the rectangularity
of Step 1a) it follows that (a, e) ∈ B, and from this we get (a, c) ∈ θ1 . This
shows that θ1 is an equivalence relation on B1 . For the congruence property,
let f be an n-ary operation symbol of the language of A and let (a1 , b1 ), . . .,
(an , bn ) be in θ1 . Then there exist elements d1 , . . . , dn ∈ A such that (a1 , d1 ),
(b1 , d1 ), . . ., (an , dn ) and (bn , dn ) are all in B. Since B is a subalgebra of A2 ,
we have
2
f A ((a1 , d1 ), . . . , (an , dn )) = (f B1 (a1 , . . . , an ), f B2 (d1 , . . . , dn )) ∈ B and
2
f A ((b1 , d1 ), . . . , (bn , dn )) = (f B1 (b1 , . . . , bn ), f B2 (d1 , . . . , dn )) ∈ B,
and therefore (f B1 (a1 , . . . , an ), f B2 (b1 , . . . , bn )) ∈ θ1 . Therefore θ1 is a con-
gruence relation on B1 . In the same way, θ2 is a congruence relation on B2 .

Step 1c). Now we define a mapping ϕ : B1 /θ1 → B2 /θ2 , by [a]θ1 7→ [d]θ2 for
all (a, d) ∈ B. We will show in this step that ϕ is an isomorphism.

First we check that ϕ is well defined. If [a]θ1 = [a0 ]θ1 , and (a, d) is in B, then
(a, a0 ) ∈ θ1 and so (a, e) and (a0 , e) ∈ B for some e ∈ A. Then by definition
of θ2 , having both (a, d) and (a, e) in B means that (d, e) ∈ θ2 . From this we
have ϕ([a]θ1 ) = [d]θ2 = [e]θ2 = ϕ([a0 ]θ1 ).
To see that the mapping ϕ is injective, suppose that [d]θ2 = ϕ([a]θ1 ) =
ϕ([a0 ]θ1 ) = [e]θ2 . Then (a, d) and (a0 , e) are in B, and (d, e) ∈ θ2 means that
(c, d) and (c, e) are in B for some c ∈ A. Then (a, c) and (a0 , c) are both in θ1 .
By symmetry and transitivity of θ1 we get (a, a0 ) ∈ θ1 , and so [a]θ1 = [a0 ]θ1 .

The mapping ϕ is also surjective, since for every d ∈ B2 there exists an a ∈ A


with (a, d) ∈ B. Finally, we show that ϕ is a homomorphism. Let f be an
n-ary operation symbol and assume that ϕ([ai ]θ1 ) = [di ]θ2 , so (ai , di ) ∈ B,
for i = 1, . . . , n. Since B is a subalgebra of A2 ,

ϕ([f B1 (a1 , . . . , an )]θ1 ) = [f B2 (d1 , . . . , dn )]θ2

and
f B2 (ϕ([a1 ]θ1 ), . . . , ϕ([an ]θ1 )) = f B2 ([d1 ]θ2 , . . . , [dn ]θ2 ).
Step 1d). Now we show that B = {X × ϕ(X) | X ∈ B1 /θ1 }. First,
S

let (a, d) ∈ B, so that ϕ([a]θ1 ) = [d]θ2 . Taking X := [a]θ1 , we have


(a, d) ∈ X × ϕ(X). This shows B ⊆ {X × ϕ(X) | X ∈ B1 /θ1 }. For the
S
238 CHAPTER 10. CLONES AND COMPLETENESS

opposite inclusion, assume now that (a, d) ∈ X × ϕ(X) for X ∈ B1 /θ1 , say
X = [b]θ1 and ϕ(X) = [e]θ2 with (b, e) ∈ B. Then (a, d) ∈ [b]θ1 × [e]θ2 ,
so that (a, b) ∈ θ1 and (d, e) ∈ θ2 . The definitions of θ1 and of θ2 re-
spectively give us elements d0 and a0 ∈ A for which (a, d0 ), (b, d0 ), (a0 , d)
and (a0 , e) are in B, and by the rectangularity property from Step 1a)
we have (b, d) ∈ B and (a, d) ∈ B. This means that X × ϕ(X) ∈ B and
{X × ϕ(X) | X ∈ B1 /θ1 } ⊆ B. Altogether we have equality.
S

Step 2. Next we show that every non-empty proper subalgebra B of A2 has


a universe of the form {(a, ϕ(a)) | a ∈ A}, where ϕ is an automorphism of
A.

Combining the result of Step 1 with our assumption that A has no proper
subalgebras, we see that B1 = B2 = A. This means that θ1 and θ2 are ac-
tually congruences on A, and since A is simple we see that θ1 and θ2 must
each be one of ∆A or A2 . If θ1 = A2 , then the isomorphism ϕ : A/θ1 → A/θ2
shows that θ2 = A2 also, and in this case we have B = A2 . We also have B
= {(a, ϕ(a)) | a ∈ A}. If instead θ1 = ∆A , then the isomorphism ϕ shows
that θ2 must also equal ∆A . Again B has the form {{a} × ϕ({a}) | a ∈ A},
S

so B = {(a, ϕ(a)) | a ∈ A}.

Step 3. Now we apply the fact that A has no non-identical automorphism


to the result of Step 2, to conclude that the only subalgebra of A2 is ∆A . As
we remarked above, this is enough to complete our proof.

We point out that for the proof of the claim of Step 1 we used only con-
gruence permutability. We will make use of this fact later, in the proof of
Theorem 10.6.5.

For functionally complete algebras we have the following simpler character-


ization, based on the equivalent conditions of Theorem 10.5.8.

Corollary 10.5.9 A finite algebra A is functionally complete iff the ternary


discriminator t, with
(
z if x = y
t(x, y, z) =
x otherwise,

induces a polynomial operation on A.


10.5. PRIMAL ALGEBRAS 239

Proof: First, we note that if A is functionally complete then every operation


defined on the set A, including the ternary discriminator, is a polynomial
operation of A. For the converse we use Lemma 10.5.2, which tells us that
it suffices to prove that the related algebra A+ is primal. We show the
primality of this algebra by verifying that the conditions of Theorem 10.5.8
hold for it. The algebra A+ has no proper subalgebras, since all elements
of A are nullary term operations and must be included in any subalgebra.
For the same reason A+ admits only the identical automorphism. Since the
ternary discriminator term t is a polynomial operation of A it is a term
operation of A+ . The algebra A is then simple: if θ is a congruence on A
which contains a pair (a, b) with a 6= b, using (a, b), (a, a) and (c, c) gives
(c, b) = (tA (a, a, c), tA (b, a, c)) ∈ θ for any c ∈ A, so that θ = A2 . But then
the algebra A+ is also simple. By Theorem 10.5.8, A+ is primal and so A is
functionally complete.

Remark 10.5.10 We leave it as an exercise to show that another proof of


Corollary 10.5.9 may be obtained by using Theorem 10.4.1, with the follow-
ing polynomials:

x + y := t(x, 0, y),
x · y := t(y, 1, x),
χ0 (x) := t(y, 1, x),
χa (x) := t(t(0, a, x), 0, 1), for a ∈ A \ {0}.

R. McKenzie proved in [78] a criterion for functional completeness of an


algebra A such that V (A) is congruence permutable, using the following
concept:

Definition 10.5.11 An algebra A = (A; F A ) is said to be affine with respect


to an abelian group if there is an abelian group (A; +, 0) such that for every
n ≥ 1 and every n-ary term operation tA of A, and for every pair of n-tuples
((a1 , . . . , an ), (b1 , . . . , bn )) of elements from A, we have

tA (a1 , . . . , an ) + tA (b1 , . . . , bn ) − tA (0, . . . , 0) = tA (a1 + b1 , . . . , an + bn ).

Theorem 10.5.12 Let A be a finite non-trivial algebra which generates a


congruence permutable variety. Then A is functionally complete iff A is sim-
ple but is not affine with respect to any elementary abelian p-group.
240 CHAPTER 10. CLONES AND COMPLETENESS

This theorem can be proved very easily using Rosenberg’s description of all
maximal classes of operations defined on a finite set by relations, and we
leave it as an exercise for the reader.

There is an especially simple characterization of primal algebras with only


one at least binary fundamental operation given by G. Rousseau in [104].

Theorem 10.5.13 ([104]) A finite non-trivial algebra A = (A; f A ) with


one single fundamental operation f A which is at least binary is primal if
and only if A is simple, has no proper subalgebra and has no non-identical
automorphisms.

We can look for properties of a variety V (A) generated by a primal algebra


A. In particular, we want to know what the subdirectly irreducible algebras
in A are, and how V (A) is located in the lattice of all varieties of the type
of A. Using our results on primal algebras, we have the following answers.

Corollary 10.5.14 If A is primal, then V (A) has no non-trivial subvari-


eties and A is the only subdirectly irreducible algebra in V (A).

10.6 Different Generalizations of Primality


A finite algebra A is primal when every operation which is definable on its
universe set A is a term operation of the algebra A. In this section we consider
several variations and generalizations of primality. All have the same basic
definition: instead of requiring that all operations are term operations, we
require only that certain operations, having some common property, are all
term operations. The concept of functional completeness can be similarly
weakened.

Definition 10.6.1 A finite non-trivial algebra A = (A; F A ) is called


semiprimal if every operation on A which preserves all subalgebras of A
is a term operation of A.

We recall that an n-ary operation f : An → A preserves a subalgebra B ⊆ A


of A if f (b1 , . . . , bn ) ∈ B for all b1 , . . . , bn ∈ B. If A is semiprimal and has
precisely one proper subalgebra it is called subprimal. A further distinction
is made based on the cardinality of the subalgebra of A: the algebra A is
10.6. DIFFERENT GENERALIZATIONS OF PRIMALITY 241

called regular subprimal if the cardinality of this subalgebra is greater than


1 and singular subprimal otherwise.
Operations on an algebra A can also preserve, or be compatible with, ho-
momorphisms and congruence relations of A. We say that f : An → A
is compatible with an endomorphism (or isomorphism) ϕ : A → B
if ϕ(f (a1 , . . . , an )) = f (ϕ(a1 ), . . . , ϕ(an )) for all a1 , . . . , an ∈ A. As a
consequence of the General Homomorphism Theorem, we know that op-
erations f which preserve all homomorphisms of A have the property
that for all θ ∈ ConA and for all (a1 , b1 ), . . . , (an , bn ) ∈ θ, the pair
(f (a1 , . . . , an ), f (b1 , . . . , bn )) is in θ.

Definition 10.6.2 A finite non-trivial algebra A = (A; F A ) is called


demiprimal if A has no proper subalgebras and every operation from O(A)
which preserves all automorphisms of A is a term operation of A.

Definition 10.6.3 A finite non-trivial algebra A = (A; F A ) is called


quasiprimal if every operation on A which preserves all subalgebras and
all isomorphisms between non-trivial subalgebras of A is a term operation
of A.

Definition 10.6.4 A finite non-trivial algebra A = (A; F A ) is called


hemiprimal if every operation on A which preserves all congruence relations
on A is a term operation of A. The algebra A is called affine complete if
every such operation is a polynomial operation of A.

Clearly, demiprimal and semiprimal algebras are examples of quasiprimal


algebras. All of these types of algebras can be determined using relations, in
the following sense. In each case there is a set of relations R ⊆ A2 such that
cloneA is precisely the clone P olR.

There are two other kinds of algebras which are defined in a different way. A
finite non-trivial algebra is called cryptoprimal if A is simple, has no proper
subalgebra and generates a congruence distributive variety. Also, A is called
paraprimal if each subalgebra of A is simple and A generates a congruence
permutable variety.

We now present some characterizations of these various kinds of algebras,


beginning with quasiprimal algebras.
242 CHAPTER 10. CLONES AND COMPLETENESS

Theorem 10.6.5 For a finite algebra A = (A; F A ) the following conditions


are equivalent:

(i) A is quasiprimal.

(ii) The ternary discriminator operation induces a term operation of A.

(iii) Each subalgebra of A is simple and there is a term q in the language


of A such that q(y, y, x) ≈ q(x, y, y) ≈ q(x, y, x) ≈ x are identities in
A.

Proof. (i) ⇒ (ii): Since the ternary discriminator t of A is an operation


which preserves all subalgebras of A and all isomorphisms between non-
trivial subalgebras of A, it is a term operation of A when A is quasiprimal.

(ii) ⇒ (iii): Let θ 6= ∆B be a non-trivial congruence relation of a non-trivial


subalgebra B of A. Then there is a pair (a, b) in θ with a 6= b. For any element
c ∈ B, the ternary discriminator properties give (t(a, b, c), t(a, a, c)) ∈ θ.
Hence (a, c) ∈ θ for every c in B, making θ = B 2 . The ternary discriminator
term also satisfies the identities t(x, x, z) ≈ t(z, x, x) ≈ t(z, x, z) ≈ z.

(iii) ⇒ (i): By Theorem 9.3.2 the variety V (A) is arithmetical and there is a
majority term operation in A. Therefore we can apply Theorem 9.2.6. The
variety V (A) is congruence permutable since it is arithmetical, and the claim
from Step 1 in the proof of Theorem 10.5.8 is satisfied. Thus there are sub-
algebras B1 and B2 of A, congruence relations θ1 ∈ ConB1 and θ2 ∈ ConB2
and an isomorphism ϕ : B1 /θ1 → B2 /θ2 such that the universe of B can be
written as B = {X × ϕ(X)|X ∈ B1 /θ1 }. Since by assumption each subal-
S

gebra of A is simple, we have θi ∈ {∆B , Bi2 }, for i = 1, 2. But this means we


have either isomorphisms between the subalgebras Bi = Bi /∆B , or isomor-
phisms between trivial (one-element) algebras Bi /Bi2 . Theorem 9.2.6 then
shows that every operation on A which preserves all subalgebras and all iso-
morphisms between non-trivial subalgebras of A is a term operation of A,
and thus A is quasiprimal.

Semiprimal algebras can be characterized as follows:

Theorem 10.6.6 A finite non-trivial algebra A = (A; F A ) is semiprimal iff


the following conditions are all satisfied:

(i) V (A) is arithmetical;


10.6. DIFFERENT GENERALIZATIONS OF PRIMALITY 243

(ii) Every non-trivial subalgebra of A is simple;

(iii) Every subalgebra of A has no non-identical automorphisms; and

(iv) No two distinct subalgebras with more than one element are isomor-
phic.

Proof. First suppose that A is semiprimal. Then A is also quasiprimal and


Theorem 10.6.5 can be applied. Consequently, by Theorem 9.3.2 the vari-
ety V (A) is arithmetical and every non-trivial subalgebra of A is simple.
From the definition of semiprimal algebras it follows that a non-trivial sub-
algebra of a semiprimal algebra is semiprimal as well. Suppose that ϕ is a
non-identical automorphism of A, so that there are distinct elements a and
b of A with ϕ(a) = b 6= a. Since ϕ is injective, this also means that ϕ(b) = b0
6= b. Consider the following operation f defined on A: we set f (x, y) = x if
x 6= a or y 6= b and f (a, b) = b. This operation preserves all subalgebras of
A and therefore it is a term operation of A. Then f (ϕ(a), ϕ(b)) = f (b, b0 ) =
b while ϕ(f (a, b)) = ϕ(b) = b0 . This shows that A cannot have non-identical
automorphisms. In the same way we can show that every non-trivial subal-
gebra of A has no non-identical automorphisms.

To show (iv), let S1 and S2 be two distinct isomorphic subalgebras of A.


Since A is finite, no proper subalgebra of S2 can be isomorphic to S2 , and
we may suppose that the set S1 \ S2 is non-empty. Let a ∈ S1 \ S2 , and let
b ∈ S2 be the image of a under the isomorphism ϕ. Furthermore, let c 6= a
be any other element of S1 and let ϕ(c) = d ∈ S2 . By injectivity, d 6= b.
Obviously, the operation f defined by f (x, y) = x if x = a and f (x, y) = y
otherwise is a term operation of A; but we have f (ϕ(a), ϕ(c)) = f (b, d) = d
yet ϕ(f (a, c)) = ϕ(a) = b, which is a contradiction. This completes the first
direction of the proof.

Now we will prove that from conditions (i) - (iv) the semiprimality of A
can be deduced. By Theorem 10.6.5 the algebra A is quasiprimal. There are
no isomorphisms between non-trivial subalgebras of A, and therefore each
operation preserving all subalgebras of A is a term operation of A. Conse-
quently, A is indeed semiprimal.

Clearly, a semiprimal algebra with no proper subalgebra is primal. A semipri-


mal algebra A which is not primal has at least one minimal subalgebra, that
is, a subalgebra which itself has no proper subalgebras. It can also be shown
244 CHAPTER 10. CLONES AND COMPLETENESS

that each minimal subalgebra of a semiprimal algebra is either primal or is


a one-element subalgebra. If A has more than one minimal subalgebra, any
two of them are disjoint.

The next theorem characterizes demiprimal algebras.

Theorem 10.6.7 A finite non-trivial algebra is demiprimal iff the following


conditions are satisfied:

(i) V (A) is arithmetical;

(ii) A has no proper subalgebra;

(iii) A is simple; and

(iv) Every non-identical automorphism of A has no fixed points.

Proof. If A is demiprimal then it is also quasiprimal. Therefore, by Theo-


rem 10.6.5 A is simple and V (A) is arithmetical. Clearly, the set of all fixed
points of an automorphism forms a subalgebra of A. But a demiprimal al-
gebra has no proper subalgebra, so we have (iv).

Conversely, from conditions (i) - (iv) and Theorem 10.6.5 we see that A is
quasiprimal. Since A has no proper subalgebras, each isomorphism between
non-trivial subalgebras of A is an automorphism of A , and A is demipri-
mal.

The following characterization of hemiprimal algebras is due to A. F. Pixley,


from [88].

Theorem 10.6.8 Let A = (A; F A ) be a finite non-trivial algebra, for which


ConA is arithmetical. Then A is hemiprimal iff the following conditions are
satisfied:

(i) V (A) is arithmetical;

(ii) A has no proper subalgebra; and

(iii) If θ1 , θ2 are congruences of A then any isomorphism ϕ : A/θ1 → A/θ2


is the identity mapping, and in particular θ1 = θ2 .
10.7. PREPRIMAL ALGEBRAS 245

For more results on affine complete and functionally complete algebras we


refer the reader to the work of A. F. Pixley in [89]. Using the properties
of semi-, demi-, and hemiprimal algebras and Theorem 6.5.7 we obtain the
following structure theorem:

Theorem 10.6.9 Let A be a finite non-trivial algebra and let V (A) be the
variety generated by A. Then:

(i) If A is semiprimal then V (A) = IPs S(A). (If A is regular subprimal


then V (A) = IPs {A, B} where B is the only non-trivial subalgebra of
A, and if A is singular subprimal then V (A) = IPs S(A).)

(ii) If A is hemiprimal then V (A) = IPs (A).

(iii) If A is demiprimal then V (A) = IPs (A). When A is singular sub-


primal or demiprimal, the algebra A is the only subdirectly irreducible
algebra in V (A) and V (A) has no non-trivial subvariety (so V (A) is
a minimal variety).

10.7 Preprimal Algebras


Definition 10.7.1 A finite non-trivial algebra A is called preprimal if
cloneA is one of the maximal clones described in Section 6.4.

By Corollary 1.3.14, the clone M is a maximal subclone of O(A) if for ev-


ery f ∈ O(A) \ M , the clone hM ∪ {f }iO(A) generated by M ∪ {f } equals
the whole clone O(A). Let A be a preprimal algebra. If not all the nullary
operations defined on the set A are term operations of A, then A is func-
tionally complete. If however all the nullary operations are term operations
of the algebra, then A admits no non-identical automorphisms and has no
proper subalgebras. Functionally complete preprimal algebras are simple. By
Post’s classification of all two-element algebras ([95]), there exist exactly five
preprimal two-element algebras. These are :

C3 = ({0, 1}; ∨, g2 ), for g2 := x ∧ ¬y,


D3 = ({0, 1}; (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x + y + z, ¬),
C2 = ({0, 1}; ∨, x + y + 1),
A1 = ({0, 1}; ∧, ∨, c20 , c21 ),
246 CHAPTER 10. CLONES AND COMPLETENESS

L1 = ({0, 1}; +, c21 ).

Here C3 and C2 are dually isomorphic, and C3 , C2 and D3 are functionally


complete.

We will use the following concept defined by A. L. Foster ([46]). An element


a ∈ A is called a centroid element of A if there exists a unary term operation
ca of A with ca (x) = a for all x ∈ A. The set C(A) of all centroid elements of
A is called the centroid of A. An algebra is called constantive if C(A) = A.
Then we have

Theorem 10.7.2 A preprimal algebra A is either quasiprimal or constan-


tive.

Proof. Let A be preprimal. Clearly, every constant operation preserves the


diagonal relation ∆A . By definition cloneA is a maximal subclone of O(A),
and is determined by one of the relations described in Rosenberg’s list from
Section 10.4. Moreover, it is known that constantive preprimal algebras cor-
respond to relations which are reflexive or totally reflexive. Then bounded
partial order relations (type (ii) in the list from Section 10.4), non-trivial
equivalence relations (type (iv)) and central relations for h = 2 (type (v))
are reflexive; and central relations for h ≥ 3 (type (v)), t-universal relations
(type (vi)) and the relations of type (iii) are totally reflexive. Preprimal
algebras corresponding to relations of type (i) are demiprimal; preprimal al-
gebras corresponding to unary central relations are semiprimal (subprimal).
Therefore, these algebras are quasiprimal. There are no other kinds of prepri-
mal algebras, showing that every preprimal algebra is either constantive or
quasiprimal.

As a consequence, we see that preprimal algebras are functionally complete


iff they are quasiprimal.

Preprimal algebras corresponding to type (iv) are hemiprimal with exactly


one non-trivial congruence relation. It was shown by K. Denecke in [20] that
such algebras generate arithmetical varieties. It is also easy to see that every
preprimal algebra which does not correspond to a relation of type (iv) is
simple. Every preprimal algebra except those corresponding to type (v) for
h = 1 has no non-trivial subalgebra, and every preprimal algebra which does
not correspond to type (i) has no non-trivial automorphisms. For algebras
10.7. PREPRIMAL ALGEBRAS 247

generating arithmetical varieties we have the following result.

Theorem 10.7.3 A non-trivial algebra A which generates an arithmetical


variety is preprimal if and only if one of the following mutually exclusive
cases is satisfied:

Case 1. (i0 ) A has exactly one proper subalgebra B, which is either


primal or trivial,
(ii) A has no non-identical automorphisms, and
(iii) A is simple.
Case 2. (i) A has no proper subalgebras,
(ii0 ) The automorphism group of A is cyclic and of prime
order, and
(iii) A is simple.
Case 3. (i) A has no proper subalgebras,
(ii) A has no non-trivial automorphisms, and
(iii0 ) A has one non-trivial homomorphic image, and this
homomorphic image is a primal algebra.

Proof. “⇒”: Assume that A generates an arithmetical variety and the three
conditions of Case 1 are satisfied. Since A is simple and the only other sub-
algebra B of A is also simple, since it is either primal or trivial, we conclude
that all subalgebras of A are simple. The algebra A has no non-identical
automorphisms and the subalgebra B, as a trivial or primal algebra, also has
no non-identical automorphisms; hence all subalgebras of A have no proper
automorphisms. This means that all conditions of Theorem 10.6.6 are satis-
fied, and A is semiprimal. Since A has only one proper subalgebra, by the
definition of semiprimality every operation defined on A which preserves the
subset B ⊂ A is a term operation of A, and cloneA is a clone of operations
preserving a central relation with h = 1 (class (v)), making A preprimal.

Next assume that A generates an arithmetical variety and that the three
conditions of Case 2 are satisfied. Let the automorphism group of A be the
cyclic group of prime order p. We want to apply Theorem 10.6.7. Clearly,
AutA is generated by an automorphism s of prime order p 6= 1. The auto-
morphism s has no fixed points, since the fixed points of an automorphism
form a subalgebra but A has no proper subalgebras, forcing p = 1, a contra-
diction. Therefore A is demiprimal.
248 CHAPTER 10. CLONES AND COMPLETENESS

Now assume that A generates an arithmetical variety and that the conditions
of Case 3 are satisfied. Then all conditions of Theorem 10.6.8 are fulfilled;
so A is hemiprimal and cloneA = P olθ for a non-trivial congruence relation
θ of A. Then A is preprimal of type (iv).

“⇐” By Theorem 10.7.2 a preprimal algebra is either quasiprimal or constan-


tive. Quasiprimal algebras generate arithmetical varieties, and correspond to
class (i) or to class (vi) with h = 1. In the first case Case 2 is satisfied, while
in the second case Case 1 is satisfied. All preprimal algebras other than
those corresponding to class (iv) are simple, since if A has a non-trivial con-
gruence relation, then cloneA ⊂ P olθ and A is not preprimal. Preprimal
algebras A with cloneA = P olθ, for θ non-trivial, generate arithmetical va-
rieties by Theorem 10.6.8. These algebras fit Case 3. No other preprimal
algebras generate arithmetical varieties since constantive algebras have no
non-identical automorphisms and no proper subalgebras, and since an alge-
bra having no proper subalgebras, no non-identical automorphisms and no
non-trivial congruence relations which generates an arithmetical variety is
primal by Theorem 10.5.8.

We remark that it was shown by K. Denecke in [20] that preprimal al-


gebras corresponding to class (iii) generate varieties which are congruence
permutable but not congruence distributive. Preprimal algebras A for which
cloneA is in class (v) generate 3-distributive but not 2-distributive vari-
eties if h > 2, and 2-distributive varieties if h = 2. Preprimal algebras for
which cloneA is in class (ii) and the relation % is a lattice order generate
2-distributive varieties. We note also that in the last case there are bounded
partial orders % such that P ol% is not finitely generated (see G. Tardos,
[110]). In this case we do not get a finite algebra in the usual sense of having
both a finite universe and finitely many fundamental operations.

Preprimal algebras A for which cloneA = P olθ for some non-trivial equiva-
lence relation θ are the only non-simple preprimal algebras. But such algebras
have only one non-trivial congruence relation; so they are subdirectly irre-
ducible. This shows that all preprimal algebras are subdirectly irreducible.
The question of describing all other subdirectly irreducible algebras in the
variety V (A) generated by a preprimal algebra A was solved for most classes
of preprimal algebras, independently by K. Denecke in [20] and A. Knoebel
in [65]. This is closely connected to the problem of describing the subvariety
lattice of varieties V (A) generated by preprimal algebras A. The results are
10.8. EXERCISES 249

summarized in the following table; the proofs may be found in [20] and [65].

Number of Number of
Class Relation Subvarieties Subdirectly
of V (A) Irreducibles
(i) fixed point free permutation 2 1
(ii) bounded partial order 2 ?
(iii) elementary abelian p-group 3 2
(iv) equivalence relation 3 2
(v) h = 1, subset |B| = 1 2 1
(v) h = 1, subset |B| > 1 3 2
(v) h>2 2 1
(vi) 5 or 6 finite

10.8 Exercises
10.8.1. Prove directly that the pair (P ol, Inv) of operators introduced in
Section 10.2 forms a Galois connection.

10.8.2. Prove Lemma 10.5.2.

10.8.3. Verify that the algebra 3 in Example 10.5.3, is primal.

10.8.4. Prove Remark 10.5.10.

10.8.5. Prove Theorem 10.5.12.

10.8.6. Determine all two-element preprimal algebras which generate an


arithmetical variety.

10.8.7. Prove that for preprimal algebras A corresponding to class (i), the
variety V (A) has no non-trivial subvarieties and A is the only subdirectly
irreducible algebra in V (A).

10.8.8. Prove that for preprimal algebras A corresponding to class (iv), the
variety V (A) has only one non-trivial subvariety and two subdirectly irre-
250 CHAPTER 10. CLONES AND COMPLETENESS

ducible algebras.

10.8.9. Prove that for preprimal algebras A corresponding to class (v), when
|B| = 1 the variety V (A) has no non-trivial subvarieties.

10.8.10. Let A be the two-element set {0, 1}. Prove that an operation f :
An → A is a linear Boolean function iff it is in P olρ, where

ρ = {(y1 , y2 , y3 , y4 ) ∈ A4 | y1 + y2 = y3 + y4 }.
Chapter 11

Tame Congruence Theory

An algebra A is called finite when its universe A is finite and every funda-
mental operation is finitary. Finite algebras are important in many areas
where finiteness plays a crucial role, for instance in Computer Science. A
major area of research activity has been to try to classify all finite algebras
of a given type. For instance, the classification of all finite groups has been
a longstanding mathematical problem.

In the early 1980’s, R. McKenzie and D. Hobby developed a new theory


called “Tame Congruence Theory,” which offers a structure theory for finite
algebras ([57]). In this chapter we present the main ideas and results of this
important theory. A central concept in tame congruence theory is the notion
of a minimal algebra. We will give a classification of all minimal algebras,
using the properties of the congruence lattices of the algebras in a variety
generated by a minimal algebra.

11.1 Minimal Algebras


Let A and B be sets, let f : A → B be a mapping and let U be a subset
of A. Then the restriction of f to the set U is the mapping f |U from U to
B defined by f |U (x) := f (x) for all x ∈ U . We will be particularly inter-
ested in the restrictions of polynomial operations on an algebra A to certain
special subsets of the base set A. We will denote by P n (A) and P (A) the
sets of all n-ary polynomial operations and of all finitary polynomial opera-
tions, respectively, of the algebra A. We also use Eq A for the lattice of all
equivalence relations on a set A.

251
252 CHAPTER 11. TAME CONGRUENCE THEORY

Definition 11.1.1 Let A be a set with U a subset of A, and let A be an


algebra with A as its universe. We define the following restrictions to set U :

(i) For any equivalence relation θ in Eq A, we set θ|U := θ ∩ U 2 .

(ii) For any n-ary operation g on A, we set g|U := g|U n .



P n (A)|U , and
S
(iii) We set P (A)|U :=
n=0
P n (A)|U = {g|U | g ∈ P n (A) and g(U n ) ⊆ U }.

(iv) We define A|U to be the algebra (U ; P (A)|U ).

It is easy to check that when θ is an equivalence relation on A, the relation


θ|U is also an equivalence relation on the set U , called the restriction of θ
to U . An operation g|U is also called the restriction of g to U . The algebra
A|U is called the algebra induced on U by A, or the restriction of the algebra
A to the set U .

To obtain our minimal algebras, we want to consider restrictions of algebras


to certain special subsets, those which occur as images of mappings, partic-
ularly of idempotent polynomial mappings on the algebra.

A mapping e : A → A is called idempotent if e2 := e◦e = e. For A an algebra,


we shall denote by E(A) the set of all idempotent polynomial operations of
A.

Our first theorem of this chapter shows that under certain conditions, the
restriction of a congruence on A to a subset U of A is also a congruence
relation, on the induced algebra A|U .

Theorem 11.1.2 Let A be an algebra and let e ∈ E(A). Let U be the image
set U := e(A). Then the mapping ϕU : Con A → Con(A|U ), defined by
θ 7→ θ|U , is a surjective lattice homomorphism.
Proof: We must show first that if θ ∈ Con A, then θ|U is in Con(A|U ),
so that our mapping ϕU is well defined. Let (a1 , b1 ), . . . , (an , bn ) be in θ|U ,
so that (a1 , b1 ), . . . , (an , bn ) are all in θ ∩ U 2 . The fundamental operations
of the induced algebra A|U have the form f A |U , where f A is a polynomial
operation on A. Then we have
11.1. MINIMAL ALGEBRAS 253

(f A|U (a1 , . . . , an ), f A|U (b1 , . . . , bn ))


= (f A |U (a1 , . . . , an ), f A |U (b1 , . . . , bn ))
= (f A (a1 , . . . , an ), f A (b1 , . . . , bn )),

which we claim is in θ ∩ U . This is because P (A) is the clone generated by


the set cloneA ∪{ca | a ∈ A}, and our claim is true for every f in cloneA by
Theorem 5.2.4 and obviously true for each constant mapping ca .

Next we must show that the mapping ϕU is compatible with the lattice op-
erations ∧ (which is just ∩) and ∨ on the congruence lattices. For ∧, we let
θ and ψ be in ConA, and let a, b ∈ A. Then (a, b) ∈ θ|U ∩ ψ|U ⇔ (a, b) ∈
(θ ∩ U 2 ) ∩ (ψ ∩ U 2 ) ⇔ (a, b) ∈ θ ∩ ψ ∩ U 2 ⇔ (a, b) ∈ (θ ∩ ψ)|U . It follows that
ϕU (θ) ∩ ϕU (ψ) = ϕU (θ ∩ ψ).

For ∨, we again take θ and ψ to be in ConA. We have θ|U = θ ∩ U 2 ⊆


(θ ∨ ψ) ∩ U 2 and ψ|U = ψ ∩ U 2 ⊆ (θ ∨ ψ) ∩ U 2 , and thus we have the contain-
ment θ|U ∨ψ|U ⊆ (θ∨ψ)|U . For the opposite inclusion, let (a, b) ∈ (θ∨ψ)∩U 2 .
Then by our characterization of joins of congruences in Lemma 9.2.2 we
know that there are elements a0 , a1 , . . . , an ∈ A such that a = a0 , b = an ,
and (a2i , a2i+1 ) ∈ θ and (a2i+1 , a2i+2 ) ∈ ψ, for i ≥ 0. Since e is a unary
polynomial on A, it preserves the congruences θ and ψ (see Exercise 5.4.7),
so we have (e(a2i ), e(a2i+1 )) ∈ θ and (e(a2i+1 ), e(a2i+2 )) ∈ ψ, for i ≥ 0.

Now we use the idempotence of our polynomial mapping e, which up till


now we have not made use of. This idempotence, and our choice of U as
e(A), means precisely that e|U is the identity mapping on U : for any x ∈ U ,
we have x = e(y) for some y in A, and then e(x) = e(e(y)) = e(y) = x.
The elements e(aj ) are all in U , and moreover a and b are in U so e(a) = a
and e(b) = b. From this, applying our join criterion in the opposite direc-
tion now, we conclude that (a, b) is in θ|U ∨ ψ|U . This gives us the equality
ϕU (θ) ∨ ϕU (ψ) = ϕU (θ ∨ ψ) we need for ∨.

Finally we show that the mapping ϕU is surjective. Let Φ be any congruence


in Con(A|U ). We define the relation
Φ := {(x, y) ∈ A2 | ∀f ∈ P 1 (A), ((e(f (x)), e(f (y))) ∈ Φ)},
which we shall show is a congruence on A whose image under ϕU is Φ. The
relation Φ is clearly an equivalence relation on A. For the congruence prop-
erty, we show that Φ is compatible with any unary polynomial operation
254 CHAPTER 11. TAME CONGRUENCE THEORY

g of A, which is sufficient to guarantee a congruence by Theorem 1.4.5. If


(x, y) ∈ Φ, then for any unary polynomial operation f on A, the composition
operation f ◦ g is also a unary polynomial, and so by definition of Φ we have
(e(f ◦ g)(x), e(f ◦ g)(y)) ∈ Φ. But then (e(f (g(x))), e(f (g(y)))) ∈ Φ for all
f ∈ P 1 (A), and thus (g(x), g(y)) ∈ Φ.

To finish our proof of surjectivity, we show that Φ|U = Φ. Let (x, y) be in


Φ|U , so that (x, y) ∈ Φ∩ U 2 . Taking f to be the unary identity polynomial
on A we get (e(x), e(y)) ∈ Φ, and the fact that e is the identity mapping on
U shows that (x, y) is then in Φ. Conversely, suppose that (x, y) ∈ Φ. For all
f ∈ P 1 (A) we have e f (U ) ⊆ U , so that (ef )|U is a fundamental operation of
the algebra A|U . But then (e(f (x)), e(f (y))) ∈ Φ, and by definition of Φ the
pair (x, y) belongs to Φ. This gives the containment and hence the equality
we need.

Thus we have proved that if e is an idempotent unary polynomial operation


on an algebra A, and our subset U is the image of A under e, then every
congruence on the induced algebra A|U is the restriction of a congruence on
A, with a lattice homomorphism between the two congruence lattices. Our
next example, while it illustrates the restriction of congruences, also shows
that it is not necessary to use the image of an idempotent polynomial.

Example 11.1.3 Consider the algebra A = ({a, b, c, d}; f ), where f is de-


fined by the table

x a b c d
f (x) b c c d

The mapping f is not idempotent, but nevertheless we can form the set
U := f (A) = {b, c, d}, and compare the lattices of congruences on the orig-
inal algebra A and the induced algebra A|U . It is straightforward to work
out that lattice Con A has the Hasse diagram shown below, with θ0 and θ1
denoting the trivial congruences ∆A and A × A, respectively.
11.1. MINIMAL ALGEBRAS 255

θq1
θ2 q q θ5

θ3 q q θ4
q
θ0

Our induced algebra A|U has universe U , and fundamental operation set
P (A)|U . There is only one unary term operation on A which preserves the
subset U = {b, c, d}, namely the term g := f |U = (f ◦ f )|U , given by the
table below.
x b c d
g(x) c c d

Thus P (A)|U is generated by g and the three constant mappings with values
in U . Using this, we can work out that this algebra A|U has four congru-
ences:

θ00 = ∆U and θ10 = U × U ,


θ20 = {(b, b), (c, c), (d, d), (b, c), (c, b)}, and
θ30 = {(b, b), (c, c), (d, d), (c, d), (d, c)}.

Thus Con A|U has the Hasse diagram:


θ10
s

θ20 s s θ0
3

s
θ00

As in Theorem 11.1.2, we can consider the mapping ϕU : Con A → Con(A|U ),


taking each θ to θ|U . Then we have
256 CHAPTER 11. TAME CONGRUENCE THEORY

θ1 7→ θ1 |U = θ10 , θ0 7→ θ0 |U = θ00 ,
θ2 7→ θ2 |U = θ10 , θ3 7→ θ3 |U = θ30
θ4 7→ θ4 |U = θ20 , θ5 →
7 θ5 |U = θ20 .

It is easy to check that this is a lattice homomorphism, in spite of the fact


that our subset U was the image of a non-idempotent polynomial operation.
We are now ready to define minimal sets of an algebra A. These will be
sets which are images of A of the form f (A), where f is a unary polynomial
operation. As we have seen, f need not be idempotent, but we do want the
image set f (A) to have certain useful properties.

Definition 11.1.4 Let A be an algebra and let f ∈ P 1 (A) be a unary


polynomial operation of A. The image set f (A) is called a minimal set of A
if the following conditions are satisfied:

(i) |f (A)| > 1, and

(ii) for all g ∈ P 1 (A) with g(A) ⊆ f (A) and |g(A)| > 1, we have g(A) =
f (A).

The collection of all minimal sets of an algebra A will be denoted by M in(A).


If A is a finite set of cardinality at least two, then A has some minimal sets,
so that M in(A) is non-empty. An induced algebra A|U induced by a minimal
set U ∈ M in(A) is called a minimal algebra of A.

Example 11.1.5 Consider the algebra

A = ({0, 1, 2}; max(x, y), min(x, y), 0, 1, 2, m1 , m2 ),

where m1 and m2 are unary operations given by the tables

x m1 (x) m2 (x)
0 0 0
1 2 0
2 2 2

Then cloneA = P (A) is the set of all operations which are monotone with
respect to the usual order 0 ≤ 1 ≤ 2. For each of the following monotone
operations f1 , . . ., f6 ,
11.1. MINIMAL ALGEBRAS 257

x f1 f2 f3 f4 f5 f6
0 0 0 0 0 1 1
1 0 1 0 2 1 2
2 1 1 2 2 2 2

the corresponding image set fi ({0, 1, 2, }) is a minimal set. Notice that these
are all idempotent operations.

As usual, a bijective unary operation f : A → A is called a permutation of


the set A. Let SA = (SA ; ◦) be the full permutation group on the set A. Any
subgroup G of this group is called a permutation group on A.

Theorem 11.1.6 Let A be a finite algebra and let U ∈ M in(A). Then every
unary polynomial operation of the induced algebra A|U is either a permuta-
tion or a constant operation. Moreover, the non-constant unary polynomial
operations of A|U form a permutation group on U .

Proof: By definition of A|U , all its polynomial operations are fundamental


operations. To show this, suppose that f ∈ P (A|U ), so that f ∈ hP (A)|U
∪ {ca | a ∈ U }i = hP (A)|U i. But the set P (A)|U = {g|U | n ∈ IN and
g ∈ P n (A), g(U n ) ⊆ U } is a clone. Therefore, hP (A)|U i = P (A)|U and thus
P (A|U ) ⊆ P (A)|U . The converse inclusion is also satisfied since the elements
of P (A)|U are the fundamental operations of A|U .

This means that the unary polynomial operations of A|U are exactly the
mappings of the form g|U for some g ∈ P 1 (A) such that g(U ) ⊆ U . Now
suppose that such a mapping is not a permutation of the set U . Since
U is minimal, it has the form U = f (A) for some f ∈ P 1 (A). Then
g(f (A)) = g(U ) ⊂ U = f (A). But again since U is minimal we must have
|g(U )| = |g(f (A))| = 1. Therefore g|U is constant on U .

Algebras with this property, that every non-constant unary polynomial op-
eration is a permutation, are called permutation algebras. Our theorem thus
says that any minimal algebra of a finite algebra A is a permutation algebra.
Later we will show that there are exactly five types of permutation algebras,
and thus five types of minimal algebras.

Minimal algebras can also be defined using congruence relations.


258 CHAPTER 11. TAME CONGRUENCE THEORY

Definition 11.1.7 Let A be an algebra and let β ∈ Con A. A set of the


form f (A) with f ∈ P 1 (A) is called a β-minimal set of A, or a minimal set
with respect to β, if the following conditions are satisfied:
(i) f (β) 6⊆ ∆A : there is a pair (x, y) ∈ β with f (x) 6= f (y).

(ii) for any g ∈ P 1 (A) with g(A) ⊆ f (A) and g(β) 6⊆ ∆A , we have g(A) =
f (A).
The set of all β-minimal sets of an algebra A is denoted by M inA (β), or
when the algebra is clear from the context, just M in(β). The corresponding
induced algebras A|U , for U ∈ M in(β), are called β-minimal algebras of A,
or minimal algebras with respect to β.

Example 11.1.8 We consider again the algebra A = ({a, b, c, d}; f ) from


Example 11.1.3, with f defined by the table
x a b c d
f (x) b c c d
and take as our congruence

β = {(a, a), (b, b), (c, c), (d, d), (b, c), (c, b), (b, d), (d, b), (c, d), (d, c)}.
For the operation g given by the table
x a b c d
g(x) c c c d
we have an image set U = g(A) = {c, d}. The image g(β) =
{(c, c), (d, d), (c, d), (d, c)} contains a non-diagonal pair, and moreover there
is no other unary polynomial with this property which gives a smaller image.
Thus our set g(A) meets both conditions regarding our congruence, and is a
β-minimal set of A.
Definition 11.1.7 is in fact a generalization of Definition 11.1.4. A minimal
set for an algebra A is always a β-minimal set for β equal to the largest
congruence A×A on A. The first condition for β-minimal sets, in this special
case, merely says that the cardinality of the minimal set is greater than
one, while the second condition translates directly. We thus have M in(A) =
M inA (A × A).
We now generalize the characterization from Theorem 11.1.6 of minimal
algebras of a finite algebra.
11.1. MINIMAL ALGEBRAS 259

Theorem 11.1.9 Let A be a finite algebra, and let β ∈ ConA. Let U ∈


M inA (β) be a β-minimal set such that U = e(A) for some idempotent e ∈
E(A). Then every unary polynomial operation g of the induced minimal
algebra A|U is either a permutation or satisfies g(β|U ) ⊆ ∆U . For every β|U -
congruence class N of A|U , the algebra (A|U )|N induced by A|U on N is a
permutation algebra, that is, all its non-constant unary polynomial operations
are permutations.
Proof: By definition, every unary polynomial operation g of A|U has the
form g = h|U for some unary polynomial h on A such that h(e(A)) = h(U ) =
g(U ) ⊆ U . If g is not a permutation of U , then by the finiteness of A we
have h(e(A)) = h(U ) = g(U ) ⊂ U ; and since U is minimal, this means that
h(e(β)) ⊆ ∆A . Since e is idempotent, we have e(β) = β|U and then g(β|U )
= h(β|U ) ⊆ ∆U .

Now let N be a β|U congruence class of A|U . Every unary polynomial oper-
ation of (A|U )|N has the form h|N for some h ∈ P 1 (A) with h(U ) ⊆ U and
h(N ) ⊆ N . Then h|U is either a permutation on U , or satisfies h(β|U ) ⊆ ∆U .
In the first case h|N is a permutation on N , while in the second case h|N is
constant.

Definition 11.1.10 Let A be an algebra, let β ∈ Con A and let U ∈


M inA (β). Any β|U -congruence class N of A|U with at least two elements is
called a β-trace of U . Note that by Definition 11.1.7 every β-minimal set U
contains at least one such β-trace. For such a trace N , the induced algebra
(A|U )|N is called a trace algebra of A with respect to β. The set U is then
divided into two subsets called the body and tail of U , as follows:
{N | N is a β-trace of U }
S
body of U :=
tail of U := U \ (body of U ).

Example 11.1.11 As an example we consider the group (Z6 ; +, −,


0) of equivalence classes modulo 6. Each congruence of a commutative group
corresponds to a subgroup of the group. In this example we have two proper
subgroups of our group, {0, 3} and {0, 2, 4}, and hence two congruences other
than ∆ and Z6 × Z6 :

α = {(0, 0)(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (0, 3), (3, 0), (1, 4), (4, 1),
(2, 5), (5, 2)},
β = {(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (0, 2), (2, 0), (0, 4), (4, 0)),
260 CHAPTER 11. TAME CONGRUENCE THEORY

(2, 4), (4, 2), (1, 3), (3, 1), (1, 5), (5, 1), (3, 5), (5, 3)}.

The unary polynomial operations of our algebra have the form f (x) = ax+b,
for some a, b ∈ Z6 . From this information we can calculate all the minimal
sets of this algebra:

M in(Z6 × Z6 ) = {{0, 3}, {1, 4}, {2, 5}, {0, 2, 4}, {1, 3, 5}},
M in(α) = {{0, 3}, {1, 4}, {2, 5}},
M in(β) = {{0, 2, 4}, {1, 3, 5}}.

Our characterization of minimal algebras obtained from minimal sets U ,


in Theorem 11.1.6, showed that for any unary polynomial f for which the
set U = f (A) is minimal, the induced minimal algebra is a permutation
algebra. The generalized version of this theorem for congruences, Theorem
11.1.9, was more limited, in the sense that Theorem 11.1.9 applies only to
β minimal sets U of the form e(A) for an idempotent unary polynomial e.
This raises the question of whether all the β-minimal sets of an algebra can
be obtained from idempotent unary polynomial operations only. The answer
to this question depends on the congruence β and a special property of the
congruence lattice of the original algebra. To describe the property involved,
we need the following definition.

Definition 11.1.12 Let L = (L, ∧, ∨) be any lattice.

(i) A mapping µ : L → L is called a meet-endomorphism of L if it satisfies


µ(x ∧ y) = µ(x) ∧ µ(y) for all elements x and y in L.

(ii) A mapping µ : L → L is called a join-endomorphism of L if it satisfies


µ(x ∨ y) = µ(x) ∨ µ(y) for all x and y in L.

(iii) Recall from Chapter 1 that a mapping ϕ : L → L is extensive if


x ≤ ϕ(x) for all x ∈ L; now we call a mapping strongly extensive if
x ≤ ϕ(x) but ϕ(x) 6= x for all x 6= 1 (where 1 is the greatest element
of L, if there is one).

A lattice endomorphism is a mapping of a lattice to itself which preserves


both the meet and join operations of the lattice. It is possible for a mapping
on a lattice to preserve meets but not joins, or joins but not meets, and
hence the properties from Definition 11.1.12 are weaker than the lattice
homomorphism property.
11.1. MINIMAL ALGEBRAS 261

Theorem 11.1.13 Let A be a finite algebra and let β ∈ Con A. Assume


that the interval [∆A , β] ⊆ Con A has no non constant strongly extensive
meet endomorphisms. Then every β-minimal set U of A has the form U =
e(A) for some idempotent unary polynomial e ∈ E(A).
Proof: Let U be any β-minimal set of A. We define a set K :=
{f ∈ P 1 (A)|f (A) ⊆ U }. For every congruence relation θ ∈ [∆A , β] we define
a relation
µ(θ) := {(x, y) ∈ β | ∀f ∈ K, (f (x), f (y)) ∈ θ} .
By definition µ(θ) ⊆ β, and we will show that µ(θ) is also a congruence on
A. To do this, it suffices by Lemma 5.3.2 to show that µ(θ) is compatible
with all unary polynomials on A. Suppose that (x, y) ∈ µ(θ) and g is a unary
polynomial operation of A. By definition of µ(θ) we have (x, y) in β and,
for the particular polynomial f ◦ g in K, the pair (f (g(x)), f (g(y))) in θ.
Since β is a congruence and compatible with g, we also have (g(x), g(y)) in
β. Now from (g(x), g(y)) ∈ β and (f (g(x)), f (g(y))) ∈ θ, we conclude that
(g(x), g(y)) is in µ(θ), as required.

Next we show that the mapping µ is an extensive meet-endomorphism on the


interval [∆A , β]. It follows from the definition that for any θ in this interval,
the image µ(θ) is also in this interval, and that our mapping preserves meets
(intersections) of congruences. Let θ be a congruence in this interval and
let (x, y) be in θ. By the compatibility of θ with unary polynomials on
A we have (f (x), f (y)) ∈ θ for every unary polynomial f . We also have
(x, y) ∈ β since θ ⊆ β, so that (x, y) in µ(θ). This shows that θ ⊆ µ(θ),
and our mapping µ is extensive. Since U is β-minimal, it has the form U =
f (A) for some unary polynomial f ∈ K for which f (β) 6⊆ ∆A . Therefore,
µ(∆A ) ⊂ β, for if µ(∆A ) = β we would have µ(θ) = β for all θ. This means
that µ is not a constant mapping. But our assumption about the interval
[∆A , β] means that µ cannot be strongly extensive. This means there must
exist a congruence θ0 in the interval, with θ0 6= β, for which µ(θ0 ) = θ0 . It
follows by extensivity that µ(µ(∆A )) ⊆ µ(µ(θ0 )) = θ0 ⊂ β. But µ(µ(∆A )) =
{(x, y) ∈ β | ∀ f, g ∈ K (f (g(x)) = f (g(y)))}; so for this to be a proper subset
of β there must exist polynomials f and g in K and a pair (x, y) ∈ β such
that f (g(x)) 6= f (g(y)). Then f (A) = g(A) = f (g(A)) = U and f (U ) = U .
Since A and U are finite there is a natural number k for which (f |U )k is
the identity mapping on U . Now taking e := f k gives an idempotent e2 = e
for which e(A) = U , and hence our β-minimal set U is obtainable by an
idempotent unary polynomial.
262 CHAPTER 11. TAME CONGRUENCE THEORY

Lemma 11.1.14 Let L be a finite lattice with least element 0, greatest el-
ement 1 and the property that the meet of arbitrary coatoms (elements di-
rectly below 1) is zero. Then L has no non–constant strongly extensive meet–
endomorphisms.

Proof: Assume that c1 , . . . , cn are the coatoms of L and that ϕ is a


strongly extensive meet–endomorphism. For i = 1, . . . , n, since 1 ∧ ci =
ci and ϕ(1) ∧ ϕ(ci ) = 1 ∧ ϕ(ci ) = ϕ(ci ), we have ϕ(ci ) ≤ ϕ(1) = 1. Now
ci < ϕ(ci ) ≤ 1, and the fact that ci is a coatom means that ϕ(ci ) = 1. Thus
ϕ(0) = ϕ(c1 ∧ · · · ∧ cn ) = ϕ(c1 ) ∧ · · · ∧ ϕ(cn ) = 1. Since ϕ is order–preserving
we have ϕ(x) = 1 for all x ∈ L.

This Lemma can be applied to lattices Mn of the form


r

r r r r p p p r

Since Con Z6 ∼= M2 , all the minimal algebras of the algebra Z6 of Example


11.1.11 have the form e(Z6 ). .

11.2 Tame Congruence Relations


Definition 11.2.1 A finite algebra A is called tame if it has a minimal set
U ∈ M in(A) satisfying the following three conditions, for all θ ∈ ConA:

(i) There is an e ∈ E(A) with U = e(A),

(ii) θ ⊃ ∆A ⇒ θ|U ⊃ ∆U ,

(iii) θ ⊂ A × A ⇒ θ|U ⊂ U × U .

Theorem 11.2.2 Let A be a finite algebra. A minimal set U ∈ M in(A)


fulfills conditions (i), (ii) and (iii) from Definition 11.2.1 iff the following
two conditions are both satisfied:
11.2. TAME CONGRUENCE RELATIONS 263

(T1) For all x, y ∈ A with x 6= y, there exists a unary polynomial f ∈ P 1 (A)


such that f (A) = U and f (x) 6= f (y).
(T2) The congruence generated by U is equal to A × A; that is, for all x, y ∈
A there are elements a1 , . . . , ak , b1 , . . . , bk ∈ U and f1 , . . . , fk ∈ P 1 (A)
with x = f1 (a1 ), fi (bi ) = fi+1 (ai+1 ) for i = 1, . . . k − 1 and fk (bk ) =
y.
As we saw in the previous section, minimal sets of an algebra A can be seen
as a special case of β-minimal sets of A; for congruences β, a minimal set is
in fact a β-minimal set for the largest congruence A × A on A. Similarly, our
definition of tame algebras by means of minimal sets can be generalized to
tame congruences, for any congruence on the algebra. In particular, Theorem
11.2.2 above is a particular case of a more general theorem for this more
general situation, and thus will be proved later.

Definition 11.2.3 Let A be a finite algebra. A congruence relation β 6= ∆A


on A is called tame if there is a β-minimal set U ∈ M inA (β) such that the
following conditions are satisfied, for all θ ∈ [∆A , β]:
(i) There is an element e ∈ E(A) with U = e(A),
(ii) θ ⊃ ∆A ⇒ θ|U ⊃ ∆U ,
(iii) θ ⊂ β ⇒ θ|U ⊂ β|U .

Let us remark that the concept of a tame congruence can be generalized even
further, in the following way. Let A be finite and α and β be congruences
on A with α ⊂ β. Then the interval [α, β] ⊆ ConA is called tame if the con-
gruence relation β/α of the quotient algebra A/α is tame. The β/α-minimal
sets, and the corresponding induced algebras, of the quotient algebra A/α
are called [α, β]-minimal. (Note that by the Second Isomorphism Theorem,
Theorem 3.2.2, the intervals [α, β] ⊆ ConA and [∆A/α , β/α] ⊆ ConA/α are
isomorphic.)

We now present the generalization of Theorem 11.2.2, giving an equivalent


characterization of a tame congruence.

Theorem 11.2.4 Let A be a finite algebra and let β ∈ ConA with β 6=


∆A . A β-minimal set U ∈ M inA (β) satisfies conditions (i), (ii) and (iii)
from Definition 11.2.3 (so that β is a tame congruence) iff the following two
conditions are both satisfied:
264 CHAPTER 11. TAME CONGRUENCE THEORY

(Z1) For all (x, y) ∈ β with x 6= y, there is a unary polynomial f ∈ P 1 (A)


such that f (A) = U and f (x) 6= f (y).

(Z2) θ(β|U ) = β; that is, for all (x, y) ∈ β there are pairs
(a1 , b1 ), . . . , (ak , bk ) ∈ β|U and polynomials f1 , . . . , fk ∈ P 1 (A) with
x = f1 (a1 ), fi (bi ) = fi+1 (ai+1 ) for i = 1, . . . , k − 1, and fk (bk ) = y.

Proof: (Z1) ⇒ (i): Since U is β-minimal there is a unary polynomial


g ∈ P 1 (A) for which g(A) = U , and a pair (a, b) ∈ β with g(a) 6= g(b). Appli-
cation of (Z1) on the pair (g(a), g(b)) gives us another polynomial f ∈ P 1 (A)
with f (A) = U and f (g(a)) 6= f (g(b)). This means that f (g(A)) = f (U ) ⊆ U ,
and using the second condition from the definition of the β-minimality of U ,
we get f (U ) = f (g(A)) = U . Therefore f is a permutation on U and there is
an n ∈ IN with (f n )|U = idU . Then e := f n is an idempotent in E(A), and
we have e(A) = U , as required for condition (i). To complete our proof, we
shall show that (Z1) ⇔ (ii) and (Z2) ⇔ (iii). Moreover, we may now assume
that U = e(A) for an idempotent polynomial e ∈ E(A).

(ii) ⇒ (Z1): Assume that (x, y) ∈ β and x 6= y. Then in A we have


θ(x, y) ⊃ ∆A , where θ(x, y) is the congruence generated by (x, y). Then
by condition (ii), we have θ(x, y)|U ⊃ ∆U , and there is a pair (u, v) ∈
θ(x, y) with u, v ∈ U and u 6= v. Then by Lemma 5.3.3 there exist
polynomials g1 , . . . , gn ∈ P 1 (A) with u ∈ {g1 (x), g1 (y)}, {gi (x), gi (y)} ∩
{gi+1 (x), gi+1 (y)} =6 ∅ for i = 1, . . . , n − 1 and v ∈ {gn (x), gn (y)}. From
u, v ∈ U and U = e(A) we get e(u) = u and e(v) = v. Then we have
u ∈ {e(g1 (x)), e(g1 (y))}, {e(gi (x)), e(gi (y))} ∩ {e(gi+1 (x)), e(gi+1 (y))} =
6 ∅
for i = 1, . . . , n − 1 and v ∈ {e(gn (x)), e(gn (y))}. Since u 6= v we obtain
e(gi (x)) 6= e(gi (y)) for at least one i. From this and from U = e(A) we get
(Z1).

(Z1) ⇒ (ii): Assume now that (Z1) is satisfied and that θ ∈ ConA with
∆A ⊂ θ ⊆ β. Then for any (x, y) ∈ θ with x 6= y, by (Z1) there is a polyno-
mial f ∈ P 1 (A) with (f (x), f (y)) ∈ θ ∩ U 2 = θ|U and f (x) 6= f (y). Thus (ii)
is satisfied.

(Z2) ⇔ (iii): The condition (iii), that if θ ⊂ β then θ|U ⊂ β|U , for all
θ ∈ [∆A , β], is by Theorem 11.1.2 equivalent to the fact that the congruence
relation on A generated by β ∩ U 2 is equal to β; that is, θ(β|U ) = β. But
since θ(β|U ) ⊆ β and β|U ⊆ θ(β|U ), we do have θ(β|U ) = β.
11.2. TAME CONGRUENCE RELATIONS 265

Definition 11.2.5 Let A be an algebra and let B and C be subsets of


the universe set A of A. Then the sets B and C are called polynomially
isomorphic in A if there exist unary polynomials f and g in P 1 (A) such that
f (B) = C, g(C) = B, gf |B = idB , and f g|C = idC .
In this case the mapping f |B is called a polynomial isomorphism from B
onto C.

Lemma 11.2.6 Let B and C be polynomially isomorphic subsets in A. Then


A|B and A|C are isomorphic as non-indexed algebras, with f |B : B → C as
an isomorphism between them. Moreover, for all θ ∈ ConA we have f (θ|B )
= θ|C .
Proof: For convenience, let us denote A|B by B and A|C by C, and let H
:= {h | h ∈ P n (A), n ∈ IN, h(B n ) ⊆ B}. The fundamental operations of
B are exactly the operations of the form h|B with h ∈ H. We regard B as
an (unindexed) algebra of type H. For every h ∈ H, we define a mapping
h0 ∈ P n (A) by
h0 (x1 , . . . , xn ) := f (h(g(x1 ), . . . , g(xn ))).
Then h0 (C n ) ⊆ C for any such h0 , and the operation h0 |C is one of the
fundamental operations of C. We now show that any fundamental operation
of C can be represented in this way. For p ∈ P n (A) with p(C n ) ⊆ C, we
define a mapping h ∈ P n (A) by
h(x1 , . . . , xn ) := g(p(f (x1 ), . . . , f (xn ))).
Then h(B n ) ⊆ B, so that h ∈ H, and p|C = h0 |C . Thus C is an algebra of
type H. Also f |B is an isomorphism of the algebras B and C, since for all
h ∈ H ∩ P n (A) and for all b1 , . . . , bn ∈ B we have
f (h(b1 , . . . , bn )) = f (h(gf (b1 ), . . . , gf (bn ))) = h0 (f (b1 ), . . . , f (bn ))
using the fact that gf |B = idB .

For all θ ∈ ConA, it follows from f ∈ P 1 (A) that f (θ) ⊆ θ. The equation
f (B) = C gives f (θ|B ) = f (θ ∩ B 2 ) ⊆ θ ∩ C 2 = θ|C . Similarly, we obtain
g(θ|C ) ⊆ θB and then θ|C = f (g(θ|C )) ⊆ f (θ|B ). Altogether we have equal-
ity.
266 CHAPTER 11. TAME CONGRUENCE THEORY

Theorem 11.2.7 Let β be a tame congruence relation of the finite algebra


A. Then the β-minimal sets have the following properties:

(i) Any β-minimal sets U1 , U2 ∈ M inA (β) are polynomially isomorphic in


A.

(ii) For all U ∈ M inA (β), the conditions (Z1), (Z2) from Theorem 11.2.4
and (i),(ii),(iii) from Definition 11.2.3 are satisfied.

(iii) Let U ∈ M inA (β), with U = f (A) for some f ∈ P 1 (A) with
f (β|U ) 6⊆ ∆A . Then f (U ) is also in M inA (β), and f |U is a polynomial
isomorphism from U onto f (U ).

Proof: (i) Since β is tame there is a set W ∈ M inA (β) satisfying con-
ditions (i), (ii), (iii) from Definition 11.2.3 and (Z1), (Z2) from Theorem
11.2.4. Moreover, we may assume that W = e(A) for an idempotent unary
polynomial e. We show that for every U ∈ M inA (β), the sets U and W
are polynomially isomorphic. Since U is β-minimal, there is a polynomial
operation s ∈ P 1 (A) with s(A) = U and s(β) 6⊆ ∆A . Then there is a pair
(x, y) ∈ β with s(x) 6= s(y). Applying condition (Z2) for W to the pair
(x, y), we obtain the existence of a pair (a, b) ∈ β|W and of a polynomial
h ∈ P 1 (A) such that s(h(a)) 6= s(h(b)) (since otherwise s(x) = s(y)). Now
we define s1 to be the composition mapping she. Then we have s1 (A) = U ,
since s(h(e(A))) = s(h(W )) ⊆ U because of the β-minimality of U . Ap-
plying (Z1) to the pair (s1 (a), s1 (b)) gives us a polynomial t ∈ P 1 (A) with
t(A) = W and ts1 (a) 6= ts1 (b). By the β-minimality of W again, we have
ts1 (W ) = W . Now we use s1 and t to construct the required polynomial iso-
morphism between U and W . The mapping s1 t|U satisfies s1 (t(U )) ⊆ U and
because of the minimality of U we have s1 (t(U )) = U . This shows that s1 t|U
is a permutation on the finite set U . Therefore there is an element k ∈ IN with
(s1 t)k = idU . Taking f := t and g := (s1 t)k−1 s1 , we have f (U ) = t(U ) = W ,
g(W ) = (s1 t)k−1 s1 (W ) = (s1 t)k−1 (U ) = U and gf = (s1 t)k−1 s1 t = idU .
Since every w ∈ W can be written as f (u) for some u ∈ U , and

f (g(w)) = f (g(f (u))) = f (gf (u)) = f (u) = w,

we see that f g|W = idW . This means that f |U is a polynomial isomorphism


of U onto W (and g|W is the isomorphism which is inverse to f |U ).

(ii) Let U be a β-minimal set. With the help of the polynomial isomorphisms
f |U and g|W from the proof of (i), the properties (Z1) and (Z2) are satisfied
11.2. TAME CONGRUENCE RELATIONS 267

for U . For (Z2) we use the additional fact that f (β|U ) = β|W , which follows
from Lemma 11.2.6. The conditions (i), (ii) and (iii) of Definition 11.2.3 then
follow by Theorem 11.2.4.

(iii) Since f (β|U ) 6⊆ ∆A , and since by (i) of this proof all U ∈ M inA (β)
have the same cardinality, we get f (U ) ∈ M inA (β). For the polynomial iso-
morphism, we have to show the existence of a polynomial g ∈ P 1 (A) with
g(f (U )) = U , gf |U = idU , and f g|f (U ) = idf (U ) . Assume that (a, b) ∈ β|U
with f (a) 6= f (b). Then by condition (Z1) there is an element h ∈ P 1 (A)
with h(A) = U and h(f (a)) 6= h(f (b)). This gives h(f (U )) = U . Then as in
the proof of part (i) there is a k ∈ IN with (hf )k = idU and the mapping
g := (hf )k−1 h has the desired property.

Now we look for properties of the congruence lattice of an algebra A which


force its congruences to be tame. The lattice property involved is called
tightness.

Definition 11.2.8 Let L be a lattice with least element 0 and greatest


element 1. A homomorphism ϕ : L → L0 of L into L0 is called 0-separating if

ϕ−1 (ϕ(0)) = {0},

and is called 1-separating if

ϕ−1 (ϕ(1)) = {1}.

A homomorphism which is both 0- and 1-separating is called 0-1-separating.


A lattice L is called 0-1-simple if |L| > 1 and every non-constant homomor-
phism f : L → L0 is 0-1-separating. A lattice L is called tight if L is finite
with |L| > 1, and is 0-1-simple and has no non-constant strongly extensive
meet-endomorphisms.
We know from Chapter 3 that any congruence relation of an algebra (and in
particular of a lattice) occurs as a kernel of a homomorphism. The condition
for a homomorphism ϕ on a lattice L with 0 and 1 to be 0-separating can
be expressed as the requirement that the congruence class of the element
0 under the congruence ker ϕ is a singleton set, {0}, and similarly for 1-
separating. This means that a lattice L with 0 and 1 is 0-1-simple iff all
congruence relations different from L × L have the singleton sets {0} and
{1} as congruence classes (blocks). In particular, it follows that all simple
268 CHAPTER 11. TAME CONGRUENCE THEORY

lattices are 0-1-simple. By Lemma 11.1.14, all finite 0-1-simple lattices whose
coatoms have meets equal to zero are tight.

Example 11.2.9 1. The lattices Mn given by Hasse diagrams

r r r r p p p r

are tight.

2. If A is a finite set then the lattice Eq(A) of all equivalence relations on A


is tight.

3. The congruence lattice of any finite vector space is tight.

Theorem 11.2.10 Let A be a finite algebra, and let β ∈ ConA with β 6=


∆A . If the lattice [∆A , β] is tight, then β is tame.
Proof: We consider the lattice L := [∆A , β], with ∆A as its 0-element and
β as its 1-element. By assumption L has no non-constant strongly extensive
meet-endomorphisms and every non-constant homomorphism f : L → L0
is 0-1-separating. Then by Theorem 11.1.13, any β-minimal set U of A has
the form U = e(A) for an idempotent e ∈ E(A). Now we use the mapping
ϕU : ConA → Con(A|U ), defined by θ 7→ θ|U , from Theorem 11.1.2. The in-
clusions ∆A ⊂ θ and θ ⊂ β imply ∆U = ∆A |U ⊆ θ|U and θ|U ⊆ β|U , respec-
tively. Since every non-constant homomorphism of [∆A , β] is 0-1-separating,
∆A has only ∆U as image and β has only β|U as image, so ∆U 6= θ|U and
θ|U 6= β|U . Thus β fits the conditions of Definition 11.2.3, and is tame.

Example 11.2.11 Consider again the algebra (Z6 ; +, −, 0) from Example


11.1.11. The two congruence relations

α = {(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (0, 3), (3, 0), (1, 4), (4, 1), (2, 5),
(5, 2)}
11.3. PERMUTATION ALGEBRAS 269

and
β = {((0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (0, 2), (2, 0), (0, 4), (4, 0), (2, 4),
(4, 2), (1, 3), (3, 1), (1, 5), (5, 1), (3, 5), (5, 3)}

are tame, since both intervals [∆Z6 , α], [∆Z6 , β] are simple. But the algebra
Z6 is not tame, since neither congruence has {0} or {1} as singleton blocks.

All two-element lattices are tight. For any algebra A, we can look at what are
called prime intervals [α, β] ∈ ConA, intervals which contain only the two
congruences α and β. For minimal sets which correspond to prime intervals,
the conditions (Z1) and (Z2) are satisfied. This means that when A is finite
we can consider all prime intervals. In this way tame congruence theory gives
a structure theory for all finite algebras.

11.3 Permutation Algebras


Definition 11.3.1 An algebra with the property that all its non constant
unary polynomial operations are permutations on its universe set is called a
permutation algebra.
As we saw in Theorem 11.1.6, for any finite algebra A the minimal algebras
of A are permutation algebras. Our goal in this section is to classify all finite
permutation algebras, which will allow us in the next section to classify
all minimal algebras. For two-element permutation algebras, we are able
to use our classification results from Chapters 9 and 10. We begin now by
considering finite permutation algebras of cardinality at least three.

Definition 11.3.2 Let f : An → A be an n-ary operation on A, and


let 1 ≤ i ≤ n. Then f depends on the i-th variable if there are ele-
ments a1 , . . . , ai−1 , ai+1 , . . . , an such that the unary operation defined by
x 7→ f (a1 , . . . , ai−1 , x, ai+1 , . . . , an ) is not constant. An operation is said to
be essentially binary if it depends on two of its variables, and to be essentially
at least binary if it depends on at least two variables.

Theorem 11.3.3 Let A be a finite permutation algebra, with |A| ≥ 3. If


A has a polynomial operation which depends on at least two variables, then
A is polynomially equivalent to a vector space; that is, there is a field K
and a vector space structure (A; +, −, 0, K) on A such that the polynomial
operations of A are exactly the operations of the form
(x1 , . . . , xn ) 7→ a + k1 x1 + · · · + kn xn
270 CHAPTER 11. TAME CONGRUENCE THEORY

for some n ∈ IN, a ∈ A, and k1 , . . . , kn ∈ K.

Before we begin the proof of this theorem we need the following fact:

Lemma 11.3.4 Let A be an algebra with a polynomial operation which de-


pends on at least two variables. Then A also has a polynomial operation
which is essentially binary.

Proof: Let f be a polynomial operation of arity n ≥ 3 which depends on at


least two variables. Then there are elements a2 , . . . , an , and b1 , b3 , . . . , bn in
A such that f (x1 , a2 , . . . , an ) depends on x1 and f (b1 , x2 , b3 , . . . , bn ) depends
on x2 . If f (x1 , x2 , a3 , . . . , an ) depends on x2 , we can use it as our essentially
binary polynomial operation. Similarly, if f (x1 , a2 , x3 , . . . , an ) depends on x3
we can use it.

Otherwise we replace a3 by b3 , and we have f (x1 , a2 , b3 , a4 , . . . , an ) de-


pending on x1 . We continue in this way. If none of the operations
f (x1 , a2 , b3 , . . . , bi−1 , xi , ai+1 , . . . , an ) depends on xi , then f (x1 , a2 , b3 , . . . , bn )
depends on x1 and because of the choice of b3 , . . . , bn we can take
f (x1 , x2 , b3 , . . . , bn ) to be our essentially binary polynomial.

Proof of Theorem 11.3.3: Our construction of a vector space structure


on A will proceed via several steps.

Step 1. Every essentially binary polynomial operation of A is the operation


of a quasigroup.

Proof: We recall from Example 1.2.7 that a quasigroup (Q; +) is a groupoid


with the property that for any a ∈ Q, both left and right addition with a
form permutations on Q; that is, the maps x 7→ a + x and x 7→ x + a are
permutations on Q.

Quasigroups can also be defined as algebras (Q; +, /, \) of type (2, 2, 2) which


satisfy the following identities:

(Q1) x \ (x + y) ≈ y
(Q2) (x + y)/y ≈ x
(Q3) x + (x \ y) ≈ y
(Q4) (x/y) + y ≈ x.
11.3. PERMUTATION ALGEBRAS 271

We let x + y be any essentially binary polynomial operation of A. We will


show that for any a ∈ A, the left addition mapping La (x) taking any x to
a + x is a permutation. It can be shown similarly that the right addition
mapping Ra (x) := x + a is a permutation, and therefore (A; +) is a quasi-
group.

Suppose that there is an a ∈ A for which the mapping La is not a permuta-


tion. Since La is a unary polynomial operation and since by assumption A is
a permutation algebra, the operation La must be constant. Hence there is an
element s ∈ A such that La (x) = s for all x ∈ A. Since the operation x + y
depends on the second variable, there is an element b ∈ A such that Lb is a
permutation. In particular Lb is surjective, so there is an element c ∈ A with
Lb (c) = s. Now we have a + c = s = Lb (c) = b + c, making Rc (a) = Rc (b).
This means that the unary polynomial Rc cannot be a permutation; so by
our assumption on A it must be constant. Since one of its values is s, from
Rc (a) = a + c = s, all of its values equal s, and Rc (x) = s for all x ∈ A.

Now we claim that this forces all the polynomials La0 , for a0 6= a, to be
non-constant. For suppose that there was some a0 6= a for which La0 was
constant. Then La0 (x) = La0 (c) = a0 + c = s, so that La0 always gives the
value s. Thus La0 = La , and for every x ∈ A we have a0 + x = a + x. But
then for every x ∈ A we have Rx (a0 ) = a0 + x = a + x = Rx (a). Then Rx
is not injective, hence not a permutation, hence must be constant. Now we
have all right-additions Rx constant, contradicting the fact that our opera-
tion x + y is essentially binary. Therefore we see that every La0 , for a0 6= a,
is non-constant.

Now we choose an element t ∈ A\{s} and define a mapping f on A by f (x) :=


Lm! m!
x (t) for m := |A|. (Here Lx means the composition Lx ◦ Lx ◦ · · · ◦ Lx .)
Since the mapping x + x + · · · + x is not constant by assumption, it is a
permutation and its order divides the order m! of the permutation group on
A. This means that f (x) = Lm! m!
x (t) = t for x 6= a, while f (a) = La (t) = s.
Thus the unary polynomial f is neither a permutation nor constant, which
is a contradiction.

Step 2. There is a loop operation in P (A).

Proof: We recall that a loop is an algebra (L; +, 0) of type (2, 0) such that
(L; +) is a quasigroup and 0 is a neutral element with respect to the operation
272 CHAPTER 11. TAME CONGRUENCE THEORY

+. We know from Step 1 that the essentially binary polynomial operation


x + y of A is a quasigroup operation. From this we can define two new
binary operations, right subtraction and left subtraction: x/y := Ry−1 (x)
and x \ y := L−1x (y). Taking m = |A| we have Ry
−1 = Rm!−1 and thus
y
x/y = Rym!−1 (x), so that / is a polynomial operation of A. Similarly, it can
be shown that \ is also a polynomial operation of A.

Any quasigroup can be made into a loop: we select an element 0 ∈ A and


then define a new addition operation + in terms of the old one, by x + y :=
(x/(0 \ 0)) + (0 \ y). It is easy to verify that 0 + x = x + 0 = x, making 0 a
neutral element for our operation.
Step 3. Let G be the set of all non constant unary polynomial operations
of A. Then two different polynomials f and g in G agree on at most one
element a ∈ A.

Proof: Suppose that for some f and g in G we have f (a) = g(a) and
f (b) = g(b) with a 6= b. Let h be the unary polynomial on A defined by
h(x) := f (x)/g(x), where / is the right subtraction belonging to the loop
operation +, as in Step 2. Then h(a) = f (a)/g(a) = 0 = h(b) = f (b)/g(b), so
h must be constant with value 0. But this means f (x) = g(x) for all x ∈ A,
and f = g.
Step 4. (A; +) is an abelian group.

Proof: It suffices to show that + is commutative and associative. For


commutativity we start with Ra (0) = 0 + a = a = a + 0 = La (0) and
Ra (a) = a + a = La (a), for every a ∈ A. From Step 1 we know that each
La is a non-constant polynomial, as is each Ra . For a 6= 0, Step 3 then tells
us that Ra = La ; that is, a + x = x + a for all a 6= 0. For a = 0 we have
0 + x = x + 0 from Step 2.

A similar argument can be used for associativity. For all a, b ∈ A we have


La (Lb (0)) = a+(b+0) = a+b = La+b (0). Because of the commutativity of +
we have also La (Lb (a)) = a+(b+a) = (b+a)+a = (a+b)+a = La+b (a). Thus
for a 6= 0, the polynomials La ◦ Lb and La+b agree on at least two elements,
0 and a. By Step 3, they must be equal, so that a + (b + x) = (a + b) + x for
all a, b, x ∈ A. For a = 0 this equation holds trivially.
Now we have our abelian group (A; +, −, 0). To make a vector space struc-
11.3. PERMUTATION ALGEBRAS 273

ture, we use the fact that the set of endomorphisms of an abelian group
forms a ring. The multiplication operation in this ring is composition of
mappings, while addition of operations is defined pointwise, by (f1 + f2 )(a)
= f1 (a) + f2 (a) for all a ∈ A. Let K := {k ∈ G | k(0) = 0} ∪ {0}, where 0 is
the constant mapping with 0(x) = 0 for all x ∈ A.
Step 5. The set K is the universe of a subring of the ring of all endomor-
phisms of the abelian group (A; +, −, 0). Moreover since K is finite and all
elements of K \ {0} are permutations, K = (K, +, −, 0, ◦,−1 , idA ) is a field
and (A; +, −, 0, K) is a vector space over K.

Proof: We show first that every k ∈ K \ {0} is an endomorphism of


(A; +, −, 0). Consider the values of k(a − x) and k(a) − k(x) on the two
inputs x = 0 and x = a; we have k(a − 0) = k(a) − k(0) and k(a − a) =
k(0) = 0 = k(a) − k(a). Thus for any a ∈ A, the operations k(a − x) and
k(a) − k(x) agree on x = 0 and x = a. For any a 6= 0 these are two different
inputs; so by Step 3, k(a − x) = k(a) − k(x). Thus the operations k(y − b)
and k(y) − k(b) agree for all y 6= 0 and all b ∈ A. Since |A| ≥ 3 they are equal
also for y = 0. But this means that k is an endomorphism of (A; +, −, 0). It
is clear from the definition of K that for any k1 , k2 ∈ K we also have k1 + k2 ,
−k1 and k1 ◦ k2 ∈ K, showing that (K; +, −, 0, ◦) is a subring of the endo-
morphism ring of (A; +, −, 0). Moreover since K is finite, for any k ∈ K \ {0}
the inverse k −1 is also in K. Therefore K is a field, and (A; +, −, 0, K) is a
vector space over K.
Step 6. The elements of P (A) are exactly the operations of the form

(x1 , . . . , xn ) 7→ a + k1 x1 + . . . + kn xn ,

for some n ∈ IN, a ∈ A and k1 , . . . , kn ∈ K.

Proof: By definition every operation of this form belongs to P (A). For the
converse, let f be a polynomial on A of arity n ≥ 0. For n = 0 the claim is
clear, so we assume now that n ≥ 1 and proceed inductively.

Let g be the operation defined by

g(x1 , x2 , . . . , xn ) := f (x1 , x2 , . . . , xn ) − f (0, x2 , . . . , xn ).


Then for all x2 , . . . , xn ∈ A we have g(0, x2 , . . . , xn ) = 0. For the base case
n = 1 we take f (x1 ) = f (0) + g(x1 ), and the claim is proved. Now let n ≥ 2.
274 CHAPTER 11. TAME CONGRUENCE THEORY

For arbitrary elements b3 , . . . , bn we define x ∗ y := g(x, y, b3 , . . . , bn ). Since


0 ∗ y = g(0, y, b3 , . . . , bn ) = 0 for all y ∈ A, the operation ∗ is not the op-
eration of a quasigroup. So by Step 1 the operation ∗ cannot be essentially
binary, and must depend on at most one variable. Suppose that x∗y depends
only on the variable y. Then there are elements a, b1 and b2 such that a ∗ b1
6= a ∗ b2 . This implies that for at least one of i = 1, 2, we have a ∗ bi 6= 0 =
0 ∗ bi . But that means that x ∗ y also depends on x. This contradiction shows
that x ∗ y cannot depend on y. The elements b3 , . . . , bn ∈ A were arbitrary;
so we have shown that g does not depend on x2 . In the same way we show
that g does not depend on x3 , . . . , xn . Thus g depends at most on x1 . If
g does not even depend on x1 then we take k1 = 0; otherwise there is an
element k1 ∈ K such that g(x1 , . . . , xn ) = k1 x1 for all x1 , . . . , xn ∈ A. Then
f (x1 , x2 , . . . , xn ) = k1 x1 + f (0, x2 , . . . , xn ). By induction on n we get that f
has the form claimed. This finishes the proof of Theorem 11.3.3.

Theorem 11.3.3 shows that if A is a non-trivial minimal algebra with a


polynomial operation depending on at least two variables but A is not poly-
nomially equivalent to a vector space, then A is a two-element algebra. The
two-element case is easy, since any two-element algebra is clearly a permu-
tation algebra: any polynomial which is not constant is a permutation.

In Section 10.2 we determined all two-element algebras, and classified them


into the following four cases, depending on the congruence lattice behaviour
of the variety V (A) generated by A: V (A) can be neither congruence dis-
tributive nor congruence permutable, V (A) can be one of congruence dis-
tributive or congruence permutable but not the other, or V (A) can be both
congruence distributive and congruence permutable.

In addition, these cases were characterized in Section 9.6 by conditions on


the term operations. That is, for a two-element algebra A the variety V (A) is
congruence permutable iff the operation p with p(x, y, z) = x + y + z (where
+ denotes addition modulo 2) is a term operation of A, while V (A) is con-
gruence distributive iff one of the following operations is a term operation
of A: m(x, y, z) = (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), x ∧ (y ∨ z), x ∧ (y ∨ ¬ z) or the
dual operations of these last two. From this characterization we obtain the
following classification (up to isomorphism) of all two-element algebras (or
Boolean clones):
11.3. PERMUTATION ALGEBRAS 275

Class 1. O1 , O5 , O8 :
- all two-element algebras with unary term operations;
- polynomially equivalent to O8 = ({0, 1}; c10 , c11 ).

Class 2. L1 , L3 , L4 , L5 :
- all two-element algebras which generate a congruence permutable, but not
congruence distributive variety;
- polynomially equivalent to L1 = ({0, 1}; c10 , c11 , +, ¬).

Class 3. C1 , C3 , C4 , D3 , D1 , F5n and F8n for n ≥ 2, F5∞ , F8∞ :


- all two-element algebras which generate a congruence distributive and con-
gruence permutable, or a 3-distributive but not 2-distributive and not con-
gruence permutable variety;
- polynomially equivalent to C1 = ({0, 1}; c10 , c11 , ∧, ¬).

Class 4. A1 , A3 , A4 , D2 , F6n and F7n for n ≥ 2, F6∞ , F7∞ :


- all two-element algebras which generate a congruence distributive but not
congruence permutable variety;
- polynomially equivalent to A1 = ({0, 1}; c10 , c11 , ∧, ∨).

Class 5. P1 , P3 , P6 , P5 :
- polynomially equivalent to P6 = ({0, 1}; c10 , c11 , ∧).

Taking one representative from each class (up to polynomial equivalence),


we obtain the following sublattice of the lattice of all Boolean clones.

C1
r

A1 r
rL1
P6 r
r
O8

We now have, up to polynomial equivalence, five kinds of permutation alge-


bras. This motivates the following definition.
æ

Definition 11.3.5 Let M be a minimal algebra. Then we define the type


of M as follows:
276 CHAPTER 11. TAME CONGRUENCE THEORY

M has type 1: ⇔ M is polynomially equivalent to an algebra


(M ; π) for some π ⊆ SM (unary type)
M has type 2: ⇔ M is polynomially equivalent to a vector
space (vector space type)
M has type 3: ⇔ M is polynomially equivalent to C1
(Boolean type)
M has type 4: ⇔ M is polynomially equivalent to A1
(lattice type)
M has type 5: ⇔ M is polynomially equivalent to a two-
element semilattice (semilattice type).

Using Theorem 11.3.3 and our classification of all two-element algebras we


have the following result.

Theorem 11.3.6 A finite algebra A is minimal iff it is of one of the types


1–5.

11.4 The Types of Minimal Algebras


In the previous section we characterized all minimal algebras, and assigned
each one a type based on which of five representative minimal algebras it is
polynomially equivalent to. We now want to extend our classification of types
to the finite algebras A from which we obtain the minimal algebras A|U , for
U a minimal or β-minimal set. In Theorem 11.2.7 we showed that when β is
a tame congruence relation on a finite algebra A, any two minimal algebras
induced on A are polynomially isomorphic to each other. This means that
in this case all the minimal algebras induced on A have the same type, and
we can use that type to assign a type to the algebra A, as follows.

Definition 11.4.1 Let A be a finite tame algebra. If one (and hence all) of
the minimal algebras of A is of type i, for i ∈ {1, 2, 3, 4, 5}, then A is said to
be of type i, and we write type A = i.

In the remark following Definition 11.2.3 we noted that our definition of a


tame congruence could be extended to intervals in a congruence lattice. For
A a finite algebra and α and β in Con A with α ⊂ β, the interval [α, β] ⊆
Con A is called tame in A if the congruence relation β/α of the quotient
algebra A/α is tame. Then β/α-minimal sets are called [α, β]-minimal, and
11.4. THE TYPES OF MINIMAL ALGEBRAS 277

the corresponding minimal algebras of the quotient algebra A/α are called
[α, β]-minimal.

Definition 11.4.2 Let C be [θ, Θ]- minimal and let [α, β] be tame in A. A
[θ, Θ]- trace is a set N ⊆ C such that N is a Θ equivalence class [x]Θ for
which [x]Θ 6= [x]θ . An [α, β]- trace N is a set N ⊆ A such that there is some
[α, β]-minimal set U such that N ⊆ U and N is an [α/U, β/U ]- trace of A/U
(so N = [x]β∩U for some x ∈ U with [x]β∩U ∩ U 6⊆ [x]α ).

The following theorem was proved by Hobby and McKenzie in [57].

Theorem 11.4.3 Let [α, β] be a tame interval for the algebra A and let N
be an [α, β]-trace. Then (A|N )/(α|N ) is a minimal algebra. Moreover for all
[α, β]-traces N the minimal algebras (A|N )/(α|N ) have the same type 1–5.

This result allows us to extend our definition of type from minimal and tame
algebras to tame intervals in a congruence lattice.

Definition 11.4.4 Let [α, β] be a tame interval of the finite algebra A. The
type of the interval is the type of the minimal algebra (A|N )/(α|N ) for an
arbitrary [α, β]-trace N .

We saw earlier that all two–element lattices are tight. In a congruence lattice
Con A, a two-element interval is called a prime interval. It follows that on a
finite algebra A all prime intervals [α, β] ⊆ Con A are tame. In this case, for
any [α, β]-minimal set N , the induced algebra A|N is minimal with respect
to (α|N , β|N ).

Definition 11.4.5 The type of a prime interval [α, β] is the type of the
((α|N )/(β|N ))-minimal algebra A|N , where N is an [α, β]-minimal set.

Finally, our definition of type can be extended to arbitrary intervals and


varieties. If [γ, λ] is an arbitrary interval in Con A, we define type {γ, λ} :=
{type(α, β) | γ ≤ α, β ≤ λ}, and type A = {type{∆A , A × A}}. The type
of a variety V of algebras is the set of all the types of finite algebras in the
variety: [
type{V } = {type A | A ∈ V and A is finite}.
278 CHAPTER 11. TAME CONGRUENCE THEORY

We now present some examples. For the first one, we recall that a finite
algebra A = (A; F A ) is called primal if its term clone is all of OA .

Theorem 11.4.6 Every primal algebra has type 3 (Boolean type).


Proof: Let A be primal. We proved in Proposition 10.5.7 that any primal
algebra is simple. By the primality, for any two elements a, b ∈ A there is
a polynomial operation f with f (A) = {a, b}. Therefore every two–element
subset of A is a minimal set. It follows that if U is such a minimal set, the
algebra A|U is two–element. Let us denote the two elements by 0 and 1.
Then there are term operations ∧, ∨ and ¬ on A|U which satisfy

0 ∧ 0 = 0 ∧ 1 = 1 ∧ 0 = 0, 1 ∧ 1 = 1,
0 ∨ 1 = 1 ∨ 0 = 1 ∨ 1 = 1, 0 ∨ 0 = 0,
¬0 = 1, ¬ 1 = 0.
Then the algebra A|U is polynomially equivalent to the two-element Boolean
algebra, which has type 3.

Example 11.4.7 1. Let A be a finite algebra such that cloneA = P ol % for


some partial order % on A which has a least element 0 and a greatest element
1. Let
(
0 if x = 0
g(x) =
1 if x 6= 0.

Then g is monotone, and the image set U = {0, 1} gives a two-element


minimal algebra A|U . Using
(
0 if x = 0
ϕ(x, y) = inf (g(x), y) =
y if x 6= y,
(
1 if x 6= 0
ϕ0 (x, y) = sup(g(x), y) =
y if x = 0,

we see that A has type 4.

2. It can be shown that like primal algebras, functionally complete algebras


have type 3.
We conclude this section with some results on the types possible for tame
algebras.
11.4. THE TYPES OF MINIMAL ALGEBRAS 279

Theorem 11.4.8 Let A be a finite tame algebra.


(i) If Con A has more than two elements, then type A ∈ {1, 2}.

(ii) If there is no homomorphic image of the lattice Con A which is iso-


morphic to a congruence lattice of a vector space with more than one
element, then type A = {1}.
Proof: (i) Suppose that type A 6∈ {1, 2}. Then any minimal set U on A
has |U | = 2, and the induced algebra A|U has only two elements and is
simple. We saw in Theorem 11.1.2 that the mapping ϕU (θ) := θ|U is a lattice
homomorphism from Con A onto Con(A|U ). By the simplicity of A|U , for
any congruence θ on A we have θ|U ∈ {∆U , U × U }. But now the definition
of a tame algebra, Definition 11.2.1, shows that Con A can only contain the
two congruences ∆A and A × A.
(ii) Since there are vector spaces with two-element congruence lattices, we
may assume that |Con A| = 2. But then by (i) only type A ∈ {1, 2} is
possible. But the type of A cannot be 2, since for every U ∈ M in(A) the
lattice Con(A|U ) is a homomorphic image of Con A by Theorem 11.1.2 again,
and A|U would be equivalent to a vector space with the congruence lattice
Con(A|U ).

Lemma 11.4.9 Let A be a finite algebra such that Con A is isomorphic to


the lattice Mn for some n ≥ 3. If n − 1 is not a prime power, then A is
tame and type A = 1.
Proof: By example 11.2.9 we know that the algebra A is tame. Moreover,
for n ≥ 3 the lattices Mn are simple. To see this, let θ 6= ∆A and (a, b) ∈ θ
with a 6= b, b 6= 1 and a 6= 0. Then (a ∧ a, a ∧ b) = (a, 0), if b 6= 1 and
(a ∨ a, a ∨ 1) if b = 1, and we have θ = A × A. Therefore, all homomorphic
images with more than one element are isomorphic to Mn . For every finite
field K, the cardinality |K| is a prime power, and the congruence lattice of
a two-dimensional vector space over K has the form M|K|+1 . By 11.4.8 (ii)
we have type A = 1.

Now we consider properties of tame algebras of type 1 or type 2.

Definition 11.4.10 An algebra A satisfies the strong term condition if for


all n ∈ IN, all n-ary term operations tA of A and all b, c1 , . . ., cn , d1 , . . .,
dn ∈ A the following implication is satisfied:
280 CHAPTER 11. TAME CONGRUENCE THEORY

tA (c1 , c2 , . . . , cn ) = tA (d1 , d2 , . . . , dn ) ⇒
tA (b, c2 , . . . , cn ) = tA (b, d2 , . . . , dn ).

An algebra satisfying the strong term condition is called strongly abelian.


An algebra A satisfies the term condition if for all n ∈ IN, for all n-ary
term operations tA of A and all a, b, c2 , . . ., cn , d2 , . . ., dn the following
implication holds:

tA (a, c2 , . . . , cn ) = tA (a, d2 , . . . , dn ) ⇒
tA (b, c2 , . . . , cn ) = tA (b, d2 , . . . , dn ).

An algebra which satisfies the term condition is said to be abelian.


It is clear that if an algebra A satisfies the strong term condition then it also
satisfies the term condition, so that strongly abelian algebras are abelian.
The concept of an abelian algebra generalizes the usual concept of an abelian
group. A group G = (G; ·, −1 , e) is called abelian in the group-theoretical
sense if it satisfies the identity x · y ≈ y · x. We show that a group is
abelian in this sense iff it is abelian as an algebra. If G is an abelian algebra
then tG (x, y, z) := yxz satisfies tG (e, e, a) = a = tG (e, a, e) ⇒ tG (b, e, a) =
tG (b, a, e) and thus ba = ab, making G an abelian group. Conversely when
G is an abelian group, the n-ary term operations of G for n ≥ 1 are exactly
those of the form tG (x1 , . . . , xn ) = xk11 · · · xknn with k1 , . . . , kn ∈ Z, and it is
easy to show that these terms satisfy the term condition.

Example 11.4.11 Consider the binary operations f and g given by the


Cayley tables below.

f 0 1 2 3 g 0 1 2 3
0 0 0 1 1 0 0 0 1 1
1 0 0 1 1 1 0 0 1 1
2 0 0 1 1 2 0 0 1 1
3 2 2 3 3 3 2 2 0 0.
It can be verified that for all evaluations f satisfies the implication of the
strong term condition, while g satisfies the implication of the term condition.
Abelian and strongly abelian algebras will be studied further in the next
chapter. For the moment, we characterize the types of such algebras.

Theorem 11.4.12 Let A be a finite tame algebra. Then


11.5. MAL’CEV CONDITIONS AND OMITTING TYPES 281

(i) A is strongly abelian iff type A = 1, and

(ii) A is abelian iff type A ∈ {1, 2}.

Corollary 11.4.13 Every finite algebra A whose congruence lattice Con A


is isomorphic to the lattice Mn for some n ≥ 3 is abelian. If in addition
n − 1 is not a prime power, then A is also strongly abelian.

Proof: We have shown that the lattices Mn with m ≥ 3 are tight, and
therefore every finite algebra with Con A ∼
= Mn is tame. By Theorem 11.4.8
(i) we have type A ∈ {1, 2}, and then Theorem 11.4.12 (ii) shows that A is
abelian. If n − 1 is not a prime power then by Lemma 11.4.9 and Theorem
11.4.12 A is strongly abelian.

11.5 Mal’cev Conditions and Omitting Types


Hobby and McKenzie proved in [57] that there is a close connection between
the types of finite algebras A and Mal’cev-type conditions for the varieties
V (A) they generate. To present their results in this section, we will use our
Mal’cev condition results from Chapter 9 along with the following new prop-
erties of congruences.

We recall from Section 6.7 that an algebra A is called meet-semidistributive


iff whenever congruences θ1 , θ2 and θ3 in Con A satisfy θ1 ∧ θ2 = θ1 ∧ θ3 ,
they also satisfy θ1 ∧ θ2 = θ1 ∧ (θ2 ∨ θ3 ).
Dually, A is called join-semidistributive iff whenever congruences θ1 , θ2 and
θ3 in Con A satisfy θ1 ∨ θ2 = θ1 ∨ θ3 , they also satisfy θ1 ∨ θ2 = θ1 ∨ (θ2 ∧ θ3 ).

An algebra which is both meet- and join-semidistributive is called semidis-


tributive. A variety is called semidistributive if all the algebras in it are
semidistributive.

Hobby and McKenzie ([57]) proved the following “omitting type” theorem,
which characterizes the types which cannot occur for a variety by means of
the Mal’cev-condition properties of the variety.

Theorem 11.5.1 Let A be a finite algebra, with V (A) the variety generated
by A. Then:
282 CHAPTER 11. TAME CONGRUENCE THEORY

(i) If V (A) is congruence distributive, then type {V (A)} ∩ {1, 2, 5} = ∅;

(ii) If V (A) is congruence permutable, then type {V (A)} ⊆ {2, 3};

(iii) V (A) is n-permutable for some n ≥ 2 iff type {V (A)} ⊆ {2, 3};

(iv) type {V (A)} = {3} iff in V (A) there exist terms


f0 (x, y, z, u), . . . , fn (x, y, z, u), for n ≥ 2, such that the following iden-
tities hold in V :
f0 (x, y, y, z) ≈ x,
fi (x, x, y, x) ≈ fi+1 (x, y, y, x), for all i < n,
fi (x, x, y, y) ≈ fi+1 (x, y, y, y), for all i < n,
fn (x, x, y, z) ≈ z;

(v) type {V (A)} ∩ {1, 2} = ∅ iff V (A) is meet-semidistributive iff the class
of all lattices isomorphic to a sublattice of Con B for some B ∈ V (A)
does not contain the lattice M3 .

As an easy corollary we obtain the following result.

Corollary 11.5.2 Let A be a finite algebra. If V (A) is both congruence dis-


tributive and n-permutable for some n ≥ 2 then type {V (A)} = 3.

To formulate the next theorem we need the concept of a 1-snag.

Definition 11.5.3 A 1-snag of an algebra A is a pair (a, b) of distinct el-


ements of A such that for some f ∈ P 2 (A), the equations f (a, b) = f (b, a)
and f (b, b) = b are satisfied. We denote the set of all 1-snags of A by Sn1 (A).

Hobby and McKenzie used 1-snags to characterize locally finite varieties,


that is, varieties in which every finitely generated algebra is finite, of type
1.

Lemma 11.5.4 Let V (A) be a locally finite variety. Then type {V (A)} =
{1} iff for every algebra B in V (A) the set Sn1 (B) = ∅.

For algebras which generate congruence modular varieties we have the fol-
lowing useful lemma, due to E.W. Kiss and E. Pröhle ([62]).
11.5. MAL’CEV CONDITIONS AND OMITTING TYPES 283

Lemma 11.5.5 Let A be a finite algebra. If V (A) is congruence modular


then type {V (A)} = type {Sub(A)}.

We saw in Chapter 9 that for a two-element algebra A, the variety V (A)


is congruence modular iff V (A) is congruence distributive or congruence
permutable. Combining this with Lemma 11.5.5 gives a characterization of
the type of V (A) for any two-element algebra A.

Theorem 11.5.6 Let A be a two-element algebra of type i, for i ∈


{1, 2, 3, 4, 5}. Then type {V (A)} = {i}.

Proof: We consider two cases, depending on whether V (A) is congruence


modular or not.

Case 1: V (A) is congruence modular. In this case we apply Lemma 11.5.5.,


and the fact that A has no non-trivial subalgebras and is simple, to conclude
that type {V (A)} = type{Sub(A)} = type{A} = {i}.

Case 2: V (A) is not congruence modular. In this case V (A) is neither con-
gruence distributive nor congruence permutable. From our classification of
two-element algebras in Section 10.3, we know that exactly the following
two-element algebras have this property:
P1 = ({0, 1}; ∧), P3 = ({0, 1}; ∧, c10 ), P5 = ({0, 1}; ∧, c11 ),
P6 := ({0, 1}; ∧c10 , c11 ), O1 = ({0, 1}; c11 ), O4 = ({0, 1}; ¬),
O8 = ({0, 1}; c10 , c11 ), O9 = ({0, 1}; ¬, c10 )

(and the dually isomorphic algebras).

The algebras O1 , O8 , O9 have type 1. For each of these algebras A there is


no algebra B in V (A) with an essentially binary term operation, so we have
Sn1 (B) = ∅ for any B ∈ V (A). Lemma 11.5.4 then shows that type {V (A)}
= {1} for these algebras A.

The algebras O1 , P3 , P5 and P6 are all the two-element algebras of type


5. The varieties generated by these algebras are the varieties of semilat-
tices, semilattices with zero, semilattices with unit, and bounded semilat-
tices. It was proved by G. Czedli in [14] that these varieties are all con-
gruence meet-semidistributive; so by Theorem 11.5.1 (v) the type of V (A)
284 CHAPTER 11. TAME CONGRUENCE THEORY

cannot contain 1 or 2. If 3 or 4 is in type{V (A)} then there must exist


a finite algebra B ∈ V (A) and a prime interval (δ, θ) ∈ Con B such that
(N ; q|N , p|N ) is a two-element lattice, where N is the uniquely determined
(δ, θ)-trace and q, p are binary polynomial operations of B which satisfy
p(x, 1) = p(1, x) = p(x, x) = x and q(x, 0) = q(0, x) = q(x, x) for all x ∈ B
and N = {0, 1} (see [57]). But this contradicts the fact that B is a finite
semilattice (with unit or bounded). Therefore, type {V (A)} = {5}.

For minimal algebras A with |A| ≥ 3 we have the following results.

Lemma 11.5.7 Let A be a minimal algebra with |A| ≥ 3. If type A = 1


then type {V (A)} = {1}.

Proof: If all polynomial operations of A are essentially unary, then any


algebra B ∈ V (A) has the same property. By Lemma 11.5.4 we have type
{V (A)} = {1}.

Theorem 11.5.8 Let M be a minimal algebra. Then we have:

M has type 3 ⇔ V (M) is congruence distributive and there is an


n ≥ 2 such that V (M) is n-permutable;
M has type 4 ⇔ V (M) is congruence distributive, and there is no
n ≥ 2 such that V (M) is n-permutable;
M has type 5 ⇔ V (M) is not congruence distributive and not
n-permutable for a certain natural number n ≥ 2,
but V (M) is meet-semidistributive;
M has type 2 ⇔ V (M) is not congruence distributive and not
meet-semidistributive, and V (M+ ) is
n-permutable for some n;
M has type 1 ⇔ V (M) is not congruence distributive and not
meet-semidistributive, and V (M+ ) is not
n-permutable for a certain natural number n ≥ 2.

Proof: If M has type 3, then M is polynomially equivalent to the two-


element Boolean algebra and therefore is one of the following two-element
algebras (up to isomorphism):
C1 , C3 , C4 , D3 , D1 , F5n , F8n , F5∞ , F8∞ (for n ≥ 2).
11.5. MAL’CEV CONDITIONS AND OMITTING TYPES 285

As we saw in Section 9.6, it was shown by M. Reschke, O. Lüders and K.


Denecke in [100] that these algebras generate congruence distributive and
n-permutable varieties, for n = 2 or n = 3.

If M has type 4, then M is polynomially equivalent to the two-element


lattice, and M is one of the following two-element algebras (up to isomor-
phism):
A1 , A3 , A4 , D2 , F6n , F7n , F6∞ , F7∞ (for n ≥ 2).
We saw in Section 9.6 that these algebras generate varieties which are con-
gruence distributive, but not n-permutable for n ≥ 2.

If M has type 5, then M is (up to isomorphism) one of the two-element


algebras P1 , P3 , P5 , P6 . In this case ( see [100]) the variety V (M) is neither
congruence distributive nor congruence permutable, but was shown to be
meet-semidistributive by D. Papert in [85].

If M has type 1, then M has no essentially at least binary term opera-


tions. Since the Mal’cev-type conditions for congruence distributivity and
n-permutability require essentially at least ternary term operations, we see
that V (M) is not congruence distributive and not n-permutable for some
n ≥ 2. Also by Theorem 11.5.1 (v) we see that V (M) cannot be meet-
semidistributive.

If M has type 2, then by Theorem 11.5.1 parts (i) and (v) V (M) cannot
be congruence distributive or meet-semidistributive. Hobby and McKenzie
showed in [57] that M has a Mal’cev operation as a polynomial operation,
so that M+ has a Mal’cev operation as a term operation and thus V (M+ )
is congruence permutable (that is, 2-permutable).

Conversely, if V (M) is congruence distributive and n-permutable for some


n ≥ 2 then by Corollary 11.5.2 type {V (M)} = {3} and type M = 3. If
V (M) is congruence distributive but there is no n ≥ 2 such that V (M)
is n-permutable, then by Theorem 11.5.1 (i) type {V (M)} ∩ {1, 2, 5} = ∅,
so type {V (M)} ⊆ {3, 4}. By the proof of Theorem 11.5.6 and Lemma
11.5.7 type {V (M)} = {3, 4} is impossible for a minimal algebra M. If type
{V (M)} = {3}, then type M = 3 and V (M) would be n-permutable for
some n ≥ 2. Therefore type {V (M)} = {4}.
286 CHAPTER 11. TAME CONGRUENCE THEORY

If V (M) is not congruence distributive, not n-permutable for some n ≥ 2,


but is meet–semidistributive, then type {V (M)} ∩ {1, 2} = ∅, so we have
type {V (M)} ⊆ {3, 4, 5}. Using Theorem 11.5.6 and Lemma 11.5.7 again,
this means that type {V (M)} must be one of {3}, {4} or {5}. The first two
of these are impossible, since otherwise V (M) would be congruence distribu-
tive or n-permutable for some n ≥ 2. Therefore we have type {5}.

Next, if V (M) is not congruence distributive and not meet- semidistributive,


then 1 ∈ type{V (M)} or 2 ∈ type{V (M)}. Having V (M+ ) n-permutable
means that type {V (M+ )} ⊆ {2, 3}. Thus type M+ must be 2 or 3. But if
type M+ = 3, then type M = 3 and type {V (M)} = {3}, making V (M)
congruence distributive. We are left with type M+ = type M = 2.

Finally, if V (M) is not congruence distributive and not meet-


semidistributive, and V (M+ ) is not n-permutable for some n ≥ 2, then
type M 6∈ {2, 3, 4, 5}, and therefore type M = 1.

11.6 Residually Small Varieties


To illustrate the applications of tame congruence theory, we present in this
section two results on residually small varieties. We recall from Section 6.7
that a variety V is called residually small if there is a cardinal number λ
such that every subdirectly irreducible algebra in V has at most λ elements.

Varieties which are not residually small are called residually large. A variety
is called locally finite if every finitely generated algebra in the variety is finite.

Example 11.6.1 The variety of distributive lattices is generated by the


two-element distributive lattice 2D = ({0, 1}; ∧, ∨). It is wellknown that every
distributive lattice is isomorphic to a subdirect power of this algebra 2D .
Therefore 2D is the only subdirectly irreducible algebra in V (2D ), and V (2D )
is residually small. Similarly it can be shown that the variety of Boolean
algebras and the variety generated by a primal algebra are residually small.

Hobby and McKenzie proved the following result in [57].

Proposition 11.6.2 Let A be a finite algebra such that ConA has a sublat-
11.7. EXERCISES 287

tice of the form given by the Hasse diagram below (the pentagon) in which
β covers α (there are no congruences properly between them), and such that
type [α, β] = 2 and type [∆A , δ] ∈ {3, 4}.
t

β t
2 t δ

α t 3 or 4
t
∆A

Then V (A) is residually large.

Theorem 11.6.3 Every locally finite variety which omits the types 1 and 5,
and is residually small, is congruence-modular.

11.7 Exercises
11.7.1. Let L be the lattice given by the following Hasse diagram:

1s
c
s
a s s
s d
0 s b

Let µ : L 7→ L be the mapping which maps 0 and a to c and all other ele-
ments to 1. Show that µ is strictly extensive and a meet-endomorphism.

11.7.2. Verify that the lattices from Example 11.2.9 are tight.

11.7.3. Verify that the two lattices shown below are not tight.
288 CHAPTER 11. TAME CONGRUENCE THEORY

s
s
s s s
s s s
s s
s s s
s s s
s
s

11.7.4. Prove that any functionally complete algebra has type 3.

11.7.5. A lattice L is called order polynomially complete iff every monotone


mapping f : Ln → L, for any n ∈ N, is a polynomial operation of L. Prove
that a lattice L is order polynomially complete iff L is both tight and simple.

11.7.6. Let (S, ·) be a finite semigroup. Show that for each element a ∈ S
there is some integer k ≥ 1 such that e = ak is an idempotent, that is,
a2k = ak . Moreover, there is an integer k such that a2k = ak holds for every
a ∈ S.

11.7.7. Prove that every finite quasigroup generates a congruence permutable


variety.
Chapter 12

Term Condition and


Commutator

In this chapter we show how concepts such as an abelian group, the commu-
tator of a group and related concepts like solvable groups may be generalized
to arbitrary universal algebras. The commutator of two elements a and b in
a group G is the element [a, b] := a−1 b−1 ab. The commutator group of G is
the normal subgroup of G which is generated by the set {[a, b] | a, b ∈ G}
of all commutators of G. In a sense, the commutator subgroup of a group G
measures how far the group is from being commutative. More generally, if
M and N are two normal subgroups of G then the commutator group of M
and N , written as [M, N ], is the normal subgroup of G generated by the set
{[a, b] | a ∈ M, b ∈ N }. It is well known that [M, N ] is the least normal sub-
group of G with the property that gh = hg for all g ∈ M/[M,N ] , h ∈ N/[M,N ] .

12.1 The Term Condition


We begin by generalizing the concept of an abelian group to arbitrary al-
gebras. If G is an abelian group then the n-ary term operations of G are
exactly operations induced by the terms t(x1 , . . . , xn ) = xk11 · · · · · xknn , for
k1 , . . . , kn ∈ Z. These terms are normal forms for arbitrary n-ary terms over
the variety of all abelian groups, meaning that any n-ary term is equivalent to
one of these terms modulo IdV , where V is the variety of all abelian groups.
We saw in Section 11.4 that a group G is abelian iff its term operations
satisfy the so-called term condition (TC) of Definition 11.4.10: for any n-ary
term t (with induced term operation tG ) and any elements a, b, c2 , . . . , cn ,

289
290 CHAPTER 12. TERM CONDITION AND COMMUTATOR

d2 , . . . , dn ∈ G

(TC) tG (a, c2 , . . . , cn ) = tG (a, d2 , . . . , dn )


⇒ tG (b, c2 , . . . , cn ) = tG (b, d2 , . . . , dn ).

This gives us an equivalent characterization of the abelian property for


groups, by means of term operations, and suggests how to generalize the
concept of abelian to arbitrary algebras.

Definition 12.1.1 An algebra A satisfies the term condition (TC) if for


all n ∈ IN\{0}, for all n-ary term operations tA of A and for all elements
a, b, c2 , . . . , cn , d2 , . . . , dn ∈ A the following implication is satisfied:

(TC) tA (a, c2 , . . . , cn ) = tA (a, d2 , . . . , dn )


⇒ tA (b, c2 , . . . , cn ) = tA (b, d2 , . . . , dn ).

An algebra A is called abelian if it satisfies the term condition (TC).

Example 12.1.2 1. Every abelian group is an abelian algebra.

2. Zero-semigroups, right-zero semigroups and left-zero-semigroups are semi-


groups satisfying, respectively, the following identities: x1 x2 ≈ x3 x4 , x1 x2 ≈
x2 and x1 x2 ≈ x1 . (Here we follow the convention of writing the semigroup
operation as juxtaposition.) These are all abelian algebras. (See Exercise
12.3.1.)

A rectangular band is a semigroup which satisfies the identities x1 x2 x3 ≈


x1 x3 and x21 ≈ x1 . Since in a rectangular band any at least binary term can
be written as a product of the first and last variable used in it, it is easy to
show that any rectangular band is an abelian algebra: we have ac2 · · · cn =
ad2 · · · dn ⇒ acn = adn ⇒ bacn = badn ⇒ bcn = bdn ⇒ bc2 · · · cn = bd2 · · · dn .

3. Every algebra which has only unary fundamental operations is abelian,


since then every term operation is also unary, and the term condition TC is
always satisfied.

4. Every subalgebra of an abelian algebra is abelian and every direct product


of abelian algebras is abelian. (See Exercise 12.3.2.)
12.1. THE TERM CONDITION 291

The next theorem shows that the property of being abelian can be charac-
terized by properties of the congruence lattice of the algebra A2 .

Theorem 12.1.3 An algebra A is abelian if and only if there is a congru-


ence relation θ on the algebra A2 such that the diagonal 4A = {(a, a) | a ∈
A} is a congruence class of θ.

Proof: We show that 4A is a class or block of the congruence h4A iConA2


which is generated by 4A . We must show that for arbitrary elements a, b ∈ A
we have ((a, a), (b, b)) ∈ h4A iConA2 , while for any b 6= c in A, with u = (b, c),
we have ((a, a), u) 6∈ h4A iConA2 . We use the characterization from Lemma
5.3.3 of the congruence generated by 4A on A2 . Thus we have to show that
for all a, b ∈ A and all unary polynomial operations pA×A of A × A, the
implication

pA×A ((a, a)) ∈ 4A ⇒ pA×A ((b, b)) ∈ 4A (*)

is satisfied. Every unary polynomial operation pA of A arises from some n-


ary term operation tA of A, by substitution of constants from A for n − 1 of
the variables in tA . Therefore we have:

pA×A ((a, a)) = (pA (x), pA (y))


= (tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ 4A
⇒ tA (a, c2 , . . . , cn ) = tA (a, d2 , . . . , dn )
⇒ tA (b, c2 , . . . , cn ) = tA (b, d2 , . . . , dn )
⇒ pA×A (b, b) ∈ 4A ,

using the term condition (TC).

Since in this sense the condition (*) is equivalent to the term condition,
and (*) is also equivalent to the fact that 4A is a block of the congruence
h4A iConA2 , we have proved our proposition.

The basic term condition (TC) can be generalized to a term condition for
congruences, which we shall denote by (TCC).

Definition 12.1.4 Let A be an algebra and let θ1 and θ2 be congruences on


A. The algebra A satisfies the term condition (TCC) for congruences with
respect to the pair (θ1 , θ2 ) if for all n ∈ N\{0}, for all n-ary term operations
292 CHAPTER 12. TERM CONDITION AND COMMUTATOR

tA of A and for all (a, b) ∈ θ2 , (c2 , d2 ) ∈ θ1 , . . . , (cn , dn ) ∈ θ1 the implication

tA (a, c2 , . . . , cn ) = tA (a, d2 , . . . , dn ) ⇒
A A
t (b, c2 , . . . , cn ) = t (b, d2 , . . . , dn ) (TCC)

is satisfied.

This generalized term condition (TCC) also has an interpretation in group


theory. There is a well known one-to-one correspondence between normal
subgroups of a group G and congruence relations of the group: if N is a
normal subgroup of G, the relation θN defined by

(a, b) ∈ θN ⇔ ab−1 ∈ N ⇔ ∃h ∈ N (a = hb)


is a congruence. Conversely, if θ is a congruence on a group G, the congruence
class of the identity element of G forms a normal subgroup of G. From this
correspondence we have the following equivalence.

Lemma 12.1.5 The following two statements are equivalent for normal sub-
groups M and N of a group G:

(i) ab = ba for all a ∈ M and b ∈ N ,


(ii) G satisfies the term condition (TCC) with respect to (θM , θN ).

Proof: (ii) ⇒ (i): Suppose that (ii) is satisfied, and let a ∈ M and b ∈ N .
Consider the term operation tG (x, y, z) := yxz over G. Since (e, b) ∈ θN and
(e, a), (a, e) ∈ θM , for e the identity element of the group, we have tG (e, e, a)
= a = tG (e, a, e); using the term condition (TCC) on this gives tG (b, e, a) =
tG (b, a, e), meaning ab = ba.

(i) ⇒ (ii): As we have seen, in a group G the n-ary term operations t,


for n ≥ 1, have the form t(x1 , . . . , xn ) = xki11 · · · xkimm , for some i1 , . . . , im ∈
{1, . . . , n} and k1 , . . . , km ∈ Z. For (a, b) ∈ θN , (c2 , d2 ), . . . , (cn , dn ) ∈ θM
and t(a, c2 , . . . , cn ) = t(a, d2 , . . . , dn ), we have to show that we can replace
every occurrence of a in t(a, c2 , . . . , cn ) and in t(a, d2 , . . . , dn ) by b, and still
have the equality preserved. This can be done step by step, for every oc-
currence of a, using the commutativity assumption and the properties of
normal subgroups. We illustrate with an example. Suppose that ac2 ac3 =
ad2 ad3 , with (c2 , d2 ) ∈ θM , (c3 , d3 ) ∈ θM and (a, b) ∈ θN . Then there exist
12.2. THE COMMUTATOR 293

elements g1 and g2 ∈ M and h ∈ N such that c2 = g1 d2 , c3 = g2 d3 , and


a = hb. This gives us ac2 ac3 = hbg1 d2 hbg2 d3 = hbhg1 d2 bg2 d3 = hbhc2 bc3 ,
and similarly ad2 ad3 = hbd2 hbd3 = hbhd2 bd3 . Thus if ac2 ac3 = ad2 ad3 , we
obtain hbhc2 bc3 = hbhd2 bd3 , with hbh ∈ N . Left multiplication by b(hbh)−1
then gives bc2 bc3 = bd2 bd3 , as required.

The generalized term condition (TCC) can also be characterized using


the diagonal relation 4A . Let θ1 and θ2 be in ConA. We define Dθ2 :=
{((a, a), (b, b)) | (a, b) ∈ θ2 }, and denote by hDθ2 iConθ1 the congruence rela-
tion of the subalgebra θ1 ⊆ A × A which is generated by Dθ2 . Then we have
the following equivalence.

Theorem 12.1.6 Let θ1 and θ2 be congruences on an algebra A. Then A


satisfies the term condition (TCC) with respect to (θ1 , θ2 ) if and only if
the diagonal 4A is a union of congruence classes of hDθ2 iConθ1 , that is, iff
[4A ]hDθ2 i = 4A .
Conθ1

Proof: By Lemma 5.3.3, it is enough to show that for all a, b ∈ A and every
unary polynomial operation pθ on A × A we have

pθ ((a, a)) ∈ 4A ⇒ pθ ((b, b)) ∈ 4A . (*)

Every unary polynomial operation pA arises from an n-ary term opera-


tion tA , by substitution of constants for n − 1 variables. Therefore we
have p θ ((a, b)) = (pA (a), pA (b)) = (tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈
4A . Then the implication (*) is equivalent to the implication
tA (a, c2 , . . . , cn ) = tA (a, d2 , . . . , dn ) ⇒ tA (b, c2 , . . . , cn ) = tA (b, d2 , . . . , dn ),
for (c2 , d2 ), . . . , (cn , dn ) ∈ θ1 and (a, b) ∈ θ2 .

12.2 The Commutator


Now we define the following generalization of the concept of the term condi-
tion. Let A be an algebra and let α, β, δ be congruence relations on A. We
will consider the following condition (1):

(1) For all n ∈ N, for all n-ary term operations tA of A and all (a, b) ∈ β
and (c2 , d2 ), . . . , (cn , dn ) ∈ α,
294 CHAPTER 12. TERM CONDITION AND COMMUTATOR

(tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ δ ⇒


(tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ δ.

For a given α and β in ConA, we will consider the set of all δ ∈ ConA
satisfying (1). We note first that this set is non-empty, since for instance δ =
A × A satisfies (1). It is also true that this set is closed under intersection. If
the congruences δ1 and δ2 satisfy the implication of condition (1), for fixed
congruences α and β, then for any (a, b) ∈ β and (c2 , d2 ), . . . , (cn , dn ) ∈ α
we have

(tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ δ1 ∩ δ2


⇒ (tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ δ1 and
(tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ δ2
⇒ (tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ δ1 and
(tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ δ2
⇒ (t (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ δ1 ∩ δ2 .
A

These observations motivate the following definition:

Definition 12.2.1 Let α and β be congruences on an algebra A. The small-


est congruence relation δ ∈ ConA satisfying condition (1) is called the com-
mutator of α and β, and is denoted by [α, β].

Lemma 12.2.2 Let α and β be congruences on an algebra A. Then [α, β] ⊆


α ∩ β.
Proof: We shall show that condition (1) is satisfied for both δ = α and
δ = β; since [α, β] is the smallest congruence to satisfy condition (1) we then
must have [α, β] ⊆ α and [α, β] ⊆ β, and therefore [α, β] ⊆ α ∩ β.

Let (a, b) be a pair in β and let (ci , di ) be


in α for i = 2, . . . , n. Then (tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ α. From
(tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ β and (a, b) ∈ β we get
(tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ β
and
(tA (a, c2 , . . . , cn ), tA (a, d2 , . . . , dn )) ∈ β,
and by transitivity
(tA (b, c2 , . . . , cn ), tA (b, d2 , . . . , dn )) ∈ β.
12.2. THE COMMUTATOR 295

The following theorem connects the commutator with the term condition for
congruences.

Theorem 12.2.3 For any congruences α and β on an algebra A, the com-


mutator [α, β] is the least congruence relation δ ⊆ α ∩ β for which A/δ
satisfies the term condition with respect to (α/δ, β/δ).

Proof: For δ ∈ ConA with δ ⊆ α ∩ β, satisfaction of condition (1) is equiv-


alent to having A/δ satisfy the term condition with respect to (α/δ, β/δ).
The claim follows.

Let G be a group. Using the correspondence between normal subgroups N


and congruences θN , and the equivalence from Lemma 12.1.5, it is straight-
forward to verify that the usual group-theoretic commutator of two normal
subgroups M and N of a group G agrees with the commutator congruence
[θM , θN ] as defined in 12.2.1.

In group theory the operation of forming the commutator of a group G can


also be iterated. That is, we form the i-th iterated commutator (also called
the i-th derivation) of G, inductively by

D0 G = G, D1 G = [G, G], . . . , Di+1 G = [Di G, Di G].


This gives a series

G = D0 G ⊇ D1 G · · · ⊇ Di G ⊇ · · · (N ),
in which each Di+1 G is a normal subgroup of Di G and each quotient (or
factor) group Di G/Di+1 G is abelian. If this chain stops at some finite stage
n with Dn G equal to the trivial subgroup E of G, then the series (N) is called
a normal series with abelian factors, and G is said to be solvable. Thus a
group G is solvable if and only if there is a natural number n such that Dn G
= E.

To generalize this process for arbitrary algebras, we make the following def-
inition.

Definition 12.2.4 For any algebra A, we define for 5A = A × A and the


diagonal 4A the following congruences:
296 CHAPTER 12. TERM CONDITION AND COMMUTATOR

5◦A := 5A , 5k+1
A := [5kA , 5kA ],

(◦) (k+1) (k) (k)


5A := 5A , 5A := [5A , 5A ].
(n)
The algebra A is said to be solvable of degree n if 5A = 4A .
An algebra A is called nilpotent if it is solvable of degree 1, so that
[5A , 5A ] = 4A . In fact the following conditions are equivalent for an alge-
bra A:

(i) A is abelian;
(ii) A satisfies the term condition (1) from the beginning of Section 12.2;
(iii) The diagonal 4A is a block of a congruence relation on A × A;
(iv) [5A , 5A ] = 4A ;
(v) A is nilpotent of degree 1.

In Chapter 11 we introduced the concept of polynomially equivalent algebras,


as algebras having the same universe and the same polynomial operations.
The following theorem of H. P. Gumm ([52]) uses polynomial equivalence to
characterize abelian algebras in congruence permutable varieties.

Theorem 12.2.5 Let V be a congruence permutable variety. For every al-


gebra A ∈ V the following are equivalent:

(i) A is abelian,
(ii) A is polynomially equivalent to a module over a ring.
Proof: (ii) ⇒ (i): it is an easy exercise to show that the polynomial oper-
ations of a module satisfy the term condition (TC) from Definition 12.1.1.
Therefore modules are abelian. This is also true for algebras which are poly-
nomially equivalent to a module over a ring.

(i) ⇒ (ii): Let A be an abelian algebra in V , so that A satisfies the term


condition (TC). Since V is congruence permutable, there is a Mal’cev term
p over V satisfying the identities p(x, x, y) ≈ y and p(x, y, y) ≈ x in the
variety V . We now use the term operation pA to define a module structure
on the set A, and show that A is polynomially equivalent to this module.
The rather lengthy proof will be broken into six steps.
12.2. THE COMMUTATOR 297

Step 1. We fix an element 0 ∈ A (since A is non-empty) and define a binary


operation + and a unary operation − on A, by

x + y : = p (x, 0, y),
−x : = p (0, x, 0).
Then (A; +, −) is an abelian group.

Proof: We note first that 0 is a neutral element for the operation +, since
for any a ∈ A we have a + 0 = p(a, 0, 0) = a = p(0, 0, a) = 0 + a.
Next we use the fact that the term condition is satisfied for all polynomial
operations, including all operations built up from +, − and 0. Then for
any a ∈ A, we have p(0, 0, −a) = p(0, 0, p(0, a, 0)) = p(0, a, 0), and by ap-
plying the term condition (TC) to the equation p(0, 0, −a) = p(0, a, 0) we
can replace the first 0 by any element from A. Replacing 0 by a we ob-
tain p(a, 0, −a) = p(a, a, 0), and therefore a + (−a) = 0. Similarly we have
p(−a, 0, 0) = p(p(0, a, 0), 0, 0) = p(0, a, 0), and using the term condition to
replace 0 by a gives p(−a, 0, a) = p(0, a, a), and (−a) + a = 0. Here we are
also using commutativity, which follows from b = p(a, a, b) = p(b, a, a) by
the term condition if we replace a by 0.

Finally, we have associativity, which follows from (a + 0) + (b + 0) = (a + b) +


(0 + 0), using the term condition to replace 0 by c at the position indicated
by underlining.
Step 2. Every polynomial operation of A is affine with respect to
(A; +, −, 0); that is, for every n-ary polynomial operation f A of A and all
a1 , . . . , an , b1 , . . . , bn ∈ A we have

f A (a1 + b1 , . . . , an + bn ) = f A (a1 , . . . , an ) + f A (b1 , . . . , bn ) − f A (0, . . . , 0).

Proof: Using the term condition on the equation

f A (a1 + 0, . . . , an + 0) + f A (0, . . . , 0) =
f A (a1 , . . . , an ) + f A (0 + 0, . . . , 0 + 0)

to replace 0 by b1 (in the two underlined places) we get

f A (a1 + b1 , . . . , an + 0) + f A (0, . . . , 0) =
f A (0 + b1 , . . . , 0 + 0) + f A (a1 , . . . , an ),
298 CHAPTER 12. TERM CONDITION AND COMMUTATOR

and continuing in this way we finally get

f A (a1 + b1 , . . . , an + bn ) + f A (0, . . . , 0) =
f A (a1 , . . . , an ) + f A (b1 , . . . , bn ).

Step 3. Every unary polynomial operation r of A with r(0) = 0 is an endo-


morphism of (A; +, −, 0).

Proof: This is a direct consequence of Step 2.

Step 4. Let R be the set of all unary polynomial operations of A mapping


0 to 0. We define operations

(r + s)(a) := r(a) + s(a),


(−r)(a) := −r(a),
0(a) := 0, and
(r ◦ s)(a) := r(s(a))
on R. Then (R; +, −, 0, ◦) forms a ring.

Proof: Since the set of all endomorphisms of (A; +, −, 0) forms a ring, we


only have to show that (R; +, −, 0, ◦) is a subring of the full endomorphism
ring of (A; +, −, 0). This is clear since if r and s are unary polynomial oper-
ations of A which fix 0, then so are r + s, −r, 0 and r ◦ s.

Step 5. (A; +, −, 0, R) is a module over the ring (R; +, −, 0, ◦).

Proof: This follows from the fact that (R; +, −, 0, ◦) is a ring of endomor-
phisms of (A; +, −, 0).

Step 6. The polynomial operations of the algebra A are exactly the opera-
tions of the form f (x1 , . . . , xn ) = c + r1 x1 + · · · + rn xn , for n ∈ N, c ∈ A and
r1 , . . . , rn ∈ R.

Proof: It is clear that all operations of the given form are polynomial op-
erations of A. Assume that f A is a polynomial operation of A. From Step 2
we have
12.3. EXERCISES 299

f A (a1 , . . . , an ) = f A (a1 + 0, 0 + a2 , . . . , 0 + an )
= (f A (a1 , 0, . . . , 0) − f A (0, . . . , 0))+
f A (0, a2 , . . . , an )
..
.
= (f A (a1 , 0, . . . , 0) − f A (0, . . . , 0)) + · · ·
+(f A (0, . . . , 0, an ) − f A (0, . . . , 0))+
f A (0, . . . , 0).
Now the polynomial operations induced by the polynomials

r1 (x) := f (x, 0, . . . , 0) − f (0, . . . , 0)


..
.
rn (x) := f (0, . . . , 0, x) − f (0, . . . , 0)
are elements of R, so we have

f A (a1 , . . . , an ) = f A (0, . . . , 0) + r1A (a1 ) + · · · + rnA (an ).

12.3 Exercises
12.3.1. Verify that any left-zero, right-zero or zero-semigroup is abelian, as
claimed in Example 12.1.2 (2).

12.3.2. Prove that any subalgebra of an abelian algebra is abelian, and that
every direct product of abelian algebras is abelian.

12.3.3. Prove that the collection of all abelian algebras in a congruence per-
mutable variety forms a subvariety.

12.3.4. Does the algebra A = ({a, b, c, d}; f ), with

f a b c d
a b a c d
b a d b c
c d c a b
d c b d a

satisfy the term condition?


300 CHAPTER 12. TERM CONDITION AND COMMUTATOR

12.3.5. Complete the proof given in this chapter that a group is abelian iff
it is commutative. Prove that a ring is abelian iff it satisfies xy = 0.

12.3.6. Prove that every module over a ring is abelian.

12.3.7. Let A be a module, and let p(x, y, z) be a ternary polynomial satis-


fying the identities p(x, x, y) ≈ p(y, x, x) ≈ y in A. Show that p(x, y, z) can
be nothing other than x − y + z.
Chapter 13

Complete Sublattices

We have seen that the collection of all varieties of a given type forms a com-
plete lattice, as does the collection of all clones of operations defined on a
fixed set. These two lattices play an important role in universal algebra, but
their study is made difficult by the fact that the lattices are large (usually
uncountably infinite) and very complex. Thus we look for new approaches
or tools to use in their study. One such approach is to try to study some
smaller parts of the large lattice. Such smaller parts should have the same
algebraic structure, so we are interested in the study of complete sublattices
of a complete lattice.

In this chapter we describe some new methods for producing complete sub-
lattices of a given complete lattice. As we saw in Chapter 2, the two complete
lattices we are interested in both arise as the lattice of closed sets under a
closure operation, which can be obtained via a Galois-connection. This leads
us to the study of new closure operators and Galois-connections, which pro-
duce sublattices of the original lattice of closed sets. We develop the theory
of such sublattices in this chapter. In the next chapter we will apply this
general theory to our two specific examples of the lattices of varieties and
clones.

13.1 Conjugate Pairs of Closure Operators


Our basic concepts of closure operators and Galois-connections were devel-
oped in Chapter 2. We saw there that any closure operator γ defined on a
set A gives us a closure system, the set Hγ of all γ-closed subsets of A, and

301
302 CHAPTER 13. COMPLETE SUBLATTICES

that any such closure system forms a complete lattice. In this lattice, the
meet operation, also the greatest lower bound or infimum with respect to
the partial order of set inclusion, is the operation of intersection. The join
operation however is not usually just the union: we have

_ \ [
B= {H ∈ Hγ | H ⊇ B}

for every B ⊆ Hγ . One situation when we do have the join operation equal
to union is the following.

Definition 13.1.1 A closure operator γ defined on a set A is said to be


additive if for all T ⊆ A, γ(T ) =
S
γ(a). (Note that we write γ(a) for
a∈T
γ({a})).

We can show easily that when γ is an additive closure operator, the least
upper bound operation on the lattice Hγ agrees with B (see M. Reichel, [98]
S

or D. Dikranjan and E. Giuli, [39]). We always have B ⊆ γ( B) because


S S

of the extensivity of γ. Conversely, if a ∈ B then a ∈ B ∈ B for some set


S

B ∈ B, and since B ∈ Hγ we have γ(a) ⊆ B and γ( B) = S γ(a) ⊆


S S S
a∈ B
B = B. This means that B is γ-closed and B = B. In other
S S S S W S
S
a∈ B
words, when γ is an additive closure operator on A, the corresponding closure
system forms a complete sublattice of the lattice (P(A); ∧ = , ∨ = ) of
T S

all subsets of A.

Definition 13.1.2 Let γ1 be a closure operator defined on the set A and let
γ2 be a closure operator defined on the set B. Let R ⊆ A × B be a relation
between A and B. Then γ1 and γ2 are called conjugate with respect to R if
for all t ∈ A and all s ∈ B, γ1 (t) × {s} ⊆ R iff {t} × γ2 (s) ⊆ R.

This property of conjugacy of two closure operators is defined in terms of


individual elements. When the two operators are also additive, we can ex-
tend this to sets of elements. Thus when (γ1 , γ2 ) is a pair of additive closure
operators, γ1 on A and γ2 on B, and they are conjugate with respect to a re-
lation R ⊆ A × B, then for all X ⊆ A and all Y ⊆ B we have X × γ2 (Y ) ⊆ R
if and only if γ1 (X) × Y ⊆ R.
13.1. CONJUGATE PAIRS OF CLOSURE OPERATORS 303

Examples of conjugate pairs of additive closure operators will be given in


the next chapter. In the rest of this section we develop the general theory of
such operators. We assume throughout that we have two sets A and B, and
that R is a relation from A to B. We know from Chapter 2 that this rela-
tion induces a Galois-connection (µ, ι) between A and B, for which the two
maps µι and ιµ are closure operators. Moreover, the pair (µι, ιµ) is always
conjugate with respect to the original relation R. But µι and ιµ need not be
additive in general.

Our goal is to construct, given a relation R and the induced Galois-


connection, a new relation which is connected to R, but gives a smaller
lattice of closed sets. One way to do this is by using conjugate pairs of
closure operators.

Definition 13.1.3 Let γ := (γ1 , γ2 ) be a conjugate pair of additive closure


operators, with respect to a relation R ⊆ A × B. Let Rγ be the following
relation between A and B:

Rγ := {(t, s) ∈ A × B | γ1 (t) × {s} ⊆ R}.

We now have two relations and Galois-connections between A and B. We


have the original relation R, with induced Galois-connection (µ, ι) between
A and B, and corresponding lattices of closed sets. We also have the new
relation Rγ and its induced Galois-connection, which we shall denote by
(µγ , ιγ ). The following theorem gives some properties relating the two Galois-
connections.

Theorem 13.1.4 Let γ = (γ1 , γ2 ) be a conjugate pair of additive closure


operators with respect to R ⊆ A × B. Then for all T ⊆ A and S ⊆ B, the
following properties hold:

(i) µγ (T ) = µ(γ1 (T )),

(ii) µγ (T ) ⊆ µ(T ),

(iii) γ2 (µγ (T )) = µγ (T ),

(iv) γ1 (ι(µγ (T ))) = ι(µγ (T )),

(v) µγ (ιγ (S)) = µ(ι(γ2 (S))); and dually,


304 CHAPTER 13. COMPLETE SUBLATTICES

(i 0 ) ιγ (S) = ι(γ2 (S)),

(ii 0 ) ιγ (S) ⊆ ι(S),

(iii0 ) γ1 (ιγ (S)) = ιγ (S),

(iv 0 ) γ2 (µ(ιγ (S))) = µ(ιγ (S)),

(v 0 ) ιγ (µγ (T )) = ι(µ(γ1 (T ))).

Proof: We will prove only (i)-(v), the proofs of the other propositions being
dual.
(i) By definition,
µγ (T ) = {b ∈ B | ∀a ∈ T ((a, b) ∈ Rγ )}
= {b ∈ B | ∀a ∈ T (γ1 (a) × {b} ⊆ R)}
= {b ∈ B | ∀a ∈ γ1 (T ) ((a, b) ∈ R)} = µ(γ1 (T )).

(ii) Since γ1 is a closure operator, we have T ⊆ γ1 (T ); and thus, since µ


reverses inclusions, µ(T ) ⊇ µ(γ1 (T )). Using (i) we obtain µγ (T ) ⊆ µ(T ).

(iii) Extensivity of γ2 implies µγ (T ) ⊆ γ2 (µγ (T )). Now let S ⊆ µγ (T ). Then


for all s ∈ S and for all t ∈ T , (t, s) ∈ Rγ , and by definition of Rγ we get
{t} × γ2 (s) ⊆ R. Idempotency of γ2 gives {t} × γ2 (γ2 (s)) ⊆ R and thus
γ2 (s) ⊆ µγ (T ) for all s ∈ S. By additivity of γ2 we get γ2 (S) = γ2 (s) ⊆
S
s∈S
µγ (T ); and taking S = µγ (T ) we obtain γ2 (µγ (T )) ⊆ µγ (T ). Altogether we
have the equality µγ (T ) = γ2 (µγ (T )).

(iv) γ1 (ι(µγ (T ))) = γ1 (ι(γ2 (µγ (T )))) = γ1 (ιγ (µγ (T )))


= ιγ (µγ (T )) = ι(γ2 (µγ (T ))) = ι(µγ (T )), by parts (i) and (i0 ).

(v) µγ (ιγ (S)) = µ(γ1 (ιγ (S))) = µ(ιγ (S)) = µ(ι(γ2 (S))).

The next theorem is our “Main Theorem for Conjugate Pairs of Closure
Operators.” It shows that when we consider sets which are closed under the
original Galois-connection from R, there are four equivalent conditions for
such sets to also be closed under the new connection from Rγ .

Theorem 13.1.5 (Main Theorem for Conjugate Pairs of Additive Closure


Operators) Let R be a relation between sets A and B, with corresponding
Galois-connection (µ, ι). Let γ = (γ1 , γ2 ) be a conjugate pair of additive
13.1. CONJUGATE PAIRS OF CLOSURE OPERATORS 305

closure operators with respect to the relation R. Then for all sets T ⊆ A
with ι(µ(T )) = T the following propositions (i) - (iv) are equivalent; and
dually, for all sets S ⊆ B with µ(ι(S)) = S, propositions (i0 ) - (iv0 ) are
equivalent:

(i) T = ιγ (µγ (T )),

(ii) γ1 (T ) = T ,

(iii) µ(T ) = µγ (T ),

(iv) γ2 (µ(T )) = µ(T ); and dually,

(i 0 ) S = µγ (ιγ (S)),

(ii 0 ) γ2 (S) = S,

(iii 0 ) ι(S) = ιγ (S),

(iv 0 ) γ1 (ι(S)) = ι(S).

Proof: We prove the equivalence of (i), (ii), (iii) and (iv); the equivalence
of the four dual statements can be proved dually.

(i) ⇒ (ii): We always have T ⊆ γ1 (T ), since γ1 is a closure operator. Since


ιµ is a closure operator we also have γ1 (T ) ⊆ ιµ(γ1 (T )) = ιγ (µγ (T )) = T ,
by 13.1.4 (v0 ) and (i).

(ii) ⇒ (iii): We have µ(T ) = µ(γ1 (T )) = µγ (T ) by (ii) and 13.1.4 (i).

(iii) ⇒ (iv): We have γ2 (µ(T )) = γ2 (µγ (T )) = µγ (T ), using 13.1.4 (iv) and


(iii).

(iv) ⇒ (i): Since the ιγ µγ -closed sets are exactly the sets of the form
ιγ (S), we have to find a set S ⊆ B with T = ιγ (S). But we have
ιγ (µ(T )) = ι(γ2 (µ(T ))) = ι(µ(T )) = T , by 13.1.4 (i0 ) and our assumption
that T is ιµ-closed.

Before we use this Main Theorem to produce our complete sublattices, we


need the following additional properties.
306 CHAPTER 13. COMPLETE SUBLATTICES

Theorem 13.1.6 Let R be a relation between sets A and B, with Galois-


connection (µ, ι). Let γ = (γ1 , γ2 ) be a conjugate pair of additive closure
operators with respect to R. Then for all sets T ⊆ A and S ⊆ B, the following
properties hold:

(i) γ1 (T ) ⊆ ι(µ(T )) ⇔ ι(µ(T )) = ιγ (µγ (T ));

(ii) γ1 (T ) ⊆ ι(µ(T )) ⇔ γ1 (ι(µ(T ))) = ι(µ(T ));

(i 0 ) γ2 (S) ⊆ µ(ι(S)) ⇔ µ(ι(S)) = µγ (ιγ (S));

(ii 0 ) γ2 (S) ⊆ µ(ι(S)) ⇔ γ2 (µ(ι(S))) = µ(ι(S)).

Proof: We prove only (i0 ) and (ii0 ); the others are dual.

(i0 ) Suppose that γ2 (S) ⊆ µ(ι(S)). Since µι is a closure operator we have


µ(ι(S)) = µ(ι(µ(ι(S)))) ⊇ µ(ι(γ2 (S))) = µγ (ιγ (S)), by our assumption and
by 13.1.4 (v). Also S ⊆ γ2 (S), and hence we have µ(ι(S)) ⊆ µ(ι(γ2 (S))) =
µγ (ιγ (S)), again by 13.1.4 (v). For the converse we have γ2 (S) ⊆ µ(ι(γ2 (S)))
= µγ (ιγ (S)) = µ(ι(S)), using the extensivity of µι, 13.1.4 (v) and our as-
sumption.

(ii0 ) Let γ2 (S) ⊆ µ(ι(S)). Then S ⊆ γ2 (S) implies γ2 (µ(ι(S))) ⊆


γ2 (µ(ι(γ2 (S)))). We also have γ2 (µ(ι(γ2 (S)))) = γ2 (µ(ιγ (S))) by Theorem
13.1.4 (i0 ), and γ2 (µ(ι(γ2 (S)))) = µ(ιγ (S)) by 13.1.4 (iv0 ). In addition,
µ(ιγ (S)) = µ(ι(γ2 (S))) ⊆ µ(ι(µ(ι(S)))) = µ(ι(S)). Altogether we obtain
γ2 (µ(ι(S))) ⊆ µ(ι(S)). The opposite inclusion is always true, since γ2 is a
closure operator. Conversely, S ⊆ µ(ι(S)) implies γ2 (S) ⊆ γ2 (µ(ι(S))) =
µ(ι(S)), by the extensivity of µι, the monotonicity of γ2 and our assump-
tion.

Now we are ready to produce our complete sublattices. We know that from
the original relation R and Galois-connection (µ, ι) we have two (dually
isomorphic) complete lattices of closed sets, the lattices Hµι and Hιµ . We
also get two complete lattices of closed sets from the new Galois-connection
(µγ , ιγ ) induced by Rγ . Our result is that each new complete lattice is in
fact a complete sublattice of the corresponding original complete lattice.

Theorem 13.1.7 Let R be a relation from A to B, with induced Galois-


connection (µ, ι). Let γ = (γ1 , γ2 ) be a conjugate pair of additive closure
13.1. CONJUGATE PAIRS OF CLOSURE OPERATORS 307

operators with respect to R. Then the lattice Hµγ ιγ of sets closed under µγ ιγ
is a complete sublattice of the lattice Hµι , and dually the lattice Hιγ µγ is a
complete sublattice of the lattice Hιµ .
Proof: As a closure system Hµγ ιγ is a complete lattice, and we have to
prove that it is a complete sublattice of the complete lattice Hµι . We begin
by showing that it is a subset. Let S ∈ Hµγ ιγ , so that µγ (ιγ (S)) = S. Then
µ(ι(S)) = µ(ι(µγ (ιγ (S)))) = µ(ι(µ(ι(γ2 (S))))) = µ(ι(γ2 (S))) = µγ (ιγ (S)) =
S by 13.1.4 (v), and thus S ∈ Hµι . This shows Hµγ ιγ ⊆ Hµι . Since every S
in Hµγ ιγ satisfies µ(ι(S)) = S, we can apply Theorem 13.1.5 (ii0 ), to get
S ∈ Hµγ ιγ ⇐⇒ S = µγ (ιγ (S)) ⇐⇒ S = γ2 (S) ⇐⇒ S ∈ Hγ2 .
As we remarked after Definition 13.1.1, the fact that γ2 is an additive closure
operator means that the corresponding closure system is a complete sublat-
tice of the lattice (P(B); ∩, ∪) of all subsets of B; that is, on our lattice
Hµγ ιγ the meet operation agrees with ordinary set-intersection and the join
agrees with union. We already know that the meet operation in Hµι also
agrees with intersection, so we only need to show that Hµγ ιγ is closed under
the join operation of Hµι . Let (Sk )k∈J be an indexed family of subsets of B.
Then
W W S
µγ (ιγ ( Sk )) = γ2 ( Sk ) = γ2 (µ(ι( Sk )))
k∈J S k∈J S k∈J S W
= γ2 (µ(ιγ ( Sk ))) = µ(ιγ ( Sk )) = µ(ι( Sk )) = Sk ,
k∈J k∈J k∈J k∈J

by Theorem 13.1.4 (iv0 ); and then using 13.1.4 (iii) we have


S T T S
ι( Sk ) = ι(Sk ) = ιγ (Sk ) = ιγ ( Sk ).
k∈J k∈J k∈J k∈J

Thus conjugate pairs of additive closure operators give us a way to construct


complete sublattices of a given closure lattice. We may also define an order
relation on the set of all conjugate pairs of additive closure operators: for α
= (γ1 , γ10 ) and β = (γ2 , γ20 ) we set
α ≤ β :⇔ (∀T ⊆ A)(∀S ⊆ B)[γ1 (T ) ⊆ γ2 (T ) and γ10 (S) ⊆ γ20 (S)].
When α ≤ β, it can be shown that the lattice Hµβ ιβ is a sublattice of Hµα ια ,
and dually that Hιβ µβ is a sublattice of Hια µα .

The following additional properties may also be verified:


308 CHAPTER 13. COMPLETE SUBLATTICES

(i) γ1 (ι(µ(T ))) = ι(µ(T )) ⇐⇒ T = ι(µ(γ1 (T ))), and

(i 0 ) γ2 (µ(ι(S))) = µ(ι(S)) ⇐⇒ S = µ(ι(γ2 (T ))).

S. Arworn in [1] has generalized the theory of conjugate pairs of additive


closure operators to the situation of conjugate pairs of extensive, additive
operators.

13.2 Galois Closed Subrelations


In the previous section we developed a method to produce complete sublat-
tices of a given complete lattice. We started with a relation R which induced
a Galois-connection (µ, ι) between two sets A and B, and used the two clo-
sure operators µι and ιµ to produce our (dually isomorphic) lattices of closed
sets. We then used two new closure operators on our sets, which are additive
and conjugate with respect to our relation R, to determine a new relation
Rγ , which in turn induces a Galois-connection and closure operators. We
showed that the sets closed under these new operators form complete sub-
lattices of the original lattices of closed sets.

In this section we examine in more detail the relations such as Rγ which


determine complete sublattices of our original lattices. As our starting point,
we assume as before that we have a relation R from A to B, which induces
a Galois-connection (µ, ι) and from which we obtain complete lattices Hιµ
and Hµι of closed subsets of A and of B respectively. Then we consider a
subrelation R0 of the initial relation R, from which we obtain a new Galois-
connection and two new complete lattices. We describe a property of the
subrelation R0 which is sufficient to guarantee that the new complete lattices
will be complete sublattices of the original lattices. This property is called
the Galois-closed subrelation property. Moreover, we show that any complete
sublattices of our original lattices arise in this way.

Definition 13.2.1 Let R and R0 be relations between sets A and B, and


let (µ, ι) and (µ0 , ι0 ) be the Galois-connections between A and B induced by
R and R0 , respectively. The relation R0 is called a Galois-closed subrelation
of R if:

1) R0 ⊆ R, and
13.2. GALOIS CLOSED SUBRELATIONS 309

2) ∀ T ⊆ A, ∀ S ⊆ B (µ0 (T ) = S and ι0 (S) = T ⇒ µ(T ) = S and


ι(S) = T ).
Directly from this definition we can prove the following equivalent character-
izations of Galois-closed subrelations, as shown by B. Ganter and R. Wille
in [47] and by S. Arworn, K. Denecke and R. Pöschel in [4] and [22].

Proposition 13.2.2 Let R0 ⊆ R be relations between sets A and B. Then


the following are equivalent:
(i) R0 is a Galois-closed subrelation of R;
(ii) For any T ⊆ A, if ι0 µ0 (T ) = T then µ(T ) = µ0 (T ), and for any S ⊆ B,
if µ0 ι0 (S) = S then ι(S) = ι0 (S);
(iii) For all T ⊆ A and for all S ⊆ B the equations ι0 µ0 (T ) = ι µ0 (T ) and
µ0 ι0 (S) = µ ι0 (S) are satisfied.
Proof: (i) ⇒ (ii): Let R0 be a Galois-closed subrelation of R, and let T ⊆ A
with ι0 µ0 (T ) = T . Define S to be the set µ0 (T ). Then we have µ0 (T ) = S and
ι0 (S) = T , and applying the second part of the definition of a Galois-closed
subrelation gives us µ(T ) = S and ι(S) = T . In particular, µ(T ) = S =
µ0 (T ). The claim for subsets S of B is proved similarly.

(ii) ⇒ (iii): Assume that T ⊆ A and S ⊆ B. Then µ0 (T ) is a closed subset


of B under the closure operator µ0 ι0 , and ι0 (S) is a closed subset of A under
the closure operator ι0 µ0 . This means that

µ0 ι0 µ0 (T ) = µ0 (T ) and ι0 µ0 ι0 (S) = ι0 (S).

Then by condition (ii) we get

ι µ0 (T ) = ι0 µ0 (T ) and µ ι0 (S) = µ0 ι0 (S).

(iii) ⇒ (i): Assume now that T ⊆ A and S ⊆ B such that µ0 (T ) = S and


ι0 (S) = T . It follows that

µ0 ι0 (S) = µ0 (T ), ι0 µ0 (T ) = ι0 (S), and


µ ι0 (S) = µ(T ), ι µ0 (T ) = ι(S).

Then by condition (iii) we get

µ(T ) = µ0 (T ) = S and ι(S) = ι0 (S) = T.


310 CHAPTER 13. COMPLETE SUBLATTICES

This shows that R0 is a Galois-closed subrelation of R.

We leave it as an exercise for the reader to verify, using the results of Section
13.1, that if γ := (γ1 , γ2 ) is a pair of additive closure operators which are
conjugate with respect to a relation R ⊆ A × B, then the relation Rγ of
Definition 13.1.3 is a Galois-closed subrelation of R.

Before we can prove our main theorem, we need the following well-known
result for Galois-connections. The proof is straightforward, and is left as an
exercise for the reader (see Exercise 2.4.2).

Lemma 13.2.3 Let R ⊆ A × B be a relation between the sets A, B and let


(µ, ι) be the Galois-connection between A and B induced by R. Then for any
families {Tj ⊆ A | j ∈ J} and {Sj ⊆ B | j ∈ J}, the following equalities
hold:

S T
(i) µ( Tj ) = µ(Tj ),
j∈J j∈J
S T
(ii) ι( Sj ) = ι(Sj ).
j∈J j∈J

Ganter and Wille showed in [47] that there is a one-to-one correspondence


between Galois-closed subrelations of a relation R ⊆ A × B and complete
sublattices of the corresponding lattices Hι µ and Hµ ι of closed sets. (Note
however that the terminology in [47] is different from ours.) The remainder
of this section is devoted to the proof of this claim. We will show that
any Galois-closed subrelation R0 of the relation R yields a lattice of closed
subsets of A which is a complete sublattice of the corresponding lattice Hι µ
for R. Conversely, we also show that any complete sublattice of the lattice
Hι µ occurs as the lattice of closed sets induced from some Galois-closed
subrelation of R. Dual results of course hold for the set B.

Theorem 13.2.4 ([47], [22], and [4]) Let R ⊆ A × B be a relation between


sets A and B, with induced Galois-connection (µ, ι). Let Hι µ be the corre-
sponding lattice of closed subsets of A.

(i) If R0 ⊆ A × B is a Galois-closed subrelation of R, then the class


UR0 := Hι0 µ0 is a complete sublattice of Hι µ .
13.2. GALOIS CLOSED SUBRELATIONS 311

(ii) If U is a complete sublattice of Hι µ , then the relation


[
RU := {T × µ(T ) | T ∈ U}

is a Galois-closed subrelation of R.

(iii) For any Galois-closed subrelation R0 of R and any complete sublattice


U of Hι µ , we have
URU = U and RUR0 = R0 .
Proof: (i) We begin by verifying that any subset of A which is closed under
the operator ι0 µ0 is also closed under ιµ, so that the lattice Hι0 µ0 is at least
a subset of Hι µ . Let T ∈ Hι0 µ0 , so that ι0 µ0 (T ) = T . By Proposition 13.2.2,
parts (ii) and (iii), we have

ι µ(T ) = ι µ0 (T ) = ι0 µ0 (T ) = T.

Therefore, Hι0 µ0 ⊆ Hι µ .
Next we have to show that this subset is in fact a sublattice. This means
showing that for any family {Tj | j ∈ J} of sets in Hι0 µ0 , both the sets
{Tj | j ∈ J} and {Tj | j ∈ J} are in Hι0 µ0 .
V W
Hιµ Hι µ
We start with the meet operation. We know from above that the collection
{Tj | j ∈ J} is also a family of sets in Hι µ . Since
^ \ \
{Tj | j ∈ J} = Tj = ιµ( Tj ),
Hι µ j∈J j∈J

we have
ι0 µ0 (Tj )).
\ \
Tj = ιµ(
j∈J j∈J

Applying Lemma 13.2.3 to this, and then using Proposition 13.2.2 (iii) twice,
we get

Tj = ιµι0 ( µ0 (Tj )) = ιµ0 ι0 ( µ0 (Tj )) = ι0 µ0 ι0 ( µ0 (Tj )).


\ [ [ [

j∈J j∈J j∈J j∈J

Using the closure operator properties, and Lemma 13.2.3 once more, we get

ι0 µ0 ( Tj ) = ι0 µ0 (
T 0 0
ι µ (Tj )) = ι0 µ0 ι0 (
T S 0
µ (Tj ))
j∈J
S j∈J0 j∈J
= ι0 ( µ (Tj )) ⊆ ι0 µ0 (Tj ) = Tj ,
j∈J
312 CHAPTER 13. COMPLETE SUBLATTICES

for all j ∈ J.

Thus we have ι0 µ0 ( Tj ) ⊆
T T
Tj . The reverse inclusion is always true for
j∈J j∈J
a closure operator, so altogether we have ι0 µ0 (
T T
Tj ) = Tj . This shows
j∈J j∈J
Tj ∈ Hι0 µ0 .
T
that
j∈J

Now we consider the join,


_ [
{Tj | j ∈ J} = ι µ( Tj ).
Hι µ j∈J

By repeated use of Lemma T 13.2.3 and Proposition 13.2.2, we see that


ι0 µ0 ι µ( Tj ) = ι0 µ0 ι(
S
µ(Tj )) (13.2.3)
j∈J j∈J
ι0 µ0 ι( µ0 (Tj ))
T
= (13.2.2 (ii))
j∈J
ι0 µ0 ι µ0 (
S
= Tj ) (13.2.3)
j∈J
ι0 µ0 ι0 µ0 (
S
= Tj ) (13.2.2 (iii))
j∈J
= 0 0 S
ιµ( Tj ) (by closure properties)
j∈J
ι µ0 (
S
= Tj ) (13.2.2 (iii))
Tj∈J0
= ι( µ (Tj )) (13.2.3)
j∈J
T
= ι( µ(Tj )) (13.2.2 (ii))
j∈J
S
= ι µ( Tj ) (13.2.3).
j∈J
Tj ) is also a fixed point under ι0 µ0 , so that it too is
S
This shows that ι µ(
j∈J
an element of Hι0 µ0 .

(ii) Now let U be any complete sublattice of Hι µ . We define the relation


[
RU := {T × µ(T ) | T ∈ U},
which we will prove is a Galois-closed subrelation of R. First, for each non-
empty T ∈ U we have µ(T ) = {s ∈ B | ∀ t ∈ T, (t, s) ∈ R}, so that
T × µ(T ) ⊆ R. Therefore RU ⊆ R. To show that the second condition of the
definition of a Galois-closed subrelation is met, we let (µ0 , ι0 ) be the Galois-
connection between sets A and B induced by RU , and assume that µ0 (T ) = S
and ι0 (S) = T for some T ⊆ A and S ⊆ B. Our goal is to prove that
13.2. GALOIS CLOSED SUBRELATIONS 313

µ(T ) = S and ι(S) = T. (∗)

The proof that (*) holds will be divided into a number of steps. We begin
with two facts we shall need.

Fact 1: For any set T ∈ U, we have µ0 (T ) = µ(T ).

Proof of Fact 1: Let T ∈ U. By definition we have

µ0 (T ) = {s ∈ B | ∀ t ∈ T, (t, s) ∈ RU }.

This means that µ0 (T ) is the greatest subset of B with T × µ0 (T ) ⊆ RU .


But from the definition of RU we have T × µ(T ) ⊆ RU . Therefore we have
µ(T ) ⊆ µ0 (T ). The opposite inclusion also holds since RU ⊆ R. Altogether
we have µ0 (T ) = µ(T ).
Fact 2: For any set T in U, if µ(T ) = S then ι(S) = ι0 (S).

Proof of Fact 2: Let T ∈ U and let µ(T ) = S. This means that T ×S ⊆ RU .


Since ι0 (S) = {t ∈ A | ∀ s ∈ S, (t, s) ∈ RU }, the set ι0 (S) is the greatest
subset of A with ι0 (S) × S ⊆ RU . This shows that T ⊆ ι0 (S). But we also
have T = ιµ(T ) = ι(S), so we now have ι(S) ⊆ ι0 (S). The opposite inclu-
sion also holds since, as we showed just above, RU ⊆ R. Altogether we get
ι0 (S) = ι(S).

Returning now to the proof of (*), we let T ⊆ A and S ⊆ B, with µ0 (T ) = S


and ι0 (S) = T . If T is the empty set, we use Facts 1 and 2 to conclude that
(*) holds; so we may now assume that T is non-empty. For each t ∈ T , we
define

{T 0 ∈ U | t ∈ T 0 and S ⊆ µ(T 0 )}.


\
Dt =

We will show the following facts:

(a) Dt 6= ∅.

(b) µ0 ({t}) = µ0 (Dt ).

(c) ι0 µ0 ({t}) = Dt .
314 CHAPTER 13. COMPLETE SUBLATTICES

S
(d) T = Dt .
t∈T

(e) µ(T ) = µ0 (T ) = S and ι(S) = ι0 (S) = T , and (*) holds.


Proof of (a): Since t ∈ T and µ0 (T ) = S, we have (t, s) ∈ RU for all s ∈ S.
From the definition of RU we see that for each s in S there exists a set
Ts ∈ U such that (t, s) ∈ Ts × µ(Ts ). Therefore t ∈ Ts ∈ U, and
T T
Ts ,
s∈S s∈S
S ⊆ µ( Ts ), which shows that Dt 6= ∅.
T
s∈S

Proof of (b): If s ∈ µ0 ({t}), then (t, s) ∈ RU and there is a set T 0 ∈ U


containing t for which (t, s) ∈ T 0 × µ(T 0 ). Fact 1 tells us that µ0 (T 0 ) = µ(T 0 ).
By definition we have Dt ⊆ T 0 , and applying µ0 reverses this inclusion to
µ0 (Dt ) ⊇ µ0 (T 0 ). Since s is in µ0 (T 0 ) we now have s ∈ µ0 (Dt ). This shows that
µ0 ({t}) ⊆ µ0 (Dt ). Conversely, t ∈ Dt and so {t} ⊆ Dt , and then applying µ0
gives µ0 ({t}) ⊇ µ0 (Dt ). Altogether we have the equality µ0 ({t}) = µ0 (Dt ).
Proof of (c): From (b) we have µ0 ({t}) = µ0 (Dt ). Since Dt ∈ U, we have
ι0 µ0 ({t}) = ι0 µ0 (Dt ) = ι0 µ(Dt ) = ιµ(Dt ) = Dt , by Facts 1 and 2.
Proof of (d): It is clear from the definition of Dt that T ⊆
S
Dt . For the
t∈T
opposite inclusion we have

T = ι0 µ0 (T ) = ι0 µ0 (
[
{t})
t∈T
⊇ ι0 µ0 ({t}) for all t ∈ T
= Dt for all t ∈ T,

using the result of (c).


S
Proof of (e): We start with the fact that T = Dt , from (d). Since
t∈T
Dt ∈ U, we can apply Fact 1, to get

[ \
µ(T ) = µ( Dt ) = µ(Dt )
t∈T t∈T
0 0
Dt ) = µ0 (T ) = S.
\ [
= µ (Dt ) = µ (
t∈T t∈T

From Fact 2, we have ι(S) = ι0 (S) = ι0 µ0 (T ) = T . This shows that (*) holds,
completing the proof of (ii) that RU is a Galois-closed subrelation of R.
13.2. GALOIS CLOSED SUBRELATIONS 315

(iii) Now we must show that for any complete sublattice U of Hι µ , and any
Galois-closed subrelation R0 of R, we have URU = U and RUR0 = R0 .

We know that URU := Hι0 µ0 , the lattice of subsets of A closed under the
closure operator ι0 µ0 induced from the relation RU . This means that T ∈ URU
iff ι0 µ0 (T ) = T . First let T ∈ URU , and let S be the set µ0 (T ). Then we have
ι0 (S) = T , and since RU is a Galois-closed subrelation of R we conclude that

µ(T ) = S and ι(S) = T.


If T = ∅ then T ∈ U, and for T 6= ∅ we use the same argument as before to
S S
show that T = Dt (see (d) above). But now T = ι µ(T ) = ι µ( Dt ) =
t∈T t∈T
sup{Dt | t ∈ T } ∈ U. This shows one direction, that URU ⊆ U.

For the opposite inclusion, let T ∈ U. Then using the fact that U is a
sublattice of Hι µ , along with Fact 2, we have

ι0 µ0 (T ) = ι0 µ(T ) = ι µ(T ) = T.
This shows that T ∈ Hι0 µ0 , which is equal to URU . We now have the required
equality URU = U.

Now let R0 be a Galois-closed subrelation of R, and set


UR 0 := Hι0 µ0 = {T ⊆ A | ι0 µ0 (T ) = T }, and
RUR0 := ∪{T × µ(T ) | T ∈ UR0 }.
We will show that RUR0 = R0 .
First, if (t, s) ∈ R0 then s ∈ µ0 ({t}). Setting S := µ0 ({t}), we have s ∈ S
and ι0 (S) = ι0 µ0 ({t}). Now taking T := ι0 µ0 ({t}), we have ι0 µ0 (T ) = T , so
T ∈ UR0 and µ0 (T ) = S and ι0 (S) = T . Therefore µ(T ) = S and ι(S) = T .

Since t ∈ ι0 µ0 ({t}) = T and s ∈ S = µ(T ), we get (t, s) ∈ T × µ(T ) and


T ∈ UR0 . Hence (t, s) ∈ RUR0 , and we have shown that R0 ⊆ RUR0 .

To show the opposite inclusion, let T ∈ UR0 , and let S = µ(T ). Then from
Facts 1 and 2 we have
µ0 (T ) = µ(T ) = S and ι0 (S) = ι(S) = ι µ(T ) = T.
Therefore T × µ(T ) ⊆ R0 , and RUR0 ⊆ R0 . Altogether, we have RUR0 = R0 .
This completes the proof of part (iii), and of Theorem 13.2.4.
316 CHAPTER 13. COMPLETE SUBLATTICES

13.3 Closure Operators on Complete Lattices


In this section we will describe one more method to produce complete sub-
lattices of a given complete lattice. We will do so by consideration of the
fixed points of a certain kind of closure operator defined on the complete
lattice.

As before, we start with a relation R between two sets A and B, with induced
Galois-connection (µ, ι) and corresponding closure operators ιµ on A and µι
on B. We denote by Hιµ and Hµι the corresponding complete lattices of
closed sets on A and B respectively. Now we assume that γ1 : P(A) → P(A)
and γ2 : P(B) → P(B) are additive closure operators which are conjugate
with respect to the relation R. As we saw in Section 13.1, this conjugate pair
determines a new relation Rγ from A to B, with its own induced Galois-
connection (ιγ , µγ ) and closure operators ιγ µγ and µγ ιγ . We now have three
lattices of closed sets on A, corresponding to closure under the operators
ιµ, ιγ µγ and γ1 , and dually three lattices on B. From Theorems 13.1.5 and
13.1.7 we have the following connection between these lattices:

Hιγ µγ = Hιµ ∩ Hγ1 and Hµγ ιγ = Hµι ∩ Hγ2 .

For notational convenience, we shall henceforth denote the lattice Hιγ µγ by


Sγ1 , and dually the lattice Hµγ ιγ by Sγ2 .

In this section we examine how the lattice Sγ1 is situated in the lattice Hιµ .
In particular, for any set T ∈ Hιµ we can look for the least γ1 -closed class
γ1 T containing T and the greatest γ -closed class
1 γ1 T contained in T . Thus
we consider two operators Γ1 : T 7→ γ1 T and Γ2 : T 7→ γ1 T , whose properties
will be studied in more detail. We will present our definitions and results for
the lattices Hιµ and Sγ1 on A, but of course these results can be dualized
for the corresponding lattices on B as well.

Definition 13.3.1 Let T be an arbitrary subset of A. Then we define:

{T 0 ∈ Sγ1 | T 0 ⊇ T }
\
γ1
T :=
Sγ1 Hιµ
{T 0 ∈ Sγ1 | T 0 ⊆ T } = {T 0 ∈ Sγ1 | T 0 ⊆ T } .
_ _
γ1 T :=

(Note that the last equality holds because Sγ1 is a complete sublattice of
Hιµ , as we proved in Theorem 13.1.7.)
13.3. CLOSURE OPERATORS ON COMPLETE LATTICES 317

We shall need the following preliminary lemma.

Lemma 13.3.2 For any sets T and T 0 in Hιµ and any set S in Hµι ,

(i) T 0 ⊆ T iff µ(T 0 ) ⊇ µ(T ).

(ii) S = µ(T ) ∈ Sγ2 iff T = ι(S) ∈ Sγ1 .


Proof: (i) This follows directly from the definition of a Galois-connection.

(ii) Let T ∈ Hιµ and S ∈ Hµι . If S = µ(T ) ∈ Sγ2 , then ι(S) = ιµ(T ) = T
and also ιγ2 (S) = ι(S). This gives T = ιγ2 (S). Applying µ to both sides and
using 13.1.4 (i0 ) and 13.2.2 (iii), we get

µ(T ) = µιγ2 (S) = µιγ (S) = µγ ιγ (S).

Now we apply ι to both sides of this result, to get ιµγ ιγ (S) = ι(S). Finally
we have

ιγ (S) = ιγ µγ ιγ (S) = ιµγ ιγ (S) = ι(S) = ιµ(T ) = T,

so T = ι(S) ∈ Hιγ µγ = Sγ1 . The other direction can be proved similarly.

Now we can prove our first properties of the sets γ1 T and γ1 T .

Proposition 13.3.3 Let T ⊆ A. Then:


(i) γ1 T = ιγ µγ (T ) = ιµγ (T ) = ιµ γ1 (T ); and in particular, γ1 T is the
ιγ µγ -closed set generated by T .

(ii) If T = ιµ(T ), then

(a) γ1 T
γ1 T = T iff = T iff γ1 (T ) = T ,
(b) γ1 T is the greatest ιγ µγ -closed set contained in T , and
(c) γ1 T = ιγ µ(T ) = ι γ2 µ(T ).

Proof: (i) This follows from the definition of γ1 T , Proposition 13.2.2 (iii)
and Proposition 13.1.4 (i).

(ii) (a) The first equivalence follows from the definitions, while the second
one follows directly from Theorem 13.1.5 and the fact that Sγ1 = Hιγ µγ .
318 CHAPTER 13. COMPLETE SUBLATTICES

(ii) (b) By definition, γ1 T is a join of elements in the lattice Sγ1 , so it is in


Sγ1 . Moreover, this lattice is equal to Hιγ µγ , which is a complete sublattice
of Hιµ , so γ1 T is ιγ µγ -closed. It is also contained in T , since all the sets in
the join are contained in T and T itself is assumed to be ιµ-closed. More-
over, every ιγ µγ -closed set T 0 is also in Sγ1 and therefore contained in γ1 T
by definition; so γ1 T is the largest such set.

(ii) (c) By Lemma 13.3.2 we have


HW
ιµ
{T 0 ∈ Sγ1 | T 0 ⊆ T } = ιµ {T 0 ∈ Sγ1 | T 0 ⊆ T }
S
γ1 T =
= ι {µ(T 0 ) ∈ Sγ2 | T 0 ⊆ T }
T

= ι {µ(T 0 ) ∈ Sγ2 | µ(T 0 ) ⊇ µ(T )}


T

= ι {S 0 ∈ Sγ2 | S 0 ⊇ µ(T )} = ιµγ ιγ µ(T )


T

= ιγ µγ ιγ µ(T ), by 13.2.2
= ιγ µ(T ).

For any set T in the lattice Hιµ , let us denote by [γ1 T, γ1 T ] the interval
between γ1 T and γ1 T in Hιµ . Such intervals will be called γ1 -intervals in
Hιµ . It is possible that different sets T may produce the same γ1 -interval.
This suggests that we define an equivalence relation ∼ on Hιµ , by T1 ∼ T2
: ⇐⇒ [γ1 T1 , γ1 T1 ] = [γ1 T2 , γ1 T2 ].

A set T in Hιµ will be called ιγ µγ -collapsing if the γ1 -interval [γ1 T, γ1 T ] =


{T }, that is, if the interval “collapses” to the singleton set containing T .
Thus collapsing sets are uniquely characterized by their ιγ µγ -closure.

Proposition 13.3.4 Let T , T1 and T2 be ιµ-closed subsets of A. Then the


following properties hold:

(i) T ∈ Sγ1 iff [γ1 T, γ1 T ] = {T },

(ii) T1 ⊆ T2 implies ⊆ γ1 T2 γ1 T ⊆ γ1 T2 ,
γ1 T1 and 1

Hιµ Hιµ
(iii) γ1 T1 ∧ γ1 T2 = γ1 (T1 ∧ T2 ),
Hιµ Hιµ
(iv) γ1 T ∨ γ1 T = γ1 (T ∨ T2 ).
1 2 1

Proof: (i) and (ii) follow directly from Definition 13.3.1.


13.3. CLOSURE OPERATORS ON COMPLETE LATTICES 319

(iii) Since Sγ1 is a complete sublattice of Hιµ by Theorem 13.1.7, the set on
the left hand side is an element of Sγ1 , and it is contained in both T1 and
T2 . Therefore it is also contained in the greatest set from Sγ1 to contain T1
and T2 , which is the set on the right hand side of (iii). Conversely, the set
on the right hand side of (iii) is by definition an element of Sγ1 , and by part
(ii) it is contained in both γ1 T1 and γ1 T2 . Thus it is also contained in the set
on the left hand side.

(iv) This is dual to (iii).

Proposition 13.3.5 (i) The mapping Γ1 : Hιµ → Hιµ defined by T 7→ γ1 T

is a closure operator on Hιµ , and satisfies

Hιµ Hιµ
_ _
γ1
( {Tj | j ∈ J}) = {γ1 Tj | j ∈ J}.

(ii) The mapping Γ2 : Hιµ → Hιµ defined by T 7→ γ1 T is a kernel operator


on Hιµ , and satisfies
\ \
γ1 ( {Tj | j ∈ J}) = {γ1 Tj | j ∈ J}.

Proof: By part (i) of Proposition 113.3.3, the new operator Γ1 coincides


with the closure operator ιγ µγ ; so it is a closure operator. The operator Γ2
is isotone by 13.3.4 part (ii). It is intensive, that is, Γ2 (T ) ⊆ T for any T in
Hιµ , by 13.3.3 part (ii)(b). It is also idempotent, since γ1 T ∈ Sγ1 is γ1 -closed
and therefore γ1 (γ1 T ) = γ1 T by 13.3.2 part (ii)(a). Thus Γ2 is a kernel oper-
ator.

The two equalities are generalizations of parts (iv) and (iii) of Proposition
T HV
ιµ
13.3.4, and may be proved in the same manner. Note that equals in
the lattice Hιµ .

Remark 13.3.6 From the definitions and Proposition 13.3.4 parts (iii) and
(iv), we conclude that the operators Γ1 and Γ2 have the following properties:
320 CHAPTER 13. COMPLETE SUBLATTICES

(i) The mapping Γ1 is a join-retraction from Hιµ onto Sγ1 ⊆ Hιµ ; that is, it
is an idempotent join-homomorphism which is the identity map on Sγ1 .

(ii) Analogously, the mapping Γ2 is a meet-retraction from Hιµ onto Sγ1 ;


that is, it is an idempotent meet-homomorphism which is the identity map
on Sγ1 .

(iii) Note that, in general, Γ1 does not preserve meets and Γ2 does not
preserve joins.

In Chapter 2 we showed that there is a 1-1 correspondence between closure


operators ϕ : L → L on a complete lattice L and closure systems S on L
(that is, subsets of L closed under arbitrary meets). This correspondence
occurs via the following maps. For any closure operator ϕ : L → L, we get
the closure system

S := F ix(ϕ) := Hϕ = {T ∈ L | ϕ(T ) = T }

of all fixed points of ϕ; and for any closure system S we have the closure
operator ϕ defined by
L
{T 0 ∈ S | T ≤ T 0 } for T ∈ L.
V
ϕ(T ) = ϕS (T ) :=

Moreover, for any closure system S and any closure operator ϕ, we have HϕS
= S and ϕHϕ = ϕ.

There is also a dual 1-1 correspondence between kernel operators and ker-
nel systems on a complete lattice L. A kernel system on L is a family S of
subsets of L which is closed under arbitrary joins. Then for kernel operators
ψ : L → L on L and kernel systems S on L, we set

S := F ix(ψ) := {T ∈ L | ψ(T ) = T } ;
L
{T 0 ∈ S | T 0 ≤ T } for T ∈ L.
W
ψ(T ) := ψS (T ) :=
We have F ix(ψS ) = S and ψF ix(ψ) = ψ, for any kernel system S and any
kernel operator ψ on L.

A result of A. Tarski ([111]) shows that for any closure operator ϕ on a com-
plete lattice L, the closure system (fixed-point set) Hϕ is always a complete
13.3. CLOSURE OPERATORS ON COMPLETE LATTICES 321

lattice with respect to ≤. However, it is not necessarily a sublattice of L.


Thus we look for some additional condition under which a complete sublat-
tice is obtained. The answer (and its dual for kernel operators) is given in
the following theorem.

Theorem 13.3.7 Let L be a complete lattice.

(i) If ϕ is a closure operator on L which satisfies


L
_ L
_
ϕ( {Tj | j ∈ J}) = {ϕ(Tj ) | j ∈ J} (∗)

for every index set J, then the set of all fixed points under ϕ,

Hϕ = {T ∈ L | ϕ(T ) = T },

is a complete sublattice of L and ϕ(L) = Hϕ .

(ii) Conversely, if H is a complete sublattice of L, then the function ϕH


which is defined by
L
{T 0 ∈ H | T ≤ T 0 }
^
ϕH (T ) :=

is a closure operator on L with ϕH (L) = H, and ϕH satisfies the condition


(*). Moreover, HϕH = H and ϕHϕ = ϕ.

(iii) If ψ is a kernel operator on L which satisfies


L
^ L
^
ψ( {Tj | j ∈ J}) = {ψ(Tj ) | j ∈ J} (∗∗)

for every index set J, then the set of all fixed points under ψ,

Hψ = {T ∈ L | ψ(T ) = T },

is a complete sublattice of L and ψ(L) = Hψ .

(iv) Conversely, if H is a complete sublattice of L then the function ψH which


is defined by
L
{T 0 ∈ H | T 0 ≤ T }
_
ψH (T ) :=
is a kernel operator on L with ψH (L) = Hψ , and ψ satisfies the condition
(**). Moreover, HψH = H and ψHψ = ψ.
322 CHAPTER 13. COMPLETE SUBLATTICES

Proof: (i) Let ϕ be a closure operator on L which satisfies the condition


(*). We have to prove that the set of all fixed points under ϕ is a complete
sublattice of L, that is, that for any index set J, both
L
^ L
_
{Tj ∈ Hϕ | j ∈ J} ∈ Hϕ and {Tj ∈ Hϕ | j ∈ J} ∈ Hϕ .
L L
It is clear that ϕ( {Tj ∈ Hϕ | j ∈ J}) ≥ {Tj ∈ Hϕ | j ∈ J}. For each
V V
L
j ∈ J we have Tj = ϕ(Tj ) ≥ ϕ( {Tj ∈ Hϕ | j ∈ J}), and from this we
V
L L
{Tj ∈ Hϕ | j ∈ J} ≥ ϕ( {Tj ∈ Hϕ | j ∈ J}). Altogether this gives
V V
obtain
L
{Tj ∈ Hϕ | j ∈ J} ∈ Hϕ . The fact that ϕ satisfies the join
V
equality, and
L
condition (*) gives {Tj ∈ Hϕ | j ∈ J} ∈ Hϕ . Thus we have a sublattice of
W

L. It is clear that Hϕ ⊆ ϕ(L). Since ϕ is idempotent, ϕ(T ) is in Hϕ for all


T ∈ L. This shows that ϕ(L) = Hϕ .

(ii) By Remark 13.3.6, we need only show that the closure operator defined
by
L
{T 0 ∈ H | T ≤ T 0 }
^
ϕH (T ) :=
satisfies condition (*) and that ϕH (L) = H. We prove the latter fact first.
Since H is a complete sublattice of L, we have ϕH (L) ⊆ H. For the opposite
inclusion, we see that for any T ∈ H
L
{T 0 ∈ H | T ≤ T 0 } = T.
^
ϕH (T ) =
Thus H ⊆ ϕH (L), and altogether we have H = ϕH (L).
Since for each j ∈ J we have Tj ≤ ϕH (Tj ) and ϕH (Tj ) ∈ H, the set
L
_
{ϕH (Tj ) | j ∈ J}
is an upper bound of the set {Tj ∈ L | j ∈ J}. Therefore
L
_ L
_
{Tj ∈ L | j ∈ J} ≤ {ϕH (Tj ) | j ∈ J}.
Since the set on the right hand side of this inequality is an element of H,
applying ϕH on both sides gives
L
_ L
_
ϕH ( {Tj ∈ L | j ∈ J}) ≤ {ϕH (Tj ) ∈ L | j ∈ J}.
13.4. EXERCISES 323

L L
{Tj ∈ L | j ∈ J} ≥ Tj , we have ϕH ( {Tj ∈ L | j ∈ J}) ≥ ϕH (Tj )
W W
Since
L L
for all j ∈ J. Thus also ϕH ( {Tj ∈ L | j ∈ J}) ≥ {ϕH (Tj ) | j ∈ J}, giving
W W

the required equality.

(iii), (iv) These proofs are analogous to those of (i) and (ii).

The equations follow from Remark 13.3.6, by restricting the one-to-one map-
ping between closure operators and complete lattices to closure operators
satisfying condition (*) and to complete sublattices.

We can apply this Theorem to the special case of conjugate pairs of closure
operators studied in Section 13.1. Using the notation from 13.1 and applying
13.3.5 part (i), we take L = Hιµ , and ϕ = Γ1 (= ιγ µγ , as in 13.3.3 (i)) and
ψ = Γ2 (= ιγ µ, as in 13.3.3 (ii)). This gives an additional proof of the fact
that Sγ1 = Hιγ µγ = F ix(ϕ) = F ix(ψ) is a complete sublattice of Hιµ .

Closure and kernel operators on complete lattices have been studied by K.


P. Shum and A. Yang in [107] and K. Denecke in [22]. The techniques of
this section have also been used by K. Denecke and S. L. Wismath in [38]
in a general construction to produce complete sublattices. The application
of this construction to the Galois-connection (Id, M od) encompasses several
well-known results on regular and normal identities, as well as some new
families of identities and varieties.

13.4 Exercises
13.4.1. Let R be a relation between sets A and B. Prove that the closure
operators µι and ιµ obtained from the Galois-connection (µ, ι) induced by
R are conjugate with respect to R.

13.4.2. Prove the additional properties for conjugate pairs of additive closure
operators listed at the end of Section 13.1.

13.4.3. Let R be a relation between sets A and B, with induced Galois-


connection (µ, ι). Let γ := (γ1 , γ2 ) be a pair of additive closure operators
which are conjugate with respect to R. Verify that the relation Rγ defined
in 13.1.3 is a Galois-closed subrelation of R.
324 CHAPTER 13. COMPLETE SUBLATTICES

13.4.4. Prove that for an additive closure operator γ, the least upper bound
operation on the lattice Hγ agrees with the union operation.

13.4.5. This exercise investigates partial closure operators. Let A be a non-


empty set. A partial mapping C : P (A) → P (A) is called a partial closure
operator on A if it satisfies the following conditions, for every X, Y ⊆ A:

(i) if C(X) is defined, then X ⊆ C(X),


(ii) if C(X) and C(Y ) are defined, then X ⊆ Y implies C(X) ⊆ C(Y ),
(iii) if C(X) is defined, then C(C(X)) = C(X), and
(iv) C({x}) is defined for every x ∈ A.

If X ⊆ A and C(X) = X, then X is said to be a closed set. A family F of


subsets of A is called a partial closure system on A if it satisfies the following
two conditions:

(i) F = A, and
S

(ii) for every x ∈ A, {X ∈ F | x ∈ X} ∈ F.


T

Prove that the family of closed sets of a partial closure operator on a set
A is a partial closure system on A, and conversely that for every partial
closure system F on A there is a partial closure operator on A whose family
of closed sets is exactly F. (See B. Šešelja and A. Tepavčević, [106].)

13.4.6. Prove that every partially ordered set (P ; ≤) is isomorphic to a par-


tial closure system on P , ordered by inclusion.
Chapter 14

G-Clones and M -Solid


Varieties

In Chapter 13 we studied methods of producing complete sublattices of a


complete lattice. We now apply these methods to our two chief examples
of complete lattices, the lattice of all clones on a fixed set and the lattice
of all varieties of algebras of a given type. We have seen, in Chapters 2
and 6, that both of these lattices arise as the lattices of closed sets from a
Galois-connection.

14.1 G-Clones
In this section we apply our theory of Galois-closed subrelations to the lattice
of clones on a fixed set. We assume a fixed base set A, and denote by O(A)
the set of all finitary operations on A and by R(A) the set of all finitary
relations on set A. As our basic relation between these two sets we have the
relation R of preservation:

R = {(f, ρ) | f ∈ O(A), ρ ∈ R(A) and f preserves ρ}.

We saw in Chapter 2 that this relation induces a Galois-connection of the


form (P olA , InvA ), between sets of operations and sets of relations. From
this we obtain two lattices of closed sets, the lattice of all clones on the set
A and the lattice of all relational clones on A. Thus clones on A are sets C
of operations for which P olA InvA C = C, and dually relational clones are
sets Q of relations for which InvA P olA Q = Q. Now we want to produce a

325
326 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

Galois-closed subrelation of the relation R, with a corresponding complete


sublattice of the lattice of clones on A.
To form this subrelation, we focus on certain kinds of operations on A. We
let SA be the symmetric group of all permutations defined on the set A.
Then for every n-ary operation f ∈ O(A) and any permutation s ∈ SA we
can define a new operation f s , of the same arity as f , by

f s (a1 , . . . , an ) := s(f (s−1 (a1 ), s−1 (a2 ), . . . , s−1 (an )),

for all a1 , . . . , an ∈ A.
We use this to define, for any fixed subgroup G ⊆ SA of permutations, a
mapping γG O on operations and sets of operations. For any operation f and

any set F ⊆ O(A), we set


[
O
γG (f ) := {f s | s ∈ G} and γG
O
(F ) := O
γG (f ).
f ∈F

This gives a map γG O on the power set of O(A), which is our first candidate

for a closure operator.

Lemma 14.1.1 For every subgroup G ⊆ SA the operator


O
γG : P(O(A)) → P(O(A))
O (F ) is an additive closure operator on O(A).
defined by F 7→ γG
O is additive and therefore monotone,
Proof: By definition our mapping γG
so that
O O
F1 ⊆ F2 ⇒ γG (F1 ) ⊆ γG (F2 ).
Since the subgroup G contains the identity permutation ϕid and f ϕid = f
for every f ∈ F , we have F ⊆ γG O (F ), making the operator γ O extensive.
G
From extensivity and monotonicity it follows that γG O (F ) ⊆ γ O (γ O (F )). For
G G
the other inclusion for idempotency, we see that for any two permutations s
and s0 in G we have
0
(f s )s (a1 , . . . , an ) = s0 (f s (s0−1 (a1 ), . . . , s0−1 (an )))
= s0 (s(f (s−1 (s0−1 (a1 )), . . . , s−1 (s0−1 (an )))))
= (s0 ◦ s)(f ((s0 ◦ s)−1 (a1 ), . . . , (s0 ◦ s)−1 (an ))),

and since s0 ◦ s ∈ G we have γG


O (γ O (F )) ⊆ γ O (F ).
G G
14.1. G-CLONES 327

Definition 14.1.2 Let G ⊆ SA be a permutation group on the set A. A


O (C) = C; so C is closed with respect
clone C on A is called a G-clone if γG
O
to the operator γG .
G-clones have been studied by several authors: see for instance Gorlov and
Pöschel, [50], and N. van Hoa, [56]. The special case where f s = f for
each element f of the clone and a permutation s ∈ SA was considered by
Demetrovics and Hannák in [17], Demetrovics, Hannák and Marchenkov in
[18] and [19], by Marchenkov in [76] and by Csákány and Gavalcová in [13].

We can use the closure operator γGO to define another relation R between
G
operations and relations on A, by setting
O
RG := {(f, ρ) | f ∈ O(A), ρ ∈ R(A) and γG (f ) × {ρ} ⊆ R}.

To show that RG is a Galois-closed subrelation of R, we look for another


additive closure operator, this time on the set of relations on A, in order to
make a pair of operators conjugate with respect to the original relation R
of preservation. For any h-ary relation ρ ∈ Rh (A) and any s ∈ G, we define
(as in Rosenberg, [103])

ρs := {(s(x1 ), . . . , s(xh )) | (x1 , . . . xh ) ∈ ρ}.

As before, we use this to define an operator on individual relations and on


R (ρ) := {ρs | s ∈ G} and γ R (Q) := S γ R (ρ).
sets Q of relations on A, with γG G G
ρ∈Q
Then it is straightforward to verify, as in Lemma 14.1.1, that the mapping
R is a closure operator on R(A).
γG

Lemma 14.1.3 For every subgroup G ⊆ SA the operator


R
γG : P(R(A)) → P(R(A))
R (Q) is an additive closure operator on R(A).
defined by Q 7→ γG

Definition 14.1.4 A relational clone Q ⊆ R(A) is called a G-relational


R (Q) = Q.
clone if γG
Now we have a pair (γG O , γ R ) of additive closure operators, between sets of
G
operations and sets of relations, and we can verify that these operators are
conjugate with respect to the relation R of preservation.
328 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

Lemma 14.1.5 For any f ∈ O(A), for any ρ ∈ R(A) and for any subgroup
G ⊆ SA , we have:
O R
γG (f ) preserves ρ iff f preserves γG (ρ).

Proof: We prove first that for every s ∈ SA , f preserves ρ iff f s preserves ρs


(see I. G. Rosenberg, [103]). If (a11 , . . . , a1h ), . . . , (an1 , . . . , anh ) are h-tuples
from ρ then

(f s (s(a11 ), . . . , s(an1 )), . . . , f s (s(a1h ), . . . , s(anh )))


= (s(f (s−1 (s(a11 )), . . . , s−1 (s(an1 )))), . . . ,
s(f (s−1 (s(a1h )), . . . , s−1 (s(anh )))))
= (s(f (a11 , . . . , an1 )), . . . , s(f (a1h , . . . , anh ))) ∈ ρs ,

and thus (f (a11 , . . . , an1 ), . . . , f (a1h , . . . , anh )) ∈ ρ, and conversely.


Now if γGO (f ) preserves ρ then f s−1 preserves ρ for every s−1 ∈ G. Using the
−1
result just proved, we have that (f s )s = f preserves ρs , for every s ∈ G.
This means that f preserves γG R (ρ). The converse can be shown in a similar

way.

Combining the three previous Lemmas gives the following conclusion.

Theorem 14.1.6 Let G be a subgroup of SA . Then the pair γG := (γG O , γR)


G
is a conjugate pair of additive closure operators with respect to the relation

R = {(f, ρ) | f ∈ O(A), ρ ∈ R(A) and f preserves ρ}.

Our conjugate pair of closure operators induces the relation

R
RG = {(f, ρ) | f ∈ O(A), ρ ∈ R(A) and {f } × γG (ρ) ⊆ R}

between O(A) and R(A). The Galois-connection induced by this relation


RG is denoted by (GInvA , GP olA ). We know from Section 13.2 that this
relation is a Galois-closed subrelation of the original relation R. We also
have a number of properties of our closure operators and closed sets, from
the theorems of Section 13.1.

Theorem 14.1.7 Let G ⊆ SA be a subgroup of the full permutation group


SA on a set A. Then the set of all G-clones of operations defined on A forms
14.1. G-CLONES 329

a complete sublattice of the lattice LA of all clones of operations defined on


A. We denote this sublattice of LA by LG 0
0 0
A . If G is a subgroup of G , then
G G G
LA ⊆ LA , and LA is a complete sublattice of LA . G

Dually, the set of all G-relational clones forms a complete sublattice of the
lattice of all relational clones.
Proof: The fact that we get complete sublattices, of the lattices of clones and
of relational clones respectively, comes from Theorem 13.1.7. If G ⊆ G 0 ⊆ SA
are subgroups, then clearly γG O (F ) ⊆ γ O (F ) for every F ⊆ O(A), and if
G00
γG0 (F ) = F then also γG (F ) = F , so LG
O O G
A ⊆ LA . Moreover RG0 is a Galois-
0
closed subrelation of RG , so that LG G
A is a complete sublattice of LA .

Theorem 13.1.4 gives us a number of interactions between the various closed


sets, which we restate here for the G-clone setting.

Proposition 14.1.8 For all F ⊆ O(A) and all Q ⊆ R(A), the following
properties hold:
R (Q),
(i) GP olA Q = P olA γG
(ii) GP olA Q ⊆ P olA Q,
(iii) γGO (GP ol Q) = GP ol Q,
A A
O (F );
(iv) GP olA GInvA F = P olA InvA γG and dually,

(i0 ) O (F ),
GInvA F = InvA γG
(ii0 ) GInvA F ⊆ InvA F ,
(iii0 ) R (GInv F ) = GInv F ,
γG A A
(iv0 ) R (Q).
GInvA GP olA Q = InvA P olA γG
The Main Theorem for Conjugate Pairs of additive closure operators, The-
orem 13.1.5, gives us a characterization of G-clones and G-relational clones.

Theorem 14.1.9 Let G ⊆ SA be a subgroup of the full symmetric group on


set A. Let F ⊆ O(A) be a clone and let Q ⊆ R(A) be a relational clone. Then
the following conditions (i) - (v) are equivalent for F , and dually conditions
(i0 ) -(v0 ) are equivalent for Q:
(i) F = GP olA GInvA F , (i0 ) Q = GInvA GP olA Q,
O
(ii) γG (F ) = F , (ii0 ) γG
R (Q) = Q,

(iii) InvA F = GInvA F , (iii0 ) P olA Q = GP olA Q,


R
(iv) γG (InvA F ) = InvA F , (iv0 ) γGO (P ol Q) = P ol Q,
A A
(v) F = P olA GInvA F ; and dually, 0
(v ) Q = InvA GP olA Q.
330 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

Proof: These conditions all come from Theorem 13.1.5, except for (iv) and
(iv0 ), which are simply applications of (ii0 ) and (ii).

We also have the following conditions which can be derived from Theorem
13.1.6 (see also K. Denecke and M. Reichel, [35]).

Proposition 14.1.10 Let G ⊆ SA be a subgroup of the full permutation


group on set A. Then for any F ⊆ O(A) and Q ⊆ R(A) the following
properties hold:
O (F ) ⊆ P ol Inv F
(i) γG ⇔ P olA InvA F =
A A
GP olA GInvA F ,
O (F ) ⊆
(ii) γG P olA InvA F ⇔ O (P ol Inv F ) =
γG A A
P olA InvA F ,
(i0 ) γG
R (Q) ⊆ InvA P olA Q ⇔ InvA P olA Q =
GInvA GP olA Q,
(ii0 ) γG
R (Q) ⊆ InvA P olA Q ⇔ R (Inv P ol Q) =
γG A A
InvA P olA Q.
Condition (ii) is a useful tool in checking whether a clone is a G-clone.
Suppose that we have a generating set or basis F for a clone, so hF i :=
P olA InvA F . To test if this clone is a G-clone, by (ii) it is enough to check
whether the set γG O (F ) is included in hF i.

As an example we will apply this method to L2 , the lattice of all clones on


the two-element set A = {0, 1}. As we saw in Section 10.3, this lattice was
first completely described by E. L. Post, in [95]. We will use here the notation
of Jablonskij, Gawrilow and Kudrjawzew in [60]. It is clear that the group
S2 of all permutations defined on the set {0, 1} contains only one non-trivial
function, namely the negation ¬ : x 7→ ¬ x. This means that S2 -clones are
those clones F which are self-dual as sets, that is, sets with F ¬ = F . We list
here the clones we shall need, with the notational convention that we denote
a clone by the name used in Section 10.3 for the corresponding two-element
algebra.

O1 = JA , the clone of projections,


O4 = h¬i, the clone of projections and their negations,
O8 = hc20 , c21 i, the clone of constants (c20 , c21 are the constants
with value 0 and 1, respectively),
O9 = hc20 , ¬i, the clone of essentially unary operations,
14.2. H-CLONES 331

L4 = hgi, g(x, y, z) = x + y + z, the clone of linear


idempotent operations,
L5 = hg, ¬i, the clone of linear self-dual operations,
L1 = hc21 , +i, the clone of all linear operations,
D2 = hhi, h(x, y, z) := (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z),
the clone of self-dual monotone operations,
D1 = hg, hi, the clone of self-dual idempotent operations,
D3 = hg, h, ¬i, the clone of self-dual operations,
A4 = h∧, ∨i, the clone of monotone idempotent operations,
A1 = hc20 , c21 , ∧, ∨i, the clone of monotone operations,
C4 = h∨, ti, for t(x, y, z) = x ∧ (y + z + 1), the clone of idempotent
operations,
C1 = O(A), the clone of all operations on {0, 1}.

By checking the generating systems, it can be verified that all of these clones
are S2 -clones. That these are all the S2 -clones was proved by Gorlov and
Pöschel in [50].

Theorem 14.1.11 ([50]) There are exactly fourteen S2 -clones on the two-
element set A = {0, 1}:
O1 , O4 , O8 , O9 , L1 , L4 , L5 , D2 , D1 , D3 , A4 , A1 , C4 and C1 .

The complete sublattice LS2 2 of L2 is given by the Hasse diagram below.


Gorlov and Pöschel also proved (see [50]) that there are forty-eight S3 -clones
of operations defined on the three-element set A = {0, 1, 2}.

14.2 H-clones
In [50], V. V. Gorlov and R. Pöschel described several generalizations of
G-clones to H-clones, where H is a transformation monoid. We will use a
new approach here, by applying a closure operator on the lattice of all clones
which is different from that given in Section 14.1.
Let TA = (O1 (A), ◦, ϕid ) be the monoid of all unary mappings or transfor-
mations on A, where ◦ is the composition of unary operations and ϕid is the
identity mapping on A.
332 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES
C1
u
XXXXX


uA1
X
 


 


 


 

C4u u u
 D3  L1
  
  

  
  
  
u D1u L5u O9 uX

A4 
 XXX
 XXu
! O8
 
  
   !!
 !
D2 u u O4u
  !
Z L4  ! !!
Z  !!
Z  !!
Z !!
!
u
Z 
!
Z!
O1

For every unary mapping ϕ ∈ O1 (A) and every n-ary mapping f ∈ On (A),
we define a new mapping f ϕ , by setting

f ϕ (a1 , . . . , an ) = ϕ(f (ϕ(a1 ), . . . , ϕ(an ))),

for all (a1 , . . . , an ) ∈ An . We use this to define, for any set H of unary
mappings, an operator on individual mappings and on sets F ⊆ O(A) of
mappings, by
[
O
γH (f ) := {f ϕ | ϕ ∈ H} and γH
O
(F ) := O
γH (f ).
f ∈F

When H is the base set of a submonoid of TA , we get the following result.


O is an addi-
Lemma 14.2.1 For every submonoid H of TA , the mapping γH
tive closure operator on O(A).
We define the following subrelation of the preservation relation R:
O
RH := {(f, ρ) | f ∈ O(A), ρ ∈ R(A) and γH (f ) × {ρ} ⊆ R}.

This subrelation induces a Galois-connection (HP olA , HInvA ) between sets


of relations and sets of operations on A. The sets of operations which are
14.2. H-CLONES 333

closed under this Galois-connection, that is the clones F such that γHO (F ) =
H
F , are called H-clones. We will denote by LA the lattice of all H-clones on
A. Since we do not have a conjugate pair of additive closure operators here,
it is not clear whether this lattice forms a complete sublattice of the lattice
LA . It is always at least a meet-subsemilattice of LA .

We conclude this section by examining in more detail the case that A is


the two-element set {0, 1}. In this case we have exactly four unary opera-
tions: the identity operation ϕid , the two constant operations c0 and c1 , and
the negation operation ¬. It is easy to check that the four-element monoid
T ({0, 1}) then has the following proper submonoids:

H1 = {ϕid }, H2 = {ϕid , c0 }, H3 = {ϕid , c1 },


H4 = {ϕid , c0 , c1 }, H5 = {ϕid , ¬}.

The submonoid lattice of T{0,1} has the form

T{0,1}
s

H4 s

H3 s s H2 s H5

s
H1

We obtain the following complete lattices of H-closed sets:


O (F ) = F holds for all clones F ⊆ O(A), so that LH1 = L .
(i) γH1 A A

O (F ) = F ∪ F c0 = F ∪ {cn | n ∈ N} = F iff c ∈ F . This means that


(ii) γH2 0 0
LH
A
2
is the set of all clones containing the constant c0 . (Here cn0 is the
334 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

n-ary constant 0 operation.)

(iii) LH 3
A is the set of all clones containing the constant c1 .

(iv) LH 4
A is the set of all clones containing both constants c0 and c1 .

(v) LHA = {O1 , O4 , O8 , O9 , L4 , L5 , L1 , D2 , D1 , D3 , A4 , A1 , C4 , C1 } is the


5

set of all self-dual clones.

It turns out that all of these lattices are complete sublattices of LA , for
A = {0, 1}. The following picture shows the structure of the set of all these
lattices.

LA
s

LH 3 s s H2 sLH5
A L A A

LH
A
4

s
T
LA{0,1}

14.3 M -Solid Varieties


In this section we apply our theory of conjugate pairs of closure operators
to the lattice of all varieties of a given type. We assume a fixed type τ of
operation symbols, and consider the sets A = Wτ (X)2 of all identities of
type τ and B = Alg(τ ) of all algebras of type τ . Between these two sets we
have the basic relation R of satisfaction: that is, a pair (s ≈ t, A) ∈ R iff the
algebra A satisfies the identity s ≈ t. As we saw in Section 6.1, this relation
induces the Galois-connection (Id, M od) between sets of identities and sets
14.3. M -SOLID VARIETIES 335

of algebras. On the algebra side, the closed sets are the equational classes
or varieties, and we have the complete lattice L(τ ) of all varieties of type τ .
Dually, the closed sets on the identity side are the equational theories, and
we have the complete lattice E(τ ) of all equational theories of type τ . These
two lattices L(τ ) and E(τ ) are dually isomorphic, and in general are very
large and complex. Thus it is important to find some means of studying at
least portions of these lattices, such as complete sublattices.

Our goal is to introduce two new closure operators on our sets A and B,
which we shall show form a conjugate pair of additive closure operators. The
results of Section 13.1 then give us complete sublattices of our two lattices.
The new operators we use are based on the concept of hypersatisfaction of
an identity by a variety. We begin with the definition of a hypersubstitution,
as introduced by Denecke, Lau, Pöschel and Schweigert in [32]. A complete
study of hypersubstitutions and hyperidentities may be found in [37].

A hypersubstitution of type τ is a mapping which associates to every oper-


ation symbol fi a term σ(fi ) of type τ , of the same arity as fi . Any hyper-
substitution σ can be uniquely extended to a map σ̂ on the set Wτ (X) of all
terms of type τ inductively as follows:

(i) if t = xj for some j ≥ 1, then σ̂[t] = xj ;


(ii) if t = fi (t1 , . . . , tni ) for some ni -ary operation symbol fi and some terms
t1 , . . . , tni , then σ̂[t] = σ(fi )(σ̂[t1 ], . . . , σ̂[tni ]).

Here the left side of (ii) means the composition of the term σ(fi ) and the
terms σ̂[t1 ], . . . , σ̂[tni ].

We can define a binary operation ◦h on the set Hyp(τ ) of all hypersubstitu-


tions of type τ , by taking σ1 ◦h σ2 to be the hypersubstitution which maps
each fundamental operation symbol fi to the term σ̂1 [σ2 (fi )]. That is,

σ1 ◦h σ2 := σ̂1 ◦ σ2 ,
where ◦ denotes ordinary composition of functions. We will show that this
operation is associative, and that the set of all hypersubstitutions forms a
monoid. The identity element is the identity hypersubstitution σid , which
maps every fi to fi (x1 , . . . , xni ).

Proposition 14.3.1 Let τ be any fixed type.


336 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

(i) For any two hypersubstitutions σ and ρ of type τ , we have (σ ◦h ρ)ˆ =


(σ̂ ◦ ρ)ˆ = σ̂ ◦ ρ̂.
(ii) The binary operation ◦h is associative.
(iii) (Hyp(τ ); ◦h , σid ) is a monoid.
Proof: (i) This can be proved by induction on the complexity of terms, using
the definition of the extension of a hypersubstitution; we leave the details as
an exercise for the reader. (See Exercise 14.5.2.)

(ii) Let σ1 , σ2 and σ3 be any three elements of Hyp(τ ). Then from (i) we
have σ1 ◦h (σ2 ◦h σ3 ) = σ̂1 ◦ (σ̂2 ◦ σ3 ) = (σ̂1 ◦ σ̂2 ) ◦ σ3 = (σ̂1 ◦ σ2 )ˆ◦ σ3 =
(σ1 ◦h σ2 ) ◦h σ3 .

(iii) This follows directly from (ii).

Definition 14.3.2 Let M be any submonoid of Hyp(τ ). An algebra A is


said to M -hypersatisfy an identity u ≈ v if for every hypersubstitution σ ∈
M , the identity σ̂[u] ≈ σ̂[v] holds in A. In this case we say that the identity
u ≈ v is an M -hyperidentity of A. An identity is called an M -hyperidentity
of a variety V if it holds as an M -hyperidentity in every algebra in V . A
variety V is called M -solid if every identity of V is an M -hyperidentity of
V . When M is the whole monoid Hyp(τ ), an M -hyperidentity is called a
hyperidentity, and an M -solid variety is called a solid variety.
Let M be any submonoid of Hyp(τ ). Since M contains the identity hyper-
substitution, any M -hyperidentity of a variety V is an identity of V . This
means that the relation of M -hypersatisfaction, defined between Alg(τ ) and
Wτ (X)2 , is a subrelation of the relation of satisfaction from which we induced
our Galois-connection (Id, M od). The new Galois-connection induced by the
relation of M -hypersatisfaction is (HM M od, HM Id), defined on classes K
and sets Σ as follows:

HM IdK = {s ≈ t ∈ Wτ (X)2 : s ≈ t is an M -hyperidentity of


A for all A in K},
HM M odΣ = {A ∈ Alg(τ ) : all identities in Σ are
hyperidentities of A}.

The Galois-closed classes of algebras under this connection are precisely the
M -solid varieties of type τ , which then form a complete sublattice of the
14.3. M -SOLID VARIETIES 337

lattice of all varieties of type τ . Thus studying M -solid and solid varieties
gives a way to study complete sublattices of the lattice of all varieties of a
given type.

We now introduce some closure operators on the two sets Alg(τ ) and
Wτ (X)2 . On the equational side, we can use the extensions of our M -
hypersubstitutions to map any terms and identities to new ones. That is,
we define an operator χE
M by

χE
M [u ≈ v] = {σ̂[u] ≈ σ̂[v] : σ ∈ M }.

This extends, additively, to sets of identities, so that for any set Σ of iden-
tities we set

χE E
M [Σ] = {χM [u ≈ v] : u ≈ v ∈ Σ}.

Hypersubstitutions can also be applied to algebras, as follows. Given an alge-


bra A = (A; (fi )i∈I ) and a hypersubstitution σ, we define the algebra σ(A):
= (A; (σ(fi )A )i∈I ). This algebra is called the derived algebra determined by
A and σ. Notice that by definition it is of the same type as the algebra A.
Now we define an operator χA M on the set Alg(τ ), first on individual algebras
and then on classes K of algebras, by

χA
M [A] = {σ[A] : σ ∈ M}, and
A A
χM [K] = {χM [A] : A ∈ K}.

Proposition 14.3.3 Let τ be a fixed type and let M be any submonoid of


Hyp(τ ). The two operators χE A
M and χM are additive closure operators and
are conjugate with respect to the relation R of satisfaction.

Proof: Since M is a submonoid of Hyp(τ ), it contains the identity hyper-


substitution σid , and the extension of this hypersubstitution maps any term
t to itself. This shows that the operator χE M is extensive. The property of
monotonicity is clear from the definition. The idempotency follows from the
fact that M is a monoid: for any σ and ρ in M , the composition σ ◦h ρ is also
in M ; so for any identity u ≈ v we have σ̂[ρ̂[u]] ≈ σ̂[ρ̂[v]] in χE E
M . Thus χM is
A
a closure operator. The proof for χM is similar. Both closure operators are
additive by definition, so it remains only to show that they are conjugate
338 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

with respect to satisfaction. For this we need to show that for any algebra
A and any identity u ≈ v of type τ , we have

χA E
M [A] satisfies u ≈ v iff A satisfies χM [u ≈ v].

For any σ ∈ M , the definition of satisfaction means that A satisfies


σ̂[u] ≈ σ̂[v] iff the induced term operations satisfy σ̂[u]A = σ̂[v]A . Simi-
larly, σ[A] satisfies u ≈ v iff uσ[A] = v σ[A] . But σ̂[u]A = uσ[A] , and similarly
for v. This shows that σ[A] satisfies u ≈ v iff A satisfies σ̂[u] ≈ σ̂[v], and
completes our proof.

Once we know that our two additive closure operators form a conjugate pair,
we can apply our Main Theorem for such conjugate pairs, Theorem 13.1.5.
Translating that theorem into the specific case here, we have the following
description of the closed objects.

Theorem 14.3.4 Let M be a monoid of hypersubstitutions of type τ . For


any variety V of type τ , the following conditions are equivalent:

(i) V = HM M odHM IdV .

(ii) χA
M [V ] = V .

(iii) IdV = HM IdV .

(iv) χE
M [IdV ] = IdV .

And dually, for any equational theory Σ of type τ , the following conditions
are equivalent:

(i0 ) Σ = HM IdHM M odΣ.

(ii0 ) χE
M [Σ] = Σ.

(iii0 ) M odΣ = HM M odΣ.

(iv0 ) χA
M [M odΣ] = M odΣ.

Since for any variety V we have HM IdV ⊆ IdV , condition (iii) of this
theorem says that V is an M -solid variety, since every identity of V is an
M -hyperidentity. In analogy with the (Id, M od) case, a variety which sat-
isfies condition (i) is called an M -hyperequational class. Thus our M -solid
14.3. M -SOLID VARIETIES 339

varieties are precisely the M -hyperequational classes. Dual results hold for
M -hyperequational theories, using the second half of the theorem. In ad-
dition, this tells us that the relation of hypersatisfaction is a Galois-closed
subrelation of the satisfaction relation. Moreover, from Theorem 13.1.7 we
have the following result.

Theorem 14.3.5 Let M be a monoid of hypersubstitutions of type τ . Then


the class SM (τ ) of all M -solid varieties of type τ forms a complete sub-
lattice of the lattice L(τ ) of all varieties of type τ . Dually, the class of all
M -hyperequational theories forms a complete sublattice of the lattice of all
equational theories of type τ .

When M1 and M2 are both submonoids of Hyp(τ ) and M1 is a submonoid


of M2 , then the corresponding complete lattices satisfy SM2 (τ ) ⊆ SM1 (τ ).
As a special case, for any M ⊆ Hyp(τ ) we see that the lattice S(τ ) of all solid
varieties of type τ is always a sublattice of the lattice SM (τ ). At the other
extreme, for the smallest possible submonoid M = {σid } the corresponding
lattice of M -solid varieties is the whole lattice L(τ ) of all varieties of type τ .
Thus we obtain a range of complete sublattices, from all of L(τ ) to Sτ . The
following definition lists a number of interesting submonoids M for which the
corresponding lattices SM (τ ) have been studied, both in the general setting
and for specific types τ .

Definition 14.3.6 Let τ be a fixed type.


(i) A hypersubstitution σ ∈ Hyp(τ ) is said to be leftmost if for every i ∈ I,
the first variable in σ̂[f (x1 , . . . , xni )] is x1 . The set of all leftmost hyper-
substitutions of type τ forms a submonoid of Hyp(τ ). The monoid of all
rightmost hypersubstitutions is defined dually.

(ii) A hypersubstitution σ ∈ Hyp(τ ) is said to be outermost if it is both


leftmost and rightmost. The set of all outermost hypersubstitutions of type
τ forms a submonoid of Hyp(τ ).

(iii) A hypersubstitution σ ∈ Hyp(τ ) is called regular if for every i ∈ I, all


the variables x1 , . . ., xni occur in the term σ̂[fi (x1 , . . . , xni )]. The set Reg(τ )
of all regular hypersubstitutions of type τ forms a submonoid of Hyp(τ ),
and a variety which is M -solid for this submonoid M is called regular-solid.
340 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

(iv) A hypersubstitution σ ∈ Hyp(τ ) is called a pre-hypersubstitution if


for every i ∈ I, the term σ(fi ) is not a variable. The set of all pre-
hypersubstitutions of type τ forms a submonoid of Hyp(τ ), and a variety
which is M -solid for this monoid is called presolid.

(v) A hypersubstitution σ ∈ Hyp(τ ) is called symmetrical, or a permuta-


tion hypersubstitution, if for every i ∈ I there is a permutation π on the set
of indices {1, 2, . . . , ni } such that σ̂[fi (x1 , . . . , xni )] = fi (xπ(1) , . . . , xπ(ni ) ).
The set of all symmetrical hypersubstitutions of type τ forms a submonoid
of Hyp(τ ), and a variety which is M -solid for this submonoid is called
permutation-solid.

When V is a variety of type τ , we can form the lattice L(V ) of all subvarieties
of V (see Section 6.6). Then the intersection SM (V ) := SM (τ ) ∩ L(V ) is the
lattice of all M -solid subvarieties of V . Such lattices have been investigated
for a number of choices of V and M , but most work has been done for the
case that V is the type (2) variety Sem of all semigroups. We give here some
examples of results in this direction.

For any variety W ⊆ Sem, the associative law is an identity in W . This


means that a necessary condition for W to be solid is that it is hyperassocia-
tive, meaning that it satisfies the associative law as a hyperidentity. It is easy
to see that the trivial variety T and the rectangular variety RB of type (2) are
both solid, and that any non-trivial solid variety of semigroups must contain
the variety RB. At the other extreme, the largest solid variety of semigroups
must be the hypermodel class HHyp (τ )M od{f (f (x, y), z) ≈ f (x, f (y, z))} of
the associative law: we know from Theorem 14.3.4 that this hypermodel
class is a solid variety in which the associative law is a hyperidentity, and
any solid semigroup variety must be contained in this one. This variety was
called VHS , the hyperassociative variety, by K. Denecke and J. Koppitz, who
first gave a finite (but very large) basis for it in [26]. Another much smaller
basis was given by L. Polák in [91].

Theorem 14.3.7 ([91]) The largest solid variety of semigroups is the vari-
ety

VHS = M od{x(yz) ≈ (xy)z, x2 ≈ x4 , xyxzxyx ≈ xyzyx, xy 2 z 2 ≈


xyz 2 yz 2 , x2 y 2 z ≈ x2 yx2 yz}.
14.3. M -SOLID VARIETIES 341

One direction of this theorem is easy to prove. The variety VHS must of
course satisfy the associative law, and by applying the hypersubstitutions
taking the binary operation symbol f to the four semigroup terms x2 , xyx,
x2 y and xy 2 we get the other four identities in the claimed basis. This shows
that VHS is contained in the model class of the set of five identities given
in the theorem. The proof of the other direction involves showing that the
variety defined by these five identities is indeed hyperassociative, and is too
complex for us to give here.

Using this equational basis for the largest solid variety of semigroups, L.
Polák also gave a characterization of all solid semigroup varieties.

Theorem 14.3.8 ([92]) Let V be a non-trivial variety of semigroups. Then


V is solid iff either M od{x(yz) ≈ (xy)z ≈ xz} ⊆ V ⊆ VHS and V is
permutation-solid, or V is one of the three varieties
RB = M od{x(yz) ≈ (xy)z, x2 ≈ x, xyz ≈ xz},
N B = M od{x(yz) ≈ (xy)z, x2 ≈ x, xyzw ≈ xzyw}, or
RegB = M od{x(yz) ≈ (xy)z, x2 ≈ x, xyxzxyx ≈ xyzyx}.

As a consequence of this characterization theorem it can be shown that there


are infinitely many solid semigroup varieties.
M -solidity for semigroup varieties has been investigated for other choices of
M besides the solid case of M = Hyp(τ ).

Theorem 14.3.9 ([27] (i) The largest presolid but not solid variety of semi-
groups is the variety

VP S := M od{(xy)z ≈ x(yz), xyxzxyx ≈ xyzyx, x2 ≈ y 2 , x3 ≈ y 3 }.

(ii) A non-trivial variety V of semigroups is presolid iff either V is solid, or


V is permutation-solid and M od{xy ≈ zw} ⊆ V ⊆ VP S .

Theorem 14.3.10 ([23]) The largest regular-solid variety of semigroups is


the variety

VRS := M od{(xy)z ≈ x(yz), x2 y 2 z ≈ x2 yx2 yz,


xy 2 z 2 ≈ xyz 2 yz 2 , xyxzxyx ≈ xyzyx}.
342 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

Similar characterizations have been given for the cases of edge-solid,


permutation-solid and other M -solid varieties of semigroups. Hyperidenti-
ties and M -solid varieties have also been studied for other kinds of algebras
besides semigroups. K. Denecke and P. Jampachon studied regular-solid va-
rieties of commutative and idempotent groupoids in [25], and Denecke and
S. Arworn looked at left- and right-edge solid varieties of entropic groupoids
in [3]. For type (2, 1) algebras, inverse semigroups were studied by D. Cowan
and S. L. Wismath in [11] and [12], star-bands by J. Koppitz and S. L. Wis-
math in [67] and graph algebras by K. Denecke and T. Poomsa-ard in [34].
Quasigroups were investigated by K. Denecke and M. Reichel in [36]. For
algebras of type (2, 2), D. Schweigert studied lattices in [105] and Denecke
and Hounnon looked at solid varieties of semirings in [24].

14.4 Intervals in the Lattice L(τ )


In Section 13.3 we described a general approach for finding intervals in a
complete lattice, based on conjugate pairs of closure operators. Now we ap-
ply this general theory to the specific lattice L(τ ), with the conjugate pair
(χE A
M , χM ) of additive closure operators for a monoid M of hypersubstitu-
tions. The results of this section were proved by S. Arworn and K. Denecke
in [2].
In this setting, Definition 13.3.1 takes the following concrete form.

Definition 14.4.1 Let K ⊆ Alg(τ ) be an arbitrary class of algebras of type


τ and let M ⊆ Hyp(τ ) be an arbitrary monoid of hypersubstitutions of type
τ . Then we define

χM K := {V 0 | V 0 is M -solid and V 0 ⊇ K}, and


T

:= {V 0 | V 0 is M-solid and V 0 ⊆ K}.


W
χM K

We also set

[χM K,χM K] := {V 0 | V 0 is a variety of type τ and


0 χM K},
χM K ⊆ V ⊆

and call this set an interval in the lattice L(τ ).

Applying the theory of Section 13.3 gives the following proposition.


14.4. INTERVALS IN THE LATTICE L(τ ) 343

Proposition 14.4.2 Let V be a variety of type τ and let M be a monoid


of hypersubstitutions. Then

(i) χM V = V iff χM V = V iff V is M -solid,


and so [χM V, χM V ] = {V } iff V is M -solid.
(ii) χM V = HM M odIdV = M odχE M [IdV ],
and χM V = H M odH IdV = M odIdχA [V ].
M M M
(iii) If V1 ⊆ V2 then χM V1 ⊆ χM V2 and χM V1 ⊆ χM V2 ;
also χM V1 ∧ χM V2 = χM (V1 ∧ V2 ), χM V1 ∨ χM V2 = χM (V1 ∨ V2 ).
(iv) χM V defines a closure operator satisfying
χM [W{V | j ∈ J}] =
W χ
{ M Vj | j ∈ J},
j

and χM V defines a kernel operator satisfying

{Vj | j ∈ J}] = { | j ∈ J}.


T T
χM [ χM V j

Lemma 14.4.3 Let M1 and M2 be submonoids of Hyp(τ ) and let V be a


variety of type τ . If M1 ⊆ M2 , then

(i) χM1 V ⊆ χM2 V ,


(ii) χM1 V ⊇ χM2 V ,
(iii) [ χM1 V, χM1 V ] ⊆ [ χM2 V,
χM2
V ].

Proof: (i) Additivity of the operator χAM means that if M1 ⊆ M2 , then


A A
χM1 [V ] ⊆ χM2 [V ]. Applying the monotonic closure operator M odId to
each side gives M odIdχA A
M1 [V ] ⊆ M odIdχM2 [V ], and by 14.4.2(ii) we have
χM1
V ⊆ χM 2 V .

(ii) Additivity of the operator χE E


M means that if M1 ⊆ M2 then χM1 [IdV ] ⊆
χEM2 [IdV ]. Application of the anti-isotone operator M od on both sides gives
M odχE E
M1 [IdV ] ⊇ M odχM2 [IdV ]. Again 14.4.2 (ii) gives χM1 V ⊇ χM2 V .

(iii) It is clear that for intervals, [ χM1 V, χM1 V ] ⊆ [ χM2 V, χM2 V ] iff
χM1
χM1 V ⊇ χM2 V and V ⊆ χM2 V ; and then we apply (i) and (ii).

By Lemma 14.4.3, the map ϕ from the lattice Sub(Hyp(τ )) of all submonoids
of Hyp(τ ) to the lattice of all varieties of type τ , which associates to each
344 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

M ⊆ Hyp(τ ) the M -solid variety χM V , is order-preserving. Dually, the map-


ping ψ which associates to every submonoid M ⊆ Hyp(τ ) the M -solid vari-
ety χM V is order-preserving. The behaviour of these mappings with respect
to meet and joins is considered next.

Lemma 14.4.4 Let V be a variety of type τ . Let ϕ and ψ be mappings


from Sub(Hyp(τ )) to L(τ ) defined by ϕ : M 7→ χM V and ψ : M 7→ χM V ,
respectively, for M ⊆ Hyp(τ ). Then the following are satisfied, for any sub-
monoids M1 and M2 of Hyp(τ ):

(i) ϕ(M1 ∩ M2 ) ⊆ ϕ(M1 ) ∩ ϕ(M2 ), and


ϕ(M1 ∨ M2 ) ⊇ ϕ(M1 ) ∨ ϕ(M2 ),
(ii) ψ(M1 ∩ M2 ) ⊇ ψ(M1 ) ∨ ψ(M2 ), and
ψ(M1 ∨ M2 ) ⊆ ψ(M1 ) ∩ ψ(M2 ).

Proof: Both claims follow directly from Lemma 14.4.3.

As a consequence, we have the following inclusions for intervals:

[ χM 1 V ∩ , χM1 V ∨ χM2 V ] ⊆ [ χ(M1 ∨M2 ) V, χ(M1 ∨M2 ) V ]


χM 2 V
= [ψ(M1 ∨ M2 ), ϕ(M1 ∨ M2 )],
[ χM1 V ∨ χM2 V, χM1 V ∩χM2 V ] ⊇ [ χ(M1 ∩M2 ) V, χ(M1 ∩M2 ) V ]
= [ψ(M1 ∩ M2 ), ϕ(M1 ∩ M2 )].

We investigate next when we obtain equality for these intervals.

Lemma 14.4.5 Let M1 and M2 be submonoids of Hyp(τ ) and let V be


a variety of type τ . If M1 ∨ M2 = M1 ∪ M2 , then ϕ(M1 ∨ M2 ) =
ϕ(M1 ) ∨ ϕ(M2 ) and ψ(M1 ∨ M2 ) = ψ(M1 ) ∩ ψ(M2 ). Therefore

χM1 χM2 χ(M1 ∨M2 )


[χM1 V ∩ χM2 V, V ∨ V ] = [χ(M1 ∨M2 ) V, V ].

Proof: Using the assumption, the additivity of the closure operator χA


M and
the properties of the Galois-connection (Id,Mod), we have

ϕ(M1 ∨ M2 ) = χ(M1 ∨M2 ) V = χ(M1 ∪M2 ) V = M odIdχA


(M1 ∪M2 ) [V ]
= M odIdχM1 [V ] ∪ M odIdχM2 [V ] = M od(IdχM1 [V ] ∩ IdχA
A A A
M2 [V ])
= M odIdχA [V ] ∨ M odIdχ A [V ] = χM1 V ∨ χM2 V
M1 M2
14.5. EXERCISES 345

= ϕ(M1 ) ∨ ϕ(M2 ).

Similarly, we have

ψ(M1 ∨ M2 ) = χ(M1 ∨M2 ) V = M odχE(M1 ∪M2 ) [IdV ]


= M od(χM1 [IdV ] ∪ χM2 [IdV ]) = M odχE
E E E
M1 [IdV ] ∩ M odχM2 [IdV ]
= χM1 V ∩χM2 V = ψ(M1 ) ∩ ψ(M2 ).

14.5 Exercises
14.5.1. Prove Lemma 14.2.1.

14.5.2. Prove that for any two hypersubstitutions σ and ρ of a fixed type τ ,
we have (σ ◦h ρ)ˆ= (σ̂ ◦ ρ)ˆ= σ̂ ◦ ρ̂.

14.5.3. Let M ⊆ Hyp(τ ) be a monoid of hypersubstitutions. A variety V of


type τ is called M -hyperequationally simple if V has no solid subvarieties
other than the trivial variety T . Prove the following:

a) V is M -hyperequationally simple iff χM V = T .


b) If M1 and M2 are submonoids of Hyp(τ ) with M1 ⊆ M2 , then when V
is M1 -hyperequationally simple it is also M2 -hyperequationally simple.

14.5.4. Let V be a variety of semigroups. Show that if V satisfies an identity


u ≈ v in which the leftmost variables in u and v are different, then V is
hyperequationally simple.

14.5.5. A variety V of type τ is called M -hyperidentity-free if the only iden-


tities satisfied as M -hyperidentities by V are trivial ones of the form u ≈ u.
Prove that V is M -hyperidentity-free iff χM V = Alg(τ ).

14.5.6. Prove that if M1 and M2 are submonoids of Hyp(τ ) with M1 ⊆ M2 ,


then when V is M1 -hyperidentity-free it is also M2 -hyperidentity-free.

14.5.7. Construct some examples of M -hyperidentity-free varieties.

14.5.8. Determine the interval [χM V, χM V ], for M = Hyp(τ ), for V equal


to the variety of semilattices, the variety of left-zero semigroups, and the
346 CHAPTER 14. G-CLONES AND M -SOLID VARIETIES

variety of right-zero semigroups.


Chapter 15

Hypersubstitutions and
Machines

In this chapter we apply the hypersubstitution operation studied in the pre-


vious chapter to the Computer Science concepts from Chapters 7 and 8. Our
first application is a generalization of the unification problem, which plays
an important role in term rewriting systems and logical programming. It
is natural to consider the hyperunification problem, which arises when we
use hypersubstitutions instead of ordinary substitutions. The second appli-
cation in this chapter is the generalization of tree-recognizers to hyper-tree-
recognizers. Finally, we consider tree transformations and tree transducers
generated by hypersubstitutions.

15.1 The Hyperunification Problem


Let V be a variety of type τ . We recall from Chapter 7 that the word problem
for V is the problem of deciding, given any two terms u and v of type τ ,
whether u ≈ v holds as an identity in V . The concept of a unifier is important
in solving the word problem. We fix a countably infinite alphabet X, and let
Wτ (X) denote the set of all terms of type τ . Recall that a substitution is a
mapping s : X → Wτ (X), and that any such map has a unique extension ŝ :
Wτ (X) → Wτ (X). For any term t, the term ŝ(t) is obtained by substitution
of the term s(x) for each occurrence of a variable x in t. A substitution s
for which ŝ(u) ≈ ŝ(v) is an identity in V is called a unifier for the terms u
and v with respect to the variety V . In the special case that V is the variety
Alg(τ ) of all algebras of type τ , we have ŝ(u) ≈ ŝ(v) an identity of V if

347
348 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

and only if ŝ(u) = ŝ(v). In this case we refer to a unifier as a syntactical


unifier; otherwise, when V is a proper subvariety of Alg(τ ), we refer to a
unifier with respect to V as a semantical unifier. To solve the unification
problem means to decide, given two terms, whether there exists a unifier for
the terms or not. For more information on this topic see J. H. Siekmann,
[108].

In this section we describe the work of K. Denecke, J. Koppitz and S. Ni-


wczyk, from [28], regarding the generalization of unifiers and the unification
problem to the hyperidentity setting. Instead of looking for substitution
mappings s which unify two terms, we look for unifying hypersubstitutions
σ.

Definition 15.1.1 Let u and v be two terms of type τ . A hypersubstitution


σ of type τ is called a (syntactical) hyperunifier for u and v if σ̂[u] =
σ̂[v]. When such a hyperunifier exists, we say that the terms u and v are
hyperunifiable.
By the definition of the kernel of a mapping, a hypersubstitution σ is a
hyperunifier for two terms u and v exactly when the pair (u, v) is in the
kernel of the extension mapping σ̂ defined on Wτ (X). We will refer to this
kernel as the kernel of the hypersubstitution σ, and denote it by kerσ. The
first step is to show that any such kernel is a fully invariant congruence
relation on the free algebra Fτ (X) defined on the set Wτ (X).

Lemma 15.1.2 ([28]) Let τ = (ni )i∈I be a type of algebras with ni ≥ 1 for
all i ∈ I. Let σ be a hypersubstitution of type τ . Then the relation kerσ is a
fully invariant congruence on the absolutely free algebra Fτ (X).

In the case of type (n), for n ≥ 1, Denecke, Koppitz and Niwczyk have
completely determined all the congruence relations kerσ for any hypersub-
stitution σ of type (n). To describe their results we need some notation for
terms regarded as trees, from Section 5.1. To each node or vertex of a tree
representing a term of type (n) we can assign a sequence of integers from
the set {1, 2, . . . , n}, called the address of the node or vertex. For each value
1 ≤ i ≤ n, there is a uniquely determined variable obtained by following the
address ii · · · i in the tree until we have the address of a variable; we shall de-
note this variable by vari (t). Let K be any non-empty subset of {1, 2, . . . , n}.
An address is called a terminating K-sequence for t if it is the address of a
15.2. HYPER TREE RECOGNIZERS 349

variable, it contains only indices from K and no subsequence of the address


gives a variable.

Proposition 15.1.3 ([28]) Let τ = (n), with n ≥ 1, with one n-ary opera-
tion symbol f .
(i) If σ is a regular hypersubstitution of type τ , then kerσ is the diagonal
relation on Wτ (X); that is, σ̂[u] = σ̂[v] iff u = v.
(ii) If σ is a projection hypersubstitution, so that σ(f ) = xi for some
1 ≤ i ≤ n, then kerσ = {(u, v) ∈ Wτ (X)2 | vari (u) = vari (v)}.
(iii) Let n ≥ 2, and let σ be a non-projection hypersubstitution. Let K be
the set of variables used in the term σ(f ), with 1 ≤ |K| < n. Then a pair
(u, v) from Wτ (X)2 is in kerσ iff any terminating K-sequence of u is a ter-
minating K-sequence of v and vice versa, and any such sequence addresses
the same variable in both u and v.

In the same paper, the authors also extended their results to an arbitrary
type τ having no nullary operation symbols, for certain restricted kinds of
hypersubstitutions.

In addition to this syntactical hyperunification problem, we may consider


the semantical version for any variety V . This is equivalent to studying, for
any variety V and hypersubstitution σ of type τ , the relation

kerV σ := {(u, v) ∈ Wτ (X)2 | V satisfies σ̂[u] ≈ σ̂[v]}.

This concept of the V -kernel of a hypersubstitution has been studied by K.


Denecke, J. Koppitz and S. L. Wismath in [29], where it was shown that
any such V -kernel is a fully invariant congruence relation on the free algebra
defined on Wτ (X).

15.2 Hyper Tree Recognizers


In Section 8.4 we introduced the concept of a tree-recognizer. A term or
tree t of a given type is recognized by a tree-recognizer if there is a finite
algebra A of the type, a subset A0 of the universe of A, and an evaluation
mapping α which sends the variables occurring in t to elements of A such
that α̂[t] belongs to A0 . The evaluation mapping carries out the operation of
substitution, by replacing the leaves of the tree (corresponding to variables
350 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

or nullary operation symbols) by terms of the algebra. If in addition we allow


the replacement of all vertices of the tree (operation symbols of the term)
by term operations of the algebra A of the same arity, we implement a hy-
persubstitution. In this way our tree-recognizer becomes a hyperrecognizer.
One might think that this kind of “parallel working” might allow us to rec-
ognize larger families of languages. But it was shown by K. Denecke and
N. Pabhapote in [33] that in the case of a finite alphabet, a language is
recognizable iff it is hyperrecognizable. This result means that all vertices in
a tree-recognizer can be evaluated at the same time in an appropriate way,
giving us additional insight into the power of a tree-recognizer.

We begin by recalling some notation from Section 8.4, where algebras were
described by a set Σ of operation symbols rather than a type τ . That is,
we let Σ = Σ0 ∪ Σ1 ∪ · · · ∪ Σm be a set of operation symbols, where the
operation symbols in each Σn are n-ary. We will also assume here that Σ
is finite. As usual we have a countably infinite set X of variables, while for
each natural number n we let Xn be the set of variables x1 , . . . , xn . We will
denote by WΣ (X) and WΣ (Xn ) the sets of all terms which can be built up
from the operation symbols from Σ and the variables from X or from Xn ,
respectively. We write A = (A, ΣA ) for a (finite) algebra whose fundamental
operations correspond to the operation symbols from Σ (of the type of Σ).
As we saw in Chapter 5, in the case that both Σ and X are finite we can
consider terms as trees.

Definition 15.2.1 A Σ − Xn -tree-hyperrecognizer is a sequence


HA := (clonen A, Σ, Xn , σ A , CnA ).
Here clonen A is the clone of all n-ary term operations of a finite algebra A,
and σ A : Σ ∪ Xn → clonen A is a mapping which is defined in the following
way:

σ A [xi ] := en,A
i if xi ∈ Xn is a variable, and
σ A (fi ) = tAi for an ni -ary operation symbol fi ∈ Σni and an ni -ary term
A
operation ti of A.

(Note that n has to be greater than the greatest arity of the operation
symbols in Σ.) CnA is a subset of clonen A.
The mapping σ A thus maps each variable and each operation symbol to
an n-ary term of the algebra A. We mention that any such mapping can
15.2. HYPER TREE RECOGNIZERS 351

be extended to a mapping (σ A )ˆ : WΣ (Xn ) → clonen A, in the following


inductive way:

(i) (σ A )ˆ[xi ] = σ A (xi ) if xi ∈ Xn is a variable,

(ii) (σ A )ˆ[f0 ] = σ A (f0 ) if f0 ∈ Σ0 is nullary,

(iii) (σ A )ˆ[fi (t1 , . . . , tni )] = σ A (fi )((σ A )ˆ[t1 ], . . . , (σ A )ˆ[tni ]) if fi is an ni -


ary operation symbol and if (σ A )ˆ[tj ] is already defined for 1 ≤ j ≤ ni .

We recall from Section 14.3 that a hypersubstitution (of type Σ) is an arity-


preserving mapping σ : Σ → WΣ (Xn ), and that any such hypersubstitution
can be uniquely extended to a mapping σ̂ : WΣ (X) → WΣ (X). Then for any
term t we have (σ A )ˆ[t] = (σ̂[t])A , meaning that each operation symbol fi ∈
Σni is mapped to an n-ary term t such that the term operation induced by
the term σ̂[t] on A agrees with (σ A )ˆ[t]. We remark that hypersubstitutions
correspond to the tree-homomorphisms h : FΣ (X) → FΣ (X) introduced in
8.6 (as in F. Gécseg and M. Steinby, [48]), with the additional restriction
that h(xi ) = xi for all i = 1, . . . , n.

Definition 15.2.2 Let HA be a Σ−Xn -tree-hyperrecognizer. The language


hyperrecognized by HA is the Σ − Xn -language

T (HA) := {t ∈ WΣ (Xn ) | (σ A )ˆ[t] ∈ CnA },

where (σ A )ˆ is the extension of σ A . A language T ⊆ WΣ (Xn ) is called


hyperrecognizable if there is a Σ − Xn -tree-hyperrecognizer HA such that
T = T (HA).

It is not difficult to see that every hyperrecognizable language is recognizable.


To prove this, we have to show that given a hyperrecognizer A, we can find
an algebra B, an evaluation mapping and a subset of B to use in a recognizer
B, in such a way that T (HA) = T (B). A possible algebra B satisfying this
condition is B = (clonen A, σ(Σ)B ) where σ(Σ) is the set of all images of
the operation symbols from Σ under the hypersubstitution σ. As evaluation
mapping we use the restriction of σ A to X. Then it is clear that HA and
B recognize the same language. Note that the concept of a derived algebra,
mentioned in Section 14.3, is involved here.

Example 15.2.3 We set X2 = {x1 , x2 }, Σ = Σ2 = {f } and A =


({0, 1}; f A ) with f A equal to the conjunction operation ∧. Then clone2 A
352 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

= {e2,A 2,A A
1 , e2 , (x1 ∧ x2 ) }. We consider the hyperrecognizer HA =
(clone2 A, Σ, X2 , σ , Cn ) in which CnA = {(x1 ∧ x2 )A } and σ A is given by
A A

σ A (f ) = (x1 ∧ x2 )A . Then T (HA) = WΣ (X2 )\{xr1 , xs2 | r, s ≥ 1}, where xr1


:= x1 ∧ · · · ∧ x1 means that x1 occurs r times.

Example 15.2.4 We choose X2 , and let Σ contain two operation symbols,


one binary and one nullary. Let A = (V4 , ·, e) be the Klein-four group. Let
C2A = {eA } and σ A (f ) = eA (the identity element eA considered as binary
term operation). Then we have σ A [xi ] = e2,A i , for i = 1, 2. Inductively, when
t = f (t1 , t2 ), for some t1 ,t2 ∈ WΣ (X2 ), we have (σ A )ˆ[t] = (σ A )ˆ[f (t1 , t2 )] =
σ A (f )((σ A )ˆ[t1 ], (σ A )ˆ[t2 ]) = eA ((σ A )ˆ[t1 ], (σ A )ˆ[t2 ]) = eA . This shows that
T (HA) = WΣ (X2 )\{x1 , x2 }.
The following proposition connects tree-hyperrecognizers with identities and
varieties. We recall the notation IdA for the set of identities of the algebra
A and sA for the term operation on A induced by a term s.

Proposition 15.2.5 The language T ⊆ WΣ (Xn ) is hyperrecognizable iff


there is a finite algebra A = (A; ΣA ), a subset CnA ⊆ clonen A and a hy-
persubstitution σ of type Σ such that
(i) for each t ∈ T there exists an element sA ∈ CnA with σ̂[t] ≈ s ∈ IdA

(ii) for each t ∈ WΣ (Xn )\T and each sA ∈ CnA , we have σ̂[t] ≈ s ∈
/ IdA.
Proof: Suppose that T is hyperrecognizable. Then there exists a Σ−Xn -tree-
hyperrecognizer HA = (clonen A, Σ, Xn , σ A , CnA ) such that T (HA) = T .
If t ∈ T then there is an element sA ∈ CnA with (σ A )ˆ[t] = sA . But then
there is a hypersubstitution σ with (σ A )ˆ[t] = (σ̂[t])A . (Note that σ is not
uniquely determined by σ A , since every σ 0 for which σ(fi ) ≈ σ 0 (fi ) ∈ IdA
for every operation symbol fi from Σ satisfies the same equation.) From the
last equation we obtain σ̂[t] ≈ s ∈ IdA.

If t ∈ / T then for each sA ∈ CnA we have (σ A )ˆ[t] 6= sA . This means


/ IdA if σ is a hypersubstitution which satisfies (σ A )ˆ[t] = (σ̂[t])A .
σ̂[t] ≈ s ∈

Conversely, suppose now that there is a finite algebra A = (A; ΣA ) with the
finite n-clone clonen A and a hypersubstitution satisfying (i) and (ii). We
will show that the hyperrecognizer HA = (clonen A, Σ, Xn , σ A , CnA ) satisfies
T (HA) = T . If t ∈ T (HA) then there is a term operation sA ∈ CnA such
that (σ A )ˆ[t] = sA , i.e., (σ[t])A = sA and σ̂[t] ≈ s ∈ IdA. Because of (ii) we
15.2. HYPER TREE RECOGNIZERS 353

have t ∈ T . Conversely, if t ∈ T then there exists a term s with sA ∈ CnA


and with σ̂[t] ≈ s ∈ IdA by (i) and thus σ̂[t]A = (σ A )ˆ[t] = sA ∈ CnA and t
belongs to T (HA).

The proof of Proposition 15.2.5 (i) shows that the term s which satisfies
σ̂[t] ≈ s ∈ IdA for t ∈ T is not uniquely determined. Therefore we use the
following binary relation ∼V (A) defined by J. Plonka in [90] on the set of all
hypersubstitutions:

σ1 ∼V (A) σ2 :⇔ σ1 (fi ) ≈ σ2 (fi ) ∈ IdA,

for all operation symbols fi ∈ Σ. Then we have the following proposition.

Proposition 15.2.6 (i) Let HA1 := (clonen A, Σ, Xn , σ1A , CnA ) and


A)
HA2 := (clonen A, Σ, Xn , σ2A , Cn ). If σ1 ∼V (A) σ2 then T (HA1 ) =
T (HA2 ).

(ii) Let A1 and A2 be Σ-algebras with IdA1 = IdA2 and let HA1 :=
(clonen A1 , Σ, Xn , σ A1 , CnA1 ) be a tree-hyperrecognizer based on the al-
gebra A1 . Then there exists a tree-hyperrecognizer T (HA2 ) based on
A2 with T (HA1 ) = T (HA2 ).

Proof: (i) If t ∈ T (HA1 ) then there is an element sA ∈ CnA and a cor-


responding term s with σˆ1 [t] ≈ s ∈ IdA. From the definition of the rela-
tion ∼V (A) it follows that σˆ1 [t] ≈ σˆ2 [t] ∈ IdA for all terms t. But then
σˆ2 [t] ≈ s ∈ IdA.

If t ∈ WΣ (Xn )\T (HA1 ) then for each sA ∈ CnA we have σˆ1 [t] ≈ s ∈
/ IdA. But
then also σˆ2 [t] ≈ s ∈
/ IdA, and by Proposition 15.2.5 we have t ∈ T (HA2 ).
A dual argument shows that T (HA2 ) ⊆ T (HA1 ).

(ii) Let V (A1 ) and V (A2 ) be the varieties generated by A1 and by A2 , re-
spectively and let FV (A1 ) (Xn ) and FV (A2 ) (Xn ) be the free algebras relative
to V (A1 ) and to V (A2 ), respectively. We have seen that the clone of an al-
gebra can be regarded as a multi-based algebra where the m-ary operations
for all 0 ≤ m ≤ n are the different sorts and where the operations are the
superposition operations. It is also well-known that clonen A1 is isomorphic
to the clone of the free algebra FV (A1 ) (Xn ). Here we have IdA1 = IdA2 ,
which tells us that FV (A1 ) (Xn ) = FV (A2 ) (Xn ) and therefore the clones are
354 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

0
equal. But then clonen A1 and clonen A2 are isomorphic. Let CnA2 be the im-
0
age of CnA1 under this isomorphism and let σ A2 be the composition of σ A1
with this isomorphism. Using this mapping it can be shown that T (HA1 ) =
T (HA2 ).

Now we are ready to prove that for finite alphabets the concepts of recog-
nizability and hyperrecognizability are equivalent.

Theorem 15.2.7 When Xn is a finite alphabet, a Σ − Xn language is hy-


perrecognizable iff it is recognizable.
Proof: Since we have already shown that any hyperrecognizable language
is recognizable, we have only to prove the converse. Let T be a recogniz-
able Σ − Xn language, with recognizer A = (A, Σ, X, α, A0 ) such that T =
T (A). We denote by T (A)A the set of all term operations which are in-
duced by the terms from T (A). Note that the term operations from T (A)A
are n-ary. We set CnA = T (A)A and consider the hyperrecognizer HA =
A , C A ), where σ A is the mapping which maps each oper-
(clonen A, Σ, Xn , σid n id
ation symbol f from Σ to the induced fundamental term operation f A from
A. Then we have:
A
t ∈ T (A) ⇔ (σid )ˆ[t] = tA ∈ T (A)A = CnA ,

and this means that T (A) = T (HA).

Next we describe another way to construct a tree-recognizer equivalent to


a given hyperrecognizable language, which illustrates the interconnections
between hyperrecognizers and equivalent tree-recognizers. Assume that T is
hyperrecognizable. Then there exists a hyperrecognizer

HA = (clonen A, Σ, Xn , σ A , CnA )

such that T = T (HA). We want to show that there is a tree-recognizer A


such that T (HA) = T (A).

Since CnA is a finite subset of clonen A we can write CnA = {sA A


1 , . . . , sm }.
Consider the tree-hyperrecognizers

HAi = (clonen A, Σ, Xn , σ A , {sA


i }),

for 1 ≤ i ≤ m. Then we have


15.2. HYPER TREE RECOGNIZERS 355

t ∈ T (HA) ⇔ (σ A )ˆ[t] ∈ CnA


⇔ (σ A )ˆ[t] = sA
i for some i ∈ {1, . . . , m}
⇔ t ∈ T (HAi ) for some i ∈ {1, . . . , m}
⇔ t∈ m
S
i=1 T (HAi ).

Therefore T (HA) = m
S
i=1 T (HAi ).
Consider now an n-ary term t ∈ T (HAi ). Then we have

t ∈ T (HAi ) ⇒ (σ A )ˆ[t] = sA i
⇒ (σ̂[t])A = sA i if σ is a hypersubstitution
satisfying (σ A )ˆ[t] = (σ̂[t])A
for all t ∈ WΣ (Xn )
⇒ (σ̂[t])A (a1 , . . . , an ) = sA
i (a1 , . . . , an )
for all a1 , . . . , an ∈ A.

Since the image of sA A


i is finite we can write Imsi = {ci1 , . . . , ciki } where
ci1 , . . . , ciki ∈ A. Consider the tree-recognizers Ail = (A, Σ, Xn , αi0 l , {cil }),
for 1 ≤ l ≤ ki . Here the evaluation mapping αi0 l is defined by α̂i0 l = α̂il ◦ σ̂
where αil maps (x1 , . . . , xn ) to the n-tuple (sA −1
i ) (cil ) for all l = 1, . . . , ki
and σ is a hypersubstitution with (σ̂[t])A = (σ A )ˆ[t].
Then we have

t ∈ T (HAi ) ⇔ (σ A )ˆ[t] = sA i
⇔ (σ̂[t])A = sA i
⇔ α̂il [σ̂[t]] = α̂il [si ] for all l = 1, . . . , ki
⇔ (α̂il ◦ σ̂)[t] = cil for all l = 1, . . . , ki
⇔ t ∈ T (Ail ) for all l = 1, . . . , ki
⇔ t ∈ kl=1
T i
T (Ail ),

Tk i
showing that T (HAi ) = l=1 T (Ail ).

As we saw in Chapter 8, it is well known that the intersection of recogniz-


able languages is recognizable (see for instance [48]). Therefore there exists
a tree-recognizer Ai such that T (Ai ) = kl=1
T i
T (Ail ) for each i = 1, . . . , m
Sm
and then also T (HA) = i=1 T (Ai ).

Since the join of recognizable languages is also recognizable, there is a tree-


recognizer B such that T (B) = m
S
i=1 T (Ai ) and then T (HA) = T (B).
356 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

Tree-recognizers were introduced as generalizations of finite automata inde-


pendently by J. E. Doner ([40], [41]) and by J. W. Thatcher and J. B. Wright
([114], [115]). They can be defined for both finite and infinite alphabets. Our
definition and results so far have been for hyperrecognizers with a finite al-
phabet. But if we extend our definition to the case of a countably infinite
alphabet X, it turns out that recognizability and hyperrecognizability are
no longer equivalent.

Proposition 15.2.8 If T ⊆ WΣ (X) is a recognizable language for which


there is no n ∈ IN such that T ⊆ WΣ (Xn ), then T is not hyperrecognizable.

Proof: Assume that T is hyperrecognizable. Then there is a hyperrecognizer


HA = (clonen A, Σ, Xn , σ A , CnA ) such that T = T (HA). Since T 6⊆ WΣ (Xn ),
/ WΣ (Xn ). Therefore tA ∈
there is a tree t ∈ T with t ∈ / clonen A , (σ A )ˆ[t] ∈
/
A
clonen A and (σ )ˆ[t] ∈ A
/ Cn . This contradicts t ∈ T = T (HA).

Example 15.2.9 Here is an example of a language which is recogniz-


able but not hyperrecognizable. Let us take Σ = Σ2 = {f } and T :=
{f (x1 , xj ) | 2 ≤ j ∈ IN}. For our tree-recognizer, we let A = ({a, b, c, d}; f A ),
with f A (a, b) = c and f A (x, y) = d on all other inputs. We choose A0 = {c},
and let α be defined by α(x1 ) = a and α(xj ) = b for all 2 ≤ j ∈ IN. We will
show that the tree-recognizer A = (A, Σ, X, α, A0 ) recognizes our language
T . If t ∈ T (A), then α̂[t] = c and t cannot be a variable. This means there ex-
ist terms r, s ∈ WΣ (X) such that t = f (r, s). Then α̂[t] = f A (α̂[r], α̂[s]) = c
and α̂[r] = a, α̂[s] = b. If r, s ∈ WΣ (X)\X we have α̂[r], α̂[s] ∈ {c, d}.
Therefore r = x1 and s ∈ {xi | 2 ≤ i ∈ IN}. Thus t = f (x1 , xj ) for some
natural number j ≥ 2, and we have shown that T (A) ⊆ T . Conversely, if
t ∈ T , then α̂[t] = f A (a, b) = c, so that t ∈ T (A).

Thus T is recognizable. However, since there is no n ∈ IN such that T ⊆


WΣ (Xn ), we see by Proposition 15.2.8 that T is not hyperrecognizable.

We remark that our approach to hyperrecognizers uses the concept of a clone.


Instead of clones one could also consider algebraic theories in the sense of
Lawvere, as was done by Z. Ésik in [43].
15.3. TREE TRANSFORMATIONS 357

15.3 Tree Transformations


The tree transducers of Section 8.8 are generalizations of automata: just as
automata transform words (terms of a particular type) into words, transduc-
ers transform terms of any one fixed type into terms of a second fixed type.
In doing so they produce transformations, which are sets consisting of pairs
of trees, where the first components are trees from the first type or language
and the second components are trees of the second type.

Our definition of a hypersubstitution, from Section 14.3, gave us a mapping


which took operation symbols of one type or language to terms of that same
type. Now we generalize this definition too, to include mappings from op-
eration symbols of one language into terms of a second language. We also
consider the corresponding tree transformations. We shall prove that the set
of all tree transformations which are defined by hypersubstitutions of a given
type forms a monoid with respect to the composition of binary relations, and
that this monoid is isomorphic to the monoid of all hypersubstitutions of this
type. We characterize transitivity, reflexivity and symmetry of tree transfor-
mations by properties of the corresponding hypersubstitutions. The results
will be illustrated for type (2), with Σ = Σ2 = {f }.

In general, let Σ := {fi | i ∈ I} be a set of operation symbols of type


τ1 = (ni )i∈I , where fi is ni -ary, ni ∈ IN and let Ω = {gj | j ∈ J} be a set
of operation symbols of type τ2 = (nj )j∈J where gj is nj -ary. As usual, we
denote by Wτ1 (X) and by Wτ2 (X) the sets of all terms of types τ1 and τ2 ,
respectively.

Definition 15.3.1 A (τ1 − τ2 )-hypersubstitution is a mapping

σ : {fi | i ∈ I} → Wτ2 (X)

which maps each operation symbol fi of type τ1 to a term σ(fi ) of type τ2


of the same arity as fi .
As before, every (τ1 − τ2 )-hypersubstitution σ can be extended to a mapping

σ̂ : Wτ1 (X) → Wτ2 (X)

in the following inductive way:

(i) σ̂[x] := x,
358 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

(ii) σ̂[fi (ti , . . . , tni )] := σ(fi )(σ̂[ti ], . . . , σ̂[tni ]).

Definition 15.3.2 Let σ be a (τ1 − τ2 )-hypersubstitution. Then


Tσ := {(t, σ̂[t]) | t ∈ Wτ1 (X)}
is called the tree transformation defined by σ.
When the types τ1 and τ2 are the same, this definition reduces to the usual
definition of a (type τ1 ) hypersubstitution. But there is another way to con-
sider these new mixed-type hypersubstitutions as hypersubstitutions of one
type. Given the two different types τ1 and τ2 , we form a new type τ by tak-
ing the union of the types; that is, we form the union Σ ∪ Ω of the sets of
operation symbols of the two types. As we saw in Section 14.3, the set of all
type τ hypersubstitutions forms a monoid Hyp(τ ), under the composition
operation
σ1 ◦h σ2 := σ̂1 ◦ σ2 .
Now every (τ1 − τ2 )-hypersubstitution can be considered as a hypersubstitu-
tion of type τ which fixes the symbols of type τ2 , in the sense that each opera-
tion symbol gj of type τ2 is mapped to the fundamental term gj (x1 , . . . , xnj ).
Since the composition of two hypersubstitutions which fix the operation sym-
bols from Ω is again a hypersubstitution which fixes the operation symbols
from Ω, the set of all (τ1 − τ2 )-hypersubstitution regarded as hypersubstitu-
tions of the type τ forms a submonoid of the monoid Hyp(τ ). This allows
us to consider tree transformations Tσ where σ ∈ Hyp(τ ), t ∈ Wτ (X) and
σ̂[t] ∈ Wτ (X).

Tree transducers were defined in Section 8.8. It turns out that tree transfor-
mations Tσ for a hypersubstitution σ are induced by tree transducers. The
following proposition is a special case of 8.8.5, since the extensions of hyper-
substitutions are a particular kind of tree homomorphisms as introduced in
Section 8.6.

Proposition 15.3.3 If σ is a (τ1 − τ2 )-hypersubstitution and if A =


(Σ, X, A, Ω, P, A0 ) is the tree transducer with A = A0 = {a} and
P = {x →A ax | x ∈ X} ∪ {fi (a(ξ1 ), · · · , a(ξni )) →A aσ(fi )(ξ1 , · · · , ξni ) | i ∈
I}, then TA = Tσ .
Note that tree transducers considered in the previous proposition are exam-
ples of the H-transducers introduced in Section 8.8.
15.3. TREE TRANSFORMATIONS 359

In the remainder of this section we assume that we have only one type τ . We
denote by Tσ1 ◦ Tσ2 the composition of the tree transformations Tσ1 and Tσ2 .
We can also consider inverses, domains and ranges of tree transformations.
We define THyp(τ ) := {Tσ | σ ∈ Hyp(τ )}.

Theorem 15.3.4 (THyp(τ ) ; ◦, Tσid ) is a monoid which is isomorphic to the


monoid Hyp(τ ) of all hypersubstitutions of type τ .

Proof: We define a mapping ϕ : Hyp(τ ) →THyp(τ ) by σ 7→ Tσ . Clearly, ϕ is


well defined. To show that ϕ is a homomorphism, we will show that Tσ1 ◦ Tσ2
= Tσ1 ◦h σ2 , so that ϕ(σ1 ◦h σ2 ) = ϕ(σ1 ) ◦ ϕ(σ2 ). We have

(t, t00 ) ∈ Tσ1 ◦ Tσ2


⇔ ∃t0 ((t, t0 ) ∈ Tσ2 and (t0 , t00 ) ∈ Tσ1 )
⇔ t0 = σ̂2 [t] and t00 = σ̂1 [t0 ]
⇔ t00 = σ̂1 [σ̂2 [t]]
⇔ t00 = (σ1 ◦h σ2 )ˆ[t]
⇔ (t, t00 ) ∈ Tσ1 ◦h σ2 .

To see that ϕ is one-to-one, let Tσ1 = Tσ2 . Then for all t ∈ Wτ (X) we have
σ̂1 [t] = σ̂2 [t]. But this means that for all operation symbols fi we also have

σ̂1 [fi (xi , · · · , xni )] = σ1 (fi ) = σ2 (fi ) = σ̂2 [fi (xi , · · · , xni )],

and therefore σ1 = σ2 . Finally, since Tσ1 ◦ Tσ2 = Tσ1 ◦h σ2 , the tree transfor-
mation Tσid is an identity element with respect to the composition ◦.

The previous theorem now allows us to describe properties of the relation


Tσ by properties of the hypersubstitution σ, and vice versa.

Theorem 15.3.5 Let σ ∈ Hyp(τ ) be a hypersubstitution of type τ and let


Tσ be the corresponding tree transformation. Then

(i) Tσ is transitive iff σ is idempotent,

(ii) Tσ is reflexive iff σ = σid ,

(iii) Tσ is symmetric iff σ ◦h σ = σid .

Proof: (i) When σ is idempotent, we have Tσ◦h σ = Tσ ◦Tσ = Tσ by Theorem


15.3.4, and Tσ is transitive. Conversely, when Tσ is transitive, we have Tσ ◦
360 CHAPTER 15. HYPERSUBSTITUTIONS AND MACHINES

Tσ ⊆ Tσ , so that Tσ◦h σ ⊆ Tσ . Then

(t1 , (σ ◦h σ) ˆ[t]) ∈ Tσ◦h σ ⇒ (t1 , (σ ◦h σ) ˆ[t]) ∈ Tσ ⇒ (σ ◦h σ)ˆ[t] = σ̂[t],


for all t ∈ Wτ (X), and σ is idempotent.

(ii) Assume that Tσ is reflexive, so that Tσid = 4Wτ (X) ⊆ Tσ . Therefore


(t, t) ∈ Tσ for all t ∈ Wτ (X) and then σ̂[t] = t for all t ∈ Wτ (X), making σ
= σid .

If conversely σ = σid , then Tσid = {(t, σ̂id [t]) | t ∈ Wτ (X)} = {(t, t) | t ∈


Wτ (X)} = 4Wτ (X) and Tσ is reflexive.

(iii) If Tσ is symmetric, then for all t ∈ Wτ (X) we have


(t, σ̂[t]) ∈ Tσ ⇒ (σ̂[t], t) ∈ Tσ .
Therefore t = σ̂[σ̂[t]] and σ̂id [t] = (σ ◦h σ) ˆ[t] for all t ∈ Wτ (X), and we have
σ ◦h σ = σid .

If conversely σ ◦h σ = σid then we have Tσ◦h σ = Tσ ◦ Tσ = Tσid . But this


means Tσ = (Tσ )−1 , and Tσ is symmetric.

In general, the range of a tree transformation σ, which is the set


σ̂(Wτ (X)) = {t0 | ∃t ∈ Wτ (X)(t0 = σ̂[t])},
is a subset of Wτ (X). Therefore, we consider Tσ as a relation between Wτ (X)
and σ(Wτ (X)), so that Tσ ⊆ Wτ (X) × σ(Wτ (X)). We notice that Tσ ◦
(Tσ )−1 = Tσid = 4Wτ (X) and that (Tσ )−1 ◦Tσ = {(t, t0 ) | σ̂[t] = σ̂[t0 ]} = kerσ
(the kernel of σ). This gives the following result.

Proposition 15.3.6 Let σ ∈ Hyp(τ ) be a hypersubstitution of type τ and


let Tσ = Wτ (X) × σ̂(Wτ (X)) be the corresponding tree transformation. Then
Tσ is bijective iff kerσ = 4Wτ (X) = Tσid .
Proof: Tσ is bijective iff Tσ ◦ (Tσ )−1 = (Tσ )−1 ◦ Tσ = Tσid = 4Wτ (X) . Now
we use the previous remark.

In Theorem 15.1.3 it was shown that for the type τ = (n) with n ≥ 2,
any regular hypersubstitution σ has the property that kerσ is equal to the
diagonal relation 4Wτ (X) .
15.4. EXERCISES 361

Corollary 15.3.7 Let σ be a regular hypersubstitution of type τ = (n), for


n ≥ 2, and let Tσ be the tree transformation defined by σ. Then Tσ is bijec-
tive.

15.4 Exercises
15.4.1. Let V be a variety and σ be a hypersubstitution of type τ . The set

TσV := {(t, t0 ) | t, t0 ∈ Wτ (X) and σ̂[t] ≈ t0 ∈ IdV }

is called the V -tree-transformation defined by σ. Prove that when σ1 ∼V σ2 ,


under the relation ∼V defined just after Proposition 15.2.5, then TσV1 = TσV2 .

15.4.2. For which hypersubstitutions σ is it the case that

TσV1 ◦ TσV2 = TσV1 ◦σ2 ?

15.4.3. Prove that TσV is surjective iff TσV ◦ (TσV )−1 = IdV .

15.4.4. Prove that TσV is injective iff (TσV )−1 ◦ (TσV ) = kerV σ = ∆Wτ (X) .

15.4.5. A mapping σ : {fi | i ∈ I} → Wτ (X), from the set of operation


symbols of a type τ to the set of terms of that type, which does not nec-
essarily preserve arities, is called a generalized hypersubstitution. Prove that
the kernel of a generalized hypersubstitution σ is a fully invariant congruence
relation on the free algebra Fτ (X) iff σ does not map any ni -ary operation
symbol fi to a variable different from x1 , . . . , xni .
362
Bibliography

[1] Arworn, S., Groupoids of Hypersubstitutions and G-solid varieties,


Shaker-Verlag, Aachen, 2000.

[2] Arworn, S. and K. Denecke, Intervals defined by M-solid varieties, in:


General Algebra and Applications, Proc. 59th Workshop on General
Algebra and 15th Conference for Young Algebraists, Potsdam 2000,
Shaker-Verlag, Aachen, 2000, 1 - 18.

[3] Arworn, S. and K. Denecke, Left- and right-edge solid varieties of en-
tropic groupoids, Demonstratio Mathematica, Vol. XXXII No. 1 (1999),
1 - 11.

[4] Arworn, S., K. Denecke and R. Pöschel, Closure Operators on Com-


plete Lattices, to appear in Proc. International Conference on Ordered
Algebraic Structures, Nanjing, 1998.

[5] Baker, K. A., Finite equational bases for finite algebras in congruence-
distributive equational classes, Adv. in Math. 24 (1977), 201 - 243.

[6] Baker, K. and A. Pixley, Polynomial interpolation and the Chinese Re-
mainder Theorem for algebraic systems, Math. Z. 143 (1975), 165 -
174.

[7] Berman, J., A proof of Lyndon’s finite basis theorem, Discrete Math. 29
(1980), 229 - 233.

[8] Birkhoff, G., The structure of abstract algebras, Proc. Cambridge Philo-
sophical Society 31 (1935), 433 - 454.

[9] Birkhoff, G. and J. D. Lipson, Heterogeneous algebras, J. Combinat.


Theory 8(1970), 115 - 133.

363
364 BIBLIOGRAPHY

[10] Cohn, P. M., Universal Algebra, Harper & Row, New York, 1965.

[11] Cowan, D. and S. L. Wismath, Unary iterative hyperidentities for semi-


groups and inverse semigroups, Semigroup Forum 55 (1997), 221 - 231.

[12] Cowan, D. and S. L. Wismath, Unary hyperidentities for varieties of


inverse semigroups, Semigroup Forum, 58 (1999), 106 - 125.

[13] Csákány, B. and T. Gavalcová, Finite homogeneous algebras I., Acta


Sci. Math. (Szeged) 42 (1980), 57 - 65.

[14] Czedli, G., A characterization for congruence semi-distributiv-


ity, Universal Algebra and Lattice Theory, Proceedings Puebla 1982,
Springer-Verlag, Berlin, Heidelberg, New York, Tokyo, 1983, 104 - 110.

[15] Davis, M. Computability and Unsolvability, New York, 1958.

[16] Day, A., A characterization of modularity for congruence lattices of al-


gebras, Canad. Math. Bull. 12 (1969), 167 - 173.

[17] Demetrovics, J. and L. Hannák, On the cardinality of self-dual closed


classes in k-valued logics, Közl.-MTA Számitástech. Aut. Kutató Int.
Budapest, 23 (1979), 7 - 17.

[18] Demetrovics, J., L. Hannák and S.S. Marchenkov, On closed classes of


self-dual functions in P3 , Colloq. Math. Soc. Janos Bolyai 28, Finite
Algebra and Multiple-valued logic Szeged (Hungary) 1979, 183 - 189.

[19] Demetrovics, J., L. Hannák and S.S. Marchenko, On closed classes of


self-dual functions in P3 (Russian), Metodi diskretnogo analiza v resh-
enii kombinatornych zadach, Sbornik trudov Instituta Matematiki SO
Akademii Nauk SSSR 34 (1980), 38 - 73.

[20] Denecke, K., Preprimal Algebras, Akademie-Verlag Berlin 1982.

[21] Denecke, K., Eine Charakterisierung der funktionalen Vollstän-


digkeit in kongruenzvertauschbaren Varietäten durch Hyperidentitäten,
Rostocker Mathematisches Kolloquium 36 (1989), 73 - 80.

[22] Denecke, K., Clones closed with respect to closure operators, Multi. Val.
Logic, Vol. 4 (1999), 229 - 247.
BIBLIOGRAPHY 365

[23] Denecke, K., L. Freiberg and J. Koppitz, Algorithmic problems in M-


solid varieties of semigroups, in: Semigroups, Proceedings of the In-
ternational Conference in Semigroup and its Related Topics, Kunming
1995, Springer-Verlag, Singapore, 1998, 104 - 117.
[24] Denecke, K. and H. Hounnon, All solid varieties of semirings, preprint,
2000.
[25] Denecke, K. and P. Jampachon, Regular-solid varieties of commutative
and idempotent groupoids, Algebras and Combinatorics, Proc. of the
International Congress, ICAC 97, Hong Kong, Springer-Verlag, Singa-
pore, 1999, 177 - 188.
[26] Denecke, K. and J. Koppitz, Hyperassociative varieties of semigroups,
Semigroup Forum 49 (1994), 41 - 48.
[27] Denecke, K. and J. Koppitz, Pre-solid varieties of semigroups, Archivum
Mathematicum (Brno) 31 (1995), 171 - 181.
[28] Denecke, K., J. Koppitz and S. Niwczyk, Equational Theories Generated
by Hypersubstitutions of Type (n), preprint, 2001.
[29] Denecke, K. J. Koppitz and S. L. Wismath, The Semantical Hyperuni-
fication Problem, preprint, 2001.
[30] Denecke, K. and S. Leeratanavalee, Weak hypersubstitutions and weakly
derived algebras, Contributions to General Algebra 11, Verlag Johannes
Heyn, Klagenfurth 1999, 59 - 75.
[31] Denecke, K. and S. Leeratanavalee, Solid polynomial varieties of semi-
groups which are definable by identities, Contributions to General Al-
gebra 12, Verlag Johannes Heyn, Klagenfurth 2000, 155 - 164.
[32] Denecke, K., D. Lau, R. Pöschel, and D. Schweigert, Hyperidentities,
Hyperequational classes and clone congruences, Contributions to Gen-
eral Algebra 7, Verlag Hölder-Pichler-Tempsky, Wien, 1991, 97 - 118.
[33] Denecke, K. and N. Pabhapote, Tree-recognizers and tree-hyperrecogn-
izers, Contributions to General Algebra 13, Verlag Johannes Heyn, Kla-
genfurth, 2001, 107 - 114.
[34] Denecke, K. and T. Poomsa-ard, Hyperidentities in graph algebras, Gen-
eral Algebra and Applications in Discrete Mathematics, Shaker-Verlag,
Aachen 1997, 59 - 68.
366 BIBLIOGRAPHY

[35] Denecke, K. and M. Reichel, Monoids of hypersubstitutions and M -solid


varieties, Contributions to General Algebra 9, Verlag Hölder-Pichler-
Tempsky, Wien 1995 - Verlag B.G. Teubner, Stuttgart, 117 - 125.

[36] Denecke, K. and M. Reichel, Hyperidentities in Quasigroups, Beiträge


zur Jahrestagung Algebra und Grenzgebiete, Güstrow, 1990, 67 - 75.

[37] Denecke, K. and S. L. Wismath, Hyperidentities and Clones, Gordon


and Breach Science Publishers, 2000.

[38] Denecke, K. and S. L. Wismath, Galois connections and complete sub-


lattices, preprint, 2001.

[39] Dikranjan, D. and E. Giuli, Closure Operators I, Topology Appl.


27(1987), 129 - 143.

[40] Doner, J. E., Decidability of the weak second-order theory of two succes-
sors, Generalized finite automata, Notices Amer. Math. Soc. 12 (1965),
abstract No. 65T-468, 819.

[41] Doner, J. E., Tree acceptors and some of their applications, J. CSS 4
(1970), 406 - 451.

[42] Eder, E., Properties of substitutions and unifications, J. Symbolic Com-


putation, Vol.1 (1985), 31 - 46.

[43] Ésik, Z., A variety theorem for trees and theories, Publicationes Math-
ematicae, Debrecen, Tomus 54 Supplement (1999), 711 - 762.

[44] Fichtner, K., Distributivity and modularity in varieties of algebras, Acta


Sci. Math.(Szeged) 33 (1972), 343 - 346.

[45] Foster, A. L., Generalized Boolean theory of universal algebras, Part I,


Math. Zeitschr. 58 (1953), 306-336; Part II, Math. Zeitschr. 59 (1953),
191 - 199.

[46] Foster, A. L., Functional completeness in the small. Algebraic structure


theorems and identities, Math. Ann. 143(1961), 127 - 146.

[47] Ganter, B. and R. Wille, Formale Begriffsanalyse, Springer-Verlag,


1996.

[48] Gécseg, F. and M. Steinby, Tree Automata, Akademiai Kiado, Budapest,


1984.
BIBLIOGRAPHY 367

[49] Gluschkow, W. M., G. J. Zeitlin and J. L. Justschenko, Algebra,


Sprachen, Programmierung, Akademie-Verlag, Berlin, 1980.

[50] Gorlov, V. V. and R. Pöschel, Clones closed with respect to permuta-


tion groups or transformation semigroups, Beiträge zur Algebra und
Geometrie 39 (1998), no. 1, 181 - 204.

[51] Grätzer, G., Universal Algebra, 2nd edition, Springer-Verlag, Berlin,


Heidelberg, New York, 1979.

[52] Gumm, H. P., Congruence modularity is permutability composed with


distributivity, Arch. Math. 36 (1981), 569 - 576.

[53] Hagemann, J. A. and A. Mitschke, On n-permutable congruences, Al-


gebra Universalis 3 (1973), 8 - 12.

[54] Handbook of Formal Languages, Vol. 3, Springer, 1997.

[55] Higgins, P. J., Algebras with a scheme of operators, Math. Nachr. 27


(1963), 115 - 132.

[56] van Hoa, N., On the structure of self-dual closed classes of three-valued
logic, Diskr. Mathematika 4 (1992), 82 - 95.

[57] Hobby, D. and R. McKenzie, The Structure of Finite Algebras (Tame


Congruence Theory), AMS Contemporary Mathematics Series, Provi-
dence, Rhode Island, 1988.

[58] Ihringer, Th., Allgemeine Algebra, Verlag B.G. Teubner, Stuttgart,


1993.

[59] Jablonskij, S.V., Functional constructions in multivalued logics, (Rus-


sian), Trudy Inst. Mat. Steklov 51 (1958), 5 - 142.

[60] Jablonskij, S.V., G. P. Gawrilow and W. B. Kudrjawzew, Boolesche


Funktionen und Postsche Klassen, Akademie-Verlag, Berlin, 1970.

[61] Jonsson, B., Algebras whose congruence lattices are distributive, Math.
Scand. 21 (1967), 110 - 121.

[62] Kiss, E. W. and E. Pröhle, Problems and results in tame congruence


theory, Mathematical Institute of the Hungarian Academy of Sciences,
Preprint No. 60 (1988).
368 BIBLIOGRAPHY

[63] Kleene, S. C., Introduction to Metamathematics, D. Van Nostrand Co.,


Inc., 1950.

[64] Klein, F., Vergleichende Betrachtungen über neuere geometrische


Forschungen, Math. Ann. 43 (1893), 63 - 100.

[65] Knoebel, A., The equational classes generated by single functionally pre-
complete algebras, Memoirs of the Amer. Math. Soc. 57, 332, Provi-
dence, Rhode Island, 1985.

[66] Knuth, D. E. and P.E. Bendix, Simple Word Problem in Universal alge-
bra, in: Computational problems in abstract algebra, Pergamon Press,
Oxford, 1970, 263 - 297.

[67] Koppitz, J. and S. L. Wismath, Hyperidentities for varieties of star


bands, Sci. Math. 3 no. 3 (2000), 299 - 307.

[68] Kruse, R. L., Identities satisfied by a finite ring, J. Algebra 26 (1973),


298 - 318.

[69] Lau, D., On closed subsets of Boolean functions, (A new proof for Post’s
theorem), J. Inform. Process. Cybernet. EIK 27 (1991), 167 - 178.

[70] Lau, D., Ein neuer Beweis für Rosenberg’s Vollständigkeitskriterium, J.


Inform. Process. Cybernet. EIK 28, 4 (1992), 149 - 195.

[71] Lawvere, F. W., Functorial semantics of algebraic theories, Proc. Nat.


Acad. Sci. 50 (1963), 869 - 872.

[72] Lyndon, R. C., Identities in two-valued calculi, Trans. Amer. Math. Soc.
71 (1954), 457 - 465.

[73] Lyndon, R. C., Identities in finite algebras, Proc. Amer. Math. Soc. 5
(1954), 8 - 9.

[74] Mal’cev, A.I., On the general theory of algebraic systems, (Russian),


Mat. Sbornik 35, 77 (1954), 3 - 20.

[75] Mal’cev, A.I., Algebraic Systems, Akademie-Verlag, Berlin 1973.

[76] Marchenkov, S.S., On closed classes of self-dual functions of multiple-


valued logic (Russian), Problemy Kibernetiki 36, 5 - 22.
BIBLIOGRAPHY 369

[77] R. McKenzie, Para-primal varieties: A study of finite axiomatizability


and definable principal congruences in locally finite varieties, Alg. Uni-
versalis 8 (1978), 336 - 348.

[78] McKenzie, R., On minimal, locally finite varieties with permuting con-
gruence relations, preprint, 1976.

[79] McKenzie, R., The residual bound of finite algebra is not computable, J.
of Algebra and Computation 6 No. 1 (1996), 29 - 48.

[80] McKenzie, R., Tarski’s finite basis problem is undecidable, J. of Algebra


and Computation 6 No. 1 (1996), 49 - 104.

[81] McKenzie, R., The residual bounds of finite algebras, J. of Algebra and
Computation 6 No. 1 (1996), 1 - 28.

[82] McNulty, G., Residual finiteness and finite equational bases: undecidable
properties of finite algebras, Lectures 2000.

[83] Murskij, V. L., The existence in the three-valued logic of a closed class
with a finite basis not having a finite complete system of identities, So-
viet Math. Dokl. 6 (1965), 1020 - 1024.

[84] Oates, S. and M. B. Powell, Identical relations in finite groups, J. Alge-


bra 1 (1965), 11 - 39.

[85] Papert, D., Congruence relations in semilattices, London Math. Soc. 39


(1964), 723 - 729.

[86] Perkins, P., Bases for equational theories of semigroups, J. Algebra 11


(1969), 298 - 314.

[87] Pixley, A.F., The ternary discriminator function in Universal Algebra,


Math. Ann. 191 (1971), 167 - 180.

[88] Pixley, A. F., A note on hemi-primal algebras, Math. Zeitschr.


124(1972), 213 - 214.

[89] Pixley, A. F., Functional and affine completeness and arithmetical vari-
eties, in: Algebras and orders, NATO Adv. Sci. Inst. Ser. C Math. Phys.
Sci., 389, Kluwer Acad. Publ., Dordrecht, 1993, 317 - 357.
370 BIBLIOGRAPHY

[90] Plonka, J. Proper and inner hypersubstitutions of varieties, in: Proceed-


ings of the International Conference: Summer school on General algebra
and ordered sets 1994, Palacky University Olomouc, 1994, 106 - 115.
[91] Polák, L., On hyperassociativity, Algebra Universalis 36 No. 3 (1996),
363 - 378.
[92] Polák, L. All solid varieties of semigroups, J. Algebra 219 no. 2 (1999),
421 - 436.
[93] Polin, S. V., Identities of finite algebras, Siberian Math. J. 17
(1976), 992 - 999.
[94] Post, E.L., Introduction to a general theory of elementary propositions,
Amer. J. Math. 43 (1921), 163 - 185.
[95] Post, E.L., The two-valued iterative systems of mathematical logic, Ann.
Math. Studies 5, Princeton Univ. Press, 1941.
[96] Pöschel, R. and L. A. Kalužnin, Funktionen- und Relationenalgebren,
VEB Deutscher Verlag der Wissenschaften, Berlin; 1979.
[97] Quackenbush, R.W., A new proof of Rosenberg’s primal algebra charac-
terization theorem, in: Finite Algebra and Multiple-valued Logic, (Proc.
Conf. Szeged, 1979), Colloq. Math. Soc. J. Bolyai, North-Holland Am-
sterdam, vol. 28, 603 - 634.
[98] Reichel, M., Bi-Homomorphismen und Hyperidentitäten, Dissertation,
Universität Potsdam, 1994.
[99] Reschke, M. and K. Denecke, Ein neuer Beweis für die Ergebnisse von
E.L. Post über abgeschlossene Klassen Boolescher Funktionen, J. In-
form. Process. Cybernet. EIK 25 (1981), 361 - 380.
[100] Reschke, M., O. Lüders and K. Denecke, Kongruenzdistributivität,
Kongruenzvertauschbarkeit und Kongruenzmodularität zweielementiger
Algebren, J. Inform. Process. Cybern. EIK 24 (1988) 1/2, 65 - 78.
[101] Robinson, J. A., A machine oriented logic based on the Resolution
Principle; Journal of the ACM, Vol. 12 (1965), 23 - 41.
[102] Rosenberg, I.G., La structure des functions de plusieurs variables sur
un ensemble fini, C.R. Acad. Sci. Paris Ser. A-B 26O (1965), 3817 -
3819.
BIBLIOGRAPHY 371

[103] Rosenberg, I.G., Über die funktionale Vollständigkeit in den mehrwer-


tigen Logiken, Rozpr. ČSAV, Řada Mat. Přı́r. Věd. Praha 80, 1 (1970),
3 - 93.

[104] Rousseau, G., Completeness in finite algebras with a single operation,


Proc. Amer. Math. Soc. 18 (1967), 1009 - 1013.

[105] Schweigert, D., Hyperidentities, in: Algebras and Orders, Kluwer Aca-
demic Publishers, Dordrecht, Boston, London, 1993, 405-506.

[106] Šešelja, B. and A. Tepavčević, On a partial closure operator, preprint,


2000.

[107] Shum, K. P. and A. Yang, Interior operators on complete lattices, Pu.


M. A. Ser. A, Vol. 3 (1992), No. 1-2, 73 - 80.

[108] Siekmann, J. H., Universal Unification, in: Lecture Notes in Computer


Science, eds. G. Goos and J. Hartmanis, 7th International Conference on
Automated Deductions, Napa, California, May 1984, Springer-Verlag,
Berlin, Heidelberg, New York.

[109] Slupecki, J., Completeness criterion for systems of many-valued propo-


sitional calculus, Studia Logica 30 (1972), 153 - 157.

[110] Tardos, G., A maximal clone of monotone operations which is not


finitely generated, Order 3 (1986), 211 - 218.

[111] Tarski, A., A lattice theoretical fix point theorem and its application,
Pacific. J. Math. 5 (1955), 285 - 310.

[112] Taylor, W., Characterizing Mal’cev conditions, Algebra Universalis 3


(1973), 351 - 397.

[113] Thatcher, J. W., Tree Automata: an informal survey - Currents in the


theory of computing, ed. A. V. Ano, Prentice-Hall, Englewood Cliffs, N
J (1973), 143 - 172.

[114] Thatcher, J. W. and J. B. Wright, Generalized finite automata, Notices


Amer. Math. Soc. 12. (1965), abstract no. 65T-649, 820.

[115] Thatcher, J. W. and J. B. Wright, Generalized finite automata theory


with an application to a decision problem of second order logic, MST 2
(1968), 57 - 81.
372 BIBLIOGRAPHY

[116] Traczyk, T., An equational definition of a class of Post algebras, Bull.


Acad.Pol. Sc. 12 (1964), 147 - 149.

[117] Webb, D. L., Definition of Post’s generalized negative and maximum


in terms of one binary operation, Amer. J. Math 58 (1936), 193 - 194.

[118] Werner, H., Discriminator Algebras, Akademie-Verlag, Berlin 1970.

[119] Willard, R., Extending Baker’s theorem, Conference on Lattices and


Universal Algebra (Szeged, 1998), Algebra Universalis 45 (2001), no.
2-3, 335 - 344.

[120] R. Wille, Restructuring lattice theory: an approach based on hierarchies


of concepts, in Ordered Sets, ed. I. Rival, Reidel, Dordrecht-Boston,
1982, 445 - 470.

[121] Wille, R., Kongruenzklassengeometrien, Lecture Notes in Mathematics


113, Springer, Berlin, 1970.
Index

Ésik, Z., 356 full iterative, 10, 215


functionally complete, 231
abelian, 5 hemiprimal, 241
abelian algebra, 280, 290 heterogeneous, 11, 159, 216
abelian group, 272, 280 homogeneous, 11, 215
absolutely free algebra, 80 indexed, 4
absorption law, 7 induced on a set, 252
acceptor, 11 infinitely generated, 19
additive closure operator, 302 join-semidistributive, 281
address of a node, 79, 348 locally finite, 108
affine algebra, 239 meet-semidistributive, 281
affine complete algebra, 241 minimal, 256, 258
Alg(τ ), 109 multi-based, 11, 159
algebra, 4 multi-sorted, 11
abelian, 280, 290 non-deterministic, 168
absolutely free, 80 non-indexed, 4
affine, 239 one-based, 11
Boolean, 9 paraprimal, 241
congruence modular, 203 permutation, 257, 269
congruence regular, 205 polynomially equivalent, 296
constantive, 246 Post, 233
demiprimal, 241 preprimal, 245
derived, 337, 351 primal, 231, 278, 286
directly irreducible, 68 quasiprimal, 241
equivalent, 220 quotient, 27, 115
factor, 27 regular subprimal, 241
finite, 251 relatively free, 98
finitely axiomatizable, 110 semidistributive, 281
finitely based, 110 simple, 23, 71
finitely generated, 19 singular subprimal, 241
free, 115 solvable, 296

373
374 INDEX

strongly abelian, 280 Berman, J., 219


subdirectly irreducible, 71 binary operation, 4
subprimal, 241 Birkhoff’s Theorem, 106
tame, 262 Birkhoff, G., 72, 216
term, 80 Boolean algebra, 110
two-element, 206 Boolean clone lattice, 219
affine complete, 241 Boolean clones, 274
cryptoprimal, 241 Boolean function, 38
semiprimal, 240 dual, 218
algebraic lattice, 37 linear, 219
algebraic theories, 216 monotone, 219
alphabet, 76 self-dual, 218
ranked, 165 Boolean operation, 3, 206, 218
arithmetical variety, 201 linear, 222
arity of a function, 3 Boolean type, 276
Arworn, S., 308, 309, 342
associative law, 5 canonical normal form, 123
attribute, 42 categories, 216
automaton, 11, 147 center of a relation, 231
deterministic, 158 central relation, 231
equivalent, 162 centroid element, 246
finite, 12 centroid of an algebra, 246
initial, 157 Church-Rosser property, 121
non-deterministic, 158 Church-Rosser Theorem, 122
quotient, 161 class
reduced, 163 equational, 93
weak initial, 157 locally finite, 108
with output, 158 clone, 9, 38, 83, 87, 215
automorphism, 49, 52 generated by a set of operations,
relative, 53 83
automorphism group, 52 maximal, 245
relational, 218
Baker’s Theorem, 112 polynomial, 231
Baker, K., 110, 199, 223, 224 term, 231
Baker-Pixley Theorem, 199, 234 closed set, 32, 93
base set, 4 closure, 33
basis, 28 closure operator, 16, 32, 102, 301
basis of identities, 110 additive, 302
Bendix, P.E., 129, 136, 139 inductive, 36
INDEX 375

closure properties, 16, 25 algebra, 204


closure system, 33 congruence permutable variety, 274
coatom, 262 congruence regular, 205
Cohn, P., 37 congruence regular algebra, 205
commutative, 5 congruence regular variety, 205
commutator, 289 conjugate pair of closure operators,
i-th iterated, 295 302
commutator congruence, 294 connected tree-recognizer, 179
commutator group, 289 consistency, 98
compact element, 37 Consistency and Completeness The-
compatibility, 22, 217 orem, 98
complement, 9 constant, 86
complement laws, 9 constantive algebra, 246
complete lattice, 93 Cowan, D., 342
complete reduction system, 123 critical pair, 126, 138
completeness, 98 cryptoprimal algebra, 241
Completeness Criterion, 21 Csákány, B., 327
completion of a reduction system, Czedli, G., 283
125
complexity of a term, 77 Davis, M., 116
composition of operations, 9 Day, A., 203
confluent relation, 121 decidable problem, 116
congruence, 161, 177 decision procedure, 116
fully invariant, 95 deduction rules, 97, 130
generated by set, 25 Demetrovics, J., 327
modulo m, 22 demiprimal algebra, 241
n-permutability, 204 Denecke, K., 86, 206, 224, 246, 248,
permutable, 204 285, 309, 323, 330, 335, 340–
regular, 205 342, 348–350
relation, 23 depth of a term, 77
tame, 263 derivation rules, 97, 169, 183
permutable, 193 derived algebra, 337, 351
congruence distributive algebra, 196 derived language, 154
congruence distributive variety, 112, deterministic automaton, 158
196, 220, 274 Dikranjan, D., 302
congruence lattice of an algebra, 26 direct product, 63
congruence modular algebra, 203 directly irreducible algebra, 68
congruence modular variety, 203, 219 disjunctive normal form, 229
congruence permutable distributive laws, 6
376 INDEX

Doner, J.E., 356 variety, 110


dual Boolean functions, 218 finitely based algebra, 110
finitely based variety, 110
Eder, E., 134 finitely generated, 19
effectively solvable problem, 115 First Isomorphism Theorem, 58
embedding, 49 fixed point, 52, 93
empty word, 148 formal concept, 42
endomorphism, 49, 52 formal context, 42
join-, 260 formal language, 76
meet-, 260 Foster, A.L., 233, 246
equation, 91 free algebra, 115
equational class, 93 free monoid, 101
equational logic, 98 free semigroup, 101
equational theory, 93 Freiberg, L., 341
Main Theorem, 106 full clone, 9, 83
equationally complete variety, 109 full iterative algebra, 10, 215
equationally definable class of alge- fully invariant congruence, 95
bras, 93 functional completeness, 215
equivalence relation, 115 functionally complete algebra, 231
equivalent algebras, 220 fundamental operation, 4
equivalent automata, 162
equivalent grammars, 170 G-clone, 327
equivalent reduction systems, 125 G-relational clone, 327
equivalent states, 162 Gécseg, F., 176, 351, 355
essentially binary operation, 269 Galois-closed subrelation, 308
evaluation mapping, 165 Galois-connection, 40, 92, 217, 301
extended regular grammar, 172 (HM Id, HM M od), 336
extension of a function, 53 (GPol,GInv), 328
extensivity, 16, 25, 32, 260 (HPol, HInv), 332
extent, 42 (Id,Mod), 92, 110, 335
(Pol,Inv), 39, 217, 325
factor algebra, 27 Ganter, B., 44, 309, 310
Fichtner, K., 198 Gavalcová, T., 327
field, 6, 32 Gawrilow, G.P., 330
finite algebra, 251 generalized hypersubstitution, 361
finite automaton, 12, 147 generated, finitely, 19
finite state machine, 147 generated, infinitely, 19
finitely axiomatizable generating system, 16, 19
algebra, 110 Gluschkow, W.M., 21
INDEX 377

Gorlov, V.V., 327, 331 leftmost, 339


Grätzer, G., 37 outermost, 339
grammar permutation, 340
extended regular, 172 regular, 339
normal form, 171 symmetrical, 340
tree, 169 hyperunifiable terms, 348
graph of a relation, 3 hyperunifier, 348
graph of an operation, 14
graph-algebra, 29 idempotency, 16, 25, 32
group, 5 idempotent law, 7
abelian, 280 idempotent mapping, 252
groupoid, 5 identity, 91
abelian, 5 Ihringer, Th., 36, 67
commutative, 5 individual variable, 165, 169
Guili, E., 302 induced algebra, 252
Gumm, H.P., 203, 296 induction, principle of structural, 18
infinitely generated, 19
H-clones, 333 initial automaton, 157
Hagemann, J.A., 204 initial symbol, 169
Halting Problem, 187 input alphabet, 185
Hannák, L., 327 intent, 42
hemiprimal algebra, 241 interpolation property, 232
heterogeneous algebra, 11, 159, 216 intersection of algebras, 15
Higgins, P.J., 216 invariant, 217
Hoa, N. van, 327 invertibility, 5
Hobby, D., 251, 277, 281 irreducible element, 119
homogeneous algebra, 11, 215 isomorphism, 49
Homomorphic Image Theorem, 57 relative, 53
Homomorphic Image Theorem for Isomorphism Theorem
Tree-Recognizers, 178 First, 58
homomorphism, 49, 176 Second, 60
natural, 50 iteration, 149
Homomorphism Theorem for Au-
tomata, 161 Jablonskij, S.V., 229, 330
Hounnon, H., 342 Jampachon, P., 342
hyperassociative variety, 340 join operation, 7
hyperidentity, 336 join-endomorphism, 260
hypersatisfaction, 336 join-semidistributive, 281
hypersubstitution, 175, 357 Jonsson’s Condition, 197
378 INDEX

Jonsson, B., 197, 201 lattice homomorphism


Justschenko, J.L., 21 0-1-separating, 267
0-separating, 267
K-free algebra, 98 1-separating, 267
Kalužnin, L.A., 39, 218 lattice type, 276
kernel, 161 Lau, D., 219, 229, 335
of a function, 22 Lawvere, F.W., 216
of a homomorphism, 54 Leeratanavalee, S., 86
kernel operator, 45, 319 leftmost hypersubstitution, 339
kernel system, 45 length of a word, 148
Kiss, E.W., 282 linear Boolean function, 219
Kleene’s Theorem, 151, 157 Lipson, J.D., 216
Kleene, S.C., 187 locally confluent reduction system,
Klein, F., 39 123
Knoebel, A., 248 locally finite
Knuth, D.E., 129, 136, 139 algebra, 108
Knuth-Bendix completion procedure, class, 108
129, 140 variety, 282, 286
Knuth-Bendix ordering, 142 loop, 271
Knuth-Bendix Theorem, 139 Lyndon, R.C., 110
Koppitz, J., 340–342, 348, 349
Kruse, R.L., 110 M-hyperidentity, 336
Kudrjawzew, W.B., 330 M-solid variety, 336
Main Theorem for Conjugate Pairs of
Lüders, O., 206, 285 Closure Operators, 304
language, 166 Main Theorem of Equational Theory,
derived, 154 106
recognizable, 151, 166 majority term, 196, 223, 234
regular, 149 Mal’cev term, 193, 194
lattice, 6 Mal’cev, A.I., 193, 215, 229
0-1-simple, 267 Mal’cev-type condition, 193
algebraic, 37 Marchenkov, S.S., 327
bounded, 8 match of terms, 133
complete, 8, 36 maximal clone, 245
distributive, 7, 110 maximal subalgebra, 20
modular, 7 McKenzie, R., 112, 187, 222, 239,
of Boolean clones, 219 251, 277, 281
Post’s, 219 meet operation, 7
tight, 267, 277 meet-endomorphism, 260
INDEX 379

meet-semidistributive, 281 join, 7


meet-semidistributive law, 112 linear, 38
minimal algebra, 256, 258 meet, 7
type of, 276 monotone, 221
minimal set, 258 non-deterministic, 167
minimal set of an algebra, 256 nullary, 3
minimal tree-recognizer, 179 on a set, 3
minimal variety, 109 polynomial, 87
Mitschke, A., 204 term, 82
modular law, 203 operators H, S, P, 102
module, 10 outermost hypersubstitution, 339
monoid, 5
monotone Boolean function, 219 Pöschel, R., 39, 218, 309, 327, 331,
monotone operation, 221 335
monotonicity, 16, 25, 32 Plonka, J., 353
multi-based algebra, 159 Pabhapote, N., 350
Murskij, V.L., 111 Papert, D., 285
paraprimal algebra, 241
near-unanimity function, 224 partial closure operator, 324
Niwczyk, S., 348 partial order, 7
noetherian relation, 120 Perkins, P., 111
non-deterministic permutable congruences, 193
algebra, 168 permutable relations, 65
automaton, 158 permutation algebra, 257, 269
operation, 167 permutation group, 257
tree-recognizer, 168 permutation hypersubstitution, 340
non-terminating symbol, 169 permutation-solid variety, 340
normal form, 101, 119, 130 Pixley, A.F., 199, 223, 224, 235, 244
canonical, 123 Polák, L., 340
normal form grammar, 171 Polin, S.V., 111
nullary operation, 4 polymorphism, 217
polynomial, 86
Oates, S., 110 polynomial clone, 231
Omitting Type Theorem, 281 polynomial equivalence, 270, 296
operation polynomial isomorphism, 265
Boolean, 3, 206, 218 polynomial operations, 87
depending on i-th variable, 269 Poomsa-ard, T., 342
essentially binary, 269 Post Algebra, 233
fundamental, 4 Post’s lattice, 219
380 INDEX

Post, E.L., 207, 218, 219, 228, 229, non-terminating, 120


232, 233, 245, 330 terminating, 119
Powell, M.B., 110 reduction order, 141
Pröhle, E., 282 reduction rule, 119, 129, 132
pre-hypersubstitution, 340 reduction system, 123
prefix order, 132 complete, 123
preprimal algebra, 245 completion, 125
preservation relation, 38, 217, 325 equivalent, 125
presolid variety, 340 locally confluent, 123
primal algebra, 231, 278, 286 reflexive closure, 117
prime interval, 277 reflexive relation, 246
Principle of Noetherian Induction, regular expressions, 149
124 regular hypersubstitution, 339
product regular language, 149
direct, 63 regular subprimal algebra, 241
subdirect, 69 regular-solid variety, 339
production, 169, 183 Reichel, M., 302, 330, 342
projection, 9, 64 relation
proper order relation, 141 central, 231
Church-Rosser, 121
Quackenbush, R.W., 229 confluent, 121
quasigroup, 6, 270 invariant, 39
quasilinear function, 230 noetherian, 120
quasiprimal algebra, 241 proper order, 141
quotient algebra, 27, 115 reduction, 141
quotient automaton, 161 reflexive, 246
reflexive closure, 117
R-module, 10 symmetric closure, 117
ranked alphabet, 165 t-universal, 231
recognition relation, 147 terminating, 120
recognizable totally reflexive, 230, 246
language, 151, 166 totally symmetric, 230
word, 151 transitive closure, 117
recognizer, 11 trivial, 23
reduced automaton, 163 relational clones, 218
reduced tree-recognizer, 179 relative automorphism, 53
reducible element, 119 relative isomorphism, 53
reduct element, 119 relatively free algebra, 98
reduction replacement rule, 97, 130
INDEX 381

Reschke, M., 206, 224, 285 singular subprimal algebra, 241


residual bound, 112 skew field, 6
residually finite variety, 112, 187 snag, 282
residually small variety, 112, 286 solid variety, 336
restriction, 53 solvable algebra, 296
of a mapping, 251 Steinby, M., 176, 351, 355
of an algebra to a set, 252 strong term condition, 279
of an equivalence relation, 252 strongly abelian algebra, 280
of an operation, 252 strongly extensive mapping, 260
ring, 6 structural induction, 18
Robinson, J.A., 134 subalgebra, 14
root of a tree, 78 maximal, 20
Rosenberg, I.G., 229, 327, 328 Subalgebra Criterion, 14
Rousseau, G., 240 subalgebra lattice, 19
rules of consequence, 97 subautomaton, 159
subclone, 83
Slupecki criterion, 231 subdirect product, 69
Slupecki, J., 231 subdirectly irreducible algebra, 71,
satisfaction, 91 104, 112
scalar, 11 subprimal algebra, 241
Schweigert, D., 335, 342 substitution, 130, 347
Sešelja, B., 324 substitution rule, 97, 130
Second Isomorphism Theorem, 60 subvariety, 109
self-dual, 218 subvariety lattice, 110
semantic tree, 78 superposition, 9, 216
semantical unifier, 348 symbol
semidistributive algebra, 281 initial, 169
semidistributive variety, 281 non-terminating, 169
semigroup, 5 symmetric closure, 117
free, 101, 148 symmetrical hypersubstitution, 340
left-zero, 110 syntactical hyperunifier, 348
right-zero, 110 syntactical unifier, 348
zero, 110 system of sets
semilattice, 8, 110 inductive, 35
semilattice type, 276 upward directed, 35
semiprimal algebra, 240
Shum, K.P., 323 t-universal relation, 231
Siekmann, J.H., 348 tame algebra, 262
simple algebra, 23, 71 tame congruence, 263
382 INDEX

tape symbol, 185 connected, 179


Tardos, G., 248 deterministic, 168
Tarski’s Finite Basis Problem, 187 minimal, 179
Tarski, A., 320 non-deterministic, 168
Taylor, W., 216 reduced, 179
Tepavčević, A., 324 trivial relation, 23
term trivial variety, 109
inductive definition, 76 Turing machine, 185
majority, 196 Turing, A., 185
Mal’cev, 193 type, 76
nullary, 77 of a minimal algebra, 276
term algebra, 80 of a prime interval, 277
term clone, 84, 220, 231 of a tame algebra, 276
term condition, 280, 290 of a tame interval, 277
strong, 279 of a variety, 277
term condition for congruences, 292 of an algebra, 4
term operation, 82, 91 of terms, 76
term reduction system, 129, 132
term rewriting system, 129, 132 unar, 5
terminating reduction, 119 unary operation, 4
terminating relation, 120 unary type, 276
ternary discriminator, 242 unifiable terms, 133
Thatcher, J.W., 182, 356 unifier, 347
tight lattice, 267, 277 semantical, 348
totally reflexive relation, 230, 246 syntactical, 348
totally symmetric relation, 230 universe of an algebra, 4
trace of a minimal algebra, 259
trace of an interval, 277 variable, 76
Traczyk, T., 234 individual, 169
transformation, 331 variety, 102
transitive closure, 117 arithmetical, 201
transitive hull operator, 26 congruence distributive, 112,
translation, 24 196, 274
tree, 78 congruence modular, 203
tree grammar, 169 congruence permutable, 274
tree homomorphism, 175 congruence regular, 205
tree transducer, 183 finitely based, 110
tree transformation, 183, 358 generated by class K, 104
tree-recognizer, 165 hyperassociative, 340
INDEX 383

locally finite, 282, 286


M-solid, 336
minimal, 109
permutation-solid, 340
presolid, 340
regular-solid, 339
residually finite, 112, 187
residually small, 112, 286
semidistributive, 281
solid, 336
vector, 11
vector space, 11, 32, 270
vector space type, 276

weak initial automaton, 157


Webb, D.L., 233
weight function, 142
Werner, H., 235
Willard, R., 113
Wille, R., 42, 44, 205, 309, 310
Wismath, S.L., 323, 342, 349
word, 148
word problem, 116, 347
words in the free semigroup, 101
Wright, J.B., 356

Yang, A., 323


yields, 97

Zeitlin, G.J., 21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy