LC April2023
LC April2023
LC April2023
Abstract
In the first part we introduce binary representations of both lambda
calculus and combinatory logic, together with very concise interpreters
that witness their simplicity. Along the way we present a simple graphical
notation for lambda calculus, a new empty list representation, improved
bracket abstraction, and a new fixpoint combinator. In the second part
we review Algorithmic Information Theory, for which these interpreters
provide a convenient vehicle. We demonstrate this with several concrete
upper bounds on program-size complexity.
1 Introduction
The ability to represent programs as data and to map such data back to pro-
grams (known as reification and reflection [10]), is of both practical use in
metaprogramming [16] as well as theoretical use in computability and logic [19].
It comes as no surprise that the pure lambda calculus, which represents both
programs and data as functions, is well equipped to offer these features. In [7],
Kleene was the first to propose an encoding of lambda terms, mapping them
to Gödel numbers, which can in turn be represented as so called Church nu-
merals. Decoding such numbers is somewhat cumbersome, and not particularly
efficient. In search of simpler constructions, various alternative encodings have
been proposed using higher-order abstract syntax [9] combined with the stan-
dard lambda representation of signatures [12]. A particularly simple encoding
was proposed by Mogensen [25], for which the term λm.m(λx.x)(λx.x) acts as
a selfinterpreter. The prevalent data format, both in information theory and
in practice, however, is not numbers, or syntax trees, but bits. We propose
binary encodings of both lambda and combinatory logic terms, and exhibit rel-
atively simple and efficient interpreters (using the standard representation of
bit-streams as lists of booleans).
This gives us a representation-neutral notion of the size of a term, measured
in bits. More importantly, it provides a way to describe arbitrary data with, in a
sense, the least number of bits possible. We review the notion of how a computer
reading bits and outputting some result constitutes a description method, and
1
how universal computers correspond to optimal description methods. We then
pick specific universal computers based on our interpreters and prove several of
the basic results of Algorithmic Information Theory with explicit constants.
2 Lambda Calculus
We only summarize the basics here. For a comprehensive treatment we refer
the reader to the standard reference [20].
Assume a countably infinite set of variables
a, b, . . . , x, y, z, x0 , x1 , . . .
(λx.M )
and application
(M N ),
where x is any variable and M, N are lambda terms. (λx.M ) is the function
that maps x to M , while (M N ) is the application of function M to argument
N . We sometimes omit parentheses, understanding abstraction to associate to
the right, and application to associate to the left, e.g. λx.λy.x y x denotes
(λx.(λy.((x y)x))). We also join consecutive abstractions as in λx y.x y x.
The free variables F V (M ) of a term M are those variables not bound by
an enclosing abstraction. Λ0 denotes the set of closed terms, i.e. with no free
variables. The simplest closed term is the identity λx.x.
We consider two terms identical if they only differ in the names of bound
variables, and denote this with ≡, e.g. λy.y x ≡ λz.z x. The essence of λ
calculus is embodied in the β-conversion rule which equates
(λx.M )N = M [x := N ],
A term with no β-redex, that is, no subterm of the form (λx.M )N , is said to
be in normal form. Terms may be viewed as denoting computations of which
β-reductions form the steps, and which may halt with a normal form as the end
result.
2
2.1 Some useful lambda terms
Define (for any M, P, Q, . . . , R)
I ≡ λx.x
true ≡ λx y.x
nil ≡ false ≡ λx y.y
hP, Q, . . . , Ri ≡ λz.z P Q . . . R
M [0] ≡ M true
M [i + 1] ≡ (M false)[i]
Y ≡ λf.((λx.x x)(λx.f (x x)))
Ω ≡ (λx.x x)(λx.x x)
Note that
A sequence is thus represented by pairing its first element with its tail—the
sequence of remaining elements. The i’th element of a sequence M may be
selected as M [i]. To wit:
3
a more complicated list processing expression like like s (λa b c.c a b) M XN ,
which for s = nil reduces to N M X, and for s ≡ hP, Qi reduces to M P Q X N .
Y is the fixpoint operator, that satisfies
Yf = (λx.f (x x))(λx.f (x x)) = f (Y f ).
This allows one to transform a recursive definition f = . . . f . . . into f =
Y(λf.(. . . f . . .)), which behaves exactly as desired.
Ω is the prime example of a term with no normal form, the equivalence of
an infinite loop.
4
λM
d ≡ 00M
c
MN
d ≡ 01M
cNb
We call |M
c| the size of M .
E C (M
c : N ) = C (λz.M ) N
5
Proof: We take
E ≡ Y (λe c s.s (λa t.t (λb.a E0 E1 )))
E0 ≡ e (λx.b (c (λz y.x hy, zi))(e (λy.c (λz.x z (y z)))))
E1 ≡ (b (c (λz.z b))(λs.e (λx.c (λz.x (z b))) t))
of size 217 and note that the beta reduction from Y M to (λx.x x)(λx.M (x x))
saves 7 bits, while[15] replacing E by Y (λe c s.s (λa t e c.t (λb.a E0 E1 ))ec)
and doing one more beta reduction of (λx.M (x x)), saves another 4 bits.
Recall from the discussion of Y that the above is a transformed recursive
definition where e will take the value of E.
Intuitively, E works as follows. Given a continuation c and sequence s, it
extracts the leading bit a and tail t of s, extracts the next bit b, and selects E0
to deal with a = true (abstraction or application), or E1 to deal with a = false
(an index).
E0 calls E recursively, extracting a decoded term x. In case b = true
(abstraction), it prepends a new variable y to bindings list z, and returns the
continuation applied to the decoded term provided with the new bindings. In
case b = false (application), it calls E recursively again, extracting another
decoded term y, and returns the continuation applied to the application of the
decoded terms provided with shared bindings.
E1 , in case b = true, decodes to the 0 binding selector. In case b = false,
it calls E recursively on t (coding for an index one less) to extract a binding
selector x, which is provided with the tail z b of the binding list to obtain the
correct selector.
We continue with the formal proof, using induction on M .
Consider first the case where M = 0. Then
E C (M
c : N) = E C (10 : N )
= hfalse, htrue, N ii (λa t.t (λb.a E0 E1 ))
= htrue, N i (λb.false E0 E1 )
= (E1 N )[b := true]
= C (λz.z true) N,
as required. Next consider the case where M = n + 1. Then, by induction,
E C (M
c : N) = E C (1n+2 0 : N )
= hfalse, hfalse, (1n 0 : N )ii (λa t.t (λb.a E0 E1 ))
= (λs.e (λx.C (λz.x (z false)))(1n+1 0 : N ))(1n 0 : N )
= E (λx.C (λz.x (z false))) (b
n : N)
= (λx.C (λz.x (z false))) (λz.nz[] ) N
= C (λz.n(z false)[] ) N
= C (λz.(z false)[n])) N
= C (λz.z[n + 1])) N
= C (λz.(n + 1)z[] ) N,
6
as required. Next consider the case M = λM 0 . Then, by induction and claim 1,
d0 : N ))
E C ((λM = E C (00M c0 : N )
c0 : N )ii (λa t.t (λb.a E0 E1 ))
= htrue, htrue, (M
= e (λx.(C (λz y.xhy, zi))) (Mc0 : N )
= (λx.(C (λz y.x hy, zi)))(λz.M 0z[] ) N
= C (λz y.(λz.M 0z[] ) hy, zi) N
= C (λz.(λy.M 0hy,zi[] )) N
= C (λz.(λM 0 )z[] )) N,
0 M 00 : N )
E C (M\ = E C (01M c0 Md00 : N )
= c0 M
htrue, hfalse, (M d00 : N )ii(λa t.t (λb.a E0 E1 ))
= e (λx.(e (λy.C (λz.x z (y z))))) (M c0 M d00 : N )
= (λx.(e (λy.C (λz.x z (y z)))))(λz.M 0z[] ) (Md00 : N )
= e (λy.C (λz.(λz.M 0z[] ) z (y z)))(M
d00 : N )
= (λy.C (λz.M 0z[] (y z)))(λz.M 00z[] ) N
= C (λz.M 0z[] M 00z[] ) N
= C (λz.(M 0 M 00 )z[] ) N,
3 Combinatory Logic
Combinatory Logic (CL) is the equational theory of combinators—terms built
up, using application only, from the two constants K and S, which satisfy
S M N L = M L (N L)
KM N = M
7
K and S. It is based on the following identities, which are easily verified:
λx.x = I = SKK
λx.M = KM (x not free in M )
λx.M N = S (λx.M ) (λx.N )
λx y.y x ≡ λx (λy.y x)
= λx (S I(K x))
= S (K (S I))(S (K K) I),
λ0 x. x ≡ I
0
λ x. M ≡ KM (x 6∈ M )
λ x. (M N ) ≡ S (λ x. M ) (λ0 x. N )
0 0
K
e ≡ 00
S
e ≡ 01
CgD ≡ 1C
eDe
F C (M
f : N) = C M N
8
Proof: We take
F ≡ Y (λe c s.s(λa.a F0 F1 ))
F0 ≡ λt.t (λb.c (b K S))
F1 ≡ e (λx.e (λy.(c (x y))))
of size 131 and note that a toplevel beta reduction saves 7 bits in size, while
replacing K by b saves another 5 bits (we don’t define F that way because of
its negative impact on bracket abstraction).
Given a continuation c and sequence s, it extracts the leading bit a of s, and
tail t extracts the next bit b, and selects F0 to deal with a = true (K or S), or
F1 to deal with a = false (application). Verification is straightforward and left
as an exercise to the reader.
We conjecture that any self-interpreter for any binary representation of com-
binatory logic must be at least 14 bytes in size. The next section considers
translations of F which yield a self-interpreter of CL.
λ1 x. (M x) ≡ M (x 6∈ M )
whenever possible. Now the size of F as a combinator is only 281, just over half
as big.
Turner [26] noticed that repeated use of bracket abstraction can lead to a
quadratic expansion on terms such as
X ≡ λa b . . . z.(a b . . . z) (a b . . . z),
λ2 x. (S K M ) ≡ S K (for all M )
λ2 x. M ≡ KM (x 6∈ M )
2
λ x. x ≡ I
2
λ x. (M x) ≡ M (x 6∈ M )
λ2 x. (x M x) ≡ λ2 x. (S S K x M )
λ2 x. (M (N L)) ≡ λ2 x. (S (λ2 x. M ) N L) (M, N combinators)
λ2 x. ((M N ) L) ≡ λ2 x. (S M (λ2 x. L) N ) (M, L combinators)
λ2 x. ((M L) (N L)) ≡ λ2 x. (S M N L) (M, N combinators)
λ2 x. (M N ) ≡ S (λ2 x. M ) (λ2 x. N )
9
The first rule exploits the fact that S K M behaves as identity, whether M
equals K, x or anything else. The fifth rule avoids introduction of two Is. The
sixth rule prevents occurrences of x in L from becoming too deeply nested,
while the seventh does the same for occurrences of x in N . The eighth rule
abstracts an entire expression L to avoid duplication. The operation λ2 x. M
for combinators M will normally evaluate to K M , but takes advantage of
the first rule by considering any S K M a combinator. Where λ1 gives an X
combinator of size 2030, λ2 brings this down to 374 bits.
For F the improvement is more modest, to 275 bits. For further improve-
ments we turn our attention to the unavoidable fixpoint operator.
Y, due to Curry, is of minimal size in the λ calculus. At 25 bits, it’s 5 bits
shorter than Turing’s alternative fixpoint operator
translates to combinator
S S K (S (K (S S (S (S S K)))) K)
10
The theory of program size complexity, which has become known as Algo-
rithmic Information Theory or Kolmogorov complexity after one of its founding
fathers, has found fruitful application in many fields such as combinatorics,
algorithm analysis, machine learning, machine models, and logic.
In this section we propose a concrete definition of Kolmogorov complexity
that is (arguably) as simple as possible, by turning the above interpreters into
a ‘universal computer’.
Intuitively, a computer is any device that can read bits from an input stream,
perform computations, and (possibly) output a result. Thus, a computer is a
method of description in the sense that the string of bits read from the input
describes the result. A universal computer is one that can emulate the behaviour
of any other computer when provided with its description. Our objective is to
define, concretely, for any object x, a measure of complexity of description
C(x) that shall be the length of its shortest description. This requires fixing
a description method, i.e. a computer. By choosing a universal computer, we
achieve invariance: the complexity of objects is at most a constant greater than
under any other description method.
Various types of computers have been considered in the past as description
methods.
Turing machines are an obvious choice, but turn out to be less than ideal:
The operating logic of a Turing machine—its finite control—is of an irregular
nature, having no straightforward encoding into a bitstring. This makes con-
struction of a universal Turing machine that has to parse and interpret a finite
control description quite challenging. Roger Penrose takes up this challenge in
his book [1], at the end of Chapter 2, resulting in a universal Turing machine
whose own encoding is an impressive 5495 bits in size, over 26 times that of E.
The ominously named language ‘Brainfuck’ which advertises itself as “An
Eight-Instruction Turing-Complete Programming Language” [24], can be con-
sidered a streamlined form of Turing machine. Indeed, Oleg Mazonka and Daniel
B. Cristofani [18] managed to write a very clever BF self-interpreter of only 423
instructions, which translates to 423 · log(8) = 1269 bits (the alphabet used is
actually ASCII at 7 or 8 bits per symbol, but the interpreter could be redesigned
to use 3-bit symbols and an alternative program delimiter).
In [5], Levin stresses the importance of a (descriptional complexity) measure,
which, when compared with other natural measures, yields small constants, of
at most a few hundred bits. His approach is based on constructive objects
(c.o.’s) which are functions from and to lower ranked c.o.’s. Levin stops short of
exhibiting a specific universal computer though, and the abstract, almost topo-
logical, nature of algorithms in the model complicates a study of the constants
achievable.
In [2], Gregory Chaitin paraphrases John McCarthy about his invention of
LISP, as “This is a better universal Turing machine. Let’s do recursive function
theory that way!” Later, Chaitin continues with “So I’ve done that using LISP
because LISP is simple enough, LISP is in the intersection between theoretical
and practical programming. Lambda calculus is even simpler and more elegant
than LISP, but it’s unusable. Pure lambda calculus with combinators S and K,
11
it’s beautifully elegant, but you can’t really run programs that way, they’re too
slow.”
There is however nothing intrinsic to λ calculus or CL that is slow; only
such choices as Church numerals for arithmetic can be said to be slow, but
one is free to do arithmetic in binary rather than in unary. Frandsen and
Sturtivant [11] amply demonstrate the efficiency of λ calculus with a linear
time implementation of k-tree Turing Machines. Clear semantics should be a
primary concern, and Lisp is somewhat lacking in this regard [4]. This paper
thus develops the approach suggested but discarded by Chaitin.
U (Mc : N) = M N
U0 (M
f : N) = M N
for every closed λ-term or combinator M and arbitrary N , immediately estab-
lishing their universality.
The universal computers essentially define new binary languages, which
we may call universal binary lambda calculus and universal combinatory logic,
whose programs comprise two parts. The first part is a program in one of the
original binary languages, while the second part is all the binary data that is
consumed when the first part is interpreted. It is precisely this ability to embed
arbitrary binary data in a program that allows for universality.
Note that by Theorem 2, the continuation hΩi in U results in a term M Ω[] .
For closed M , this term is identical to M , but in case M is not closed, a free
index n at λ-depth n is now bound to Ω[n − n], meaning that any attempt
to apply free indices diverges. Thus the universal computer essentially forces
programs to be closed terms.
We can now define the Kolmogorov complexity of a term x, which comes in
four flavors. In the simple version, programs are terminated with N = nil and
the result must equal x. In the prefix version, programs are terminated with un-
solvable Ω, and the result must x. In a former prefix definition, now considered
unnecessarily complicated and renamed Kp, programs were not terminated, and
the result had to equal the pair of x and the remainder of the input. In all cases
the complexity is conditional on zero or more terms yi .
Definition 4
KS(x|y1 , . . . , yk ) = min{l(p) | U (p : nil) y1 . . . yk = x}
12
KP (x|y1 , . . . , yk ) = min{l(p) | U (p : Ω) y1 . . . yk = x}
Kp(x|y1 , . . . , yk ) = min{l(p) | U (p : z) y1 . . . yk = hx, zi}
The definition also applies to infinite terms according to the infinitary lambda
calculus of [21].
In the special case of k = 0 we obtain the unconditional complexities KS(x)
and KP (x).
Finally, for a binary string s, we can define its monotone complexity as
4.2 Monadic IO
The reason for preserving the remainder of input in the prefix casse is to facilitate
the processing of concatenated descriptions, in the style of monadic IO [22].
Although a pure functional language like λ calculus cannot define functions with
side effects, as traditionally used to implement IO, it can express an abstract
data type representing IO actions; the IO monad. In general, a monad consists
of a type constructor and two functions, return and bind (also written >>= in
infix notation) which need to satisfy certain axioms [22]. IO actions can be seen
as functions operating on the whole state of the world, and returning a new state
of the world. Type restrictions ensure that IO actions can be combined only
through the bind function, which according to the axioms, enforces a sequential
composition in which the world is single-threaded. Thus, the state of the world
is never duplicated or lost. In our case, the world of the universal machine
consists of only the input stream. The only IO primitive needed is readBit,
which maps the world onto a pair of the bit read and the new world. But a list
is exactly that; a pair of the first element and the remainder. So readBit is
simply the identity function! The return function, applied to some x, should
map the world onto the pair of x and the unchanged world, so it is defined
by return ≡ λx y.hx, yi. Finally, the bind function, given an action x and
a function f , should subject the world y to action x (producing some ha, y 0 i)
followed by action f a, which is defined by bind ≡ λx f y.x y f (note that
ha, y 0 if = f a y 0 ) One may readily verify that these definitions satisfy the
monad axioms. Thus, we can wite programs for U either by processing the
input stream explicitly, or by writing the program in monadic style. The latter
can be done in the pure functional language ‘Haskell’ [23], which is essentially
typed lambda calculus with a lot of syntactic sugar.
13
4.3 An Invariance Theorem
The following theorem is the first concrete instance of the Invariance Theorem,
2.1.1 in [6].
n∈N: 0 1 2 3 4 5 6 7 8 9 ...
x ∈ {0, 1}∗ : 0 1 00 01 10 11 000 001 010 ...
y ∈ {1, 2}∗ : 1 2 11 12 21 22 111 112 121 ...
14
Figure 1: binary natural tree
In [8], Vladimir Levenshtein defines a universal code for the natural numbers
that corresponds to concatenating the edges on the path from 0 to n, prefixed
with a unary encoding of the depth of vertex n in the tree. The resulting set of
codewords is prefix-free. meaning that no string is a proper prefix of another,
which is the same as saying that the strings
P in the set are self-delimiting. Prefix-
−|s|
free sets satisfy the Kraft inequality: s2 ≤ 1. We’ve already seen two
important examples of prefix-free sets, namely the set of λ term encodings M c
and the set of combinator encodings M f. The Levenshtein code n of a number
n can be defined recursively as
0=0 n + 1 = 1 l(n) n.
Figure 2 shows the codes as segments of the unit interval, where code x covers
all the real numbers whose binary expansion starts as 0.x, and lexicographic
order translates into left-to-right order.
15
5 Upper bounds on complexity
Having provided concrete definitions of all key ingredients of algorithmic infor-
mation theory, it is time to prove some concrete results about the complexity
of strings.
The simple complexity of a string is upper bounded by its length:
KP (x) ≤ |delimit|
\ + l(x) = l(x) + 326.
((’ (lambda (loop) ((’ (lambda (x*) ((’ (lambda (x) ((’ (lambda (y) (cons x
(cons y nil)))) (eval (cons (’ (read-exp)) (cons (cons ’ (cons x* nil))
nil)))))) (car (cdr (try no-time-limit (’ (eval (read-exp))) x*)))))) (loop
nil)))) (’ (lambda (p) (if(= success (car (try no-time-limit (’ (eval
(read-exp))) p))) p (loop (append p (cons (read-bit) nil)))))))
16
of length 2872 bits.
We constructed an equivalent of ”try” from scratch. The constant 784 is the
size of the term pairup defined below, containing a monadic lambda calculus
interpreter that tracks the input bits read so far (which due to space restrictions
is only sparsely commented):
\io. let
id = \x x;
true = \x \y x;
false = \x \y y;
nil = false;
-- parse binary lambda calculus using HOAS, capturing program.
-- uni :: ((t -> t) -> t) -- abstraction
-- -> (t -> (t -> t)) -- application
-- -> (([t] -> t) -> ([Bool] -> [Bool]) -> [Bool] -> r)
-- -- continuation taking program, parsed string, remainder of input
-- -> ([Bool] -> [Bool]) -- initial difference list (id)
-- -> [Bool] -- input
-- -> r
uni = \abs\app.let uni0 = \cnt\ps\xs.
xs (\b0.let ps0 = \ts.ps (\p.p b0 ts) in
\ys\uni0\cnt.ys (\b1.
let ps1 = \ts.ps0 (\p.p b1 ts) in
b0 (uni0 (\v1.(b1 (cnt (\ctx.abs (\v2.v1 (\p.p v2 ctx))))
(uni0 (\v2.cnt (\ctx.app (v1 ctx) (v2 ctx)))))))
(b1 (cnt (\ctx.ctx b1))
(\d\d.uni0 (\v.cnt (\ctx.v (ctx b1))) ps0 ys)) ps1)) uni0 cnt
in uni0;
absT = A;
17
boolT = \x. absT (\t. absT (\f. x t f));
consT = \x\xs. absT (\p. appT (appT p x) xs);
rest = \prog\ps.
let go = \is\pspi\xs.
caseT (nfT (appT (prog xs) (is O)))
xs -- A: impossible case
(result (pspi nil) xs) -- V: we’ve found the remainder
-- O: start over with a longer input list
(xs (\x\xs. go (\tl. is (consT (boolT x) tl))
(\tl. pspi (\p. p x tl)) xs))
in go id ps;
6 Future Research
It would be nice to have an objective measure of the simplicity and expres-
siveness of a universal machine. Sizes of constants in fundamental theorems
are an indication, but one that is all too easily abused. Perhaps diophantine
equations can serve as a non-arbitrary language into which to express the com-
putations underlying a proposed definition of algorithmic complexity, as Chaitin
has demonstrated for relating the existence of infinitely many solutions to the
random halting probability Ω. Speaking of Ω, our model provides a well-defined
notion of halting as well, namely when U (p : z) = hM, zi for any term M
(we might as well allow M without normal form). Computing upper and lower
bounds on the value of Ωλ , as Chaitin did for his LISP-based Ω, and Calude
et al. for various other languages, should be of interest as well. A big task
remains in finding a good constant for the other direction of the ‘Symmetry of
Information’ theorem, for which Chaitin has sketched a program. That con-
stant is bigger by an order of magnitude, making its optimization an everlasting
challenge.
7 Conclusion
The λ-calculus is a surprisingly versatile and concise language, in which not
only standard programming constructs like bits, tests, recursion, pairs and lists,
18
but also reflection, reification, and marshalling are readily defined, offering an
elegant concrete foundation of algorithmic information theory.
An implementation of Lambda Calculus, Combinatory Logic, along with
their binary and universal versions, written in Haskell, is available at [27].
8 Acknowledgements
I am greatly indebted to Paul Vitányi for fostering my research into concrete
definitions of Kolmogorov complexity, and to Robert Solovay, Christopher Hen-
drie and Bertram Felgenhauer for illuminating discussions on my definitions and
improvements in program sizes.
References
[1] R. Penrose, The Emperor’s New Mind, Oxford University press, 1989.
19
[11] Gudmund S. Frandsen and Carl Sturtivant, What is an Efficient Imple-
mentation of the λ-calculus?, Proc. ACM Conference on Functional Pro-
gramming and Computer Architecture (J. Hughes, ed.), LNCS 523, 289–312,
1991.
[12] J. Steensgaard-Madsen, Typed representation of objects by functions,
TOPLAS 11-1, 67–89, 1989.
[13] N.G. de Bruijn, Lambda calculus notation with nameless dummies, a tool
for automatic formula manipulation, Indagationes Mathematicae 34, 381–
392, 1972.
[14] H.P. Barendregt, Discriminating coded lambda terms, in (A. Anderson and
M. Zeleny eds.) Logic, Meaning and Computation, Kluwer, 275–285, 2001.
[15] Bertram Felgenhauer, private communication, aug 27, 2011.
[16] François-Nicola Demers and Jacques Malenfant, Reflection in logic, func-
tional and object-oriented programming: a Short Comparative Study, Proc.
IJCAI Workshop on Reflection and Metalevel Architectures and their Ap-
plications in AI, 29–38, 1995.
[17] Daniel P. Friedman, Mitchell Wand, and Christopher T. Haynes, Essentials
of Programming Languages – 2nd ed, MIT Press, 2001.
[18] Oleg Mazonka and Daniel B. Cristofani, A Very Short Self-Interpreter,
http://arxiv.org/html/cs.PL/0311032, 2003.
[19] D. Hofstadter, Godel, Escher, Bach: an Eternal Golden Braid, Basic Books,
Inc., 1979.
[20] H.P. Barendregt, The Lambda Calculus, its Syntax and Semantics, revised
edition, North-Holland, Amsterdam, 1984.
[21] J.R. Kennaway and J. W. Klop and M.R. Sleep and F.J. de Vries, Infinitary
Lambda Calculus, Theoretical Compututer Science, 175(1), 93–125,1997.
[22] Simon Peyton Jones, Tackling the awkward squad: monadic input/output,
concurrency, exceptions, and foreign-language calls in Haskell, in ”Engineer-
ing theories of software construction”, ed. Tony Hoare, Manfred Broy, Ralf
Steinbruggen, IOS Press, 47–96, 2001.
[23] The Haskell Home Page, http://haskell.org/.
[24] Brainfuck homepage, http://www.muppetlabs.com/~breadbox/bf/.
[25] Torben Æ. Mogensen, Linear-Time Self-Interpretation of the Pure Lambda
Calculus, Higher-Order and Symbolic Computation 13(3), 217-237, 2000.
[26] D. A. Turner, Another algorithm for bracket abstraction, J. Symbol. Logic
44(2), 267–270, 1979.
[27] J. T. Tromp, http://tromp.github.io/cl/cl.html, 2004.
20