Imperative Functional Programming Philip Wadler
Imperative Functional Programming Philip Wadler
1
Haskell array operations in terms of incremental ar- mainIO :: IO ()
rays. Hudak have proposed a similar method based mainIO = putcIO '!'
on continuations. Our method is more general than
his in the following sense: monads can implement This is the point at which being is converted to doing:
continuations, but not the converse. when executed, the putcIO action will be performed, and
write an exclamation mark to the standard output.
It is based (only) on the Hindley-Milner type system.
Some other proposals require linear types or existen-
tial types; ours does not. 2.1 Composing I/O operations
The functions dened above allow one to dene a single
We have implemented all that we describe in the con- action, but how can actions be combined? For example,
text of a compiler for Haskell (Hudak et al. [1992]), with how can we write a program to print two exclamation
the exception of the extension to arrays and reference marks? To do so, we introduce two \glue" combinators:
types. The entire I/O system provided by our compiler
is written in Haskell, using the non-standard extensions doneIO :: IO ()
we describe below. The language's standard Dialogue seqIO :: IO a -> IO b -> IO b
interface for I/O is supported by providing a function to
convert a Dialogue into our IO monad. The system is The compound action m `seqIO` n is performed, by rst
freely available by FTP. performing m and then performing n, returning whatever
n returns as the result of the compound action. (Back-
We do not claim any fundamental expressiveness or e- quotes are Haskell's syntax for an inx operator.) The
ciency which is not obtainable through existing systems, action doneIO does no I/O and returns the unit value,
except where arrays are concerned. Nevertheless we feel (). To illustrate, here is an action putsIO, which puts a
that the entire system works particularly smoothly as a string to the standard output:
whole, from the standpoint of both programmer and im-
plementor. putsIO :: [Char] -> IO ()
putsIO [] = doneIO
putsIO (a:as) = putcIO a `seqIO`
2 Overview putsIO as
We need a way to reconcile being with doing: an expres- We can now use putsIO to dene a program which prints
sion in a functional language denotes a value, while an \hello" twice:
I/O command should perform an action. We integrate
these worlds by providing a type IO a denoting actions mainIO = hello `seqIO` hello
that, when performed, may do some I/O and then return where
a value of type a. The following provide simple Unix- hello = putsIO "hello"
avoured I/O operations. This example illustrates the distinction between an action
getcIO :: IO Char and its performance: hello is an action which happens to
putcIO :: Char -> IO () be performed twice. The program is precisely equivalent
to one in which putsIO "hello" is substituted for either
Here getcIO is an action which, when performed, reads a or both of the occurrences of hello. In short, programs
character from the standard input, and returns that char- remain referentially transparent.
acter; and putcIO a is an action which, when performed,
writes the character a to the standard output. Actions In general, an action may also return a value. Again,
which have nothing interesting to return, such as putcIO, there are two combinators. The rst is again trivial:
return the empty tuple (), whose type is also written (). unitIO :: a -> IO a
Notice the distinction between an action and its perfor- If x is of type a, then unitIO x denotes the action that,
mance. Think of an action as a \script", which is per- when performed, does nothing save return x. The second
formed by executing it. Actions themselves are rst-class combines two actions:
citizens. How, then, are actions performed? In our sys-
tem, the value of the entire program is a single (perhaps bindIO :: IO a -> (a -> IO b) -> IO b
large) action, called mainIO, and the program is executed
by performing this action. For example, the following is If m :: IO a and k :: a -> IO b then m `bindIO` k
a legal Haskell program. denotes the action that, when performed, behaves as fol-
2
lows: rst perform action m, yielding a value x of type easier, by dening new action-manipulating combinators.
a, then perform action k x, yielding a value y of type b, For example, the denition of putsIO given above uses
and then return value y. To illustrate, here is an action explicit recursion. Here is an alternative way to write
that echoes the standard input to the standard output. putsIO which does not do so:
(In Haskell, \x -> e stands for a lambda abstraction; the
body of the abstraction extends as far as possible.) putsIO as = seqsIO (map putcIO as)
It will not have escaped the reader's notice that programs The functions take and repeat are standard list-
written in the monadic style look rather similar to imper- processing functions (with nothing to do with I/O) from
ative programs. For example, the echo program in C Haskell's standard prelude. The function repeat takes a
might look something like this: value and returns an innite list each of whose elements
is the given value. The function take takes a prex of
echo() { given length from a list.
loop: a = getchar(a);
These necessarily small examples could easily be pro-
if (a == eof)
grammed with explicit recursion without signicant loss
return;
of clarity (or even a gain!). The point we are making is
else { putchar(a);
that it is easy for the programmer to dene new \glue" to
goto loop; }
combine actions in just the way which is suitable for the
}
program being written. It's a bit like being able to dene
(Indeed, as we discuss later, our compiler translates the your own control structures in an imperative language.
echo function into essentially this C code.) Does the
monadic style force one, in eect, to write a functional 2.3 Calling C directly
facsimile of an imperative program, thereby losing any
advantages of writing in a functional language? We be- Since the \primitive" functions putcIO, getcIO, and so
lieve not. on must ultimately be implemented by a call to the un-
derlying operating system, it is natural to provide the
Firstly, the style in which one writes the functional pro- ability to call any operating system function directly. To
gram's internal computation is unaected. For instance, achieve this, we provide a new form of expression, the
the argument to putsIO can be computed using the usual ccall, whose general form is:
list-processing operations provided by a functional lan-
guage (list comprehensions, map, append, and the like). ccall proc e1 en
:::
Secondly, the power of higher-order functions and non- Here, proc is the name of a C procedure, and e1 , , en:::
strict semantics can be used to make I/O programming are the parameters to be passed to it. This expression
3
is an action, with type IO Int; when performed, it calls Request and Response are algebraic data types which
the named procedure, and delivers its result as the value embody all the possible I/O operations and their results,
of the action. Here, for example, are the denitions of respectively:
getcIO and putcIO:
data Request = Putc Char | Getc
putcIO a = ccall putchar a data Response = OK | OKCh Char
getcIO = ccall getchar
(For the purposes of exposition we have grossly simpli-
These ccalls directly invoke the system-provided func- ed these data types compared with those in standard
tions; no further runtime support is necessary. Using this Haskell.) A system \wrapper program" repeatedly gets
single primitive allows us to implement our entire I/O the next request from the list of requests returned by
system in Haskell. main, interprets and performs it, and attaches the re-
We dene ccall to be a language construct rather than sponse to the end of the response list to which main is
simply a function because: applied.
Here, for example, is the echo program written using a
The rst \argument" must be the literal name of the Dialogue. (In Haskell xs!!n extracts the n'th element
C procedures to be called, and not (say) an expres- from the list xs.)
sion which evaluates to a string which is the name of echo :: Dialogue
the function. Type information alone cannot express echo resps = Getc :
this. if (a == eof)
then []
Dierent C procedures take dierent numbers of ar- else Putc a :
guments, and some take a variable number of ar- echo (drop 2 resps)
guments. (It would be possible to check the type- where
correctness of the C call by reading the signature of OKCh a = resps!!1
the C procedure, but we do not at present do so.)
The diculties with this programming style are all too
Dierent C procedures take arguments of dierent obvious, and have been well rehearsed elsewhere (Perry
types and sizes. (At present, we only permit the [1991]):
arguments to be of base types, such as Char, Int,
Float, Double and so on, though we are working on
extensions which allow structured arguments to be It is easy to extract the wrong element of the re-
sponses, a synchronisation error. This may show up
built.) in a variety of ways. If the \2" in the above program
was erroneously written as \1" the program would
Treating ccall as a construct allows these variations to fail with a pattern-mathing error in getCharIO; if it
be accomodated without diculty. were written \3" it would deadlock.
The Response data type has to contain a constructor
3 Comparison with other I/O styles for every possible response to every request. Even
though Putc may only ever return a response OKChar,
In this section we brie
y compare our approach with two the pattern-matching performed by get has to take
other popular ones, dialogues and continuations. account of all these other responses.
Even more seriously, the style is not composable:
3.1 Dialogues there is no direct way to take two values of type
Dialogue and combine them to make a larger value
The I/O system specied for the Haskell language (Hudak
et al. [1992]) is based on dialogues, also called lazy streams of type Dialogue (try it!).
(Dwelly [1989]; O'Donnell [1985]; Thompson [1989]). In
Haskell, the value of the program has type Dialogue, a Dialogues and the IO monad have equal expressive power,
synonym for a function between a list of I/O responses to as Figure 1 demonstrates, by using Dialogues to emu-
a list of I/O requests: late the IO monad, and vice versa. The function dToIO,
which emulates Dialogues in terms of IO is rather cu-
type Dialogue = [Response] -> [Request] rious, because it involves applying the single dialogue
main :: Dialogue d to both bottom (?) and (later) to the \real" list
4
Dialogue to IO getcC :: (Char -> Result) -> Result
doneC :: Result
dToIO :: Dialogue -> IO ()
dToIO d Using these primitives, the echo program can be written
= case (d bottom) of as follows:
[] -> doneIO
(q:qs) -> doReq q `bindIO` \r -> echo :: Result -> Result
dToIO (\rs -> tail (d (r:rs))) echo c = getcC (\a ->
if (a == eof) then
bottom :: a then c
bottom = error "Should never be evaluated" else putcC a (echo c))
doReq :: Request -> IO Response Since we might want to do some more I/O after the echo-
doReq (GetChar f) ing is completed, we must provide echo with a continua-
= getCharIO f `bindIO` (\c -> tion, c, to express what to do when echo is nished. This
unitIO (OKChar c)) \extra argument" is required for every I/O-performing
doReq (PutChar f c) function if it is to be composable, a pervasive and tire-
= putCharIO f c `seqIO` unitIO OK some feature.
The above presentation of continuation-style I/O is a lit-
IO to Dialogue tle dierent from those cited above. In all those descrip-
type IO a = [Response]
tions, Result is an algebraic data type, with a construc-
-> (a, [Request], [Response])
tor for each primitive I/O operation. As with Dialogues,
execution is driven by a \wrapper" program, which eval-
ioToD :: IO () -> Dialogue
uates main, performs the operation indicated by the con-
ioToD action = \rs -> case (action rs) of
structor, and applies the continuation inside the construc-
(_, qs, _) -> qs
tor to the result. This approach has the disadvantage
that it requires existential types if polymorphic opera-
unitIO v = \rs -> (v, [], rs)
tions, such as those we introduce later in Section 5.3, are
bindIO op fop
to be supported.
= \rs -> let (v1, qs1, rs1) = op rs An obvious improvement, which we have not seen previ-
(v2, qs2, rs2) = fop v1 rs1 ously suggested, is to implement the primitive continu-
in (v2, qs1++qs2, rs2) ation operations (such as putcC, getcC and doneC) di-
rectly, making the Result type an abstract data type
Figure 1: Converting between Dialogue and IO with no operations dened on it other than the primi-
tives themselves. This solves the problem.
of responses (Hudak & Sundaresh [1989]; Peyton Jones Continuations are easily emulated by the IO monad, and
[1988]). This causes both duplicated work and a space vice versa, as Figure 2 shows. The comparison between
leak, but no more ecient purely-functional emulation is the monadic and continuation approach is further ex-
known. The reverse function, ioToD does not suer from plored in Section 6.
these problems, and this asymmetry is the main reason
that Dialogues are specied as primitive in Haskell. We 4 Implementing monadic I/O
return to this this matter in Section 5.3.
So far we have shown that an entire I/O system can be
3.2 Continuations expressed in terms of ccall, bindIO, and unitIO, and of
course the IO type itself. How are these combinators to
The continuation-style I/O model (Gordon [1989]; Hudak be implemented? One possibility is to build them in as
& Sundaresh [1989]; Karlsson [1982]; Perry [1991]) pro- primitives, but it turns out to be both simpler and more
vides primitive I/O operations which take as one of their ecient to implement all except ccall in Haskell.
arguments a continuation which says what to do after the
I/O operation is performed: The idea is that an action of type IO a is implemented as
a function, which takes as its input a value representing
main :: Result the entire current state of the world, and returns a pair,
putcC :: Char -> Result -> Result consisting of (a value representing) the new state of the
5
Continuations to IO unitIO a w = MkIORes a w
bindIO m k w = case (m w) of
type Result = IO ()
MkIORes a w' -> k a w'
cToIO :: Result -> IO () Notice that bindIO and unitIO carefully avoid duplicat-
cToIO r = r ing the world. Provided that the primitive ccall ac-
tions are combined only with these combinators, we can
putCharC :: File -> Char -> Result -> Result guarantee that the ccalls will be linked in a single, lin-
putCharC f c k = putCharIO f c `seqIO` k ear chain, connected by data dependencies in which each
ccall consumes the world state produced by the previous
getCharC :: File -> Char one. In turn this means that the ccall operations can
-> (Char -> Result) -> Result update the real world \in place".
getCharC f k = getCharIO f `thenIO` k
6
case x of to be instantiated at an unboxed type, such as Int#.)
MkInt x# -> MkInt (x# +# x#)
Thirdly, the ccall# primitive is recognised by the code
(Unboxed types and ccall are not part of standard generator and expanded to an actual call to C. Speci-
Haskell. They are mainly used internally in our compiler, cally, the expression:
though we do also make them available to programmers
as a non-standard extension.) case (ccall# proc a# b# c# w) of
MkIORes# n# w' -> ...
We apply exactly the same ideas to ccall. In particular,
instead of implementing ccall directly, we unfold every generates the C statement
use of ccall to make the argument evaluation explicit n# = proc(a#,b#,c#);
before using the truly primitive operation ccall#. For ...
example, the uses of ccall in the denitions of putcIO
and getcIO given above (Section 2.3), are unfolded thus: This simple translation is all that the code generator is
putcIO a = \w ->
required to do. The rest is done by generic program trans-
case a of
formations; that is, transformations which are not specic
MkChar a# ->
to I/O or even to unboxing (Peyton Jones & Launchbury
case (ccall# putchar a# w) of
[1991]).
MkIORes# n# w' -> MkIORes () w'
getcIO = \w ->
4.2 Where has the world gone?
case (ccall# getchar w) of But what has become of the world values in the nal
MkIORes# n# w' -> C code? The world value manipulated by the program
MkIORes (MkChar n#) w' represents the current state of the real world, but since the
Like Int, the type Char is implemented as an algebraic real world is updated \in place" the world value carries no
data type thus: useful information. Hence we simply arrange that no code
is ever generated to move values of type World. This is
data Char = MkChar Int# easy to do, as type information is preserved throughout
the compiler. In particular, the world is never loaded
The outer case expression of putcIO, therefore, evalu- into a register, stored in a data structure, or passed to C
ates a and extracts the bit-pattern a#, which is passed procedure calls.
to ccall#. The inner case expression evaluates the ex-
pression (ccall# putchar a# w), which returns a pair, Is it possible, then, to dispense with the world in the func-
constructed by MkIORes#, consisting of the value n# re- tional part of the implementation as well? For example,
turned by the C procedure putchar (which is ignored), can we dene the IORes type and bindIO combinators
and a new world w' (which is returned). like this?
In the case of getcIO, the (primitive, unboxed) value n# data IORes a = MkIORes a
returned by getchar is not ignored as it is in putcIO; bindIO m k w = case (m w) of
rather it is wrapped in a MkChar constructor, and re- MkIORes a -> k a w
turned as part of the result. No, we cannot! To see this, suppose that bindIO was ap-
The dierences between ccall and ccall# are as follows. plied to a function k which discarded its argument. Then,
Firstly, ccall# takes only unboxed arguments, ready to if bindIO was unfolded, and the expression (k r w) was
call C directly. simplied, there would be no remaining data dependency
to force the call of k to occur after that of m. A compiler
Secondly, it returns a pair built with MkIORes#, contain-would be free to call them in either order, which destroys
ing an unboxed integer result direct from the C call. Thethe I/O sequencing.
IORes# type is very similar to IORes:
To reiterate, the world is there to form a linear chain
data IORes# = MkIORes# Int# World of data dependencies between successive ccalls. It is
(IORes and IORes# are distinct types, because while our quite
to
safe to expose the representation of the IO type
code-improving transformations, because the chain of
extended type system recognises unboxed types, it does data dependencies will prevent any transformations which
not permit polymorphic type constructors, such as IORes, reorder the ccalls. Once the code generator is reached,
7
though, the work of the world values is done, so it is safe 4.4 A continuation-passing implementation
to generate no code for them.
Like most abstract data types, there is more than one way
to implement IO. In particular, it is possible to implement
the IO abstract type using a continuation-passing style.
4.3 echo revisited The type IO a is represented by a function which takes
a continuation expecting a value of type a, and returns a
The implementation we have outlined is certainly simple, value of the opaque type Result.
but is it ecient? Perhaps surprisingly, the answer is an type IO a = (a -> Result) -> Result
emphatic yes. The reason for this is that because the
combinators are written in Haskell, the compiler can un- It is easy to implement bindIO and unitIO:
fold them at all their call sites; that is, perform procedure
inlining. bindIO m k cont = m (\a -> k a cont)
Very little special-purpose code is required in the compiler unitIO r cont = cont r
to achieve this eect | essentially all that is required is What is there to choose between these this representation
that the Haskell denitions of bindIO, unitIO, putcIO of the IO type and the one we described initially (Sec-
and so on, be unfolded by the compiler. In contrast, if tion 4)? The major tradeo seems to be this: with the
bindIO were primitive, then every call to bindIO will re- continuation-passing representation, every use of bindIO
quire the construction of two heap-allocated closures for (even if unfolded) requires the construction of one heap-
its two arguments. Even if bindIO itself took no time at allocated continuation. In contrast, the implementation
all, this would be a heavy cost. we described earlier keeps the continuation implicitly on
To illustrate the eectiveness of the approach we have the stack, which is slightly cheaper in our system.
outlined, we return to the echo program of Section 2.1. If There is a cost to pay for the earlier representation,
we take the code there, unfold the calls of seqIO, doneIO, namely that a heavily left-skewed composition of bindIOs
eof, putcIO and getcIO, and do some simplication, we can cause the stack to grow rather large. In contrast,
get the following: the continuation-passing implementation may use a lot of
echo = \w ->
heap for such a composition, but its stack usage is con-
case (ccall# getchar w) of
stant.
MkIORes# a# w1 -> The main point is that the implementor is free to choose
case (a# ==# eof#) of the representation for IO based only on considerations of
T# -> MkIORes () w1 eciency and resource usage; the choice makes no dier-
F# -> case (ccall# putchar a# w1) of ence to the interface seen by the programmer.
MkIORes# n# w2 -> echo w2
8
What is required is a new combinator for the IO monad, 5.2 Asynchronous I/O
delayIO, which forks o a new branch from the main
\trunk": An even more dangerous but still useful combinator is
performIO, whose type is as follows:
delayIO :: IO a -> IO a
performIO :: IO a -> a
When performed, (delayIO action) immediately re-
turns a suspension which when it is subsequently forced It allows potentially side-eecting operations to take place
will perform the I/O specied by action. The relative which are not attached to the main \trunk" at all! The
interleaving of the I/O operations on the \trunk" and the proof obligation here is that any such side eects do not
\branch" is therefore dependent on the evaluation order aect the behaviour of the rest of the program. An obvi-
of the program. ous application is when one wishes to call a C procedure
which really is a pure function; procedures from a numer-
The delayIO combinator is dangerous (albeit useful), be- ical analysis library are one example.
cause the correctness of the program now requires that
arbitrary interleaving of I/O operations on the \trunk" Implementation. The implementation is quite simple:
and \branch" cannot aect the result. This condition performIO m = case (m newWorld) of
cannot be guaranteed by the compiler; it is a proof obli- MkIORes r w' -> r
gation for the programmer. In practice, we expect that
delayIO will be used mainly by system programmers. Here, newWorld is a value of type World conjured up out
of thin air, and discarded when the action m has been
With the aid of delayIO (and a few new primitives such performed.
as fOpenIO), it is easy to write a lazy readFile:
readFile :: [Char] -> IO [Char] 5.3 Assignment and reference variables
readFile s = fOpenIO s `bindIO` \f ->
delayIO (lazyRd f) Earlier, in Section 3.1, we discussed the apparently in-
soluble ineciency of dToIO, the function which emu-
lazyRd :: File -> IO [Char] lates Dialogues using the IO monad. We can solve this
lazyRd f problem by providing an extra general-purpose mecha-
= readChar f `bindIO` \a -> nism, that of assignable reference types and operations
if (a == eof) then over them (Ireland [1989]):
fCloseIO f `seqIO`
unitIO [] newVar :: a -> IO (Ref a)
else assignVar :: Ref a -> a -> IO ()
delayIO (lazyRd f) `bindIO` \as -> deRefVar :: Ref a -> IO a
unitIO (a:as)
The call newVar x allocates a fresh variable containing
The delayIO combinator provides essentially the power the value x; the call assignVar v x assigns value x to
of Gordon's suspend operator (Gordon [1989]). variable v; and the call deRefVar v fetches the value in
variable v. By making these side-eecting operations part
Implementation. A nice feature of the implementation of the IO monad, we make sure that their order of evalu-
technique outlined in Section 4 is that delayIO is very ation, and hence semantics, is readily explicable.
easy to dene:
With the aid of these primitives it is possible to write
delayIO m = \w -> let res = case (m w) of an ecent emulation of Dialogues using IO (Figure 3).
MkIORes r w' -> r The idea is to mimic a system which directly implements
in Dialogues, which follows the processing of each request
MkIORes res w with a destructive update to add a new response to the
end of the list of responses. Notice the uses of delayIO,
In contrast to bindIO, notice how delayIO duplicates the which re
ects the fact that there is no guarantee that
world w, and then discards the nal world w' of the de- dialogue will not evaluate a response before it has emit-
layed branch; it is this which allows the unsynchronised ted a request. If this occurs, the un-assigned variable is
interleaving of I/O operations on the \branch" with those evaluated, which elicits a suitable error message.
on the \trunk".
References in languages such as ML require a weakened
form of polymorphism in order to maintain type safety
9
dToIO :: Dialogue -> IO () The behaviour of these operations is specied by the usual
dToIO dialogue laws.
= newVar (error "Synch") `bindIO` \rsV ->
delayIO (deRefVar rsV) `bindIO` \rs -> lookup i (new v) = v
run (dialogue rs) rsV lookup i (update i v x) = v
lookup i (update j v x) = lookup i x
run :: [Request] -> Ref [Response] -> IO ()
where i 6= j in the last equation. In practice, these oper-
run [] v = doneIO
ations would be more complex; one needs a way to specify
run (req:reqs) v
the array bounds, for instance. But the above suces to
= doReq req `bindIO`
newVar (error "Synch") `bindIO`
\r ->
\rsV ->
explicate the main points.
delayIO (deRefVar rsV) `bindIO` \rs -> The ecient way to implement the update operation is
assignVar v (r:rs) `seqIO` to overwrite the specied entry of the array, but in a pure
run reqs rsV functional language this is only safe if there are no other
pointers to the array extant when the update operation
Figure 3: Ecient conversion from Dialogue to IO is performed. An array satisfying this property is called
single threaded, following Schmidt (Schmidt [1985]).
(Tofte [1990]). For instance, in ML a fresh reference to As an example, consider the following problem. An oc-
an empty list has type '_a list ref, where the type currence is either a denition pairing an index with a
variable '_a is weak, and so may be instantiated only value, or a use of an index.
once. In contrast, here a fresh reference to an empty list data Occ = Def Ind Val | Use Ind
has type IO (Ref a), and the type variable a is normal.
But no lack of safety arises, because an expression of this For illustration take Ind = Int and Val = Char. Given
type allocates a new reference each time it is evaluated. a list os of occurrences, the call uses os returns for each
The only way to change a value of type IO (Ref a) to one use the most recently dened value (or '-' if there is no
of type Ref a is via bindIO, but now the variable of type previous denition). If
Ref a is not let-bound, and so can only be instantiated
once anyway. Hence the extra complication of weak type os = [Def 1 'a', Def 2 'b', Use 1,
variables, required in languages with side eects, seems Def 1 'c', Use 2, Use 1]
unnecessary here. (We're indebted to Martin Odersky for then
this observation.)
uses os = ['a', 'b', 'c'].
10
6.1 Monadic arrays unitA (v:vs)
We believe that single threading is too important to leave This is somewhat lengthier than the previous example,
to the vagaries of an analyser. Instead, we use monads to but it is guaranteed safe to implement update by over-
guarantee single threading, in much the same way as was writing.
done with I/O. Analogous to the type IO a (the monad of
I/O actions), we provide an abstract type A a (the monad
of array transformers). 6.2 Continuation arrays
newA :: Val -> A a -> a An alternative method of guaranteeing single threading
lookupA :: Ind -> A Val for arrays has been proposed by Hudak [1992]. Like the
updateA :: Ind -> Val -> A () previous work of Swarup, Reddy & Ireland [1991], it is
unitA :: a -> A a based on continuations, but unlike that work it requires
bindA :: A a -> (a -> A b) -> A b no change to the type system.
For purposes of specication, we can dene these in terms As with the array monad, one denes an abstract type
of the proceeding operations as follows. supporting various operations. The type is C z, and the
operations are as follows.
type A a = Arr -> (a, Arr)
newC :: Val -> C z -> z
newA v m = fst (m (new v)) lookupC :: Ind -> (Val -> C z) -> C z
lookupA i = \x -> (lookup i x, x) updateC :: Ind -> Val -> C z -> C z
updateA i v = \x -> ((), update i v x) unitC :: z -> C z
unitA a = \x -> (a,x)
m `bindA` k = \x -> let (a,y) = m x in k a y Here a continuation, of type C z, represents the remaining
series of actions to be performed on the array, eventually
A little thought shows that these operations are indeed returning (via unitC) a value of type z.
single threaded. The only operation that could duplicate For purposes of specication, we can dene these in terms
the array is lookupA, but this may be implemented as of the array operations as follows.
follows: rst fetch the entry at the given index in the
array, and then return the pair consisting of this value type C z = Arr -> z
and the pointer to the array. To enforce the necessary
sequencing, we augment the above specication with the newC v c = c (new v)
requirement that lookupA and updateA are strict in the lookupC i d = \x -> d (lookup i x) x
index and array arguments (but need not be strict in the updateC i v c = \x -> c (update i v x)
value). unitC z = \x -> z
The above is given for purposes of specication only { the Again, these operations are single threaded if lookupC
actual implementation is along the lines of Section 4. and updateC are strict in the index and array arguments.
For convenience, dene seqA in terms of bindA in the For convenience, dene
usual way.
m $ c = m c
m `seqA` n = m `bindA` \a -> n
Here is the `denition-use' problem, recoded in monadic This lets us omit some parentheses, since m (\x -> n)
becomes m $ \x -> n.
style.
Here is the `denition-use' problem, recoded in continua-
uses :: [Occ] -> [Val] tion style.
uses os = newA '-' (loopA os)
uses :: [Occ] -> [Val]
loopA :: [Occ] -> A [Val] uses os = newC '-' (loopC os unitC)
loopA [] = unitA []
loopA (Def i v : os) = updateA i v `seqA` loopC :: [Occ] -> ([Val] -> C z) -> C z
loopA os loopC [] c = c []
loopA (Use i : os) = lookupA i `bindA` \v -> loopC (Def i v : os) c = updateC i v $
loopA os `bindA` \vs -> loopC os c
11
loopC (Use i : os) c = lookupC i $ \v -> type B a z = (a -> C z) -> C z
loopC os $ \vs ->
c (v:vs) newB v m = newC v (m unitC)
lookupB i = \d -> lookupC i d
This is remarkably similar to the monadic style, where updateB i v = \d -> updateC i v (d ())
$ takes the place of bindA and seqA, and the current unitB a = \d -> d a
continuation c takes the place of unitA. (If c plays the m `bindB` k = \d -> m (\a -> k a d)
role of unitA, why do we need unitC? Because it acts as
the `top level' continuation.) Again, it is easy to prove this implementation satises the
However, there are two things to note about the contin- given specications.
uation style. First, the types are rather more complex So monads are more powerful than continuations, but
{ compare the types of loopA and loopC. Second, the only because of the types! It is not clear whether this
monadic style abstracts away from the notion of contin- is simply an artifact of the Hindley-Milner type system,
uation { so there are no occurrences of c cluttering the or whether the types are revealing a dierence of funda-
dention of loopA. mental importance. (Our own intuition is the latter { but
it's only an intuition.)
6.3 Monads vs. continuations 6.4 Conclusion
We can formally compare the power of the two approaches The I/O approach outlined earlier manipulates a global
by attempting to implement each in terms of the other. state, namely the entire state of the machine accessible
Despite their similarities, the two approaches are not via a C program. What has been shown in this section
equivalent. Monads are powerful enough to implement is that this approach extends smoothly to manipulating
continuations, but not (quite) vice versa. local state, such as a single array. Further, although the
To implement continuations in terms of monads is sim- monad and continuation approaches are interconvertible
plicity itself. for I/O, they are not for arrays: monads are powerful
enough to dene continuations, but not the reverse.
type C z = A z
For actual use with Haskell, we require a slightly more so-
newC v c = newA v c
phisticated set of operations. The type A must take extra
lookupC i d = lookupA i `bindA` d
parameters corresponding to the index and value types,
updateC i v c = updateA i v `seqA` c the operation newA should take the array bounds, and so
unitC = unitA
on. By using a variant of newA that creates an unini-
tialised array, and returns the array after all updates are
It is an easy exercise in equational reasoning to to prove nished, it is possible to implement Haskell primitives
that this implementation is correct in terms of the speci- for creating arrays in terms of the simpler monad opera-
cations in Sections 6.1 and 6.2. tions. Thus the same strategy that works for implement-
ing I/O should work for implementing arrays: use a small
The reverse implementation is not possible. The trouble set of primitives based on monads, and depend on pro-
is the annoying extra type variable, z, appearing in the gram transformation to make this adequately ecient.
types of lookupC and updateC. This forces the introduc-
tion of a spurious type variable into any attempt to dene One question that remains is how well this approach ex-
monads in terms of continuations. Instead of a type A a, tends to situations where one wishes to manipulate more
the best one can do is to dene a type B a z. Here are than one state at a time, as when combining I/O with
the types of the new operations. array operations, or operating on two arrays. In this re-
spect eect systems or linear types may be superior; see
newB :: Val -> B a a -> a below.
lookupB :: Ind -> B Val z
updateB :: Ind -> Val -> B () z
unitB :: a -> B a z 7 Related work
bindB :: B a z -> (a -> B b z) -> B b z
7.1 Eect systems
And here are the implementations in terms of continua-
tions. Giord and Lucassen introduced `eect systems' which
use types to record the side-eects performed by a pro-
12
gram, and to determine which components of a program For example, here is the echo program again, written in
can run in parallel without interference (Giord & Lu- the style suggested by the Clean I/O system:
cassen [1986]). The original notion of eect was fairly
crude, there being only four possible eects: pure (no ef- echo :: File -> File -> World -> World
fect), allocate (may allocate storage), function (may read echo fi fo w = if a == eof
storage), procedure (may write storage). New systems are then w1
more rened, allowing eects to be expressed separately else echo (putChar fo a w1)
for dierent regions of store (Jouvelot & Giord [1991]). where
(w1,a) = getChar fi w
A theoretical precursor of the eects work is that of
Reynolds, which also used types to record where eects Compared to the monad approach, this suers from a
could occur and where parallelism was allowed (Reynolds number of drawbacks: programs become more cluttered;
[1981]; Reynolds [1989]). the linear type system has to be explained to the pro-
grammer and implemented in the compiler; and code-
Our work is similar to the above in its commitment to improving transformations need to be re-examined to en-
use types to indicate eects. But eect systems are de- sure they preserve linearity. The latter problem may be
signed for impure, strict functional languaes, where the important; Wakeling found that some standard transfor-
order of sequencing is implicit. Our work is designed for mations could not be performed in the presence of linear-
pure, lazy functional languages, and the purpose of the ity (Wakeling [1990]).
`bind' operation is to make sequencing explicit where it
is required. The big advantage of a linear type system is that it en-
ables us to write programs which manipulate more than
With eect systems, one may use the usual laws of one piece of updatable state at a time. The monadic and
equational reasoning on any program segment without continuation-passing presentations of arrays given above
a `write' side eect. Our work diers in that the laws pass the array around implicitly, and hence can only eas-
of equational reasoning apply even where side eects are ily handle one at a time. This is an important area for
allowed. This is essential, because the optimisation phase future work.
of our compiler is based on equational reasoning.
On the practical side, the Clean work is impressive. They
On the other hand, eect systems make it very easy have written a library of high-level routines to call the
to combine programs with dierent eects. In our ap- Macintosh window system, and demonstrated that it is
proach, each dierent eect would correspond to a dier- possible to build pure functional programs with sophisti-
ent monad type (one for IO, one for each array manip- cated user interfaces. The same approach should work for
ulated, and so on), and it is not so clear how one goes monads, and another area for future work is to conrm
about combining eects. that this is the case.
13
Hammond, Wadler & Brady [1991]). P Hudak [July 1992], \Continuation-based mutable ab-
stract datatypes, or how to have your state and
munge it too," YALEU/DCS/RR-914, Depart-
Acknowledgements ment of Computer Science, Yale University.
This work took place in the context of the team building P Hudak, SL Peyton Jones, PL Wadler, Arvind, B Boutel,
the Glasgow Haskell compiler: Cordy Hall, Kevin Ham- J Fairbairn, J Fasel, M Guzman, K Hammond, J
mond and Will Partain. David Watt, Joe Morris, John Hughes, T Johnsson, R Kieburtz, RS Nikhil, W
Launchbury also made very helpful suggestions about our Partain & J Peterson [May 1992], \Report on the
presentation. We gratefully acknowledge their help. functional programming language Haskell, Ver-
sion 1.2," SIGPLAN Notices 27.
References P Hudak & RS Sundaresh [March 1989], \On the ex-
pressiveness of purely-functional I/O systems,"
YALEU/DCS/RR-665, Department of Com-
S Abramsky[1990], \Computational interpretations of puter Science, Yale University.
linear logic," DOC 90/20, Dept of Computing,
Imperial College. Paul Hudak [Aug 1986], \A semantic model of reference
counting and its abstraction," Proc ACM Con-
PM Achten, JHG van Groningen & MJ Plasmeijer ference on Lisp and Functional Programming.
[1992], \High-level specication of I/O in func-
tional languages," in Proc Glasgow Workshop on E Ireland [March 1989], \Writing interactive and le-
Functional Programming, Launchbury et al, ed., processing functional programs," MSc thesis,
Springer Verlag. Victoria University of Wellington.
A Bloss [Sept 1989], \Update analysis and the ecient im- P Jouvelot & D Giord [Jan 1991], \Algebraic reconstruc-
plementation of functional aggregates," in Func- tion of types and eects," in 18'th ACM Sympo-
tional Programming Languages and Computer sium on Principles of Programming Languages
Architecture, London, ACM. (POPL), Orlando, ACM.
A Dwelly [Sept 1989], \Dialogue combinators and dy- Kent Karlsson [1982], \Nebula - a functional operating
namic user interfaces," in Functional Program- system," Chalmers Inst, Goteborg.
ming Languages and Computer Architecture,
London, ACM. BW Kernighan & DM Ritchie [1978], The C programming
language, Prentice Hall.
DK Giord & JM Lucassen [Aug 1986], \Integrating func-
tional and imperative programming," in ACM E Moggi [June 1989], \Computational lambda calculus
Conference on Lisp and Functional Program- and monads," in Logic in Computer Science, Cal-
ming, MIT, ACM, 28{38. ifornia, IEEE.
J-Y Girard [1987], \Linear Logic," Theoretical Computer JT O'Donnell[1985], \Dialogues: a basis for construct-
Science 50, 1{102. ing programming environments," in Proc ACM
Symposium on Language Issues in Programming
A Gordon[Feb 1989], \PFL+: a kernel scheme for func- Environments, Seattle, ACM, 19{27.
tional I/O," TR 160, Computer Lab, University
of Cambridge. N Perry [1991], \The implementation of practical func-
tional programming languages," PhD thesis, Im-
JC Guzman & P Hudak [1990], \Single-threaded poly- perial College, London.
morphic lambda calculus," in Proc 5th Annual
IEEE Symposium on Logic in Computer Science. SL Peyton Jones [Apr 1992], \Implementing lazy func-
tional languages on stock hardware: the Spine-
K Hammond, PL Wadler & D Brady [1991], \Imperate: be less Tagless G-machine," Journal of Functional
imperative," Department of Computer Science, Programming 2, 127{202.
Univ of Glasgow.
14
SL Peyton Jones [Oct 1988], \Converting streams to con-
tinuations and vice versa," Electronic mail on
Haskell mailing list.
SL Peyton Jones & J Launchbury [Sept 1991], \Unboxed
values as rst class citizens," in Functional Pro-
gramming Languages and Computer Architec-
ture, Boston, Hughes, ed., LNCS 523, Springer
Verlag.
J Reynolds [1981], \The essence of Algol," in Algorithmic
Languages, de Bakker & van Vliet, eds., North
Holland, 345{372.
J Reynolds [1989], \Syntactic control of interference, part
II," in International Colloquium on Automata,
Languages, and Programming.
DA Schmidt[Apr 1985], \Detecting global variables in de-
notational specications," TOPLAS 7, 299{310.
V Swarup, US Reddy & E Ireland [Sept 1991], \Assign-
ments for applicative languages," in Functional
Programming Languages and Computer Archi-
tecture, Boston, Hughes, ed., LNCS 523, Springer
Verlag, 192{214.
SJ Thompson [1989], \Interactive functional programs - a
method and a formal semantics," in Declarative
Programming, DA Turner, ed., Addison Wesley.
M Tofte [Nov 1990], \Type inference for polymorphic ref-
erences," Information and Computation89.
PL Wadler[1990], \Linear types can change the world!,"
in Programming concepts and methods, M Broy
& C Jones, eds., North Holland.
PL Wadler[Jan 1992], \The essence of functional pro-
gramming," in Proc Principles of Programming
Languages, ACM.
PL Wadler[June 1990], \Comprehending monads," in
Proc ACM Conference on Lisp and Functional
Programming, Nice, ACM.
D Wakeling[Nov 1990], \Linearity and laziness," PhD
thesis, Department of Computer Science, Univer-
sity of York.
15