Oberon 2
Oberon 2
Oberon 2
to recount how Oberon entered the picture in our case. It happened that around
the time of the beginning of our effort, the space probe Voyager made headlines
with a series of spectacular pictures taken of the planet Uranus and of its moons,
the largest of which is named Oberon. Since its launch I had considered the
Voyager project as a singularly well-planned and successful endeavor, and as a
small tribute to it I picked the name of its latest object of investigation. There
are indeed very few engineering projects whose products perform way beyond
expectations and beyond their anticipated lifetime; mostly they fail much earlier,
particularly in the domain of software. And, last but not least, we recall that
Oberon is famous as the king of elfs.
The consciously planned shortage of manpower enforced a single, but healthy,
guideline: Concentrate on essential functions and omit embellishments that
merely cater to established conventions and passing tastes. Of course, the
essential core had first to be recognized and crystallized. But the basis had been
laid. The ground rule became even more crucial when we decided that the result
should be able to be used as teaching material. I remembered C.A.R. Hoare’s
plea that books should be written presenting actually operational systems rather
than half-baked, abstract principles. He had complained in the early 1970s that
in our field engineers were told to constantly create new artifacts without being
given the chance to study previous works that had proven their worth in the
field. How right was he, even to the present day!
The emerging goal to publish the result with all its details let the choice of
programming language appear in a new light: it became crucial. Modula-2 which
we had planned to use, appeared as not quite satisfactory. Firstly, it lacked a
facility to express extensibility in an adequate way. And we had put extensibility
among the principal properties of the new system. By “adequate” we include
machine-independence. Our programs should be expressed in a manner that
makes no reference to machine peculiarities and low-level programming facilities,
perhaps with the exception of device interfaces, where dependence is inherent.
Hence, Modula-2 was extended with a feature that is now known as type
extension. We also recognized that Modula-2 contained several facilities that
we would not need, that do not genuinely contribute to its power of expression,
but at the same time increase the complexity of the compiler. But the compiler
would not only have to be implemented, but also to be described, studied, and
understood. This led to the decision to start from a clean slate also in the domain
of language design, and to apply the same principle to it: concentrate on the
essential, purge the rest. The new language, which still bears much resemblance
to Modula-2, was given the same name as the system: Oberon[1][2] In contrast
to its ancestor it is terser and, above all, a significant step towards expressing
programs on a high level of abstraction without reference to machine-specific
features.
[1] N. Wirth. The programming language Oberon. Software - Practice and Experience 18,
7, (July 1988) 671-690.
[2] M. Reiser and N. Wirth. Programming in Oberon - Steps beyond Pascal and Modula.
Addison- Wesley, 1992.
4
We started designing the system in late fall 1985, and programming in early
1986. As a vehicle we used our workstation Lilith and its language Modula-2.
First, a cross-compiler was developed, then followed the modules of the inner
core together with the necessary testing and down-loading facilities. The devel-
opment of the display and the text system proceeded simultaneously, without the
possibility of testing, of course. We learned how the absence of a debugger, and
even more so the absence of a compiler, can contribute to careful programming.
Thereafter followed the translation of the compiler into Oberon. This was
swiftly done, because the original had been written with anticipation of the later
translation. After its availability on the target computer Ceres, together with
the operability of the text editing facility, the umbilical cord to Lilith could be
cut off. The Oberon System had become real, at least its draft version. This
happened around the middle of 1987; its description was published thereafter[3],
and a manual and guide followed in 1991[5].
The system’s completion took another year and concentrated on connecting
the workstations in a network for file transfer[4], on a central printing facility,
and on maintenance tools. The goal of completing the system within three years
had been met. The system was introduced in the middle of 1988 to a wider
user community, and work on applications could start. A service for electronic
mail was developed, a graphics system was added, and various efforts for general
document preparation systems proceeded. The display facility was extended to
accommodate two screens, including color. At the same time, feedback from
experience in its use was incorporated by improving existing parts. Since 1989,
Oberon has replaced Modula-2 in our introductory programming courses.
[3] N. Wirth and J. Gutknecht. The Oberon System. Software - Practice and Experience,
19, 9 (Sept. 1989), 857-893.
[5] M. Reiser. The Oberon System - User Guide and Programmer’s Manual. Addison-Wesley,
1991.
[4] N. Wirth. Ceres-Net: A low-cost computer network. Software - Practice and Experience,
20, 1 (Jan. 1990), 13-24.
CHAPTER 2
2.1. INTRODUCTION
In order to warrant the sizeable effort of designing and constructing an entire
operating system from scratch, a number of basic concepts need to be novel.
We start this chapter with a discussion of the principal concepts underlying
the Oberon System and of the dominant design decisions. On this basis, a
presentation of the system’s structure follows. It will be restricted to its coarsest
level, namely the composition and interdependence of the largest building blocks,
the modules. The chapter ends with an overview of the remainder of the book.
It should help the reader to understand the role, place, and significance of the
parts described in the individual chapters.
The fundamental objective of an operating system is to present the computer
to the user and to the programmer at a certain level of abstraction. For example,
the store is presented in terms of requestable pieces or variables of a specified data
type, the disk is presented in terms of sequences of characters (or bytes) called
files, the display is presented as rectangular areas called viewers, the keyboard
is presented as an input stream of characters, and the mouse appears as a pair
of coordinates and a set of key states. Every abstraction is characterized by
certain properties and governed by a set of operations. It is the task of the
system to implement these operations and to manage them, constrained by the
available resources of the underlying computer. This is commonly called resource
management.
Every abstraction inherently hides details, namely those from which it ab-
stracts. Hiding may occur at different levels. For example, the computer may
allow certain parts of the store, or certain devices to be made inaccessible
according to its mode of operation (user/supervisor mode), or the program-
ming language may make certain parts inaccessible through a hiding facility
inherent in its visibility rules. The latter is of course much more flexible and
powerful, and the former indeed plays an almost negligible role in our system.
Hiding is important because it allows maintenance of certain properties (called
invariants) of an abstraction to be guaranteed. Abstraction is indeed the key
of any modularization, and without modularization every hope of being able
to guarantee reliability and correctness vanishes. Clearly, the Oberon System
was designed with the goal of establishing a modular structure on the basis of
purpose-oriented abstractions. The availability of an appropriate programming
language is an indispensable prerequisite, and the importance of its choice cannot
be over-emphasized.
5
6 2.2
2.2. CONCEPTS
2.2.1. Viewers.
Whereas the abstractions of individual variables representing parts of the
primary store, and of files representing parts of the disk store are well established
notions and have significance in every computer system, abstractions regarding
input and output devices became important with the advent of high interactivity
between user and computer. High interactivity requires high bandwidth, and the
only channel of human users with high bandwidth is the eye. Consequently, the
computer’s visual output unit must be properly matched with the human eye.
This occurred with the advent of the high-resolution display in the mid 1970s,
which in turn had become feasible due to faster and cheaper electronic memory
components. The high-resolution display marked one of the few very significant
break-throughs in the history of computer development. The typical bandwidth
of a modern display is in the order of 100 MHz. Primarily the high-resolution
display made visual output a subject of abstraction and resource management. In
the Oberon System, the display is partitioned into viewers, also called windows,
or more precisely, into frames, rectangular areas of the screen(s). A viewer
typically consists of two frames, a title bar containing a subject name and a
menu of commands, and a main frame containing some text, graphic, picture,
or other object. A viewer itself is a frame; frames can be nested, in principle to
any depth.
The System provides routines for generating a frame (viewer), for moving
and for closing it. It allocates a new viewer at a specified place, and upon request
delivers hints as to where it might best be placed. It keeps track of the set of
opened viewers. This is what is called viewer management, in contrast to the
handling of their displayed contents.
But high interactivity requires not only a high bandwidth for visual output,
it demands also flexibility of input. Surely, there is no need for an equally large
bandwidth, but a keyboard limited by the speed of typing to about 100 Hz is
not good enough. The break-through on this front was achieved by the so-called
mouse, a pointing device which appeared roughly at the same time as the high-
resolution display.
This was by no means just a lucky coincidence. The mouse comes to fruition
only through appropriate software and the high-resolution display. It is itself
a conceptually very simple device delivering signals when moved on the table.
These signals allow the computer to update the position of a mark—the cursor—
on the display. Since feedback occurs through the human eye, no great precision
is required from the mouse. For example, when the user wishes to identify a
certain object on the screen, such as a letter, he moves the mouse as long as
required until the mapped cursor reaches the object. This stands in marked
contrast to a digitizer which is supposed to deliver exact coordinates. The
Oberon System relies very much on the availability of a mouse.
Perhaps the cleverest idea was to equip mice with buttons. By being able
to signal a request with the same hand that determines the cursor position,
2.2.2 COMMANDS 7
2.2.2. Commands.
Position-dependent commands with fixed meaning (fixed for each type of
viewer) must be supplemented by general commands. Conventionally, such
commands are issued through the keyboard by typing the program’s name that is
to be executed into a special command text. In this respect, the Oberon System
offers a novel and much more flexible solution which is presented in the following
paragraphs.
First of all we remark that a program in the common sense of a text compiled
as a unit is mostly a far too large unit of action to serve as a command. Compare
it, for example, with the insertion of a piece of text through a mouse command.
In Oberon, the notion of a unit of action is separated from the notion of unit of
compilation. The former is a command represented by a (exported) procedure,
the latter is a module. Hence, a module may, and typically does, define several,
even many commands. Such a (general) command may be invoked at any time
by pointing at its name in any text visible in any viewer on the display, and by
clicking the middle mouse button. The command name has the form M.P, where
P is the procedure’s identifier and M that of the module in which P is declared.
As a consequence, any command click may cause the loading of one or several
modules, if M is not already present in main store. The next invocation of M.P
occurs instantaneously, since M is already loaded. A further consequence is that
8 2.2.2
modules are never (automatically) removed, because a next command may well
refer to the same module.
display. All viewers—and with them also their contents—are organized in a data
structure that is rooted in a global variable (in module Viewers). Parts of this
variable therefore constitute visible states, and it is highly appropriate to refer
to them as command parameters.
One of the rules of what may be called the Oberon Programming Style
is therefore to avoid hidden states, and to reduce the introduction of global
variables. We do not, however, raise this rule to the rank of a dogma. There
exist genuinely useful exceptions, even if the variables have no visible parts.
There remains the question of how to denote visible objects as command
parameters. An obvious case is the use of the most recent selection as parameter.
A procedure for locating that selection is provided by module Oberon. (It is
restricted to text selections). Another possibility is the use of the caret position
in a text. This is used in the case of inserting new text; the pressing of a key on
the keyboard is also considered to be a command, and it causes the character’s
insertion at the caret position.
A special facility is introduced for designating viewers as operands: the star
marker. It is placed at the cursor position when the keyboard’s mark key (SETUP)
is pressed. The procedure Oberon.MarkedViewer identifies the viewer in whose
area the star lies. Commands which take it as their parameter are typically
followed by an asterisk in the text. Whether the text contained in a text viewer,
or a graph contained in a graphic viewer, or any other part of the marked viewer
is taken as the actual parameter depends on how the command procedure is
programmed.
Finally, a most welcome property of the system should not remain unmen-
tioned. It is a direct consequence of the persistent nature of global variables
and becomes manifest when a command fails. Detected failures result in a trap.
Such a trap should be regarded as an abnormal command termination. In the
worst case, global data may be left in an inconsistent state, but they are not
lost, and a next command can be initiated based on their current state. A trap
opens a small viewer and lists the sequence of invoked procedures with their local
variables and current values. This information helps a programmer to identify
the cause of the trap.
2.2.3. Tasks.
From the presentations above it follows that the Oberon System is dis-
tinguished by a highly flexible scheme of command activation. The notion of
a command extends from the insertion of a single character and the setting
of a marker to computations that may take hours or days. It is moreover
distinguished by a highly flexible notion of operand selection not restricted to
registered, named files. And most importantly, by the virtual absence of hidden
states. The state of the system is practically determined by what is visible to
the user.
This makes it unnecessary to remember a long history of previously activated
commands, started programs, entered modes, etc. Modes are in our view the
hallmark of user-unfriendly systems. It should at this point have become obvious
10 2.2.3
that the system allows a user to pursue several different tasks concurrently.
They are manifest in the form of viewers containing texts, graphics, or other
displayable objects. The user switches between tasks implicitly when choosing
a different viewer as operand for the next command. The characteristic of this
concept is that task switching is under explicit control of the user, and the atomic
units of action are the commands.
At the same time, we classify Oberon as a single-process (or single-thread)
system. How is this apparent paradox to be understood? Perhaps it is best
explained by considering the basic mode of operation. Unless engaged in the
interpretation of a command, the processor is engaged in a loop continuously
polling event sources. This loop is called the central loop; it is contained in
module Oberon which may be regarded as the system’s heart. The two fixed event
sources are the mouse and the keyboard. If a keyboard event is sensed, control
is dispatched to the handler installed in the so-called focus viewer, designated as
the one holding the caret. If a mouse event (key) is sensed, control is dispatched
to the handler in which the cursor currently lies. This is all possible under the
paradigm of a single, uninterruptible process.
The notion of a single process implies non-interruptability, and therefore
also that commands cannot interact with the user. Interaction is confined to
the selection of commands before their execution. Hence, there exists no input
statement in typical Oberon programs. Inputs are given by parameters supplied
and designated before command invocation.
This scheme at first appears as gravely restrictive. In practice it is not, if one
considers single-user operation. It is this single user who carries out a dialog with
the computer. A human might be capable of engaging in simultaneous dialogs
with several processes only if the commands issued are very time-consuming.
We suggest that execution of time-consuming computations might better be
delegated to loosely coupled compute-servers in a distributed system.
The primary advantage of a system dealing with a single process is that task
switches occur at user-defined points only, where no local process state has to be
preserved until resumption. Furthermore, because the switches are user-chosen,
the tasks cannot interfere in unexpected and uncontrollable ways by accessing
common variables. The system designer can therefore omit all kinds of protection
mechanisms that exclude such interference. This is a significant simplification.
The essential difference between Oberon and multiprocess-systems is that
in the former task switches occur between commands only, whereas in the latter
a switch may be invoked after any single instruction. Evidently, the difference
is one of granularity of action. Oberon’s granularity is coarse, which is entirely
acceptable for a single-user system.
The system offers the possibility to insert further polling commands in the
central loop. This is necessary if additional event sources are to be introduced.
The prominent example is a network, where commands may be sent from other
workstations. The central loop scans a list of so-called task descriptors. Each
descriptor refers to a command procedure. The two standard events are selected
only if their guard permits, i.e. if either keyboard input is present, or if a mouse
2.2.4 TOOL TEXTS AS CONFIGURABLE MENUS 11
event occurs. Inserted tasks must provide their own guard in the beginning of
the installed procedure.
The example of a network inserting commands, called requests, raises a
question: what happens if the processor is engaged in the execution of another
command when the request arrives? Evidently, the request would be lost unless
measures are taken. The problem is easily remedied by buffering the input. This
is done in every driver of an input device, in the keyboard driver as well as
the network driver. The incoming signal triggers an interrupt, and the invoked
interrupt handler accepts the input and buffers it. We emphasize that such
interrupt handling is confined to drivers, system components at the lowest level.
An interrupt does not evoke a task selection and a task switch. Control simply
returns to the point of interruption, and the interrupt remains unnoticeable to
programs. There exists, as with every rule, an exception: an interrupt due to
keyboard input of the abort character returns control to the central loop.
texts as input. It is therefore possible to compile a text, execute the program, and
to recompile the re-edited text without storing it on disk between compilations
and tests. The ubiquitous editability of text together with the persistence of
global data (in particular viewers) allows many steps that do not contribute to
the progress of the task actually pursued to be avoided.
2.2.5. Extensibility.
An important objective in the design of the Oberon System was extensibility.
It should be easy to extend the system with new facilities by adding modules
that make use of the already existing resources. Equally important, it should
also reduce the system to those facilities that are currently and actually used.
For example, a document editor processing documents free of graphics should
not require the loading of an extensive graphics editor, a workstation operating
as a stand-alone system should not require the loading of extensive network
software, and a system used for clerical purposes need include neither compiler
nor assembler. Also, a system introducing a new kind of display frame should
not include procedures for managing viewers containing such frames. Instead, it
should make use of existing viewer management. The staggering consumption
of memory space by many widely used systems is due to violation of such
fundamental rules of engineering. The requirement of many megabytes of store
for an operating system is, albeit commonly tolerated, absurd and another
hallmark of user-unfriendliness, or perhaps manufacturer friendliness. Its reason
is none other than inadequate extensibility.
We do not restrict this notion to procedural extensibility, which is easy
to realize. The important point is that extensions may not only add further
procedures and functions, but introduce their own data types built on the basis
of those provided by the system: data extensibility. For example, a graphics
system should be able to define its graphics frames based on frames provided by
the basic display module and by extending them with attributes appropriate for
graphics.
This requires an adequate language feature. The language Oberon provides
precisely this facility in the form of type extensions. The language was designed
for this reason; Modula-2 would have been the choice, had it not been for the lack
of a type extension feature. Its influence on system structure was profound, and
the results have been most encouraging. In the meantime, many additions have
been created with surprising ease. One of them is described at the end of this
book. The basic system is nevertheless quite modest in its resource requirements
(see Table at the end of Section 2.3).
For example, a document editor loads a graphics package when a graphic element
appears in the processed document, but not otherwise.
The Oberon System features no separate linker. A module is linked with its
imports when it is loaded, and never before. As a consequence, every module is
present only once, in main store (linked) as well as on backing store (unlinked,
as file). Avoiding the generation of multiple copies in different, linked object files
is the key to storage economy. Prelinked mega-files do not occur in the Oberon
System, and every module is freely reusable.
Its principles and techniques are explained in [6]. Both, source language and
target architecture must be understood before studying a compiler. Both source
language and the target computer’s RISC architecture are presented in the Ap-
pendix.
Although here the compiler appears as an application module, it naturally
plays a distinguished role, because the system (and the compiler itself) is for-
mulated in the language which the compiler translates into code. Together with
the text editor it was the principal tool in the system’s development. The use
of straight-forward algorithms for parsing and symbol table organization led to
a reasonably compact piece of software. A main contributor to this result is
the language’s definition: the language is devoid of complicated structures and
rarely used embellishments.
The compiler and thereby the chapter is partitioned into two main parts.
The first is language-specific, but does not refer to any particular target com-
puter. It consist of the scanner and the parser. This part is therefore of most
general interest to the readership. The second part is, essentially, language-
independent, but is specifically tailored to the instruction set of the target
computer. It is called the code generator.
Texts play a predominant role in the Oberon System. Their preparation is
supported by the system’s major tool, the editor. In Chapter 13 we describe
another editor, one that handles graphic objects. At first, only horizontal and
vertical lines and short captions are introduced as objects. The major difference
to texts lies in the fact that their coordinates in the drawing plane do not follow
from those of their predecessor automatically, because they form a set rather than
a sequence. Each object carries its own, independent coordinates. The influence
of this seemingly small difference upon an editor are far-reaching and permeate
the entire design. There exist hardly any similarities between a text and a
graphics editor. Perhaps one should be mentioned: the partitioning into three
parts. The bottom module defines the respective abstract data structure for texts
or graphics, together with, of course, the procedures handling the structure, such
as searches, insertions, and deletions. The middle module in the hierarchy defines
a respective frame and contains all procedures concerned with displaying the
respective objects including the frame handler defining interpretation of mouse
and keyboard events. The top modules are the respective tool modules (Edit,
Draw). The presented graphics editor is particularly interesting in so far as it
constitutes a convincing example of Oberon’s extensibility. The graphics editor
is integrated into the entire system; it embeds its graphic frames into menu-
viewers and uses the facilities of the text system for its caption elements. And
lastly, new kinds of elements can be incorporated by the mere addition of new
modules, i.e. without expanding, even without recompiling the existing ones.
Two examples are shown in Chapter 13 itself: rectangles and circles.
The Draw System has been extensively used for the preparation of diagrams
of electronic circuits. This application suggests a concept that is useful elsewhere
too, namely a recursive definition of the notion of object. A set of objects may
be regarded as an object itself and be given a name. Such an object is called a
macro. It is a challenge to the designer to implement a macro facility such that
it is also extensible, i.e. in no way refers to the type of its elements, not even in
its input operations of files on which macros are stored.
Chapter 14 presents two other tools, namely one used for installing an
Oberon System on a bare machine, and one used to recover from failures of
the file store. Although rarely employed, the first was indispensable for the
development of the system. The maintenance or recovery tools are invaluable
assets when failures occur. And they do! Chapter 14 covers material that is
rarely presented in the literature.
Chapter 15 is devoted to tools that are not used by the Oberon System
presented so far, but may be essential in some applications. The first is a data
link with a protocol based on the RS-232 standard shown in Chapter 9. Another
is a standard set of basic mathematical functions. And the third is a tool for
creating new macros for the Draw System.
The third part of this book is devoted to a detailed description of the hard-
ware. Chapter 16 defines the processor, for which the compiler generates code.
The target computer is a truly simple and regular processor called RISC with only
14 instructions, represented not by a commercial processor, but implemented
with an FPGA, a Field Programmable Gate Array. It allows its structure to
be described in full detail. It is a straight-forward, von Neumann type device
consisting of a register bank, an arithmetic-logic unit, including a floating-point
unit. Typical optimization facilities, like pipelining and cache memory, have
been omitted for the sake of transparency and simplicity. The processor circuit
is described in the language Verilog.
Chapter 17 describes the environment in which the processor is embedded.
This environment consists of the interfaces to main memory and to all external
devices.
CHAPTER 3
In order to get firmer ground under our feet, we now present the programmed
declaration of type Viewer in a slightly abstracted form:
Viewer = POINTER TO ViewerDesc;
ViewerDesc = RECORD
X, Y, W, H: INTEGER;
handle: Handler;
state: INTEGER;
END;
X, Y, W, H define the viewer’s rectangle on the screen, i.e. location X, Y of the
lower left corner relative to the display origin, width W and height H. The variable
state informs about the current state of visibility (visible, closed, covered), while
handle represents the functional interface of viewers. The type of the handler is
Handler = PROCEDURE (V: Viewer; VAR M: ViewerMsg);
where ViewerMsg is some base type of messages whose exact declaration is of
minor importance for the moment:
ViewerMsg = RECORD ... (*basic parameter fields*) END;
However, we should point out the use of object-oriented terminology. It is justi-
fied because handle is a procedure variable (a handler) whose identity depends
on the specific viewer. A call V.handle(V, M) can therefore be interpreted as
the sending of a message M to be handled by the method of the receiving viewer
V.
We recognize an important difference between the standard object-oriented
model and our handler paradigm. The standard model is closed in the sense that
only a fixed set of messages is understood by a given class of objects. In contrast,
the handler paradigm is open because it defines just the root (ViewerMsg) of a
potentially unlimited tree of extending message types. For example, a concrete
handler might be able to handle messages of type MyViewerMsg, where
MyViewerMsg = RECORD (ViewerMsg)
mypar: MyParameters
END;
is an extended type of ViewerMsg.
It is worth noting that our open object-oriented model is extremely flexible.
Notably, extending the set of message types that are handled by an object is a
mere implementation issue, that is, it has no effect at all on the objects compile-
time interface and on the system integrity. It is fair to mention though that such
a high degree of extensibility does not come for free. The price to pay is the
obligation of explicit message dispatching at runtime. The following chapters
will capitalize on this property.
Coming back to the perspective of tasks, we note that each sending of a
message to a viewer corresponds to an activation or reactivation of the interactive
task that it represents.
22 3.1.2
The procedures Install and Remove are called explicitly in order to transfer the
state of the specified task from offline to idle and from idle to offline respectively.
Installed tasks take their turns in becoming active, that is, in being executed.
The installed handlers are simple, parameterless procedures specifying their own
actions and conditions for execution, with one exception: Resumption may be
delayed until a certain period of time has elapsed. This period is specified in
milliseconds when a task is created.
The following two examples of concrete background tasks may serve a better
understanding of our explanations. The first one is a system-wide garbage
collector collecting unused memory. The second example is a network monitor
accepting incoming data on a local area network. In both examples the state of
the task is captured entirely by global system variables. We shall come back to
these topics in Chapters 8 and 10 respectively.
We should not end this Section without drawing an important conclusion.
Transfers of control between tasks are implemented in Oberon as ordinary calls
and returns of ordinary procedures (procedure variables, actually). Preemp-
tion is not possible. From that we conclude that active periods of tasks are
sequentially ordered and can be controlled by a single thread of control. This
simplification pays well: Locks of common resources are completely dispensable
and deadlocks are not a topic.
The field id specifies the exact request transmitted with this specific reactivation.
In the case of InputMsg the possible requests are consume (the character specified
by field ch) and track (mouse, starting from state given by keys and X, Y). In
case of ControlMsg the choice is mark (the viewer at position X, Y) or neutralize.
Mark means moving the global system pointer (typically represented as a star-
shaped mark) to the current position of the mouse. Neutralizing a viewer is
equivalent to removing all marks and graphical attributes from this viewer.
All tasking facilities are collected in one program module, called Oberon. In
particular, the module’s definition exposes the declarations of the abstract data
type Task and of the message types InputMsg and ControlMsg. The module’s
most important contribution, however, is the task scheduler (often referred to as
“Oberon loop”) that can be regarded as the system’s dynamic center.
Before studying the scheduler in detail we need some more preparation. We
start with the institution of the focus viewer. By definition, this is a distinguished
viewer that by convention consumes subsequent keyboard input. Note that we
identify the focus viewer with the focus task, hereby making use of the one-to-one
correspondence between viewers and tasks.
Module Oberon provides the following facilities in connection with the focus
viewer: A global variable FocusViewer, a procedure PassFocus for transferring
the role of focus to a new viewer, and a defocus variant of ControlMsg for
notifying the old focus viewer of such a transfer.
The implementation details of the abstract data type Task are hidden from
the clients. It is sufficient to know that all task descriptors are organized in
a ring and that a pointer points to the previously activated task. The ring is
guaranteed never to be empty because the above mentioned garbage collector is
installed as a permanent sentinel task at system loading time.
The following is a slightly abstracted version of the actual scheduler code
operating on the task ring. It should be associated with procedure Loop in the
module Oberon.
get mouse position and state of keys;
REPEAT
24 3.2
Essentially, the system state is determined by the values of all global and local
variables at a given time. The trap handler typically opens an extra viewer
displaying the cause of the trap and the saved system state. Notice in the
program fragment above that background tasks are removed from the ring after
failing. This is an effective precaution against cascades of repeated failures.
Obviously, no such precaution is necessary in the case of interactive tasks because
their reactivation is under control of the user of the system.
3.3.1 ATOMIC ACTIONS 25
interruptions may occur at any time, in Oberon they can occur only after the
completion of a task, of a command.
Quintessentially, Oberon programs are represented in the form of commands
that are in the form of exported parameterless procedures that do not interact
with the user of the system.
Returning to the calling and execution of programs we now arrive at the
following refined code version:
name is the name of the desired command in the form M.P, par is the list of
actual parameters, and res is a result code. But in fact we have separated the
setting of parameters from the actual call. Parameters are set by calling
The pair (T, pos) specifies the starting position of a textual parameter list. F
indicates the calling viewer. Notice the occurrence of yet another abstract data
type of name Text that is exported by module Texts. We shall devote Chapter
5 to a thorough discussion of Oberon’s text system. For the moment we can
simply look at a text as a sequence of characters.
The list of actual parameters is handed over to the called command by
module Oberon in the form of an exported global variable Par:
In principle, commands operate on the entire system and can access the current
global state via the system’s powerful abstract modular interface, of which the
list of actual parameters is just one component. Another one is the so-called
system log which is a system-wide protocol reporting on the progress of command
3.3.3 GENERIC COPY VIEWER 27
3.4. TOOLBOXES
Modules typically appear in three different forms. The first is a module
that encapsulates some data, letting them be accessed only through exported
procedures and functions. A good example is module FileDir, encapsulating
the file directory and protecting it from disruptive access. A second kind is the
module representing an abstract data type, exporting a type and its associated
operators. Typical examples are modules Files, Modules, Viewers, and Texts.
A third kind is the collection of procedures pertaining to the same topic, such
as module RS-232 handling communication over a serial line.
Oberon adds a fourth form: the toolbox. By definition, this is a pure col-
lection of commands in the sense of the previous section. Toolboxes distinguish
themselves principally from the other forms of modules by the fact that they lie
on top of the modular hierarchy. Toolbox modules are “imported” by system
users at run-time. In other words, their definitions define the user interface.
Typical examples are modules System and Edit. As a rule of thumb there exists
a toolbox for every topic or application.
As an example of a toolbox definition we quote an annotated version of
module System:
DEFINITION System;
PROCEDURE CloseTrack;
PROCEDURE Recall; (*most recently closed viewer*)
PROCEDURE Copy; (*viewer*)
PROCEDURE Grow; (*viewer*)
PROCEDURE Clear; (*clear log*)
If present, the parameter list is made available to the called command via fields
text and pos in the global variable Par that is exported from module Oberon.
Because this parameter list is interpreted individually by each command, its
format is completely open. However, we postulate some conventions and rules
for the purpose of a standardized user interface:
1. The elements of a textual parameter list are universal syntactical tokens like
name, literal string, integer, real number, and special character.
2. An arrow “ˆ” in the textual parameter list refers to the current text selection
for continuation. In the special case of the arrow following the command
name immediately, the entire parameter list is represented by the text se-
lection.
3. An asterisk “*” in the textual parameter list refers to the currently marked
viewer. Typically, the asterisk replaces the name of a file. In such a case
the contents of the viewer marked by the system pointer (star) is processed
by the command interpreter instead of the contents of a file.
4. An at-character “@’ in the textual parameter list indicates that the selection
marks the (beginning of the) text which is taken as operand.
30 3.4
The display screen is the most important part of the interface presented by a
personal workstation to its users. At first sight, it simply represents a rectangular
output area. However, in combination with the mouse, it quickly develops into
a sophisticated interactive input/output platform of almost unlimited flexibility.
It is mainly its Janus-faced characteristic that makes the display screen stand
out from ordinary external devices to be managed by the operating system. In
the current chapter we shall give more detailed insight into the reasons for the
central position the display system takes within the operating system, and for
its determining influence on the entire system architecture. In particular, we
shall show that the display system is a natural basis or anchor for functional
extensibility.
The desktop metaphor is used by many modern operating systems and user
interface shells both as a natural model for the system to separate displayed
data belonging to different tasks, and as a powerful tool for users to organize
the display screen interactively, according to individual taste and preference.
However, there are inherent drawbacks in the metaphor. They are primarily
connected with overlapping. Firstly, any efficient management of overlapping
viewers must rely on a subordinate management of (arbitrary) sub-rectangles
and on sophisticated clipping operations. This is so because partially overlapped
viewers must be partially restored under control of the viewer manager. For
example, in Figure 4.1, rectangles a, b, and c in viewer B ought to be restored
individually after closing of viewer A. Secondly, there is a significant danger of
covering viewers completely and losing them forever. And thirdly, no canonical
heuristic algorithms exist for automatic allocation of screen space to newly
opened viewers.
Experience has shown that partial overlapping is desirable and beneficial in
rare cases only, and so the additional complexity of its management [1] [2] is hard
to justify. Therefore, alternate strategies to structure a display screen have been
looked for. An interesting class of established solutions can be titled as tiling.
There are several variants of tiling [3]. Perhaps the most obvious one (because
the most unconstrained one) is based on iterated horizontal or vertical splitting
[1] C. Binding, User Interface Components based on a Multiple Window Package, University
of Washington, Seattle, Technical Report 85-08-07.
[2] M. Wille, Overview: Entwurf und Realisierung eines Fenstersystems für Arbeitsplatzrech-
ner, Diss. ETH Nr. 8771, 1988.
[3] E.S. Cohen, E.T. Smith, L.A. Iverson, Constraint-Based Tiled Windows, IEEE, 1985
4.1 THE SCREEN LAYOUT MODEL 33
of existing viewers. Starting with the full screen and successively opening viewers
A, B, C, D, E, and F we get to a configuration as in Figure 4.2.
black boxes so far we have not revealed anything about the continuation of the
hierarchy. As a matter of fact, viewers are neither elementary display entities nor
atoms. They are just a special case of so-called display frames. Display frames
or frames in short are arbitrary rectangles displaying a collection of objects or
an excerpt of a document. In particular, frames may recursively contain other
frames, a capability that makes them an extremely powerful tool for any display
organizer.
The type Frame is declared as
Frame = POINTER TO FrameDesc;
FrameDesc = RECORD
next, dsc: Frame;
X, Y, W, H: INTEGER;
handle: Handler
END;
The components next and dsc are connections to further frames. Their names
suggest a multi-level recursive hierarchical structure: next points to the next
frame on the same level, while dsc points to the (first) descendant, i.e. to the
next lower level of the hierarchy of nested frames. X, Y, W, H, and the handler
handle serve the original purpose to that we introduced them. In particular, the
handler allows frames to react individually on the receipt of messages. Its type
is
Handler = PROCEDURE (F: Frame; VAR M: FrameMsg);
where FrameMsg represents the root of a potentially unlimited tree hierarchy of
possible messages to frames:
FrameMsg = RECORD END;
Having now introduced the concept of frames, we can reveal the whole truth
about viewers. As a matter of fact, type Viewer is a derived type, it is a type
extension of Frame:
Viewer = POINTER TO ViewerDesc;
ViewerDesc = RECORD (FrameDesc)
state: INTEGER
END;
These declarations formally express the fact that viewers are nothing but a
special case (or variant or subclass) of general frames, additionally featuring
a state of visibility. In particular, viewers inherit the hierarchical structure of
frames. This is an extremely useful property immediately opening an unlimited
spectrum of possibilities for designers of a specific subclass of viewers to organize
the representing rectangular area. For example, the area of viewers of, say,
class Desktop may take the role of a background being covered by an arbitrary
collection of possibly mutually overlapping frames. In other words, our decision
of using a tiling viewer scheme globally can easily be overwritten locally.
4.4 DISPLAY MANAGEMENT 37
4.4.1. Viewers.
Focusing first on module Viewers we can roughly define the domain of its
responsibility as “initializing and maintaining the global layout of the display
area”. From the previous discussion we are well acquainted already with the
structure of the global display space as well as with its building blocks: The
display area is hierarchically tiled with display frames, where the first two levels
in the frame hierarchy correspond to tracks and viewers respectively.
This is the formal definition:
DEFINITION Viewers;
IMPORT Display;
CONST restore = 0; modify = 1; suspend = 2; (*message ids*)
TYPE Viewer = POINTER TO ViewerDesc;
ViewerDesc = RECORD (Display.FrameDesc)
4.4.1 VIEWERS 39
state: INTEGER
END;
ViewerMsg = RECORD (Display.FrameMsg)
id: INTEGER;
X, Y, W, H: INTEGER;
state: INTEGER
END;
(*track handling*)
PROCEDURE InitTrack (W, H: INTEGER; Filler: Viewer);
PROCEDURE OpenTrack (X, W: INTEGER; Filler: Viewer);
PROCEDURE CloseTrack (X: INTEGER);
(*viewer handling*)
PROCEDURE Open (V: Viewer; X, Y: INTEGER);
PROCEDURE Change (V: Viewer; Y: INTEGER);
PROCEDURE Close (V: Viewer);
(*miscellaneous*)
PROCEDURE This (X, Y: INTEGER): Viewer;
PROCEDURE Next (V: Viewer): Viewer;
PROCEDURE Recall (VAR V: Viewer);
PROCEDURE Locate (X, H: INTEGER; VAR fil, bot, alt, max: Viewer);
PROCEDURE Broadcast (VAR M: Display.FrameMsg);
END Viewers.
It is now a good time to throw a glance behind the scenes. Let us start with
revealing module Viewer’s internal data structure. Remember that according to
the principle of information hiding an internal data structure is fully private to
the containing module and accessible through the modules procedural interface
only. Figure 4.6 shows a data structure view of the display snapshot taken in
Figure 4.4. Note that the overlaid tracks and viewers are still part of the internal
data structure.
In the data structure we recognize an anchor that represents the display
area and points to a list of tracks, each of them in turn pointing to a list of
viewers, each of them in turn pointing to a list of arbitrary sub-frames. Both
the list of tracks and the list of viewers are closed to a ring, where the filler track
(filling up the display area) and the filler viewers (filling up the tracks) act as
anchors. Additionally, each track points to a (possibly empty) list of tracks lying
underneath. These frames are invisible on the display, and shaded in Figure 4.6.
Fig. 4.6 A snapshot of the internal data structure corresponding to Figure 4.3
42 4.4.1
FrameDesc = RECORD
next, dsc: Frame;
X, Y, W, H: INTEGER;
handle: Handler
END;
It is noteworthy that the data structure of the viewer manager is heterogeneous
with Frame as base type. It provides a nice example of a nested hierarchy of
frames with the additional property that the first two levels correspond to the
first two levels in the type hierarchy defined by Track, Viewer, and Frame.
In an object-oriented environment objects are autonomous entities in prin-
ciple. However, they may be bound to some higher instance (other than the
system) temporarily. For example, we can look at the objects belonging to a
module’s private data structure as bound to this module. Deciding if an object
is currently bound is then a fundamental problem. In the case of viewers, this
information is contained in an extra instance variable called state.
As a system invariant, we have for every viewer V
V is bound to module Viewers ⇐⇒ V.state 6= 0
If we call visible any displayed viewer and suspended any viewer that is covered
by an overlaying track we can refine this invariant to
(V is visible ⇐⇒ v.state > 0) ∧ (V is suspended ⇐⇒ v.state < 0)
In addition, more detailed information about the kind of viewer V is given by the
magnitude | V.state |:
V.state Kind of viewer
0 closed
1 filler
-1 productive
The magnitude | V.state | is kept invariant by module Viewers. It could
be used, for example, to distinguish different levels of importance or preference
with the aim of supporting a smarter algorithm for heuristic allocation of new
4.4.2 MENU VIEWERS 43
viewers. The variable state is treated as read-only by every module other than
Viewers.
We are now sufficiently prepared to understand how the exported procedures
of module Viewers work behind the scenes. All of them operate on the internal
dynamic data structure just explained. Some use the structure as a reference
only or operate on individual elements (procedures This, Next, Locate, Change),
others add new elements to the structure (procedures InitTrack, OpenTrack,
Open), and even others remove elements (procedures CloseTrack, Close). Most
procedures have side-effects on the size or state of existing elements.
Let us now change perspective and look at module Viewers as a general
low-level manager of viewers whose exact contents are unknown to it (and whose
controlling software might have been developed years later). In short, let us look
at module Viewers as a manager of black boxes. Such an abstraction immediately
makes it impossible for the implementation to call fixed procedures for, say,
changing a viewer’s size or state. The facility needed is a message-oriented
interface.
TYPE ViewerMsg = RECORD (Display.FrameMsg)
id: INTEGER;
X, Y, W, H: INTEGER;
state: INTEGER
END;
There exist three variants of Viewer messages, discriminated by the field id:
Restore contents, modify height (extend or reduce at bottom), and suspend
(close temporarily or permanently). The additional components of the message
inform about the desired new location, size, and state.
The following table lists senders, messages, and recipients of viewer mes-
sages.
Originator Message Recipients
OpenTrack Suspend temporarily Viewers covered by opening track
CloseTrack Suspend permanently Viewers in closing track
Open Modify or suspend Upper neighbor of opening viewer
Change Modify Upper neighbor of changing viewer
Close Suspend permanently Closing viewer
TYPE
Viewer = POINTER TO ViewerDesc;
ViewerDesc = RECORD (Viewers.ViewerDesc)
menuH: INTEGER
END;
ModifyMsg = RECORD (Display.FrameMsg)
id: INTEGER;
dY, Y, H: INTEGER
END;
PROCEDURE Handle (V: Display.Frame; VAR M: Display.FrameMsg);
PROCEDURE New (Menu, Main: Display.Frame;
menuH, X, Y: INTEGER): Viewer;
END MenuViewers.
create V; open V at X, Y
ELSE (*reduce*)
send modify message to main frame to make it reduce;
send modify message to menu frame to make it reduce;
reduce viewer area and border
END
ELSIF operation is suspend THEN
send modify message to main frame to make it reduce to height 0;
send modify message to menu frame to make it reduce to height 0
END
END
The field id specifies one of two variants: extend or reduce. The first variant
of the message requests the receiving frame to move by the vertical translation
vector dY and then to extend to height H at bottom. The second variant requests
the frame to reduce to height H at bottom and then to move by dY. In both cases
Y indicates the Y-coordinate of the new lower-left corner. Figure 4.7 summarizes
this graphically.
Messages arriving from the viewer manager and requesting the receiving
viewer to extend or reduce at its bottom are also mapped into messages of type
ModifyMsg. Of course, no translation is needed in these cases, and dY is 0.
The attentive reader might perhaps have asked why the standard handler
is exported by module MenuViewers at all. The thought behind is reusability
of code. For example, a message handler for a subclass of menu viewers could
be implemented effectively by reusing menu viewer’s standard handler. After
having handled all new or differing cases first it would simply (super-)call the
standard handler subsequently.
The viewer manager and the cursor handler are two concurrent users of
the same display area. Actually, we should imagine two parallel planes, one
displaying viewers and the other displaying cursors. If there is just one physical
plane we take care of painting markers non-destructively, for example in inverse-
video mode. Then, no precondition must be established before drawing a marker.
However, in the case of a viewer task painting destructively in its viewer’s area,
the area must be locked first after turning invisible all markers in the area.
The technical support of cursor management is again contained in module
Oberon. The corresponding application programming interface is
DEFINITION Oberon;
TYPE Marker = RECORD
Fade, Draw: PROCEDURE (x, y: INTEGER)
END;
Cursor = RECORD
marker: Marker;
on: BOOLEAN;
X, Y: INTEGER
END;
VAR Arrow, Star: Marker;
Mouse, Pointer: Cursor;
PROCEDURE OpenCursor (VAR c: Cursor);
PROCEDURE FadeCursor (VAR c: Cursor);
PROCEDURE DrawCursor (VAR c: Cursor; VAR m: Marker;
X, Y: INTEGER);
PROCEDURE MarkedViewer (): Viewers.Viewer;
PROCEDURE RemoveMarks (X, Y, W, H: INTEGER);
...
END Oberon.
The state of a cursor is given by its mode of visibility (on), its position (X, Y)
in the display area, and the current marker. Marker is an abstract data type
with an interface consisting of two operations Fade and Draw. The main benefit
we can draw from this abstraction is once more conceptual independence of the
underlying hardware. For example, Fade and Draw can adapt to a given monitor
hardware with built-in cursor support or, in case of absence of such support,
can simply be implemented as identical procedures (an involution) drawing the
marker pattern in inverse video mode.
The functional interface to cursors consists of three operations: OpenCursor
to open a new cursor, FadeCursor to switch off the marker of an open cursor,
and DrawCursor to extend the path of a cursor to a new position and mark
it with the given marker. We emphasize that the marker representing a given
cursor can change its shape dynamically on the fly.
Two cursors, Mouse and Pointer are predefined. They represent the mouse
and an interactively controlled global system pointer respectively. Typically
(but not necessarily) these cursors are visualized by the built-in markers Arrow
4.4.3 RASTER OPERATIONS 49
Fig. 4.8 A pattern and its encoding as an array of bytes (in hex)
Some standard patterns are included in module Display and exported as
global variables. Among them are patterns arrow, hook, and star intended to
represent the cursor, the caret, and the marker. A second group of predefined
patterns supports drawing graphics.
The parameter col in the pattern-oriented raster operations specifies the
pattern’s foreground color. Colors black (background) and white are predefined.
[1] A. Goldberg, Smalltalk-80: The Interactive Programming Environment, Addison-Wesley
1984.
52 4.5
At the beginning of the computing era, text was the only medium mediating
information between users and computers. Not only was a textual notation used
to denote all kinds of data and objects via names and numbers (represented by
sequences of characters and digits respectively), but also for the specification of
programs (based on the notions of formal language and syntax) and tasks. Actu-
ally, not even the most modern and most sophisticated computing environments
have been able to make falter the dominating role of text substantially. At most,
they have introduced alternative models like graphical user interfaces (GUI) as
a graphical replacement for command lines.
There are many reasons for the popularity of text in general and in con-
nection with computers in particular. To name but a few: Text containing any
arbitrary amount of information can be built from a small alphabet of widely
standardized elements (characters), their building pattern is extremely simple
(lining up elements), and the resulting structure is most elementary (a sequence).
And perhaps most importantly, syntactically structured text can be parsed and
interpreted by a machine.
In computing terminology, sequences of elements are called files and, in
particular, sequences of characters are known as text files. Looking at their
binary representation, we find text files excellently suited to be stored in com-
puter memories and on external media. Remember that individual characters
are usually encoded in one byte each (ASCII-code). We can therefore identify
the binary structure of text files with sequences of bytes, matching perfectly the
structure of any underlying computer storage. We should recall at this point
that, with the possible exception of line-break control characters, rendering
information is not part of ordinary text files. For example, the choices of
character style and of paragraph formatting parameters are entirely left to the
rendering interpreter.
Unfortunately, in conventional computing environments, text is merely used
for input/output, and its potential is not nearly exploited optimally. Input
texts are typically read from the keyboard under control of some text editor,
interpreted and then discarded. Output text is volatile. Once displayed on the
screen it is no longer available to any other parts of the program. The root of
the problem is easily located: Conventional operating systems neither feature an
integrated management nor an abstract programming interface (API) for texts.
Of course, such poor support of text on the level of programming must
reflect itself on the user surface. More often than not, users are forced to retype
a certain piece of text instead of simply copy/pasting it from elsewhere on the
55
56 4.6
and vertical offset together are often referred to as looks. With that, we can now
define a (rich) text as a sequence of characters with looks. We shall treat the
topic of fonts and glyphs thoroughly in Section 5.4.
For the moment, however, let us continue our discussion of the abstract data
type Text. Formally, we define it as
There is only one state variable and one method. The variable len represents
the current length of the described text (i.e. the number of characters in the
sequence). The procedure variable notify is included as a method (occasionally
called after-method) to notify interested clients of state changes.
By definition, each abstract data type comes with a complete set of opera-
tions. In the case of Text, three different groups corresponding to three different
topics need to be considered, loading (from file), storing (to file), editing, and
accessing (reading and writing) respectively.
Logical entities like texts are stored in Oberon on external media in the form
of sections. A section is addressed by a pair (file, pos) consisting of a file
descriptor and a starting position. In general, the structure of sections obeys the
following syntax:
Procedure Open internalizes a named text file (consisting of a single text section),
procedure Load internalizes an arbitrary text section starting at (f, pos), and
procedure Store externalizes a text section to (f, pos). The parameter T
designates the internalized text. len returns the length of the section. Note
that in case of Load the identification of the section must have been read and
consumed before the loader is called.
By far the most important application of the notifier is updating the display,
i.e. adjusting all affected views of the text that are currently displayed to the
new state of the text (the model). We shall come back to this important matter
when discussing text frames in Section 5.3.
In concluding this Section it is worth noting that the groups of operations
just discussed have been designed to be equally useful for interactive text editors
as for programmed text generators/manipulators.
(and vary) the granularity at which a text and its views are updated. Finally,
an after-method is used to allow context-dependent post-processing of editing
operations. It is used primarily for preserving consistency between text models
and their views.
used for Xerox PARC’s Bravo text editor and also for ETH’s former document
editors Dyna and Lara [1]. The original piece list is able to describe a vanilla
text without looks. It is based on two principles:
1. A text is regarded as a sequence of pieces, where a piece is a section of a
text file consisting of a sequence of contiguous characters.
2. Every piece is represented by a descriptor (f, pos, len), where the com-
ponents designate a file, a starting position, and a length respectively. The
whole text is represented as a list of piece descriptors (in short: piece list).
The editing operations operate on the piece list rather than on the pieces
themselves.
from the previous Section. Type Piece is completely private and hidden from
the clients.
TextDesc = RECORD
len: INTEGER;
notify: Notifier;
. trailer: Piece;
. org: INTEGER;
. pce: Piece
END;
Reader = RECORD
eot: BOOLEAN;
fnt: Fonts.Font;
col, voff: INTEGER;
. ref: Piece;
. org, off: INTEGER;
rider: Files.Rider
END;
As depicted in Figure 5.1, the piece list is implemented as a doubly linked list
with a sentinel piece closing it to a ring. The field trailer in type TextDesc points
to the sentinel piece. Fields org and pce implement a translation cache consisting
of merely one entry (org, pce). It links a position org with a piece pce. The
fields header and last in type Buffer refer to the implementation of buffers
as piece lists. They point to the first and last piece descriptors respectively.
5.2 TEXT MANAGEMENT 67
Finally, the fields ref, org, and off in type Reader memorize the current piece,
its origin, and the current offset within this piece.
The fields f, off, and len in type Piece specify the underlying file, starting
position in the file, and length of the piece. fnt, col, and voff are its looks.
Finally prev and next are pointers to the previous piece and to the next piece
in the list respectively.
FindPiece and SplitPiece are auxiliary procedures that are used by almost
all piece-oriented operations.
PROCEDURE FindPiece (T: Text; pos: INTEGER;
VAR org: INTEGER; VAR p: Piece);
VAR p: Piece;
porg: INTEGER;
BEGIN
p := T.pce;
porg := T.org;
1 IF pos >= porg THEN
WHILE pos >= porg + p.len DO INC(porg, p.len); p := p.next END
2 ELSE p := p.prev; DEC(porg, p.len);
WHILE p < porg DO p := p.prev; DEC(porg, p.len) END
END;
3 T.pce := p; R.org := porg; (*update cache*)
pce := p; org := porg
END FindPiece;
Explanations (referring to the line numbers in the above code excerpt)
1 search to the right (next)
2 search to the left (prev)
3 update cache if more than 50 pieces traversed
ELSE pr := p
END
END SplitPiece;
Explanations:
1 return right part piece pr after split
2 generate new piece only if remaining length ¿ 0
3 insert new piece in forward chain
4 insert new piece in backward chain
Procedure Insert handles text insertion. It operates on a buffer that contains
the stretch of text to be inserted:
Explanations:
1 split piece to isolate point of insertion
2 adjust cache if necessary
3 merge pieces if possible
4 insert buffer
5 update text length
6 empty buffer
7 notify
5.2 TEXT MANAGEMENT 69
Explanations:
1 read character from file and update looks in reader
2 if piece boundary reached
3 check if sentinel piece reached
4 move reader to next piece
5 position file rider
Procedure Read is typically used as a primitive by text scanners and in particular
by the built-in scanner Scan for the recognition of universal tokens, as they were
defined in the previous section. Scanning is a rather complex operation that, for
example, includes the conversion of a sequence of digits into an internal floating-
point representation and vice-versa. Scanning a real number involves recognizing
m and d, and computing x = m × 10d . This is done using procedure Ten(d)
computing 10d by repeated multiplication maintaining the invariant t × pn =
10n0 , where n0 is the initial value of n.
Remarks:
For compatibility reasons, plain ASCII-files are accepted as text
files as well. They are mapped to texts consisting of a single run with
standard looks.
Internalizing a text section from a file is extremely efficient because
it is obviously sufficient to read the header and translate it into the
initial state of the piece list.
Summary: The mechanism used for the implementation of the abstract data type
Text is completely hidden from clients. It is a generalized version of the original
piece list technique, adapted to texts with looks. The piece list technique in turn
is based on the principle of indirection: Operations operate on descriptors of texts
rather than on texts themselves. The benefits are efficiency and non-destructive
operations. However, the technique works properly only in combination with an
efficient (and reliable) garbage collector and a suitable file system.
x, y specify the envisioned location relative to the text frame’s origin, and dx is
the width of the character at this location. pos is the corresponding position in
the text and org is the origin position of the corresponding text line.
The following is a simplified version of the message handler employed by
text frames. It fully determines the behavior and capabilities of text frames.
PROCEDURE Handle* (F: Display.Frame; VAR M: Display.FrameMsg);
VAR F1: Frame;
buf: Texts.Buffer;
BEGIN
CASE M OF
Oberon.InputMsg:
1 IF M.id = Oberon.track THEN Edit(F(Frame), M.X, M.Y, M.keys)
ELSIF M.id = Oberon.consume THEN
2 IF F(Frame).hasCar THEN Write(F(Frame), M.ch, M.fnt, M.col, M.voff)
END END |
Oberon.ControlMsg:
3 IF M.id = Oberon.defocus THEN Defocus(F(Frame))
4 ELSIF M.id = Oberon.neutralize THEN Neutralize(F(Frame))
END |
5 Oberon.SelectionMsg:
GetSelection(F(Frame), M.text, M.beg, M.end, M.time) |
6 Oberon.CopyMsg:
Copy(F(Frame), F1); M.F := F1 |
7 MenuViewers.ModifyMsg:
Modify(F(Frame), M.id, M.dY, M.Y, M.H) |
8 CopyOverMsg:
CopyOver(F(Frame), M.text, M.beg, M.end) |
5.3 TEXT FRAMES 73
9 UpdateMsg:
IF F(Frame).text = M.text THEN Update(F(Frame), M) END
END
END Handle;
Explanations:
1 Mouse tracking message: Call built-in editor immediately
2 Consume message: In case of valid caret insert character
3 Defocus message: Remove caret
4 Neutralize message: Remove caret and selection
5 Selection message: Return current selection with time stamp
6 Copy message: Create a copy (clone)
7 Modify message: Translate and change size
8 Copyover message: Copy given stretch of text to caret
9 Update message: If text was changed then update display
We recognize again our categories of universal messages introduced in Chapter
4, Table 4.6: Messages in lines 1 and 2 report about user interactions. Messages
in 3, 4, 5, 6, and 8 specify generic operations. Messages in 7 require a change of
location or size. Messages of the latter kind arrive from the ancestor menu viewer
via delegation. They are generated by the interaction handler and preprocessed
by the original viewer message handler. Finally, messages in line 9 report about
changes of contents.
The text frame handler is encapsulated in a module called TextFrames. This
module exports the above introduced types Frame (text frame) and Location,
as well as the procedure Handle. Furthermore, it exports type UpdateMsg to
report on changes made to a displayable text.
UpdateMsg = RECORD (Display.FrameMsg)
id: INTEGER;
text: Texts.Text;
beg, end: INTEGER
END;
Field id names one of the operators replace, insert, or delete. The remaining
fields text, beg, and end restrict the change to a range. Additional procedures
generate a new standard menu text frame and contents text frame respectively:
PROCEDURE NewMenu (name, commands: ARRAY OF CHAR): Frame;
PROCEDURE NewText (text: Texts.Text; pos: INTEGER): Frame;
This completes the minimum definition of module TextFrames. In addition, this
module exports a set of useful service procedures supporting the composition of
custom handlers from elements of the standard handler:
PROCEDURE Edit (F: Frame; X, Y: INTEGER; Keys: SET);
PROCEDURE Write (F: Frame; ch: CHAR; fnt: Fonts.Font;
col, voff: INTEGER);
PROCEDURE Defocus (F: Frame);
74 5.3
Let us now take a closer look at the implementation of some selected operations.
For this purpose, we must first explain the notion of line descriptor that is used
to optimize the operation of locating positions within text frames.
LineDesc = RECORD
len, wid: INTEGER;
eot: BOOLEAN;
next: Line
END;
Each line descriptor provides detailed information about a single line of text
that is currently displayed: len is the number of characters on the line, wid is
the line width, eot indicates terminating line, and next points to the next line
descriptor.
Text frames maintain a private data structure called line list that describes
the list of text lines displayed:
Field trailer represents a sentinel element that closes the line list to a ring.
5.3 TEXT FRAMES 75
The line list contains useful summary information about the current contents
of the text frame. It can be used beneficially by some related data types, for
example by type Location that was introduced earlier:
Location = RECORD
org, pos, dx, x, y: INTEGER;
. lin: Line
END;
The built-in editor procedure Edit is a worthwhile part to look at in more detail.
It is called by the task scheduler to handle mouse events within a text frame.
The following code excerpt shows nicely how the different components of the
text system interoperate.
BEGIN
1 LocateLine(F, y, loc);
2 lim := loc.org + loc.lin.len - 1;
3 pos := loc.org; ox := F.left; dx := eolW;
4 Texts.OpenReader(R, F.text, loc.org);
5 WHILE pos # lim DO
6 Texts.Read(R, nextCh);
7 Fonts.GetPat(R.fnt.raster, nextCh, dx, u, v, w, h, patadr);
IF ox + dx <= x THEN
INC(pos); ox := ox + dx;
IF pos = lim THEN dx := eolW END
ELSE lim := pos
END
END ;
8 loc.pos := pos; loc.dx := dx; loc.x := ox
END LocateChar;
Explanations:
1 locate text line corresponding to at y
2 set limit to the last actual character on this line
3 start locating loop with first character on this line
4 setup reader and read first character of this line
5–7 scan through characters of this line until limit or x is reached
6 get character width dx of current character 8) return location found
Note that the need to read characters from the text (again) in LocateChar
has its roots in the so-called proportional fonts in which our rich texts are repre-
sented. We found that keeping character widths is an unnecessary optimization
thanks to the buffering capabilities of the underlying file system. In the case of
fixed-pitch fonts a simple division by the character width would be sufficient, of
course.
Finally, procedure LocateLine uses the line list to locate the desired text
line without reading text at all.
PROCEDURE LocateLine (F: Frame; y: INTEGER; VAR loc: Location);
VAR L: Line; org, cury: INTEGER;
BEGIN
org := F.org;
1 org := F.org; L := F.trailer.next; cury := F.H - F.top - asr;
2 WHILE (L.next # F.trailer) & (cury > y + dsr) DO
org := org + L.len; L := L.next; cury := cury - lsp
3 END;
4 loc.org := org; loc.lin := L; loc.y := cury
END LocateLine;
Explanations:
1 start with first line in the frame
80 5.3
Let us now take the perspective of a text frame receiving an update message.
Looking at line 9 in the text frame handler, we see that procedure Update is
called, which in turn calls procedure Insert in TextFrames:
Of course, the rules governing the rendering and formatting process crucially
influence the complexity of procedures like Insert. For text frames we have
consciously chosen the simplest possible set of formatting rules. They can be
summarized as:
1. For a given text frame the distance between lines is constant.
2. There are no implicit line breaks.
It is exactly this set of rules that makes it possible to display a text line in one
pass. Two passes are inevitable if line distances have to adjust to font sizes or if
lines must be broken implicitly.
Updating algorithms make use of the following one-pass rendering proce-
dures Width and DisplayLine:
PROCEDURE Width (VAR R: Texts.Reader; len: INTEGER): INTEGER;
VAR patadr, pos, ox, dx, x, y, w, h: INTEGER;
1 BEGIN pos:=0; ox:=0;
WHILE pos < len DO
Fonts.GetPat(R.fnt, nextCh, dx, x, y, w, h, pat);
ox := ox + dx; INC(pos); Texts.Read(R, nextCh)
2 END;
3 RETURN ox
END Width;
Explanations:
1–2 scan through len characters of this line
3 return accumulated width
Procedures Width and LocateChar are similar. Therefore the above comment
about relying on the buffering capabilities of the underlying file system applies
to procedure Width equally well.
TODO: This function is definitely different in the source. Need to figure
out what happened.
PROCEDURE DisplayLine (F: Frame; L: Line;
VAR R: Texts.Reader;
X, Y, len: INTEGER);
VAR patadr, NX, Xlim, dx, x, y, w, h: INTEGER;
BEGIN
1 NX := F.X + F.W; Xlim := NX - 40;
2 WHILE (nextCh # CR) & ((nextCh > " ") OR (X < Xlim)) & (R.fnt # NIL) DO
3 Fonts.GetPat(R.fnt, nextCh, dx, x, y, w, h, patadr);
4 IF (X + x + w <= NX) & (h # 0) THEN
5 Display.CopyPattern(R.col, patadr, X + x, Y + y, Display.invert)
6 END;
7 X := X + dx; INC(len); Texts.Read(R, nextCh)
8 END;
9 L.len := len + 1; L.wid := X+eolW - (F.X + F.left);
10 L.eot := R.fnt = NIL; Texts.Read(R, nextCh)
5.3 TEXT FRAMES 83
END DisplayLine;
Explanations:
1 set right margin
2–8 display characters of this line
3 get width dx, box x, y, w, h, and pattern address of next character
4 if there is enough space in the rectangle of contents
5 display pattern
7 jump to location of next character; read next character
9–10 setup line descriptor
Procedure DisplayLine is again similar to LocateChar, and the comment about
relying on the file systems buffering capabilities applies once more. The principal
difference between LocateChar and Width on one hand and DisplayLine on the
other hand is the fact that the latter accesses the display screen physically.
Therefore, possession of the screen lock is a tacit precondition for calling Dis-
playLine.
A quick look at an auxiliary procedure that updates the position marker
concludes our tour behind the scenes of the text system:
PROCEDURE UpdateMark (F: Frame);
VAR oldH: INTEGER;
BEGIN 1
1 oldH := F.markH; F.markH := F.org * F.H DIV (F.text.len + 1));
IF (F.mark > 0) & (F.left >= barW) & (F.markH # oldH) THEN
2 Display.ReplConst(Display.white, F.X + 1, F.Y + F.H - 1 - oldH, markW, 1, Display.invert);
3 Display.ReplConst(Display.white, F.X + 1, F.Y + F.H - 1 - F.markH, markW, 1, Display.invert)
END
END UpdateMark;
Explanations
1 shows how the marker’s position is calculated. Loosely spoken, the invariant
is distance from top of frame / frame height = text position of first character
in frame / text length
2 erase the old marker
3 draw the new marker
And this in turn concludes our Section on text frames. Recapitulating the most
important points: The tasks of text editing (input oriented) and text rendering
(output oriented) are combined in the concept of text frames. Text frames
constitute a subclass of display frames, and they are implemented in a separate
module called TextFrames. The implementation of TextFrames accesses the
displayed text exclusively via the “official” abstract interface of module Texts
discussed in Section 5.2. It maintains a private data structure of line lists to
accelerate locating requests. Text frames use simple formatting rules that allow
super-efficient rendering of text in a single pass. In particular, line spacing is
fixed for every text frame. Therefore, different styles of a base font are possible
within a given text frame while different sizes are not.
84 5.3
of runs within the ASCII-code range (intervals occupied without gaps) and every
pair [beg, end) describes one run. The tuples (dx, x, y, w, h) are the metric data
of the corresponding characters (in their ASCII-code order), and the sequence of
rasterByte gives the total of raster information.
In summary, fonts in Oberon are indexed libraries of objects. The objects
are descriptions of character images in two levels of abstraction: As metric data
of black boxes and as binary patterns (glyphs). Type Font is an abstract data
type with intrinsic operations to internalize and to get character object data.
Internalized fonts are cached in a private list.
symbol file—does not specify the entry addresses of its exported procedures, but
merely specifies a unique number (pno) for each one of them. The reason for this
is that in this way the implementation of M0 may be modified, causing a change
of entry addresses, without affecting its interface specification. And this is a
crucial property of the scheme of separate compilation of modules: changes of
the implementation of M0 must not necessitate the recompilation of clients (M1).
The consequence is that the binding of entry addresses to procedure numbers
must be performed by the linker. In order to make this possible, the object file
must contain a list (table) of its entry addresses, one for each procedure number
used as index to the table.
Similarly, the object file must contain a table of imported modules, contain-
ing their names. An external reference in the program code then appears in the
form of a pair consisting of a module number (mno) - used as index to the import
table (of modules) - and a procedure number (pno), used as index to the entry
table of this module.
Certain linkage information must not only be provided in each object file,
but also be present along with each loaded module’s program code, because a
module to be loaded must be linkable with modules loaded at any earlier time
without reading their object files again.
key is the module’s key used for version consistency checking. The key changes
if, and only if, the module’s interface and thereby its symbol file changes. num is
the module’s number, which is the index of the module’s entry in a global module
table, referenced by the processor’s MT register. The invariant relationship is
ModTable[mod.mno] = mod.data
for all mod in the module list. size is the entire module block’s size excluding the
descriptor, and refcnt is the number of other modules importing this module.
This number is used to check whether a module can be released by procedure
Modules.Free.
The section with meta data follows the data and code areas and consists
of several parts. Imports is an array of the modules imported by this module,
each entry being the address of the respective module descriptor. Commands is
a sequence of procedure identifiers followed by their offset in the code section.
This section is used when activating a command. Entries is an array of offsets
of all exported entities (including commands). This section is used by the loader
itself for linking. Pointer refs is an array of offsets of global pointer variables in
the data section. These are used by the garbage collector as the roots of graphs
of heap objects in use.
the list searching for the named module. Only if it is not present, the module is
loaded and added to the list. Duplications therefore cannot occur.
mod := root;
WHILE (mod # NIL) & (name # mod.name) DO mod := mod.next END ;
IF mod = NIL THEN (*load*) F := ThisFile(name); Files.Set(R, F, 0); ...
First, the header of the respective object file is read. It specifies the required size
of the block which is allocated in the module area at the position indicated by the
global variable AllocPtr. Then the list of imports of the module being imported
is read, and these module are imported. Evidently procedure Load is used
recursively. Because cyclic imports are excluded, recursion always terminates.
Files.ReadString(R, impname); (*imports*)
WHILE (impname[0] # 0X) & (res = 0) DO
Files.ReadInt(R, impkey);
Load(impname, impmod); import[nofimps] := impmod; importing := name1;
IF res = 0 THEN
IF impmod.key = impkey THEN INC(impmod.refcnt); INC(nofimps)
ELSE error(3, name); imported := impname
END
END ;
Files.ReadString(R, impname)
END
The loading process stops, if a key mismatch is detected (err = 3). After
successful loading of all imports, the loading of the actual module proceeds by
allocating a descriptor and then reading the remaining sections of the file. The
data is allocated (and cleared) and the code section is read in a straight-forward
way without alteration.
At the very end of the file three integers called fixorgP, fixorgD, and
fixorgT are read. They are the anchors of linked lists in the program code of
instructions that need fixups. These fixups are performed only after the entire file
had been read. Traversing the P-list, the pairs mno-pno are replaced by computed
offsets in BL instructions (procedure calls). Traversing the D-list, addresses of
LDR instructions and instruction pairs are fixed up, and traversing the T-list,
addresses of type descriptors are computed and inserted. This low-level piece
of code is shown below for call instructions (BL). Those for the D-List and the
T-list are analogous.
adr := mod.code + fixorgP*4;
WHILE adr # mod.code DO
SYSTEM.GET(adr, inst);
mno := inst DIV 100000H MOD 10H; (*decompose*)
pno := inst DIV 1000H MOD 100H;
disp := inst MOD 1000H;
SYSTEM.GET(mod.imp + (mno-1)*4, impmod);
92 6.3
fixP, fixD, fixT are the origins of chains of instructions to be updated (fixed
up). body is the entry point offset of the module body.
CHAPTER 7
7.1. FILES
It is essential that a computer system has a facility for storing data over
longer periods of time and for retrieving the stored data. Such a facility is
called a file system. Evidently, a file system cannot accommodate all possible
data types and structures that will be programmed in the future. Hence, it
is necessary to provide a simple, yet flexible enough base structure that allows
any data structure to be mapped onto this base structure (and vice-versa) in a
reasonably straight-forward and efficient way. This base structure, called file, is a
sequence of bytes. As a consequence, any given structure to be transformed into
a file must be sequentialized. The notion of sequence is indeed fundamental, and
it requires no further explanation and theory. We recall that texts are sequences
of characters, and that characters are typically represented as bytes.
The sequence is also the natural abstraction of all physically moving storage
media. Among them are magnetic tapes and disks. Magnetic media have
the welcome property of non-volatility and are therefore the primary choices
for storing data over longer periods of time, especially over periods where the
equipment is switched off. Sequential access is also necessary for media that
allow access only by large blocks, such as flash-RAMs and SD-cards.
A further advantage of the sequence is that its transmission between media
is simple too. The reason is that its structural information is inherent and need
not be encoded and transmitted in addition to the actual data. This implicit-
ness of structural information is particularly convenient in the case of moving
storage media, because they impose strict timing constraints on transmission of
consecutive elements. Therefore, the process which generates (or consumes) the
data must be effectively decoupled from the transmission process that observes
the timing constraints. In the case of sequences, this decoupling is simple to
achieve by dividing a sequence into subsequences which are buffered. A sequence
is output to the storage medium by alternately generating data (and filling the
buffer holding the current subsequence) and transmitting data (fetching elements
from the buffer and transmitting them). The size of the subsequences (and the
buffer) depends on the storage medium under consideration: there must be no
timing constraints between accesses to consecutive subsequences.
The file is not a static data structure like the array or the record, because the
length may increase dynamically, i.e. during program execution. On the other
hand, the sequence is less flexible than general dynamic structures, because it
94
7.1 FILES 95
cannot change its form, but only its length, since elements can only be appended
but not inserted. It might therefore be called a semi-dynamic structure.
The discipline of purely sequential access to a file is enforced by restricting
access to calls of specific procedures, typically read and write procedures for
scanning and generating a file. In the jargon of data processing, a file must be
opened before reading or writing is possible. The opening implies the initializa-
tion of a reading and writing mechanism, and in particular the fixing of its initial
position. Hence each (opened) file not only has a value and a length, but also
a position attributed to it. If reading must occur from several positions (still
sequentially) alternately, the file is “multiply opened”; it implies that the same
file is represented by several variables, each denoting a different position.
This widespread view of files is conceptually unappealing, and the Oberon
file system therefore departs from it by introducing the notion of a rider. A file
simply has a value, the sequence of bytes, and a length, the number of bytes
in the sequence. Reading and writing occurs through a rider, which denotes a
position. “Multiple opening” is achieved by simply using several riders riding
on the same file. Thereby the two concepts of data structure (file) and access
mechanism (rider) are clearly distinct and properly disentangled.
Given a file f, a rider r is placed on a file by the call Files.Set (r, f,
pos), where pos indicates the position from which reading or writing is to start.
Calls of Files.Read (r, x) and Files.Write (r, x) implicitly increment the
position beyond the element read or written, and the file is implicitly denoted
via the explicit parameter r, which denotes a rider. The rider has two (visible)
attributes, namely r.eof and r.res. The former is set to FALSE by Files.Set,
and to TRUE when a read operation could not be performed, because the end
of the file had been reached. r.res serves as a result variable in procedures
ReadBytes and WriteBytes allowing one to check for correct termination.
A file system must not only provide the concept of a sequence with its
accessing mechanism, but also a registry. This implies that files be identified,
that they can be given a name by which they are registered and retrieved. The
registry or collection of registered names is called the file system’s directory.
Here we wish to emphasize that the concepts of files as data structure with
associated access facilities on the one hand, and the concept of file naming and
directory management on the other hand must also be considered separately
and as independent notions. In fact, in the Oberon system their implementation
underscores this separation by the existence of two modules: Files and FileDir.
The following procedures are available. They are summarized by the interface
specification (definition) of module Files.
DEFINITION Files;
TYPE File = POINTER TO FileDesc;
FileDesc = RECORD END ;
Rider = RECORD eof: BOOLEAN; res: INTEGER END ;
... x ...;
Files.Read(r, x)
END
END
The analogous template for a purely sequential writing is:
f := Files.New(name);
Files.Set(r, f, 0);
WHILE ... DO Files.Write (r, x); ... END
Files.Register(f)
There exist two further procedures; they do not change any files, but only
affect the directory. Delete(name, res) causes the removal of the named entry
from the directory. Rename(old, new, res) causes the replacement of the
directory entry old by new.
It may surprise the reader that these two procedures, which affect the
directory only, are exported from module Files instead of FileDir. The reason
is that the presence of the two modules, together forming the file system, is also
used for separating the interface into a public and a private (or semi-public) part.
The definition (in the form of a symbol file) of FileDir is not intended to be
freely available, but restricted to use by system programmers. This allows the
export of certain sensitive data, (such as file headers) and sensitive procedures
(such as Enumerate) without the danger of misuse by inadvertent users.
Module Files constitutes a most important interface whose stability is
utterly essential, because it is used by almost every module programmed. During
the entire time span of development of the Oberon system, this interface had
changed only once. We also note that this interface is very terse, a factor
contributing to its stability. Yet, the offered facilities have in practice over years
proved to be both necessary and sufficient.
The header contains some additional data, namely the length of the file (in
bytes), its name, and date and time of its creation. The size of the header
is 352 bytes; the remaining 672 bytes of the first sector are used for data.
Hence, truly short files occupy a single sector only. The declaration of the file
header is contained in the definition of module FileDir. An abbreviated version
7.2 IMPLEMENTATION OF FILES ON A RANDOM-ACCESS STORE 99
We emphasize that in all but one out of 1024 cases only three instructions and
a single test are to be executed. This improvement therefore is crucial to the
efficiency of file access, and to that of the entire Oberon System. We now present
the entire file module (for files on a random-access store).
CONST
HS = FileDir.HeaderSize;
SS = FileDir.SectorSize;
STS = FileDir.SecTabSize;
XS = FileDir.IndexSize;
Rider* =
RECORD eof*: BOOLEAN;
res*, pos, adr: INTEGER;
file: File END ;
FileDesc =
RECORD mark: INTEGER;
name: FileDir.FileName;
len, date: INTEGER;
ext: ARRAY FileDir.ExTabSize OF Index;
sec: FileDir.SectorTable END ;
Allocation of a new sector occurs upon creating a file (Files.New), and when
writing at the end of a file after the current sector had been filled. Procedure
AllocSector yields the address of the allocated sector. It is determined by a
search in the sector reservation table for a free sector. In this table, every sector
is represented by a single bit indicating whether or not the sector is allocated.
Although conceptually belonging to the file system, this table resides within
module Kernel.
Deallocation of a file’s sectors could occur as soon as the file is no longer
accessible, neither through a variable of any loaded module nor from the file
directory. However, this moment is difficult to determine. Therefore, the method
of garbage collection is used in Oberon for the deallocation of file space. In
consideration of the fact that file space is large and the collection of unused
sectors relatively time-consuming, we confine this process to system initialization.
It is represented by procedure FileDir.Init. At that time, the only referenced
files are those registered in the directory. Init therefore scans the entire directory
and records the sectors referenced in each file in the sector reservation table (see
Sect. 7.4).
For applications where system startup and initialization is supposed to
occur very infrequently, such as for server systems, a procedure Files.Purge
is provided. Its effect is to return the sectors used by the specified file to the
pool of free sectors. Evidently, the programmer then bears the responsibility to
guarantee that no references to the purged file continue to exist. This may be
possible in a closed server system, but files should not be purged under normal
circumstances, as a violation of said precondition will lead to unpredictable
disaster.
The following procedures used for allocating, deallocating, and marking
sectors in the sector reservation table are defined in module Kernel:
PROCEDURE AllocSector(hint: INTEGER; VAR sec: INTEGER); (*used in WriteByte*)
PROCEDURE MarkSector(sec: INTEGER); (*used in Init*)
PROCEDURE FreeSector(sec: INTEGER); (*used in Purge*)
In order to increase efficiency of access, riders have been provided with a field
containing the address of the element of the rider’s position. From the conditions
stated above for the allocation of buffers, it is evident that the value of this field
can be a hint only. This implies that there can be no reliance on its information.
Whenever it is used, its validity has to be checked. The check consists in a
comparison of the riders’ position r.apos with the hinted buffer’s actual position
r.buf.apos. If they differ, a buffer with the desired position must be searched
and, if not present, allocated. The advantage of the hint lies in the fact that the
hint is correct with a very high probability. The check is included in procedures
Read, ReadByte, Write, and WriteByte.
Some fields of the record types require additional explanations:
1. The length is stored in a “preprocessed” form, namely by the two integers
aleng and bleng such that aleng is a sector number and
The same holds for the form of the position in riders (apos, bpos).
106 7.3
2. The field nofbufs indicates the number of buffers in the list headed by
firstbuf:
1 ≤ nofbufs ≤ Maxbufs
3. Whenever data are written into a buffer, the file becomes inconsistent, i.e. the
data on the disk are outdated. The file is updated, i.e. the buffer is copied into
the corresponding disk sector, whenever the buffer is reallocated, e.g. during
sequential writing after the buffer is full and is “advanced”. During sequential
reading, a buffer is also advanced and reused, but needs not be copied onto disk,
because it is still consistent. Whether a buffer is consistent or not is indicated by
its state variable mod (modified). Similarly, the field modH in the file descriptor
indicates whether or not the header had been modified.
4. The field sechint records the number of the last sector allocated to the file
and serves as a hint to the kernel’s allocation procedure, which allocates a next
sector with an address larger than the hint. This is a measure to gain speed in
sequential scans.
5. The buffer’s position is specified by its field apos. Used as index in the file
header’s sector table, it yields the sector corresponding to the current buffer
contents. The field lim specifies the number of bytes’s stored in the buffer.
Reading cannot proceed beyond this limiting index; writing beyond it implies
an increase in the file’s length. All buffers except the one for the last sector are
filled and specify lim = SS.
6. The hidden rider field buf is merely a hint to speed up localization of the
concerned buffer. A hint is likely, but not guaranteed to be correct. Its validity
must be checked before use. The buffer hint is invalidated when a buffer is
reallocated and/or a rider is repositioned.
The structure of riders remains practically the same as for files using main
store. The hidden field adr is merely replaced by a pointer to the buffer covering
the rider’s position. A configuration of a file f with two riders is shown in Fig
7.2.
Some comments concerning module Files follow.
1. After the writing of a file has been completed, its name is usually registered in
the directory. Register invokes procedure Unbuffer. It inspects the associated
buffers and copies those onto disk which had been modified. During this process,
new index sectors may have to be transferred as well. If a file is to remain
anonymous and local to a module or command, i.e. is not to be registered, but
merely to be read, the release of buffers must be specified by an explicit call to
Close (meaning “close buffers”), which also invokes Unbuffer.
7.3 IMPLEMENTATION OF FILES ON A DISK 107
2. Procedure Old (and for reasons of consistency also New) deviates from the
general Oberon programming rule that an object be allocated by the calling
(instead of the called) module. This rule would suggest the statements
New(f); Files.Open(f, name)
instead of f := Files.Old(name). The justification for the rule is that any
extension of the type of f could be allocated, providing for more flexibility. And
the reason for our deviation in the case of files is that, upon closer inspection, not
a new file, but only a new descriptor is to be allocated. The distinction becomes
evident when we consider that several statements f := Files.Old(name) with
different f and identical name may occur, probably in different modules. In
this case, it is necessary that the same descriptor is referenced by the delivered
pointers in order to avoid file inconsistency. Each (opened) file must have exactly
one descriptor. When a file is opened, the first action is therefore to inspect
whether a descriptor of this file already exists. For this purpose, all descriptors
are linked together in a list anchored by the global variable root and linked by the
descriptor field next. This measure may seem to solve the problem of avoiding
inconsistencies smoothly. However, there exists a pitfall that is easily overlooked:
All opened files would permanently remain accessible via root, and the garbage
collector could never remove a file descriptor nor its associated buffers. This
would be unacceptable. In order to hide this list from the garbage collector, it
is represented by integers (addresses) instead of pointers.
3. Sector pointers are represented by sector numbers of type INTEGER.
Actually, we use the numbers multiplied by 29. This implies that any single-
108 7.3
bit error leads to a number which is not a multiple of 29, and hence can easily
be detected. Thereby the crucial sector addresses are software parity checked
and are safe (against single-bit errors) even on computers without hardware
parity check. The check is performed by procedures Kernel.GetSector and
Kernel.PutSector.
[1] R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes.
Acta Informatica, 1, 3, (1972), 173-189.
[2] D. Comer. The ubiquitous B-tree. ACM Comp Surveys, 11, 2, (June 1979), 121-137.
7.4 THE FILE DIRECTORY 109
A B-tree of height h and order 12 may contain the following minimal and
maximal number of elements:
height minimum maximum
1 0 24
2 25 624
3 625 15624
4 15625 390624
It follows that the height of the B-tree will never be larger than 4, if the
disk has a capacity of less than about 400 Mbyte, and assuming that each file
occupies a single 1K sector. It is rarely larger than 3 in practice.
The definition of module FileDir shows the available directory operations.
Apart from the procedures Search, Insert, Delete, and Enumerate, it contains
some data definitions, and it should be considered as the non-public part of the
file system’s interface.
DEFINITION FileDir;
IMPORT SYSTEM, Kernel;
CONST
FnLength = 32; (*max length of file name*)
SecTabSize = 64; (*no. of entries in primary table*)
ExTabSize = 12;
SectorSize = 1024;
IndexSize = SectorSize DIV 4; (*no. of entries in index sector*)
HeaderSize = 352;
DirRootAdr = 29;
DirPgSize = 24; (*max no. of elements on page*)
DirEntry = RECORD
name: FileName;
adr, p: DiskAdr
END ;
DirPage = RECORD
mark: INTEGER;
m: INTEGER; (*no. of elements on page*)
p0: DiskAdr;
e: ARRAY DirPgSize OF DirEntry;
END ;
L := 0; R := m;
WHILE L < R DO
i := (L+R) DIV 2;
IF x <= e[i] THEN R := i ELSE L := i + 1 END
END;
IF (R < m) & (x = e[R]) THEN found END
The invariant is
If the desired element is not found, the search continues on the appropriate
descendant page, if there is one. Otherwise the element is not contained in the
tree.
Procedures insert and delete use the same algorithm for searching an
element within a page. However, they use recursion instead of iteration to
proceed along the search path of pages. We recall that the depth of recursion
is at most four. The reason for the use of recursion is that it facilitates the
formulation of structural changes, which are performed during the “unwinding”
of recursion, i.e. on the return path. First, the insertion point (respectively
the position of the element to be deleted) is searched, and then the element is
inserted (deleted).
Upon insertion, the number of elements on the insertion page may become
larger than 2N , violating B-tree condition 1. This situation is called page
overflow. The invariant must be reestablished immediately. It could be achieved
by moving one element from either end of the array e onto a neighbouring page.
However, we choose not to do this, and instead to split the overflowing page into
two pages immediately. The process of a page split is visualized by Fig 7.4, in
which we distinguish between three cases, namely R < N , R = N , and R > N ,
where R marks the insertion point. a denotes the overflowing, b the new page,
and u the inserted element.
The 2N + 1 elements (2N from the full page a, plus the one element u to be
inserted) are equally distributed onto pages a and b. One element v is pushed
up in the tree. It must be inserted in the ancestor page of a. Since that page
obtains an additional descendant, it must also obtain an additional element in
order to maintain B-tree rule 2.
A page split may thus propagate, because the insertion of element v in the
ancestor page may require a split once again. If the root page is full, it is split
too, and the emerging element v is inserted in a new root page containing a
single element. This is the only way in which the height of a B-tree can increase.
112 7.4
k = (b.m − N + 1) ÷ 2
elements. The process of page balancing then distributes the elements of
the underflowing and its neighbouring page equally to both pages (see procedure
underflow).
However, if (and only if) the neighbouring page has no elements to spare, the
two pages can and must be united. This action, called page merge, places the N −
1 elements from the underflowing page, the N elements from the neighbouring
page, plus one element from the ancestor page onto a single page of size 2N .
One element must be taken from the ancestor page, because that page loses
one descendant and invariant rule 2 must be maintained. The events of page
balancing and merging are illustrated in Fig 7.5. a is the underflowing page, b
7.4 THE FILE DIRECTORY 113
its neighbouring page, and c their ancestor; s is the position in the ancestor page
of (the pointer to the) underflowing page a. Two cases are distinguished, namely
whether the underflowing page is the rightmost element (s = c.m) or not (see
procedure underflow).
MODULE BTree;
IMPORT Texts, Oberon;
CONST N = 3;
TYPE Page = POINTER TO PageRec;
Entry = RECORD
key, data: INTEGER;
p: Page
END ;
PageRec = RECORD
m: INTEGER; (*no. of entries on page*)
p0: Page;
e: ARRAY 2*N OF Entry
END ;
VAR root: Page; W: Texts.Writer;
c.e[s].p := a; a.m := N-1+k; h := FALSE ELSE (*merge pages a and b, discard a*)
c.e[s].p := a.p0; b.e[N] := c.e[s]; i := 0;
WHILE i < N-1 DO b.e[i+N+1] := a.e[i]; INC(i) END ;
b.m := 2*N; DEC(c.m); h := c.m < N
END
END
END underflow;
From this sketch we may conclude that during the process of traversal the tree
structure must not change, because the function NextEntry quite evidently
relies on the structural information stored in the elements of structure itself.
Hence, the actions of the parametric procedure must not affect the tree structure.
Enumeration must not be used, for example, to delete a given set of files. In
order to prevent the misuse of the indispensible facility of element enumeration,
the interface of FileDir is not available to users in general.
The handling of the directory stored on disk follows exactly the same al-
gorithms. The accessed pages are fetched from the disk as a whole (each page
fits onto a single disk sector) and stored in buffers of type DirPage, from where
individual elements can be accessed. In principle, these buffers can be local to
procedures insert and delete. A single buffer is allocated globally, namely the
one used by procedure Search. The reason for this exception is not only that
iterative searching requires one buffer only, but because procedure Files.Old
and in turn Search may be called when the processor is in the supervisor mode
and hence uses the system- (instead of the user-) stack, which is small and would
not accommodate sector buffers.
Naturally, an updated page needs to be stored back onto disk. Omission of
sector restoration is a programming error that is very hard to diagnose, because
some parts of the program are executed very rarely, and hence the error may
look sporadic and mistakenly be attributed to malfunctioning hardware.
Oberon’s file directory represents a single, ordered set of name-file pairs. It
is therefore also called a flat directory. Its internal tree structure is not visible
to the outside. In contrast, some file systems use a directory with a visible tree
structure, notably UNIX. In a search, the name (key) guides the search path;
the name itself displays structure, in fact, it is a sequence of names (usually
separated by slashes or periods). The first name is then searched in the root
directory, whose descendants are not files but subdirectories. The process is
repeated, until the last name in the sequence has been used (and hopefully
denotes a file).
Since the search path length in a tree increases with the logarithm of the
number of elements, any subdivision of the tree inherently decreases performance
since log (m + n) < log (m) + log (n) for any m, n > 1. It is justified only if there
exist sets of elements with common properties. If these property values are stored
once, namely in the subdirectory referencing all elements with common property
values, instead of in every element, not only a gain in storage economy results,
but possibly also in accesses which depend on those properties. The common
properties are typically an owner’s name, a password, and access rights (read
7.5 THE TOOLBOX OF FILE UTILITIES 119
one or several asterisks (wild cards), the test consists of a sequence of searches
of the pattern parts (separated by the asterisks) in the file name. In order to
reduce the number of calls of List, Enumerate is called with the first part of
the pattern as parameter prefix. Enumeration then starts with the least name
having the specified prefix, and terminates as soon as all names with this prefix
have been scanned.
If the specified pattern is followed by an option directive “!”, then not
only file names are listed, but also the listed files’ creation date and length.
This requires that not only the directory sectors on the disk are traversed, but
that additionally for each listed file its header sector must be read. The two
procedures use the global variables pat and diroption.
CHAPTER 8
pointer to the descriptor of its type called the type tag. It is used by the garbage
collector.
Unfortunately, the number of distinct spaces is larger than two. If it were
two, no arbitrary size limitation would be necessary; merely the sum of their sizes
would be inherently limited by the size of the store. In the case of three spaces,
arbitrarily determined size limits are unavoidable. Address-mapping hardware
can alleviate (and delegate) this problem using a virtual address space which is
so large that limits will hardly ever be reached.
Such a scheme is implemented by tables mapping virtual into physical ad-
dresses, requiring multiple memory accesses for every reference. Of course, the
need for a double or a triple access for every memory reference is avoided
by a translation cache in the (hardware) unit. Nevertheless, a decrease in
performance is unavoidable for each cache miss. Furthermore, an additional
subcycle is required for every access in order to look up the cached translation
table. Without a virtual address scheme, each module block must consist of an
integral number of physically adjacent pages. Holes generated by the release of
modules must be reused. We employ the simple scheme of marking the released
space as a hole, and of allocating a new block in the first hole encountered
that is large enough (first-fit strategy). Considering the relative infrequency of
module releases, efforts to improve the strategy are not worth the resulting added
complexity.
It is remarkable that the code for module allocation and release without
virtual addressing is only marginally more complicated than with it. The only
remaining advantages of an MMUare a better storage utilization, because no
holes occur (a negligible advantage), and that inadvertent references to unloaded
modules, e.g. via installed procedures, lead to an invalid address trap.
It is worth recalling that the concept of address mapping was introduced
as a requirement for virtual memory implemented with disks as backing store,
where pages could be moved into the background in order to obtain space for
newly required pages, and could then be retrieved from disk on demand, i.e.
when access was requested. This scheme is called demand paging. It is not used
in the Oberon system, and one may fairly state that demand paging has lost its
significance with the availability of large, primary stores.
Experience in the use of the RISC predecessor Ceres leads to the conclusion
that whereas address translation through an MMUwas an essential feature for
multi-user operating systems, it constitutes a dispensible overkill for single-user
workstations. The fact that modern semiconductor technology made it possible
to integrate the entire translation and caching scheme into a single chip, or
even into the processor itself, led to the hiding (and ignoring) of the scheme’s
considerable complexity. Its side effects on execution speed are essentially un-
predictable. This makes systems with MMUvirtually unusable for applications
with tight real-time constraints. The RISC processor does indeed not feature an
address mapping unit.
The RISC processor features 16 registers (of 32 bits). R0–R11 are
used for expression evaluation. R12–R15 have fixed, system-wide usage:
8.2 MANAGEMENT OF DYNAMIC STORAGE 123
Fig. 8.2 Allocation of dynamic variable pin the heap by procedure NEW(p)
We note that only three local variables are required, independent of the size of
the tree to be traversed. The third, r, is in fact merely an auxiliary variable
to perform the rotation of values p.L, p.R, q, and p as shown in Fig. 8.3. A
snapshot of a tree traversal is shown in Fig. 8.4.
The pair p, q of pointers marks the position of the process. The algorithm
traverses the tree in a left to right, depth first fashion. When it returns to the
root, all nodes have been marked.
How are these claims convincingly supported? The best way is by analyzing
the algorithm at an arbitrary node. We start with the hypothesis H that, given
the initial state P , the algorithm will reach state Q, (see Fig 8.5).
State Q differs from P by the node and its descendants B and C having
been marked, and by an exchange of p and q. We now apply the algorithm to
state P , assuming that B and C are not empty. The process is illustrated in Fig
8.5. P 0 stands for P in Fig. 8.4.
8.2 MANAGEMENT OF DYNAMIC STORAGE 127
We now modify the algorithm of tree traversal to the case where the structure
is not confined to a binary tree, but may be a tree of any degree, i.e. each node
may have any number n of descendants. For practical purposes, however, we
restrict n to be in the range 0 ≤ n ≤ N , and therefore may represent all nodes
by the type
Node = RECORD
m, n: INTEGER;
dsc: ARRAY N OF Node
END
In principle, the binary tree traversal algorithm might be adopted almost with-
out change, merely extending the rotation of pointers from p.L, p.R, q, p to
p.dsc[0], ..., p.dsc[n − 1], q, p. However, this would be an unnecessarily inefficient
solution. The following is a more effective variant. Moreover, it caters for the
case of inhomogeneous graphs, where different nodes have different numbers of
descendants. The key lies in associating with every node, in addition to the tag,
a second private field mk. It serves two purposes. The first is as a mark, with
mk > 0 indicating that the node had been visited. The second is to store the
address of the next descendant to be visited. The underlying data structure is
shown in Figure 8.8. Type descriptors consist of the following fields:
• size in bytes, of the described record type,
• base a table of pointers to the descriptors of the base types (3 elements only)
• offsets of the descendant pointers in the described type (1 word each)
We note that the mark value, starting with zero (unmarked), is used as a counter
of descendants already traversed, and hence as an index to the descendant field
to be processed next. The algorithm can be applied not only to trees, but to
arbitrary structures, including circular ones, if the continuation condition p #
0 (actually p >= heapOrg) is extended to (p >= heapOrg) & (offadr = 0).
This causes a descendant that is already marked to be skipped. Here the array
M stands for the entire memory.
8.2 MANAGEMENT OF DYNAMIC STORAGE 129
The mark is included in each record’s hidden prefix. The prefix takes 2 words
only; the first is used for the tag. The other is reserved for the garbage collector
and used as mark and offset address. The end of the list of descendant pointers
is marked by an entry with value -1. And finally, assignments involving M are
expressed as
SYSTEM.GET(a, x) for x := M[a]
SYSTEM.PUT(a, x) for M[a] := x
The scan phase is performed by a relatively straight-forward algorithm.
The heap, i.e. the storage area between HeapOrg and HeapLimit (the latter
is a variable), is scanned element by element, starting at HeapOrg. Elements
130 8.2
marked are unmarked, and unmarked elements are freed by linking them into
the appropriate list of available space.
As the heap may always contain free elements, the scan phase must be able
to recognize them in order to skip them or merge them with an adjacent free
element. For this purpose, the free elements are also considered as prefixed. The
prefix serves to determine the element’s size and to recognize it as free due to a
special (negative) mark value. The encountered mark values and the action to
be taken are:
mk value state action
= 0 unmarked collect, mark free
> 0 marked unmark
< 0 free skip or merge
Fig. 8.9 Encoding of date and time (year starting with 2000)
PROCEDURE ResetDisk;
PROCEDURE MarkSector(sec: INTEGER);
PROCEDURE FreeSector(sec: INTEGER);
PROCEDURE AllocSector(hint: INTEGER; VAR sec: INTEGER); PROCEDURE GetSector(src: INTEGER; VAR dest:
PROCEDURE Clock(): INTEGER;
PROCEDURE SetClock(dt: INTEGER);
PROCEDURE Install(adr, procadr: INTEGER);
PROCEDURE Init;
END Kernel.
DEVICE DRIVERS
9.1. OVERVIEW
Device drivers are collections of procedures that constitute the immediate
interface between hardware and software. They refer to those parts of the com-
puter hardware that are usually called peripheral. Computers typically contain
a system bus which transmits data among its different parts. Processor and
memory are considered as its internal parts; the remaining parts, such as disk,
keyboard, display, etc, are considered as external or peripheral, notwithstanding
the fact that they are often contained in the same cabinet or board.
Such peripheral devices are connected to the system bus via special registers
(data buffers) and transceivers (switches, buffers in the sense of digital electron-
ics). These registers and transceivers are addressed by the processor in the
same way as memory locations—they are said to be memory-mapped—and they
constitute the hardware interface between processor bus and device. References
to them are typically confined to specific driver procedures which constitute the
software interface.
Drivers are inherently hardware specific, and the justification of their exis-
tence is precisely that they encapsulate these specifics and present to their clients
an appropriate abstraction of the device. Evidently, this abstraction must still
reflect the essential characteristics of the device, but not the details (such as e.g.
the addresses of its interface registers).
Our justification to present the drivers connecting the Oberon system with
the RISC computer in detail is on the one hand the desire for completeness. But
on the other hand it is also in recognition of the fact that their design represents
an essential part of the engineering task in building a system. This part may
look trivial from a conceptual point of view; it certainly is not so in practice.
In order to reduce the number of interface types, standards have been
established. The RISC computer also uses such interface standards, and we
will concentrate on them in the following presentations. The following devices
are presented:
1. The Keyboard is considered as a serial device delivering one byte of input
data per key stroke. It is connected by a serial line according to the PS/2 and
ASCII (American Standard Code for Information Interchange) standards. The
software is contained in module Input (Sect. 9.2), and the hardware is explained
in Sect. 17.2.1.
2. The mouse is a pointing device delivering coordinates in addition to key
states. The software is also part of module Input (Sect. 9.2).
132
9.2 KEYBOARD AND MOUSE 133
is necessary. This is so, because modern keyboards treat all keys in the same
way, including the ones for upper case, control, alternative, etc. Separate codes
are sent to signal the pushing down and the release of a key, followed by another
code identifying which key had been pressed or released. This requires, besides
a translation table from codes to characters, a set of state variables. They are
the global, Boolean variables Recd, Up, Shift, Ctrl, and Ext. Procedure Peek
determines whether an actual character is present, or merely a code signalling a
key shift. Peek controls the state.
Procedure Mouse fetches a word from the mouse interface register and de-
composes it into its components (key state and coordinates). (kb is the bit
indicating whether a code had been received from the keyboard).
THE COMPILER
12.1. INTRODUCTION
The compiler is the primary tool of the system builder. It therefore plays a
prominent role in the Oberon System, although it is not part of the basic system.
Instead, it constitutes a tool module—an application—with a single command:
Compile. It translates program texts into machine code. Therefore, it is as a
program inherently machine-dependent; it acts as the interface between source
language and target computer.
In order to understand the process of compilation, the reader needs to be
familiar with the source language Oberon defined in Appendix 1, and with the
target computer RISC, defined in Appendix 2.
The language is defined as an infinite set of sequences of symbols taken
from the language’s vocabulary. It is described by a set of equations called
syntax. Each equation defines a syntactic construct, or more precisely, the set of
sequences of symbols belonging to that construct. It specifies how that construct
is composed of other syntactic constructs. The meaning of programs is defined
in terms of semantic rules governing each such construct.
Compilation of a program text proceeds by analyzing the text and thereby
decomposing it recursively into its constructs according to the syntax. When
a construct is identified, code is generated according to the semantic rule asso-
ciated with the construct. The components of the identified construct supply
parameters for the generated code.
It follows that we distinguish between two kinds of actions: analyzing steps
and code generating steps. In a rough approximation we may say that the former
are source language dependent and target computer independent, whereas the
latter are source language independent and target computer dependent. Al-
though reality is somewhat more complex, the module structure of this compiler
clearly reflects this division. The main module of the compiler is ORP (for Oberon-
to-RISC Parser) It is primarily dedicated to syntactic analysis, parsing. Upon
recognition of a syntactic construct, an appropriate procedure is called the code
generator module ORG (for Oberon-to-RISC Generator). Apart from parsing, ORP
checks for type consistency of operands, and it computes the attributes of objects
identified in declarations.
Whereas ORP mirrors the source language and is independent of a target
computer, ORG reflects the target computer, but is independent of the source
language.
136
12.1 INTRODUCTION 137
In spite of the high degree of regularity of the target computer, the process of
code generation is more complicated, as shown by module ORG.
The resulting module structure of the compiler is shown in Fig. 12.1 in a
slightly simplified manner. In reality OCS is imported by all other modules due
to their need for procedure OCS.Mark. This, however, will be explained later.
The size of an array is the size of the element type multiplied by the number of
elements. The size of a record is the sum of the sizes of its fields.
A complication arises due to so-called alignment. By alignment is meant
the adjustment of an address to a multiple of the variable’s size. Alignment is
performed for variable addresses as well as for record field offsets. The motivation
for alignment is the avoidance of double memory references for variables being
“distributed” over two adjacent words. Proper alignment enhances processing
speed quite significantly. Variable allocation using alignment is shown by the
example in Fig. 12.2.
We note in passing that a reordering of the four variables lessens the number of
unused bytes, as shown in Fig. 12.3.
MODULE Pattern1;
VAR ch: CHAR; 0
k: INTEGER; 4
x: REAL; 8
s: SET; 12
BEGIN module entry code
ch := "0"; 40000030 MOV R0 R0 30H
B0D00000 STR R0 SB 0
k := 10; 4000000A MOV R0 R0 10
A0D00004 STR R0 SB 4
x := 1.0; 60003F80 MOV' R0 R0 3F800000H
A0D00008 STR R0 SB 8
s := {0, 4, 8} 40000111 MOV R0 R0 111H
A0D0000C STR R0 SB 12
END Pattern1. module exit code
MODULE Pattern2;
VAR i, j, k, n: INTEGER; 0, 4, 8, 12
x, y: REAL; 16, 20
s, t, u: SET; 24, 28, 32
BEGIN
i := (i + 1) * (i - 1); LDR R0 SB 0
ADD R0 R0 1
LDR R1 SB 0
SUB R1 R1 1
MUL R0 R0 R1
STR R0 SB 0
k := k DIV 17; LDR R0 SB 8
DIV R0 R0 17
STR R0 SB 8
k := 8*n; LDR R0 SB 12
LSL R0 R0 3
STR R0 SB 8
k := n DIV 2; LDR R0 SB 12
ASR R0 R0 1
12.2 CODE PATTERNS 141
STR R0 SB 8
k := n MOD 16; LDR R0 SB 12
AND R0 R0 15
STR r0 SB 8
x := -y / (x - 1.0); LDR R0 SB 16
MOV' R1 R0 3F80H
FSB R0 R0 R1
LDR R1 SB 20
FDV R0 R1 R0
MOV R1 R0 0
FSB R0 R1 R0
STR R0 SB 16
s := s + t * u LDR R0 SB 28
LDR R1 SB 32
AND R0 R0 R1
LDR R1 SB 24
OR R0 R1 R0
STR R0 SB 24
END Pattern2.
Pattern 3: Indexed variables: References to elements of arrays make use of the
possibility to add an index value to an offset. The index must be present in a
register and be multiplied by the size of the array elements. (For integers with
size 4 this is done by a shift of 2 bits). Then this index is checked whether it
lies within the bounds specified in the array’s declaration. This is achieved by a
comparison, actually a subtraction, and a subsequent branch instruction causing
a trap, if the index is either negative or beyond the upper bound.
If the reference is to an element of a multi-dimensional array (matrix),
its address computation involves several multiplications and additions. The
address of an element A[ik−1 , ..., i1 , i0 ] of a k-dimensional array A with lengths
nk−1 , ..., n1 , n0 is
adr(A) + ((...((ik−1 ∗ nk−2 ) + ik−2 ) ∗ nk−3 + ...) ∗ n1 + i1 ) ∗ n0 + i0
Note that for index checks CMP is written instead of SUB to mark that the
subtraction is merely a comparison, that the result remains unused and only
the condition flag registers hold the result.
MODULE Pattern3;
VAR i, j, k, n: INTEGER; 0, 4, 8, 12
a: ARRAY 10 OF INTEGER; 16
x: ARRAY 10, 10 OF INTEGER; 56
y: ARRAY 10, 10, 10 OF INTEGER; 456
BEGIN
k := a[i]; LDR R0 SB 0
CMP R1 R0 10
BLHI R12
142 12.2
LSL R0 R0 2
ADD R0 SB R0
LDR R0 R0 16
STR R0 SB 8
n := a[5]; LDR R0 SB 36
STR R0 SB 12
x[i, j] := 2; LDR R0 SB 0
CMP R1 R0 10
BLHI R12
MUL R0 R0 40
ADD R0 SB R0
LDR R1 SB 4
CMP R2 R1 10
BLHI R12
LSL R1 R1 2
ADD R0 R0 R1
MOV R1 R0 2
STR R1 R0 56
y[i,j,k]:=3; LDR R0 SB 0
CMP R1 R0 10
BLHI R12
MUL R0 R0 400
ADD R0 SB R0
LDR R1SB 4
CMP R2 R1 10
BLHI R12
MUL R1 R1 40
ADD R0 R0 R1
LDR R1 SB 8
CMP R2 R1 10
BLHI R12
LSL R1 R1 2
ADD R0 R0 R1
MOV R1 R0 3
STR R1 R0 456
y[3, 4, 5] := 6 MOV R0 R0 6
STR R0 SB 1836
END Pattern3.
Pattern 4: Record fields and pointers: Fields of records are accessed by com-
puting the sum of the record’s (base) address and the field’s offset. If the record
variable is statically declared, the sum is computed by the compiler.
MODULE Pattern4;
TYPE Ptr = POINTER TO Node;
Node = RECORD num: INTEGER; 0
12.2 CODE PATTERNS 143
MODULE Pattern5;
VAR n: INTEGER; s: SET; 0,4
BEGIN
IF n = 0 THEN LDR R0 SB 0
CMP R0 R0 0
BNE 3
INC(n) LDR R0 SB 0
ADD R0 R0 1
STR R0 SB 0
END ;
IF (n >= 0) & (n < 100) THEN LDR SB R0 ...
LDR R0 SB 0 (n)
CMP R0 R0 0
BLT 6
LDR R0 SB 0
CMP R0 R0 100
BGE 3
DEC(n) LDR R0 SB 0
SUB R0 R0 1
STR R0 R0 0
END ;
IF ODD(n) OR (n IN s) THEN LDR SB R0 ...
LDR R0 SB 0 (n)
AND R0 R0 1
BNE 5
LDR R0 SB 4 (s)
LDR R1 SB 0
ADD R1 R1 1
ROR R0 R0 R1
BPL 2
n := -1000 MOV R0 R0 -1000
STR R0 SB 0
END ;
IF n < 0 THEN LDR SB R0 ...
LDR R0 SB 0
CMP R0 R0 0
BGE 3
s := {} MOV R0 R0 0 {}
STR R0 SB 4
B 17
ELSIF n < 10 THEN LDR SB R0 ...
LDR R0 SB 0
CMP R0 R0 10
12.2 CODE PATTERNS 145
BGE 3
s := {0} MOV R0 R0 1
STR R0 SB 4
B 10
ELSIF n < 100 THEN LDR SB R0 ...
LDR R0 SB 0
CMP R0 R0 100
BGE 3
s := {1} MOV R0 R0 2
STR R0 SB 4
B 3
ELSE
s := {2} MOV R0 R0 4
LDR SB R0 ...
STR R0 SB 4
END
END Pattern5.
Pattern 6: While and repeat statements.
MODULE Pattern6;
VAR i: INTEGER;
BEGIN i := 0; MOV R0 R0 0
STR R0 SB 0
WHILE i < 10 DO LDR SB R0 ...
LDR R0 SB 0
CMP R0 R0 10
BGE 4
i := i + 2 LDR R0 SB 0
ADD R0 R0 2
STR R0 SB 0
END ; B-8
REPEAT i := i - 1 LDR SB R0 ...
LDR R0 SB 0
SUB R0 R0 1
STR R0 SB 0
UNTIL i = 0 LDR R0 SB 0
CMP R0 R0 0
BNE -7
END Pattern6.
Pattern 7: For statements.
MODULE Pattern7;
VAR i, m, n: INTEGER;
BEGIN
146 12.2
END Pattern8.
12.2 CODE PATTERNS 147
Pattern 9: Function procedures. They are handled in exactly the same manner
as proper procedures, except that a result is returned in register R0. If the
function is called in an expression at a place where intermediate results are held
in registers, these values are put onto the stack before the call, and they are
restored after return (not shown here).
MODULE Pattern9;
VAR x: REAL;
PROCEDURE F(x: REAL): REAL;
BEGIN SUB SP SP 8
STR LNK SP 0 push ret adr
STR R0 SP 4 push x
IF x >= 1.0 THEN LDR R0 SP 4
MOV' R1 R0 3F80H
FSB R0 R0 R1
BLT 4
x := F(F(x)) LDR R0 SP 4
BL -9
BL -10
STR R0 SP 4
END ;
RETURN x LDR R0 SP 4
END F; LDR LNK SP 0 pop ret adr
ADD SP SP 8
B R15
END Pattern9.
MODULE Pattern10;
VAR a: ARRAY 12 OF INTEGER;
PROCEDURE P(x: ARRAY OF INTEGER);
VAR i, n: INTEGER;
BEGIN SUB SP SP 20
STR LNK SP 0
STR R0 SP 4 x
STR R1 SP 8 x.len
n := x[i]; LDR R0 SP 12 i
LDR R1 SP 8 x.len
CMP R2 R0 R1
148 12.2
BLHI R12
LSL R0 R0 2
LDR R1 SP 4 x
ADD R0 R1 R0
LDR R0 R0 0
STR R0 SP 16
x[i+1] := n+5 LDR R0 SP 12 i
ADD R0 R0 1
LDR R1 SP 8 x.len
CMP R2 R0 R1
BLHI R12
LSL R0 R0 2
LDR R1 SP 4 x
ADD R0 R1 R0
LDR R1 SP 16 n
ADD R1 R1 5
STR R1R0 0
END P; LDR LNK SP 0
ADD SP SP 20
B R15
BEGIN P(a); ADD R0 SB 0 a
MOV R1 R0 12 a.len
BL -29
END Pattern10.
Pattern 11: Sets. This code pattern exhibits the construction of sets. If the
specified elements are constants, the set value is computed by the compiler. Oth-
erwise, sequences of move and shift instructions are used. Since shift instructions
do not check whether the shift count is within sensible bounds, the results are
unpredictable, if elements outside the range 0 .. 31 are involved.
MODULE Pattern11;
VAR s: SET; m, n: INTEGER;
BEGIN
s := {m}; LDR R0 SB 4 m
MOV R1 R0 1
LSL R0 R1 R0
STR R0 SB 0 s
s := {0 .. n}; LDR R0 SB 8 n
MOV R1 R0 -2
LSL R0 R1 R0
XOR R0 R0 -1
STR R0 SB 0
s := {m .. 31}; LDR R0 SB 4 m
MOV R1 R0 31
MOV R2 R0 -2
12.2 CODE PATTERNS 149
LSL R1 R2 R1
MOV R2 R0 -1
LSL R0 R2 R0
XOR R0 R0 R1
STR R0 SB 0 s
s := {m .. n}; LDR R0 SB 4 m
LDR R1 SB 8 n
MOV R2 R0 -2
LSL R1 R2 R1
MOV R2 R0 -1
LSL R0 R2 R0
XOR R0 R0 R1
STR R0 SB 0 s
IF n IN {2,3,5,7,11,13} THEN MOV R0 R0 28ACH
LDR R1 SB 8
ADD R1 R1 1
ASR' R0 R0 R1
BPL 2
m := 1 MOV R0 R0 1
STR R0 SB 4 m
END
END Pattern11.
Pattern 12: Imported variables and procedures: When a procedure is imported
from another module, its address is unavailable to the compiler. Instead, the
procedure is identified by a number obtained from the imported module’s symbol
file. In place of the offset, the branch instruction holds (1) the number of the
imported module, (2) the number of the imported procedure, and (3) a link in
the list of BL instructions calling an external procedure. This list is traversed by
the linking loader, that computes the actual offset (fixup, see Chapter 6).
Imported variables are also referenced by a variable’s number. In general,
an access required two instructions. The first loads the static base register SB
from a global table with the address of that module’s data section. The module
number of the imported variable serves as index. The second instruction loads
the address of the variable, using the actual offset fixed up by the loader.
In the following example, modules Pattern12a and Pattern12b both export
a procedure and a variable. They are referenced from the importing module
Pattern12c.
MODULE Pattern12a;
VAR k*: INTEGER;
PROCEDURE P*;
BEGIN k := 1
END P;
END Pattern12a.
150 12.2
MODULE Pattern12b;
VAR x*: REAL;
PROCEDURE Q*;
BEGIN x := 1
END Q;
END Pattern12b.
MODULE Pattern12c;
IMPORT Pattern12a, Pattern12b;
VAR i: INTEGER; y: REAL;
BEGIN
The following example features 3 record types with associated pointer types, and
hence also 3 type descriptors. Each descriptor is 5 words long. Their addresses,
and therefore their tags, are 0, 20, and 40 respectively.
MODULE Pattern13;
TYPE
P0 = POINTER TO R0;
P1 = POINTER TO R1;
P2 = POINTER TO R2;
R0 = RECORD x: INTEGER END ;
R1 = RECORD (R0) y: INTEGER END ;
R2 = RECORD (R1) z: INTEGER END ;
VAR
p0: P0; 60
p1: P1; 64
p2: P2; 68
BEGIN
p0.x := 0; LDR R0 SB 60
MOV R1 R0 0 p0.x
STR R1 R0 0 no type check
p1.y := 1; LDR R0 SB 64
MOV R1 R0 1
STR R1 R0 4 p1.y
p0(P1).y := 3; LDR R0 SB 60 p0
LDR R1 R0 -8 tag(p0)
LDR R1 R1 4
ADD R2 SB 20 TD P1
CMP R3 R2 R1
BLNE R12
MOV R1 R0 3
152 12.2
STR R1 R0 4 p0.z
p0(P2).z := 5; LDR R0 SB 60 p0
LDR R1 R0 -8 tag(p0)
LDR R1 R1 8
ADD R2 SB 40 TD P2
CMP R3 R2 R1
BLNE R12
MOV R1 R0 5
STR R1 R0 8 p0.z
IF p1 IS P2 THEN LDR R0 SB 64 p1
LDR R1 R0 -8 tag(p1)
LDR R1 R1 8
ADD R2 SB 40 TD P2
CMP R3 R2 R1
BNE 2
p0 := p2 LDR R0 SB 68
STR R0 SB 60
END
END Pattern13.
Pattern 14: Record extensions as VAR parameters: Records occurring as VAR-
parameters may also require a type test at program execution time. This is
because VAR-parameters effectively constitute hidden pointers. Type tests and
type guards on VAR-parameters are handled in the same way as for variables
referenced via pointers, with a slight difference, however. Statically declared
record variables may be used as actual parameters, and they are not prefixed
by a type tag. Therefore, the tag has to be supplied together with the vari-
able’s address when the procedure is called, i.e. when the actual parameter is
established. Record structured VAR-parameters therefore consist of address and
type tag. This is similar to dynamic array descriptors consisting of address and
length.
0 00000020 FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
20 00000020 00014006 FFFFFFFF FFFFFFFF FFFFFFFF
MODULE Pattern14;
TYPE
R0 = RECORD a, b, c: INTEGER END ;
R1 = RECORD (R0) d, e: INTEGER END ;
VAR
r0: R0; 40
r1: R1; 52
PROCEDURE P(VAR r: R0);
BEGIN ...
r.a := 1; LDR R1 SP 4 r
STR R0 R10 r.a
r(R1).d := 2 LDR R0 SP 8 tag(r)
12.2 CODE PATTERNS 153
LDR R0 R0 4
ADD R1 SB 20 R1
CMP R2 R1 R0
BLNE R12
MOV R0 R0 2
LDR R1 SP 4 r
STR R0 R1 12 r.d
END P; ...
BEGIN ...
P(r0); ADD R0 SB 40 r0
ADD R1 SB 0 tag(R0)
BL P
P(r1) ADD R0 SB 52 r1
ADD R1 SB 20 tag(R1)
BL P
END Pattern14. ...
Pattern 15: Array assignments and strings.
MODULE Pattern15;
VAR s0, s1: ARRAY 32 OF CHAR;
PROCEDURE P(x: ARRAY OF CHAR);
END P;
BEGIN s0 := "ABCDEF"; ADD R0 SB 0 @s0
ADD R1 SB 64 @"ABCDEF"
LDR R2 R1 0
ADD R1 R1 4
STR R2 R0 0
ADD R0 R0 4
ASR R2 R2 24 test for 0X
BNE -6
s0 := s1; ADD R0 SB 0 @s0
ADD R1 SB 32 @s1
MOV R2 R0 8 len
LDR R3 R1 0
ADD R1 R1 4
STR R3 R0 0
ADD R0 R0 4
SUB R2 R2 1
BNE -6
P(s1); ADD R0 SB 32 @s1
MOV R1 R0 32 len
BL -38 P
P("012345"); ADD R0 SB 72 @"012345"
MOV R1 R0 7 len (incl 0X)
BL -42 P
154 12.2
STB R0 SB 16 b
n := ORD(ch); LDB R0 SB 17 ch
STR R0 SB 4 n
n := FLOOR(x); LDR R0 SB 8 x
MOV' R1 R0 4B00H
FAD" R0 R0 R1 floor
STR R0 SB 4 n
y := FLT(m); LDR R0 SB 0 m
MOV' R1 R0 4B00H
FAD' R0 R0 R1 float
STR R0 SB 12 y
n := LSL(m, 3); LDR R0 SB 0 m
LSL R0 R0 3
STR R0 SB n
n := ASR(m, 8); LDR R0 SB 0
ASR R0 R0 8
STR R0 SB 4
m := ROR(m, n); LDR R0 SB 0
LDR R1 SB 4
ROR R0 R0 R1
STR R0 SB 0
END Pattern17.
The kind of object which a table entry represents is indicated by the field
class. Its values are denoted by declared integer constants: Var indicates that
the entry describes a variable, Con a constant, Fld a record field, Par a VAR-
parameter, and Proc a procedure. Different kinds of entries carry different
attributes. A variable or a parameter carries an address, a constant has a value,
a record field has an offset, and a procedure has an entry address, a list of
parameters, and a result type. For each class the introduction of an extended
record type would seem advisable. This was not done, however, for three reasons.
First, the compiler was first formulated in (a subset of) Modula-2 which does not
feature type extension. Second, not making use of type extensions would make it
simpler to translate the compiler into other languages for porting the language to
other computers. And third, all extensions were known at the time the compiler
was planned. Hence extensibility provided no argument for the introduction of
a considerable variety of types. The simplest solution lies in using the multi-
purpose fields val and dsc for class-specific attributes. For example, val holds
an address for variables, parameters, and procedures, an offset for record fields,
and a value for constants.
The definition of a type yields a record of type Struct, regardless of whether
it occurs within a type declaration, in which case also a record of type Object
(class = Typ) is generated, or in a variable declaration, in which case the
type remains anonymous. All types are characterized by a form and a size. A
type is either a basic type or a constructed type. In the latter case it refers to
one or more other types. Constructed types are arrays, records, pointers, and
procedural types. The attribute form refers to this classification. Its value is an
integer.
Just as different object classes are characterized by different attributes,
different forms have different attributes. Again, the introduction of extensions
of type Struct was avoided. Instead, some of the fields of type Struct remain
unused in some cases, such as for basic types, and others are used for form-
specific attributes. For example, the attribute base refers to the element type
in the case of an array, to the result type in the case of a procedural type, to
the type to which a pointer is bound, or to the base type of a (extended) record
type. The attribute dsc refers to the parameter list in the case of a procedural
type, or to the list of fields in the case of a record type.
As an example, consider the following declarations. The corresponding data
structure is shown in Fig. 12.5. For details, the reader is referred to the program
listing of module ORB and the respective explanations.
CONST N = 100;
TYPE Ptr = POINTER TO Rec;
Rec = RECORD n: INTEGER; p, q: Ptr END ;
VAR k: INTEGER;
a: ARRAY N OF INTEGER;
PROCEDURE P(x: INTEGER): INTEGER;
158 12.3.1
identifiers and constitute the context in which statements and expressions are
compiled, compilations of expressions typically generate anonymous entities of
additional, non-basic modes. Such entities reflect selectors, factors, terms, etc.,
i.e. constituents of expressions and statements. As such, they are of a transitory
nature and hence are not represented by records allocated on the heap. Instead,
they are represented by record variables local to the processing procedures and
are therefore allocated on the stack. Their type is called Item and is a slight
variation of the type Object. Items are not referenced via pointers.
Let us assume, for instance, that a term x*y is parsed. This implies that
the operator and both factors have been parsed already. The factors x and y
are represented by two variables of type Item of Var mode. The resulting term
is again described by an Item, and since the product is transitory, i.e. has
significance only within the expression of which the term is a constituent, it is to
be held in a temporary location, in a register. In order to express that an item
is located in a register, a new, non-basic mode Reg is introduced.
Effectively, all non-basic modes reflect the target computer’s architecture, in
particular its addressing modes. The more addressing modes a computer offers,
the more item modes are needed to represent them. The additional item modes
required by the RISC processor are. They are declared in module ORG:
The use of the types Object, Item, and Struct for the various modes and
forms, and the meaning of their attributes are explained in the following tables:
Objects Items
class val a b r
0 Undef
1 Const val val
2 Var adr adr base
3 Par adr adr off
4 Fld off off
5 Typ TDadr TDadr modno
6 SProc num
7 SFunc num
8 Mod
10 Reg regno
11 RegI off regno
12 Cond Tjmp Fjmp condition code
160 12.3.1
Structures
form nofpar len dsc base
7 Pointer base type
10 ProcTyp nofpar param result type
12 Array nofel element typ
13 Record ext lev desc adr fields extension type
Items have an attribute called lev which is part of the address of the item.
Positive values denote the level of nesting of the procedure in which the item
is declared; lev = 0 implies a global object. Negative values indicate that the
object is imported from the module with number -lev.
The three types Object, Item, and Struct are defined in module ORB, which
also contains procedures for accessing the symbol table.
DEFINITION ORG;
CONST WordSize* = 4;
TYPE Item* = RECORD
mode*: INTEGER;
type*: ORB.Type;
a*, b*, r: INTEGER;
rdo*: BOOLEAN (*read only*)
END ;
VAR pc: INTEGER;
(* x := x < y *) (* x := x < y *)
(* x := x < y *)
PROCEDURE VarParam*(VAR x: Item; ftype: ORB.Type); (*parameters*)
PROCEDURE ValueParam*(VAR x: Item);
PROCEDURE OpenArrayParam*(VAR x: Item);
PROCEDURE StringParam*(VAR x: Item);
PROCEDURE For0*(VAR x, y: Item); (*For Statements*)
PROCEDURE For1*(VAR x, y, z, w: Item; VAR L: LONGINT);
PROCEDURE For2*(VAR x, y, w: Item);
(* Branches, procedure calls, procedure prolog and epilog *)
PROCEDURE Here*(): LONGINT;
PROCEDURE FJump*(VAR L: LONGINT);
PROCEDURE CFJump*(VAR x: Item);
PROCEDURE BJump*(L: LONGINT);
PROCEDURE CBJump*(VAR x: Item; L: LONGINT);
PROCEDURE Fixup*(VAR x: Item);
PROCEDURE PrepCall*(VAR x: Item; VAR r: LONGINT);
PROCEDURE Call*(VAR x: Item; r: LONGINT);
PROCEDURE Enter*(parblksize, locblksize: LONGINT; int: BOOLEAN);
PROCEDURE Return*(form: INTEGER; VAR x: Item; size: LONGINT; int: BOOLEAN);
(* In-line code procedures*)
PROCEDURE Increment*(upordown: LONGINT; VAR x, y: Item);
PROCEDURE Include*(inorex: LONGINT; VAR x, y: Item);
PROCEDURE Assert*(VAR x: Item);
PROCEDURE New*(VAR x: Item);
PROCEDURE Pack*(VAR x, y: Item);
PROCEDURE Unpk*(VAR x, y: Item);
PROCEDURE Led*(VAR x: Item);
PROCEDURE Get*(VAR x, y: Item);
PROCEDURE Put*(VAR x, y: Item);
PROCEDURE Copy*(VAR x, y, z: Item);
PROCEDURE LDPSR*(VAR x: Item);
PROCEDURE LDREG*(VAR x, y: Item);
(*In-line code functions*)
PROCEDURE Abs*(VAR x: Item);
PROCEDURE Odd*(VAR x: Item);
PROCEDURE Floor*(VAR x: Item);
PROCEDURE Float*(VAR x: Item);
PROCEDURE Ord*(VAR x: Item);
PROCEDURE Len*(VAR x: Item);
PROCEDURE Shift*(fct: LONGINT; VAR x, y: Item);
PROCEDURE ADC*(VAR x, y: Item);
PROCEDURE SBC*(VAR x, y: Item);
PROCEDURE UML*(VAR x, y: Item);
PROCEDURE Bit*(VAR x, y: Item);
164 12.3.2
File names and the characters @ and may be followed by an option specification
/s. Option s enables the compiler to overwrite an existing symbol file, thereby
invalidating clients.
The parser is designed according to the proven method of top-down, recur-
sive descent parsing with a look-ahead of a single symbol. The last symbol read
is represented by the global variable sym. Syntactic entities are mirrored by
procedures of the same name. Their goal is to recognize the specified construct
in the source text. The start symbol and corresponding procedure is Module.
The principal parser procedures are shown in Fig. 12.6., which also exhibits
their calling hierarchy. Loops in the diagram indicate recursion in the syntactic
definition.
IF x IS T1 THEN
WITH x: T1 DO ... x ... END
ELSIF x IS T2 THEN
WITH x: T2 DO ... x ... END
ELSIF ... END
168 12.4
CASE x OF
T1: ... x ... T2: ... x ... ...
END
where T1 and T2 are extensions of the type T0 of the case variable x. Compilation
of this form of case statement merges the regional type guard of the former
with statements with the type test of the former if statements. This case
statement represents the only case where a symbol table entry—the type of
x—is modified during compilation of statements. When the end of the with
statement is reached, the change must be reverted.
The scanner module ORS embodies the lexicographic definitions of the lan-
guage, i.e. the definition of abstract symbols in terms of characters. The
scanner’s substance is procedure Get, which scans the source text and, for each
call, identifies the next symbol and yields the corresponding integer code. It
is most important that this process be as efficient as possible. Procedure Get
recognizes letters indicating the presence of an identifier (or reserved word), and
digits signalling the presence of a number. Also, the scanner recognizes comments
and skips them. The global variable ch stands for the last character read.
A sequence of letters and digits may either denote an identifier or a key word.
In order to determine which is the case, a search is made in a table containing all
key words for each would-be identifier. This table is sorted alphabetically and
according to the length of reserved words. It is initialized when the compiler is
loaded.
The presence of a digit signals a number. Procedure Number first scans the
subsequent digits (and letters) and stores them in a buffer. This is necessary,
because hexadecimal numbers are denoted by the postfix letter H (rather than a
prefix character). The postfix letter X specifies that the digits denote a character.
There exists one case in the language Oberon, where a look-ahead of a single
character does not suffice to identify the next symbol. When a sequence of digits
is followed by a period, this period may either be the decimal point of a real
number, or it may be the first element of a range symbol (..). Fortunately, the
problem can be solved locally as follows: If, after reading digits and a period,
a second period is present, the number symbol is returned, and the look-ahead
variable ch is assigned the special value 7FX. A subsequent call of Get then
delivers the range symbol. Otherwise the period after the digit sequence belongs
to the (real) number.
12.6.1 THE STRUCTURE OF THE SYMBOL TABLE 169
A search of an identifier proceeds first through the scope list, and for each
header its list of object records is scanned. This mirrors the scope rule of the
12.6.2 SYMBOL FILES 171
language and guarantees that if several entities carry the same identifier, the
most local one is selected. The linear list of objects represents the simplest
implementation by far. A tree structure would in many cases be more efficient
for searching, and would therefore seem more recommendable. Experiments have
shown, however, that the gain in speed is marginal. The reason is that the lists
are typically quite short. The superiority of a tree structure becomes manifest
only when a large number of global objects is declared. We emphasize that
when a tree structure is used for each scope, the linear lists must still be present,
because the order of declarations is sometimes relevant in interpretation, e.g. in
parameter lists.
Not only procedures, but also record types establish their own local scope.
The list of record fields is anchored in the type record’s field dsc, and it is
searched by procedure thisField. If a record type R1 is an extension of R0,
then R1’s field list contains only the fields of the extension proper. The base
type R0 is referenced by the BaseTyp field of R1. Hence, a search for a field may
have to proceed through the field lists of an entire sequence of record base types.
Objects have types, and types are referenced by pointers. These cannot
be written on a file. The straight-forward solution would be to use the type
identifiers as they appear in the program to denote types. However, this would
be rather crude and inefficient, and second, there are anonymous types, for which
artificial identifiers would have to be generated.
An elegant solution lies in consecutively numbering types. Whenever a type
is encountered the first time, it is assigned a unique reference number. For this
purpose, records (in the compiler) of type Type contain the field ref. Following
the number, a description of the type is then written to the symbol file. When
the type is encountered again during the traversal of the data structure, only
the reference number is issued, with negative sign. The global variable ORB.Ref
functions as the running reference number.
When reading a symbol file, a positive reference number is followed by the
type’s description. A pointer to the type read is assigned to the global table
typtab with the reference number as index. When a negative reference number
is read (it is not followed by a type description), then the type is identified
by typtab[-ref] (see procedure InType). In the following example, types are
identified by their reference number (e.g. R #14), and later referenced by this
number (ˆ14).
MODULE A;
CONST Ten* = 10; Dollar* = "$";
TYPE R* = RECORD u*: INTEGER; v*: SET END ;
S* = RECORD w*: ARRAY 4 OF R END ; P* = POINTER TO R;
A* = ARRAY 8 OF INTEGER;
B* = ARRAY 4, 5 OF REAL;
C* = ARRAY 10 OF S;
D* = ARRAY OF CHAR;
VAR x*: INTEGER;
PROCEDURE Q0*;
BEGIN END Q0;
PROCEDURE Q1*(x, y: INTEGER): INTEGER;
BEGIN RETURN x+y END Q1;
END A.
After a symbol file has been generated, it is compared with the file from a
previous compilation of the same module, if one exists. Only if the two files
differ and if the compiler’s s-option is enabled, is the old file replaced by the
new version. The comparison is made by comparing byte after byte without
consideration of the file’s structure. This somewhat crude approach was chosen
because of its simplicity and yielded good results in practice.
A symbol file must not contain addresses (of variables or procedures). If
they did, most changes in the program would result in a change of the symbol
file. This must be avoided, because changes in the implementation (rather
than the interface) of a module are supposed to remain invisible to the clients.
Only changes in the interface are allowed to effect changes in the symbol file,
requiring recompilation of all clients. Therefore, addresses are replaced by
export numbers. The variable exno (global in ORP) serves as running number
(see ORP.Declarations and ORP.ProcedureDecl). The translation from export
number to address is performed by the loader. Every code file contains a list
(table) of addresses (of variables and entry points for procedures). The export
number serves as index in this table to obtain the requested address. Export
numbers are generated by the parser.
Objects exported from some module M1 may refer in their declaration to
some other module M0 imported by M1. It would be unacceptable, if an import
of M1 would then also require the import of M0, i.e. imply the automatic reading
of M01’s symbol file. It would trigger a chain reaction of imports that must be
avoided. Fortunately, such a chain reaction can be avoided by making symbol
files self-contained, i.e. by including in every symbol file the description of entities
that stem from other modules. Such entities are always types.
The inclusion of types imported from other modules seems simple enough
to handle: type descriptions must include a reference to the module from which
the type was imported. This reference is the name and key of the respective
module. However, there exists one additional complication that cannot be
ignored. Consider a module M1 importing a variable x from a module M0. Let the
type T of x be defined in module M0. Also, assume M1 to contain a variable y of
type M0.T. Evidently, x and y are of the same type, and the compiler compiling
M2 must recognize this fact. Hence, when importing M0 during compilation of
M1, the imported element T must not only be registered in the symbol table, but
it must also be recognized as being identical to the T already imported from M2
directly. It is rather fortunate that the language definition specifies equivalence
of types on the basis of names rather than structure, because it allows type tests
at execution time to be implemented by a simple address comparison.
174 12.6.2
(Appendix 2) and the generated code patterns for individual language constructs
(Section 12.2).
A distinguishing feature of this compiler is that parsing proceeds top-down
according to the principle of recursive descent in the parsing tree. This implies
that for every syntactic construct a specific procedure is called. It carries the
same name as the construct. It also implies that properties of the parsed
construct can be represented by parameters of the parsing procedures. Consider,
for example, the construct of simple expression:
SimpleExpression = term {"+" term}.
The corresponding parsing procedure is
PROCEDURE SimpleExpression(VAR x: Item);
VAR y: Item;
BEGIN term(x);
WHILE sym = plus DO ORS.Get(sym); term(y); ORG.AddOp(x, y)
END
END SimpleExpression
The generating procedure AddOp receives two parameters representing the oper-
ands, and returns the result through the first parameter. This scheme carries
the invaluable advantage of using operands efficiently allocated on the stack
rather than dynamically allocated on the heap and subject to automatic storage
retrieval (garbage collection). Here the processed operands quietly disappear
from the stack upon exit from the parser procedure.
The parameters representing syntactic constructs are of type Item defined
in ORG. This data type is rather similar to the type Object (in ORB). After all,
it serves the same purpose; but it represents internal items rather than declared
objects.
TYPE Item = RECORD
mode: INTEGER;
type: ORB.Type;
a, b, r: INTEGER;
rdo: BOOLEAN (*read only*)
END
The attribute class of Object is renamed mode in Item. In fact, in some sense
different classes evoke different (corresponding) addressing modes as featured by
the processor architecture. According to the architecture, additional modes may
have to be introduced. Thanks to the simplicity of RISC, only three are needed:
• Reg = 10; The item x is located in register x.r
• RegI = 11; The item x is addressed indirectly through register x.r plus
offset x.a
• Cond = 12; The item is represented by the condition bit registers
Instructions are emitted sequentially and emitted by the four procedures Put0,
Put1, Put2, Put3. They directly correspond to the instruction formats of the
176 12.7
RISC processor (see Chapter 11). The instructions are stored in the array code
and the compiler variable pc serves as running index.
PROCEDURE Put0(op, a, b, c: INTEGER); format F0
PROCEDURE Put1(op, a, b, im: INTEGER); format F1
PROCEDURE Put2(op, a, b, off: INTEGER); format F2
PROCEDURE Put3(op, cond, off: INTEGER); format F3
12.7.1. Expressions.
Expressions consist of operands and operators. They are evaluated and have
a value. First, a number of make-procedures transform objects into items (see
Section 12.3.2). The principal one is MakeItem. Typical objects are variables
(class, mode = Var). Global variables are addressed with base register SB (x.r
= 13), local variables with the stack pointer SP (x.r = 14). VAR-parameters are
addressed indirectly; the address is on the stack (class, mode = Par, Ind). x.a
is the offset from the stack pointer.
Before an operator can be applied to operands, these must first be trans-
ferred (loaded) into registers. This is because the RISC performs operations
only on registers. The loading is achieved by procedure load (and loadAdr) in
ORG. The resulting mode is Reg. In allocating registers, a strict stack principle
is used, starting with R0, up to R11. This is certainly not an optimal strategy
and provides ample room for improvement (usually called optimization). The
compiler variable RH indicates the next free register (top of register stack).
Base address SB is, as the name suggests, static. But this holds only within
a module. It implies that on every transfer to a procedure in another module,
the static base must be adjusted. The simplest way is to load SB before every
external call, and to restore it to its old value after return from the procedure.
We chose a different strategy: loading on demand (see below: global variables).
If a variable is indexed, has a field selector, is dereferenced, or has a type
guard, this is detected in the parser by procedure selector. It calls generators
Index, Field, DeRef, or TypeTest accordingly (see Section 12.3.2. and patterns
1–4 in Section 12.2). These procedures cause item modes to change as follows:
Mode transition of x Instructions emitted Construct
1. Index(x, y) (y is loaded into y.r)
Var → RegI ADD y.r, SP, y.r array variable
Par → RegI LDR RH, SP, x.a array parameter
ADD y.r, RH, y.r
RegI → RegI ADD x.r, x.r, y.r indexed array
2. Field(x, y) (y.mode = Fld, y.a = field offset)
Var → Var none field designator, add offset to x.a
RegI → RegI none add field offset to x.a
Par → Par none add field offset to x.b
3. DeRef(x)
12.7.3 SET OPERATIONS 177
12.7.2. Relations.
RISC does not feature any compare instruction. Instead, subtraction is used,
because an implicit comparison with 0 is performed along with any arithmetic
(or load) instruction. Instead of x < y we use x−y < 0. This is possible, because
in addition to the computed difference deposited in a register, also the result of
the comparison is deposited in the condition flags N (difference negative) and
Z (difference zero). Relations therefore yield a result item x with mode Cond.
x.r (= relmap[sym]) identifies the relation. Branch instructions (jumps) are
executed conditionally depending on these flags. The value x.r is then used when
generating branch instructions. For example, the relation x < y is translated
simply into
LDR R0, SP, x
LDR R1, SP, y
CMP R0, R0, R1
and the resulting item mode is x.mode = Cond, x.r := "less". (The mne-
monic CMP is synonymous with SUB). More about relations and Boolean expres-
sions will be explained in Section 12.7.6.
12.7.4. Assignments.
Statements have an effect, but no result like expressions. Statements are
executed, not evaluated. Assignments alter the value of variables through store
instructions. The computation of the address of the affected variable follows the
same scheme as for loading. The value to be assigned must be in a register.
Assignments of arrays (and records) are an exceptional case in so far as they
are performed not by a single store instruction, but by a repetition. Consider
y := x, where x, and y are both arrays of n integers. Assuming that the address
of y is in register R0, that of x in R1, and the value n in R2. Then the resulting
code is
L LDR R3, R1, 0 source
ADD R1, R1, 4
STR R3, R0, 0 destination
ADD R0, R0, 4
SUB R2, R2, 1 counter
BNE L
12.7.7. Procedures.
Before embarking on an explanation of procedure calls, entries and exits,
we need to know how recursion is handled and how storage for local variables is
allocated. Procedure calls cause a sequence of frames to be allocated in a stack
fashion. These frames are the storage space for local variables. Each frame is
headed by a single word containing the return address of the call. This address
is deposited in R15 by the call instructions (BL, branch and link). The compiler
“knows” the size of the frame to be allocated, and thus merely decrements the
stack pointer SP (R14) by this amount. Upon return, SP is incremented by
the same amount, and PC is restored by a branch instruction. In the following
example, a procedure P is called, calling itself Q, and Q calling P again (recursion).
The stack then contains 3 frames (see Figure 12.7).
Scheme and layout determine the code sequences for call, entry and exit of
procedures. Here is an example of a procedure P with 2 parameters:
Call: LDR R0, param0
LDR R1, param1
BL P
In the symbol table, the field base refers to the ancestor of a given record type.
Thus base of the type representing T11 points to T1, etc. Run-time checks,
however, must be fast, and hence cannot proceed through chains of pointers.
Instead, each TD contains an array with references to the ancestor TDs (including
itself). For the example above, the TDs are as follows:
TD(T) = [T]
TD(T0) = [T, T0]
TD(T1) = [T, T1]
TD(T00) = [T, T0, T00]
TD(T01) = [T, T0, T01]
TD(T10) = [T, T1, T10]
TD(T11) = [T, T1, T11]
Evidently, the first element can be omitted, as it always refers to the common
base of the type hierarchy. The last element always points to the TDs owner. TDs
are allocated in the data area, the area for variables.
References to TDs are called type tags. They are required in two cases. The
first is for records referenced by pointers. Such dynamically allocated records
carry an additional, hidden field holding their type tag. (A second additional
word is reserved for use by the garbage collector. The offset of the tag field
is therefore -8). The second case is that of record-typed VAR-parameters. In
this case the type tag is explicitly passed along with the address of the actual
parameter. Such parameters therefore require two words/registers.
A type test then consists of a test for equality of two type tags. In p IS T
the first tag is that of the nth entry of the TD of pˆ, where n is the extension
level of T. The second tag is that of type T. This is shown in Pattern13 in Section
12.2 (see also Fig. 12.4). The test then is as follows:
pˆ.tagˆ[n] = adr(T), where n is the extension level of T
12.7.9 IMPORT AND EXPORT, GLOBAL VARIABLES 185
When declaring a record type, it is not known how many extensions, nor
how many levels will be built on this type. Therefore TDs should actually be
infinite arrays. We decided to restrict them to 3 levels only. The first entry,
which is never used for checking, is replaced by the size of the record.
the same module by the same technique. Their level number is 0. One might
use a specific base register for the base of the current module. Its content would
then have to be reloaded upon every procedure call and after every return. This
is common technique, but we have here chosen to reload only when necessary,
i.e. only when an access is at hand. This strategy rewards the programmer who
sensibly uses global variables rarely.
12.7.10. Traps.
This compiler provides an extensive system of safeguard by providing run-
time checks (aborts) in several cases:
trap number trap cause
1 array index out of range
2 type guard failure
3 array or string copy overflow
4 access via NIL pointer
5 illegal procedure call
6 integer division by zero
7 assertion violated
These checks are implemented very efficiently in order not to downgrade
a program’s performance. Involved is typically a single compare instruction,
plus a conditional branch (BLR MT). It is assumed that entry 0 of the module
table contain not a base address (module numbers start with 1), but a branch
instruction to an appropriate trap routine. The trap number is encoded in bits
4:7 of the branch instruction.
The predefined procedure Assert generates a conditional trap with trap
number 7. For example, the statement Assert(m = n) generates
LDR R0, m
LDR R1, n
CMP R0, R0, R1
BLR 1, 7CH branch and link if unequal through R12,
trap number 7
Procedure New, representing the operator NEW, has been implemented with the
aid of the trap mechanism. (This is in order to omit in ORG any reference to
module Kernel, which contains the allocation procedure New). The generated
code for the statement NEW(p) is
ADD R0, SP, p address of p
ADD R1, SB, tag type tag
BLR 7, 0CH branch and link unconditionally through
R12 (MT), trap number 0
CHAPTER 13
A GRAPHICS EDITOR
The goal of the original SIL program was to support the design of electronic
circuit diagrams. Primarily, SIL was a line drawing system. This implies that
the drawings remain uninterpreted. However, in a properly integrated system,
the addition of modules containing operators that interpret the drawings is a
reasonably straight-forward proposition. In fact, the Oberon system is ideally
suited for such steps, particularly due to its command facility.
At first, we shall ignore features specially tailored to circuit design. The
primary one is a macro facility to be discussed in a later chapter.
The basic system consists of the modules Draw, GraphicFrames, and Graph-
ics. These modules contain the facilities to generate and handle horizontal and
vertical lines, text captions, and macros. Additional modules serve to introduce
other elements, such as rectangles and circles, and the system is extensible, i.e.
further modules may be introduced to handle further types of elements.
in both their x and y coordinates, the end point is adjusted so that the line is
either horizontal or vertical.
Writing a caption. First the cursor is positioned where the caption is to
appear. Then the left key is clicked, causing a crosshair to appear. It is called
the caret. Then the text is typed. Only single lines of texts are accepted. The
DEL key may be used to retract characters (backspace).
Selecting. Most commands require the specification of operands, and many
implicitly assume the previously selected elements—the selection—to be their
operands. A single element is selected by pointing at it with the cursor and then
clicking the right mouse button. This also causes previously selected elements to
be deselected. If the left key is also clicked, their selection is retained. This action
is called an interclick. To select several elements at once, the cursor is moved
from P0 to P1 while the right key is held. Then all elements lying within the
rectangle with diagonally opposite corners at P0 and P1 are selected. Selected
lines are displayed as dotted lines, selected captions (and macros) by inverse
video mode. A macro is selected by pointing at its lower left corner. The corner
is called sensitive area.
Moving. To move (displace) a set of elements, the elements are first selected
and then the cursor is moved from P0 to P1 while the middle key is held. The
vector from P0 to P1 specifies the movement and is called the displacement
vector. P0 and P1 may lie in different viewers displaying the same graph. Small
displacements may be achieved by using the keyboard’s cursor keys.
Copying. Similarly, the selected elements may be copied (duplicated). In
addition to pressing the middle key while indicating the displacement vector, the
left key is interclicked. The copy command may also be used to copy elements
from one graph into another graph by moving the cursor from one viewer into
another viewer displaying the destination graph. A text caption may be copied
from a text frame into a graphic frame and vice-versa. There exist two ways to
accomplish this: 1. First the caret is placed at the destination position, then
the text is selected and the middle key is interclicked. 2. First the text is
selected, then the caret is placed at the destination position and the middle key
is interclicked.
Shifting the plane. You may shift the entire drawing plane behind the viewer
by specifying a displacement vector pressing the middle button (like in a move
command) and interclicking the right button.
The following table shows a summary of the mouse actions:
The ChangeColor command either take a color number in the range 1..15
or a string as parameter. It serves to copy the color from the selected character.
13.2.4. Macros.
A macro is a (small) drawing that can be identified as a whole and be used as
an element within a (larger) drawing. Macros are typically stored in collections
called libraries, from where they can be selected and copied individually.
Draw.Macro lib mac — The macro mac is selected from the library named
lib and inserted in the drawing at the caret’s position.
An example for the use of macros is drawing electronic circuit diagrams. The
basic library file containing frequently used TTL components is called TTL0.Lib,
and a drawing showing its elements is called TTL0.Graph (see Figure 13.2).
13.2.5. Rectangles.
Rectangles can be created as individual elements. They are frequently used
for framing sets of elements. Rectangles consist of four lines which are se-
lectable as a unit. The attribute commands Draw.SetWidth, System.SetColor,
Draw.ChangeWidth, and Draw.ChangeColor also apply to rectangles. Rectangles
are selected by pointing at their lower left corner and are created by the following
steps:
1. The caret is placed where the lower left corner of the new rectangle is to lie.
2. A secondary caret is placed where the opposite corner is to lie (ML + MR).
3. The command Rectangles.Make is activated.
13.2.7. Spline curves. Spline curves are created by the following steps:
1. The caret is placed where the starting point is to lie.
2. Secondary carets are placed at the spline’s fixed points (at most 20).
3. The command Splines.MakeOpen or Splines.MakeClosed is activated.
in their visual appearance in some way, and gives the user an opportunity to
verify the selection (and to change it, if necessary) before applying the operation
(such as deletion). For an object to be selectable means that it must record
a state (selected/unselected). We note that it is important that this state is
reflected by visual appearance.
As a consequence, the property selected is added to every object record.
We now specify the data types representing lines and captions as follows and
note that both types must be extensions of the same base type in order to be
members of one and the same data structure.
TYPE Object = POINTER TO ObjectDesc;
ObjectDesc = RECORD
x, y, w, h, col: INTEGER;
selected: BOOLEAN;
next: Object
END ;
Line = POINTER TO LineDesc;
LineDesc = RECORD (Object) END ;
Caption = POINTER TO CaptionDesc
CaptionDesc = RECORD (Object)
pos, len: INTEGER
END
The two procedures typically are placed in different modules, one containing
operations on objects, the other those on graphics. Here the former is the
service module, the latter the former’s client. Procedures for, e.g, copying
elements, or determining whether an object is selectable, follow the same pattern
as drawGraphic.
This solution has the unpleasant property that all object types are anchored
in the base module. If any new types are to be added, the base module has to be
modified (and all clients are to be—at least—recompiled). The object-oriented
paradigm eliminates this difficulty by inverting the roles of the two modules.
It rests on binding the operations pertaining to an object type to each object
individually in the form of procedure-typed record fields as shown in the following
sample declaration:
ObjectDesc = RECORD
x, y, w, h, col: INTEGER; selected: BOOLEAN;
draw: PROCEDURE (obj: Object);
write: PROCEDURE (obj: Object; VAR R: Files.Rider);
next: Object
END
field (called do) in each record (analogous to the handler). This field is a pointer
to a method record containing the procedures declared for the base type. At
least one of them uses a message parameter, i.e. a parameter of record structure
that is extensible.
The modules in the top row implement the individual object types’ methods,
and additionally provide commands, in particular Make for creating new objects.
The base module specifies the base types and procedures operating on graphics
as a whole.
Our system, however, deviates from this scheme somewhat for several rea-
sons:
1. Implementation of the few methods requires relatively short programs for
the basic objects. Although a sensible modularization is desirable, we wish
to avoid an atomization, and therefore merge parts that would result in tiny
modules with the base module.
2. The elements of a graphic refer to fonts used in captions and to libraries used
in macros. The writing and reading procedures therefore carry a context
consisting of fonts and libraries as an additional parameter. Routines for
mapping a font (library) to a number according to a given context on output,
and a number to a font (library) on input are contained in module Graphics.
3. In the design of the Oberon System, a hierarchy of four modules has proven
to be most appropriate:
0. Module with base type handling the abstract data structure.
1. Module containing procedures for the representation of objects in frames
(display handling).
2. Module containing the primary command interpreter and connecting
frames with a viewer.
3. A command module scanning command lines and invoking the appro-
priate interpreters.
The module hierarchy of the Graphics System is here shown together with
its analogy, with the Text System:
Function Graphics Text
3. Command Scanner Draw Edit
2. Viewer Handler MenuViewers MenuViewers
1. Frame Handler GraphicFrames TextFrames
0. Base Graphics Texts
As a result, module Graphics does not only contain the base type Object,
but also its extensions Line and Caption (and Macro). Their methods are also
13.3 THE CORE AND ITS STRUCTURE 199
defined in Graphics, with the exception of drawing methods, which are defined
in GraphicFrames, because they refer to frames.
So far, we have discussed operations on individual objects and the structure
resulting from the desire to be able to add new object types without affecting
the base module. We now turn our attention briefly to operations on graphics
as a whole. They can be grouped into two kinds, namely operations involving a
graphic as a set, and those applying to the selection, i.e. to a subset only.
The former kind consists of procedures Add, which inserts a new object, Draw,
which traverses the set of objects and invokes their drawing methods, ThisObj,
which searches for an object at a given position, SelectObj, which marks an
object to be selected, SelectArea, which identifies all objects lying within a
given rectangular area and marks them, Selectable, a Boolean function, and
Enumerate, which applies the parametric procedure handle to all objects of
a graphic. Furthermore, the procedures Load, Store, Print, and WriteFile
belong to this kind.
The set of operations applying to selected objects only consist of the follow-
ing procedures: Deselect, DrawSel (drawing the selection according to a spec-
ified mode), Change (changing certain attributes of selected objects like width,
font, color), Move, Copy, CopyOver (copying from one graphic into another), and
finally Delete. Also, there exists the important procedure Open which creates
a new graphic, either loading a graphic stored as a file, or generating an empty
graphic.
The declaration of types and procedures that have emerged so far are sum-
marized in the following excerpt of the module’s interface definition.
Every frame specifies its coordinates X, Y within the display area, its size by
the attributes W (width) and H (height), and its background color col. Just as
a frame represents a (rectangular) section of the entire screen, it also shows an
excerpt of the drawing plane of the graphic. The coordinate origin need coincide
with neither the frame origin nor the display origin. The frame’s position relative
to the graphic plane’s origin is recorded in the frame descriptor by the coordinates
Xg, Yg.
X1 = X + W, Y 1 = Y + H, x = X + Xg, y = Y 1 + Y g
X and Y (and hence also X1 and Y1) are changed when a viewer is modified, i.e.
when the frame is moved or resized. Xg and Yg are changed when the graph’s
origin is moved within a frame. The meaning of the various values is illustrated
in Figure 13.4.
202 13.4
END
The meaning of the mode parameter’s four possible values are the following:
mode = 0: draw object according to its state,
mode = 1: draw reflecting a transition from normal to selected state,
mode = 2: draw reflecting a transition from selected to normal state,
mode = 3: erase.
In the case of captions, for instance, the transitions are indicated by simply
inverting the rectangular area covered by the caption. No rewriting of the
captions’ character patterns is required.
A mode parameter is also necessary for reflecting object deletion. First, the
selected objects are drawn with mode indicating erasure. Only afterwards are
they removed from the graphic’s linked list.
Furthermore, the message parameter of the drawing procedure contains two
offsets x and y. They are added to the object’s coordinates, and their significance
will become apparent in connection with macros. The same holds for the color
parameter.
The drawing procedures are fairly straight-forward and use the four basic
raster operations of module Display. The only complication arises from the need
to clip the drawing at the frame boundaries. In the case of captions, a character
is drawn only if it fits into the frame in its entirety. The raster operations do
not test (again) whether the indicated position is valid.
At this point we recall that copies of a viewer (and its frames) can be
generated by the System.Copy command. Such copies display the same graphic,
but possibly different excerpts of them. When a graphic is changed by an
insertion, deletion, or any other operation, at a place that is visible in several
frames, all affected views must reflect the change. A direct call to a drawing
procedure indicating a frame and the change does therefore not suffice. Here
again, the object-oriented style solves the problem neatly: In place of a direct
call a message is broadcast to all frames, the message specifying the nature of
the required updates.
The broadcast is performed by the general procedure Viewers.Broadcast(M).
It invokes the handlers of all viewers with the parameter M. The viewer handlers
either interpret the message or propagate it to the handlers of their subframes.
Procedure obj.handle is called with a control message as parameter when
pointing at the object and clicking the middle mouse button. This allows control
to be passed to the handler of an individual object.
The definition of module GraphicFrames is summarized by the following
interface:
DEFINITION GraphicFrames;
IMPORT Display, Graphics;
TYPE Frame = POINTER TO FrameDesc;
Location = POINTER TO LocDesc;
LocDesc = RECORD
204 13.4
x, y: INTEGER;
next: Location
END ;
FrameDesc = RECORD (Display.FrameDesc)
graph: Graphics.Graph;
Xg, Yg, X1, Y1, x, y, col: INTEGER;
marked, ticked: BOOLEAN;
mark: LocDesc
END ;
issues are a matter of subjective judgement, and all too often convention is being
mixed up with convenience. Nevertheless, a few criteria have emerged as fairly
generally accepted.
We base our discussion on the premise that input is provided by a key-
board and a mouse, and that keyboard input is essentially to be reserved for
textual input. The critical issue is that a mouse—apart from providing a cursor
position—allows to signal actions by the state of its keys. Typically, there are
far more actions than there are keys. Some mice feature a single key only, a
situation that we deem highly unfortunate. There are, however, several ways to
“enrich” key states:
1. Position. Key states are interpreted depending on the current position of
the mouse represented by the cursor. Typically, interpretation occurs by the
handler installed in the viewer covering the cursor position, and different
handlers are associated with different viewer types. The handler chosen for
interpretation may even be associated with an individual (graphic) object
and depend on that object’s type.
2. Multiple clicks. Interpretation may depend on the number of repeated clicks
(of the same key), and/or on the duration of clicks.
3. Interclicks. Interpretation may depend on the combination of keys depressed
until the last one is released. This method is obviously inapplicable for
single-key mice.
Apart from position dependence, we have quite successfully used interclicks.
A ground rule to be observed is that frequent actions should be triggered by
single-key clicks, and only variants of them should be signalled by interclicks.
The essential art is to avoid overloading this method.
Less frequent operations may as well be triggered by textual commands, i.e.
by pointing at the command word and clicking the middle button. Even for this
kind of activation, Oberon offers two variations:
1. The command is listed in a menu (title bar). This solution is favoured
when the respective viewer is itself a parameter to the command, and it
is recommended when the command is reasonably frequent, because the
necessary mouse movement is relatively short.
2. The command lies elsewhere, typically in a viewer containing a tool text.
Lastly, we note that any package such as Draw is integrated within an
entire system together with other packages. Hence it is important that the rules
governing the user interfaces of the various packages do not differ unnecessarily,
but that they display common ground rules and a common design “philosophy”.
Draw’s conventions were, as far as possible and sensible, adapted to those of the
text system. The right key serves for selection, the left for setting the caret, and
the middle key for activating general commands, in this case moving and copying
the entire graphic. Inherently, drawing involves certain commands that cannot
be dealt with in the same way as for texts. A character is created by typing on
the keyboard; a line is created by dragging the mouse while holding the left key.
Interclicks left-middle and right-middle are treated in the same way as in the
text system (copying a caption from the selection to the caret), and this is not
206 13.5
surprising, because text and graphics are properly integrated, i.e. captions can
be copied from texts into graphics and vice-versa.
Using different conventions depending on whether the command was acti-
vated by pointing at the caption within a text frame or within a graphics frame
would be confusing indeed.
13.6. MACROS
For many applications it is indispensible that certain sets of objects may
be named and used as objects themselves. Such a named subgraph is called a
macro. A macro thus closely mirrors the sequence of statements in a program
text that is given a name and can be referenced from within other statements:
the procedure. The notion of a graphic object becomes recursive, too. The
facility of recursive objects is so fundamental that it was incorporated in the
base module Graphics as the third class of objects.
Its representation is straight-forward: in addition to the attributes common
to all objects, a field is provided storing the head of the list of elements which
constitute the macro. In the present system, a special node is introduced
representing the head of the element list. It is of type MacHeadDesc and carries
also the name of the macro and the width and height of the rectangle covering
all elements. These values serve to speed up the selection process, avoiding their
recomputation by scanning the entire element list.
The recursive nature of macros manifests itself in recursive calls of display
procedures. In order to draw a macro, drawing procedures of the macro’s element
types are called (which may be macros again). The coordinates of the macro are
added to the coordinates of each element, which function as offsets. The color
value of the macro, also a field of the parameter of type DrawMsg, overrides the
colors of the elements. This implies that macros always appear monochrome.
An application of the macro facility is the design of schematics of electronic
circuits. Circuit components correspond to macros. Most components are rep-
resented by a rectangular frame and by labelled connectors (pins). Some of the
most elementary components, such as gates, diodes, transistors, resistors, and
capacitors are represented by standardized symbols. Such symbols, which may be
regarded as forming an alphabet of electronic circuit diagrams, are appropriately
provided in the form of a special font, i.e. a collection of raster patterns. Three
such macros are shown in Figure 13.5, together with the components from which
they are assembled. The definitions of the data types involved are:
Macro = POINTER TO MacroDesc;
MacroDesc = RECORD (ObjectDesc) mac: MacHead END ;
MacHead = POINTER TO MacHeadDesc;
MacHeadDesc = RECORD name: Name;
w, h: INTEGER; lib: Library
END ;
Library = POINTER TO LibraryDesc;
LibraryDesc = RECORD name: Name END
13.7 OBJECT CLASSES 207
expected that a modern graphics system allow the addition of further types of
objects. The emphasis lies here on the word addition instead of change. New
facilities are to be providable by the inclusion of new modules without requiring
any kind of adjustment, not even recompilation of the existing modules. In
practice, their source code would quite likely not be available. It is the triumph
of the object-oriented programming technique that this is elegantly possible. The
means are the extensible record type and the procedure variable, features of the
programming language, and the possibility to load modules on demand from
statements within a program, a facility provided by the operating environment.
We call, informally, any extension of the type Object a class. Hence, the
types Line, Caption, and Macro constitute classes. Additional classes can be
defined in other modules importing the type Object. In every such case, a set
of methods must be declared and assigned to a variable of type MethodDesc.
They form a so-called method suite. Every such module must also contain a
procedure, typically a command, to generate a new instance of the new class.
This command, likely to be called Make, assigns the method suite to the do field
of the new object.
This successful decoupling of additions from the system’s base suffices, al-
most. Only one further link is unavoidable: When a new graphic, containing
objects of a class not defined in the system’s core, is loaded from a file, then
that class must be identified, the corresponding module with its handlers must
be loaded—this is called dynamic loading—and the object must be generated
(allocated). Because the object in question does not already exist at the time
when reading the object’s attribute values, the generating procedure cannot
possibly be installed in the very same object, i.e. it cannot be a member of
the method suite. We have chosen the following solution to this problem:
1. Every new class is implemented in the form of a module, and every class
is identified by the module name. Every such module contains a command
whose effect is to allocate an object of the class, to assign the message suite
to it, and to assign the object to the global variable Graphics.new.
2. When a graphics file is read, the class of each object is identified and a call to
the respective module’s allocation procedure delivers the desired object. The
call consists of two parts: a call to Modules.ThisMod, which may cause the
loading of the respctive class module M, and a call of Modules.ThisCommand.
Then the data of the base type Object are read, and lastly the data of the
extension are read by a call to the class method read.
The following may serve as a template for any module defining a new object
class X. Two examples are given in Section 13.9, namely Rectangles and Curves.
MODULE Xs;
IMPORT Files, Oberon, Graphics, GraphicFrames;
TYPE X* = POINTER TO XDesc;
XDesc = RECORD (Graphics.ObjectDesc)
(*additional data fields*) END ;
VAR method: Graphics.Method;
13.7 OBJECT CLASSES 209
x.w := ... ;
x.h := ... ;
x.col := Oberon.CurCol;
x.do := method;
GraphicFrames.Defocus(F);
Graphics.Add(F.graph, x);
GraphicFrames.DrawObj(F, x)
END
END Make;
BEGIN
NEW(method);
method.module := "Xs";
method.allocator := "New";
method.copy := Copy;
method.draw := Draw;
method.selectable := Selectable;
method.handle := Handle;
method.read := Read;
method.write := Write;
method.print := Print
END Xs.
We wish to point out that also the macro and library facilities are capable of
integrating objects of new classes, i.e. of types not occurring in the declarations
of macro and library facilities. The complete interface definition of module
Graphics is obtained from its excerpt given in Sect. 13.3, augmented by the
declarations of types and procedures in Sect. 13.6. and 13.7.
PROCEDURE Open;
PROCEDURE Delete;
PROCEDURE SetWidth;
PROCEDURE ChangeColor;
PROCEDURE Store;
PROCEDURE Macro;
PROCEDURE OpenMacro;
PROCEDURE MakeMacro;
PROCEDURE LoadLibrary;
END Draw.
Fig. 13.6 Data structure for two libraries, each with three macros
214 13.8.3
data = x y w h color.
All class numbers are at least 4; the values 1, 2, and 3 are assigned to lines,
captions, and macros. x, y, w, h are two-byte integer attributes of the base type
Object. The attribute color takes a single byte. The first byte of an item being
0 signifies that the item is an identification of a new font, library, or class. If the
second byte is 0, a new font is announced, if 1 a new library, and if 2 a new class
of elements.
The same procedures are used for loading and storing a library file. In
fact, Load and Store read and write a file stretch representing a sequence of
elements which is terminated by a special value (255). In a library file each macro
corresponds to a stretch, and the terminator is followed by values specifying the
macro’s overall width, height, and its name. The structure of library files is
defined by the following syntax:
libfile = libtag {macro}.
macro = stretch w h name.
The first byte of each element is a class number within the context of the file
and identifies the class to which the element belongs. An object of the given class
is allocated by calling the class’ allocation procedure, which is obtained from the
class dictionary in the given context. The class number is used as dictionary
index. The presence of the required allocation procedure in the dictionary is
guaranteed by the fact that a corresponding index/name pair had preceded the
element in the file.
The encounter of such a pair triggers the loading of the module specifying
the class and its methods. The name of the pair consists of two parts: the
first specifies the module in which the class is defined, and it is taken as the
parameter of the call to the loader (see procedure GetClass). The second part
is the name of the relevant allocation procedure which returns a fresh object to
variable Graphics.new. Thereafter, the data defined in the base type Object
are read.
Data belonging to an extension follow those of the base type, and they are
read by the extension’s read method. This part must always be headed by a byte
specifying the number of bytes which follow. This information is used in the case
where a requested module is not present; it indicates the number of bytes to be
skipped in order to continue reading further elements.
A last noteworthy detail concerns the Move operation which appears as
surprisingly complicated, particularly in comparison with the related copy oper-
ation. The reason is our deviation from the principle that a graphics editor must
refrain from an interpretation of drawings. Responsible for this deviation was
the circumstance that the editor was at first primarily used for the preparation
of circuit diagrams. They suggested the view that adjoining, perpendicular lines
be connected. Consequently, the horizontal or vertical displacement of a line was
to preserve connections. Procedure Move must therefore identify all connected
lines, and subsequently extend or shorten them.
216 13.8.3
13.9.1. Rectangles.
In this section, we present two extensions of the basic graphics system
which introduce new classes of objects. The first implements rectangles which
are typically used for framing a set of objects. They are, for example, used
in the representation of electronic components (macros, see Fig. 13.2). Their
implementation follows the scheme presented at the end of chapter 13.7 and is
reasonably straight-forward, considering that each rectangle merely consists of
four lines. Additionally, a background raster may be specified.
One of the design decisions occurring for every new class concerns the way
to display the selection. In this case we chose, in contrast to the cases of captions
and macros, not inverse video, but a small square dot in the lower right corner
of the rectangle. The data type Rectangle contains one additional field: lw
indicates the line width.
In spite of the simplicity of the notion of rectangles, their drawing method
is more complex than might be expected. The reason is that drawing methods
are responsible for appropriate clipping at frame boundaries. In this case, some
of the component lines may have to be shortened, and some may disappear
altogether.
Procedure Handle provides an example of a receiver of a control message.
It is activated as soon as the middle mouse button is pressed, in contrast to
other actions, which are initiated after the release of all buttons. Therefore,
this message allows for the implementation of actions under control of individual
handlers interpreting further mouse movements. In this example, the action
serves to change the size of the rectangle, namely by moving its lower left corner.
DEFINITION Rectangles;
TYPE Rectangle = POINTER TO RectDesc;
RectDesc = RECORD (Graphics.ObjectDesc) lw: INTEGER
END ;
VAR method: Graphics.Method;
PROCEDURE New;
PROCEDURE Make;
END Rectangles.
DEFINITION Curves;
TYPE Curve = POINTER TO CurveDesc;
CurveDesc = RECORD (Graphics.ObjectDesc)
kind, lw: INTEGER
END ;
(*kind: 0 = up-line, 1 = down-line, 2 = circle*)
VAR method: Graphics.Method;
PROCEDURE MakeLine;
PROCEDURE MakeCircle*;
END Curves.
CHAPTER 14
This describes the normal case of startup. But, how did the boot loader ever
get into the platform-flash, and how did the inner core ever get into the boot
area of the disk, and how did the files of the outer core get into the file store?
In fact, how did the file store get initialized? This is described in the following
section on building tools.
Precisely to solve this problem, the boot loader has been provided with a
second source of the boot data. Instead of from the disk, it may be fetched over
a data link, in this case the RS-232 data line. This choice is set by switch 0.
0 load from the ”boot track” of the disk (sectors 2 - 63)
1 load from the RS-232 line (or a network, if available)
In case 1, the data stream originates at a host computer, on which pre-
sumably the boot file had been generated or even the entire system had been
built.
220 14.1
A simple boot loader reading from the RS-232 line and using the stream
format described above is shown here:
MODULE* BootLoad;
IMPORT SYSTEM;
CONST MT = 12; SP = 14; MemLim = 0E7F00H;
(*device addresses*)
swi = -60; led = -60; data = -56; ctrl = -52;
PROCEDURE Load;
VAR len, adr, dat: INTEGER;
BEGIN RecInt(len);
WHILE len > 0 DO
RecInt(adr);
REPEAT RecInt(dat);
SYSTEM.PUT(adr, dat);
adr := adr + 4;
len := len - 4
UNTIL len = 0;
RecInt(len)
END ;
SYSTEM.GET(4, adr);
SYSTEM.LDREG(13, adr);
SYSTEM.LDREG(12, 20H)
END Load;
BEGIN
SYSTEM.LDREG(SP, MemLim);
SYSTEM.LDREG(MT, 20H);
SYSTEM.PUT(led, 128);
END BootLoad.
Another detail that must not be ignored is the handling of traps. They are
implemented as a single BRL instruction, jumping conditionally to the address
stored in register MT, that is, to entry 0 of the module table (which is not a module
address). This address is deposited by the initialization of module System, which
contains the trap handler. However, traps may also occur during the startup
222 14.1
process. So, a temporary trap handler must also be installed at the very start,
that is, when initializing Kernel.
Finally, it is worth mentioning that small Oberon programs can also be
loaded and executed without the Oberon core. In fact, the boot loader is just one
such example. Programs of this kind must be marked by an asterisk immediately
after the symbol MODULE. This causes the compiler to generate a different starting
sequence Such programs are loaded, like the boot loader in Stage 0, by the Xilinx
downloader. They must not import other modules.
DEFINITION RS232;
PROCEDURE Send(x: BYTE);
PROCEDURE Rec(VAR x: BYTE);
PROCEDURE SendInt(x: INTEGER);
PROCEDURE SendHex(x: INTEGER);
PROCEDURE SendReal(x: REAL);
PROCEDURE SendStr(x: ARRAY OF CHAR);
PROCEDURE RecInt(VAR x: INTEGER);
PROCEDURE RecReal(VAR x: REAL);
PROCEDURE RecStr(VAR x: ARRAY OF CHAR);
PROCEDURE Line;
PROCEDURE End;
END RS232.
DEFINITION PCLink1;
PROCEDURE Run*;
PROCEDURE Stop*;
END PCLink1.
224 14.2
Stages 1 and 2 without any access to the disk. Operating DiskCheck requires
care and knowledge of the structure of the file system (Chapter 7). The available
commands are the following:
parameters action
0 s send and mirror integer (test)
1 a,n show (in hex) M [a], M [a + 4], ..., M [a + n ∗ 4]
2 secno show disk sector
3 secno show head sector
4 secno show directory sector
5 - traverse directory
6 secno clear header sector
7 - clear directory (root page)
The essential command is the file directory traversal (5). It lists all faulty
directory sectors, showing their numbers. It also lists faulty header sectors. No
changes are made to the file system.
If a faulty header is encountered, it can subsequently be cleared (6). Thereby
the file is lost. It is not removed from the directory, though. But its length will
be zero.
Program DiskCheck must be extremely robust. No data read can be as-
sumed to be correct, no index can be assumed to lie within its declared bounds,
no sector number can be assumed to be valid, and no directory or header page
may be assumed to have the expected format. Guards and error diagnostics take
a prominent place.
Whereas a faulty sector in a file in the worst case leads to the loss of that file,
a fault in a sector carrying a directory page is quite disastrous. Not only because
the files referenced from that page, but also those referenced from descendant
pages become inaccessible. A fault in the root page even causes the loss of all
files. The catastrophe is of such proportions, that measures should be taken even
if the case is very unlikely. After all, it may happen, and it indeed has occurred.
The only way to recover files that are no longer accessible from the directory
is by scanning the entire disk. In order to make a search at all possible, every
file header carries a mark field that is given a fixed, constant value. It is very
unlikely, but not entirely impossible, that data sectors which happen to have the
same value at the location corresponding to that of the mark, may be mistaken
to be headers.
The tool performing such a scan is called Scavenger. It is, like DiskCheck,
a simple command interpreter with the following available commands:
parameters action
0 s Send and mirror integer (test)
1 n Scan the first n sectors and collect headers
2 - Display names of collected files
3 - Build new directory
4 - Transfer new directory to the disk
226 14.3
5 - Clear display
During the scan, a new directory is gradually built up in primary store.
Sectors marked as headers are recorded by their name and creation date. The
scavenger is the reason for recording the file name in the header, although it
remains unused there by the Oberon System. Recovery of the date is essential,
because several files with the same name may be found. If one is found with a
newer creation date, the older entry is overwritten.
Command W transfers the new directory to the disk. For this purpose, it is
necessary to have free sectors available. These have been collected during the
scan: both old directory sectors (identified by a directory mark similar to the
header mark) and overwritten headers are used as free locations.
The scavenger has proven its worth on more than one occasion. Its main
drawback is that it may rediscover files that had been deleted. The deletion
operation by definition affects only the directory, but not the file. Therefore, the
header carrying the name remains unchanged and is discovered by the scan. All
in all, however, it is a small deficiency.
CHAPTER 15
In this chapter, a few modules are presented that do not belong to Oberon’s
system core. However, they belong to the system in the sense of being basic,
and of assistance in some way, either to construct application programs, to
communicate with external computers, or to analyze existing programs.
y := c1*x; (*1/ln(2)*)
n := FLOOR(y + 0.5);
y := y - FLT(n);
yy := y*y;
p := ((p2*yy + p1)*yy + p0)*y;
p := p/((yy + q1)*yy + q0 - p) + 0.5;
PACK(p, n+1);
RETURN p
END exp;
ln (a × b) = ln a + ln b
First, the argument x is transposed into the interval [0, π/4] by computing
n := FLOOR(y+0.5); y := (y - n)
and then distinguish between two approximating polynomials depending on
whether x < π/4.
PROCEDURE sin(x: REAL): REAL;
CONST c1 = 6.3661977E-1; (*2/pi*)
p0 = 7.8539816E-1;
p1 = -8.0745512E-2;
p2 = 2.4903946E-3;
p3 = -3.6576204E-5;
p4 = 3.1336162E-7;
p5 = -1.7571493E-9;
p6 = 6.8771004E-12;
q0 = 9.9999999E-1;
q1 = -3.0842514E-1;
q2 = 1.5854344E-2;
q3 = -3.2599189E-4;
q4 = 3.5908591E-6;
q5 = -2.4609457E-8;
q6 = 1.1363813E-10;
VAR n: INTEGER; y, yy, f: REAL;
BEGIN
y := c1*x;
IF y >= 0.0 THEN n := FLOOR(y + 0.5)
ELSE n := FLOOR(y - 0.5)
END ;
y := (y - FLT(n)) * 2.0;
yy := y*y;
IF ODD(n) THEN f := (((((q6*yy + q5)*yy + q4)*yy + q3)*yy +
q2)*yy + q1)*yy + q0
ELSE f := ((((((p6*yy + p5)*yy + p4)*yy + p3)*yy + p2)*yy +
p1)*yy + p0)*y
232 15.1.5
END ;
IF ODD(n DIV 2) THEN f := -f END ;
RETURN f
END sin;
END RecName;
PROCEDURE Task;
VAR len, n, i: INTEGER;
x, ack, len1, code: BYTE;
name: ARRAY 32 OF CHAR;
F: Files.File;
R: Files.Rider;
buf: ARRAY 256 OF BYTE;
BEGIN
IF SYSTEM.BIT(stat, 0) THEN (*byte available*)
Rec(code);
IF code = SND THEN (*send file*)
RecName(name);
F := Files.Old(name);
IF F # NIL THEN
Send(ACK); len := Files.Length(F); Files.Set(R, F, 0);
REPEAT
IF len >= BlkLen THEN len1 := BlkLen
ELSE len1 := len
END ;
Send(len1);
n := len1;
len := len - len1;
WHILE n > 0 DO Files.ReadByte(R, x); Send(x); DEC(n) END ;
IF ack # ACK THEN len := 0 END
UNTIL len1 < BlkLen
ELSE Send(11H)
END
ELSIF code = REC
THEN (*receive file*) RecName(name); F := Files.New(name);
IF F # NIL THEN
Files.Set(R, F, 0);
Send(ACK);
REPEAT
Rec(x);
len := x; i := 0;
WHILE i < len DO Rec(x); buf[i] := x; INC(i) END ;
i := 0;
234 15.2
PROCEDURE Run*;
BEGIN
Oberon.Install(T);
Texts.WriteString(W, "PCLink started");
Texts.WriteLn(W); Texts.Append(Oberon.Log, W.buf)
END Run;
PROCEDURE Stop*;
BEGIN
Oberon.Remove(T);
Texts.WriteString(W, "PCLink stopped");
Texts.WriteLn(W);
Texts.Append(Oberon.Log, W.buf)
END Stop;
BEGIN
Texts.OpenWriter(W);
T := Oberon.NewTask(Task, 0)
END PCLink.
16.1. INTRODUCTION
The design of the processor to be described here in detail was guided by
two intentions. The first was to present an architecture that is distinct in its
regularity, minimal in the number of features, yet complete and realistic. It
should be ideal to present and explain the main principles of processors. In
particular, it should connect the subjects of architectural and compiler design,
of hardware and software, which are so closely interconnected.
Clearly “real”, commercial processors are far more complex than the one
presented here. We concentrate on the fundamental concepts rather than on
their elaboration. We strive for a fair degree of completeness of facilities, but
refrain from their “optimization”. In fact, the dominant part of the vast size
and complexity of modern processors and software is due to speed-up called
optimization. It is the main culprit in obfuscating the basic principles, making
them hard, if not impossible to study. In this light, the choice of a RISC (Reduced
Instruction Set Computer) is obvious.
The use of an FPGA provides a substantial amount of freedom for design.
Yet, the hardware designer must be much more aware of availability of resources
and of limitations than the software developer. Also, timing is a concern that
usually does not occur in software, but pops up unavoidably in circuit design.
Nowadays circuits are no longer described in terms of elaborate diagrams, but
rather as a formal text. This lets circuit and program design appear quite similar.
The circuit description language we here use Verilog appears almost the same as
a programming language. But one must be aware that differences still exist, the
main one being that in software we create mostly sequential processes, whereas in
hardware everything “runs” concurrently. However, the presence of a language—
a textual definition—is an enormous advantage over graphical schemata. Even
more so are systems (tools) that compile such texts into circuits, taking over
the arduous task of placing components and connecting them (routing). This
holds in particular for FPGAs, where components and wires connecting them are
limited, and routing is a very difficult and time-consuming matter.
The development of this RISC progressed through several stages. The first
was the design of the architecture itself, (more or less) independent of subsequent
implementation considerations. Then followed a first implementation called
RISC-0. For this a Harvard Architecture was chosen, implying that two distinct
memories are used for program and for data. For both chip-internal block
235
236 16.1
RAMs were used. The Harvard architecture allows for a neat separation of the
arithmetic from the control unit.
But these blocks of RAM are relatively small on the used Spartan-3 develop-
ment board (1–4K words). This board, however, provides also an FPGA-external
static RAM with a capacity of 1 MB. In a second effort, the BRAM for data was
replaced by this SRAM. Both instructions and data are placed into the SRAM,
resulting in a von Neumann architecture.
The RISC hardware is characterized by three interfaces. The first is the
programmer’s interface, the architecture, that is, those aspects that are rele-
vant to the programmer, in particular, the instruction set. It is described in
Appendix A2. The second is the hardware interface between the processor
core and its environment, described here. The third is that which connects
the environment with physical devices such as memory, keyboard and display.
This is described in Chapter 17.
module RISC5(
input clk, rst, stallX,
input [31:0] inbus, codebus,
output [19:0] adr, // memory and device addresses
output rd, wr, ben, // ead, write, byte enable
output [31:0] outbus); // -- control signals for memory
The main parts of the hardware interface are three busses, the data input
and output busses, the code bus, and the address bus. Signals rd and wr indicate,
whether a read or a write operation is to be performed. ben indicates a byte
(rather than word) access. The entire processor operates synchronously on the
clock clk (25 MHz on Spartan-3), rst is the reset signal (from a push button
on the development board), and stall is the input to stall the processor.
B and C0 are the outputs from the register bank, and A is its input. The
register numbers ira for port A, irb for port B, and irc for port C0 are taken from
4-bit fields of the instruction register IR. C1 is the multiplexer selecting among
the register output C0 and the immediate field imm. s3 and t3 are outputs of
the shift units (Sect. 16.2.1). product is the output of the multiplier (16.2.2),
quotient and remainder those of the divider (16.2.3), fsum that of the floating-
point adder (16.2.4), fprod that of the floating-point multiplier (16.2.5), and
fquot the output of the floating-point divider (16.2.6).
assign A = R[ira];
assign B = R[irb];
assign C0 = R[irc];
assign C1 = q ? {{16{v}}, imm} : C0;
The following represents the main instruction decoding and selection of
results. The opcodes refer to specific values of fields p and op of IR. Note that
if x then y else z is denoted in Verilog by x ? y : z.
assign aluRes =
MOV ? (q ? (˜u ? {{16{v}}, imm} : {imm, 16’b0}) :
16.2.1 SHIFTERS 239
16.2.1. Shifters.
Shifters are multi-way multiplexers. For a 32-bit word, the simplest solution
would be 32 32-way multiplexers. But this is hardly economical. On the FPGA
used here, 4-way muxes are basic cells. It is therefore beneficial, to compose a
shifter out of 4-way muxes. Now the obvious solution is to use 3 levels of muxes
through which data flow. The first level shifts by amounts of 0, 1, 2, or 3, the
second by amounts of 0, 4, 8, 12, and the third by 0 or 16. This scheme is
programmed as follows for left shifts (instruction LSL) with B as input, sc0 =
C1[1:0] and sc1 = C1[3:2] as shift counts, and t3 as output:
assign t1 = (sc0 == 3) ? {B[28:0], 3’b0} :
(sc0 == 2) ? {B[29:0], 2’b0} :
(sc0 == 1) ? {B[30:0], 1’b0} : B;
assign t2 = (sc1 == 3) ? {t1[19:0], 12’b0} :
(sc1 == 2) ? {t1[23:0], 8’b0} :
(sc1 == 1) ? {t1[27:0], 4’b0} : t1;
240 16.2.1
16.2.2. Multiplication.
Multiplication is an inherently more complex operation than addition and
subtraction. After all, multiplication can be composed (of a sequence) of addi-
tions. There are many methods to implement multiplication, all—of course—
based on the same concept of a series of additions. They show the fundamental
problem of trade-off between time and space (circuitry). Some solutions operate
with a minimum of circuitry, namely a single adder used for all 32 additions
executed sequentially (in time). They obviously sacrifice speed. The other
extreme is multiplication in a single cycle, using 32 adders in series (in space).
This solution is fast, but the amount of required circuitry is high.
Before we present the sequential solution, let us briefly recapitulate the
basics of a multiplication p ← x × y. Here p is the product, x the multiplier,
and y the multiplicand. Let x and y be unsigned integers. Consider x in binary
form.
x = x31 × 231 + x30 × 230 + ... + x1 × 21 + x0 × 20
Evidently, the product is the sum of 32 terms of the form xk × 2k × y, i.e. of y
left shifted by k positions multiplied by xk . Since xk is either 0 or 1, the product
is either 0 or y (shifted). Multiplication is thus performed by an adder and a
selector. The selector is controlled by xk , a bit of the multiplier. Instead of
selecting this bit among x0 ...x31 , we right shift x by one bit in each step. Then
the selection is always according to x0 . The add-shift step then is
IF ODD(x) THEN p := p + y END ;
y := 2*y; x := x DIV 2
whereby multiplication by 2 is done by a left shift, and division by 2 by a right
shift: As an example, consider the multiplication of two 4-bit integers x = 5 and
y = 3, requiring 4 steps:
p x y
0000’0000 0101 0000’0011
add y to p 0000’0011 0101 0000’0011
shift 0000’0011 0010 0000’0110
add 0 to p 0000’0011 0010 0000’0110
shift 0000’0011 0001 0000’1100
add y to p 0000’1111 0001 0000’1100
shift 0000’1111 0000 0001’1000
add 0 to p 0000’1111 0000 0001’1000
shift 0000’1111 0000 0011’0000 p = 15
16.2.2 MULTIPLICATION 241
The shifting of x to the right also suggests that instead of shifting y to the
left in each step, we keep y in the same position and shift the partial sum p
to the right. We notice that the size of x decreases by 1 in each step, whereas
the size of p increases by 1. This allows to pack p and x into a single double
register <B, A> with a shifting border line. At the end, it contains the product
p = x × y.
p x
0000 0101
add y to p 0011 0101
shift 00011 010
add 0 to p 00011 010
shift 000011 01
add y to p 001111 01
shift 0001111 0
add 0 to p 0001111 0
shift 00001111 p = 15
p = {B[31:0], A{31:[32-k]},
x = A[31-k:0]
k = 0 ... 31
The multiplier is controlled by a rudimentary state machine S, actually a simple
5-bit counter running from 0 to 31. The multiplier is shown schematically in
Figure 16.3.
The multiplier interprets its operands as signed (u = 0) or unsigned (u =
1) integers. The difference between unsigned and signed representation is that
in the former case the first term has a negative weight (−x31 × 231 ). Therefore,
implementation of signed multiplication requires very little change: Term 31 is
subtracted instead of added (see complete program listing below).
16.2.3. Division.
Division is similar to multiplication in structure, but slightly more compli-
cated. We present its implementation by a sequence of 32 shift-subtract steps,
the complement of add-shift. We here discuss division of unsigned integers only.
q = x DIV y
r = x MOD y
x = q × y + r with 0 ≤ r < y
Both q and r are held in registers. Initially we set r to x, the dividend, and then
subtract multiples of y (the divisor) from it, each time checking that the result
is not negative. This shift-subtract step is
r := 2*r; q := 2*q;
IF r y 0 THEN r := r y END
Numbers are represented in sign-magnitude form. This implies that for sign
inversion only the sign bit must be inverted, and exponent and mantissa remain
unchanged.
Zero is a special case represented by 32 0-bits, and therefore has to be treated
separately. Furthermore, e = 255 denotes “not a number”. It is generated in the
case of arithmetic overflow.
module FPAdder(
input clk, run, u, v,
input [31:0] x, y,
output stall,
output [31:0] z);
246 16.3.1
is
module FPMultiplier(
input clk, run,
input [31:0] x, y,
output stall,
output [31:0] z);
module FPDivider(
input clk, run,
input [31:0] x, y,
output stall,
output [31:0] z);
This is reflected by the following program text, and shown in Figure 16.8.
reg [17:0] PC;
reg [31:0] IRBuf;
wire [31:0] IR;
wire [31:0] pmout;
wire [17:0] pcmux, nxpc;
wire cond;
IR = codebus;
nxpc = PC + 1;
pcmux = (˜rst) ? 0 :
(stall) ? PC : // stall
(BR & cond & u) ? off + nxpc :
(BR & cond & ˜u) ? C0[19:2] :
nxpc;
always @ (posedge clk) PC <= pcmux; end
wire S;
assign S = N ˆ 0V;
assign cond = IR[27] ˆ
((cc == 0) & N | // MI, PL
(cc == 1) & Z | // EQ, NE
(cc == 2) & C | // CS, CC
(cc == 3) & OV | // VS, VC
(cc == 4) & (C| Z) | // LS, HI
(cc == 5) & S | // LT, GE
(cc == 6) & (S| Z) | // LE, GT
(cc == 7)); // T, F
There is, unfortunately, a complication obfuscating the simple scheme presented
so far. It stems from the necessity to initialize the processor. Only registers and
memory blocks (BRAM) can be initialized and loaded by the available FPGA-
tools. How, then, is a program (in our case the boot loader) moved into memory,
the chip-external SRAM? The following scheme has been chosen:
The initial program is loaded into a BRAM (1K × 32). This block is memory-
mapped into high-end addresses in the range of the data stack. On startup, the
flag PMsel is set and IR is loaded from pmout (from the BRAM) at StartAdr.
At the end of the program (boot loader), a branch instruction with destination 0
jumps to the beginning of the program that had just been loaded into SRAM by
the boot loader. This is, presumably, but not necessarily, the operating system.
The following changes and additions are required:
localparam StartAdr = 18’b111111100000000000; // 0FE000H
reg PMsel; // memory select for instruction fetch
reg [31:0] IRBuf;
dbram32 PM ( // BRAM
.clka (clk),
.rdb (pmout), // output port
.ab (pcmux[10:0])); // address
The decoder for output and the multiplexer for input determine the various
addresses of devices:
adr input output
0 0FFFFFFC0 millisecond counter reserved
4 0FFFFFFC4 switches LEDs
8 0FFFFFFC8 RS-232 data RS-232 data
12 0FFFFFFCC RS-232 status RS-232 control
16 0FFFFFFD0 SPI data (SD-card, net) SPI data (SD-card, nat)
20 0FFFFFFD4 SPI status SPI control
24 0FFFFFFD8 PS/2 keyboard
28 0FFFFFFDC mouse
The circuitry connecting with the SRAM is part of this module, whereas
the drivers for the other devices are described in separate modules. Note: The
signals to and from devices must be listed in the heading of the top module,
which is not imported by any other module. Their pin numbers are specified
250
17.2 THE PS/2 INTERFACE 251
in a configuration file (.ucf). For details, the reader is referred to the program
listing, as several items are rather dependent on the given Spartan-3 board.
In the driver for the keyboard a 16-byte fifo buffer is inserted, forming a
queue. This is necessary in order to avoid loss of characters when the processor
is tied up in computation.
module PS2(
input clk, rst,
input done, // "byte has been read"
output rdy, // "byte is available"
output shift, // shift in, transmitter
output [7:0] data,
input PS2C, // serial input
input PS2D); // clock
direction of the movement (left or right). A second light and sensor solve this
problem. The distance between the two lights is half the distance of adjacent
spokes.
17.2.3. The SPI interface for the SD-card (disk) and the Net.
SPI (Standard Peripheral Interface) is similar to PS/2, and also synchronous.
However, there may be many participants. They are configured in a loop as
shown in Figure 17.7, and the clock is provided by a master, namely the RISC.
SPI requires 3 wires (apart from ground).
Here, however, no use is made of SPI’s ring topology. Instead, One master
interface is serving both the disk and the net. The connection is determined in
module RISC5Top. The packet (and thus the shift register) is 32 bits long.
Transmission frequency is 0.4 MHz at startup (as required by the SD-card),
and then is raised to 8.33 MHz. Details are shown in the respective program
listing.
256 17.2.3
Fig. 17.8 Connections between SPI, SD-card, and Net (see RISCTop5.v)
bits for each pixel on the screen. In this case, there is exactly one bit per pixel,
signalling black or white. For a 1024 × 768 pixel display area, 96 KBare required.
The pixel position on the display is not determined by an address. Instead,
data are received by rhe display purely sequentially, and the position is indirectly
determined by two synchronization signals, hsync (for horizontal sync) at the end
of each line, and vsync (for vertical sync) at the end of every frame. This scheme
originates from cathode ray tube (CRT) monitors, where an electron beam is
sweeping the screen. It is deflected by magnetic fields, which require some time
to sweep back. The timing with retrace periods was retained for LCD displays
as a legacy.
The heart of the controller consists of a data buffer (32 bits) fed from memory
and shifted out bit by bit to the display, and of two counters hcnt and vcnt,
representing the horizontal and vertical coordinates.The memory word address
is derived from hcnt and vcnt:
Every line consists of 1024 pixels (32 words). The challenge is to find a design
with as few registers and comparators as possible. There are two signals for
suppressing video data: hblank, vblank. They are needed for turning the light
off durich retrace.
Let us generalize this scheme to displays of w pixels per line and h lines per
frame. Also, let w0 be the number of pixels per line including those of the retrace
time, and h0 be the number of lines including the vertical retrace. Also, let the
number of displayed frames per second be n. Then the pixel frequency is
f = w0 × h0 × n
This will in all probability be different from the system clock’s frequency. There-
fore the need arises for a diffenernt pixel clock. It is generated by the FPGA’s
built-in digital clock manager (dcm). It multiplies and divides the system clock
by selectable factors. Note that the refresh rate may vary within certain bounds
for all brands of monitors. Therefore, a simple factor may be chosen for division
and multiplication. Examples:
258 17.2.4
module VID(
input clk, clk25, inv,
input [31:0] viddata,
output reg req, // read request
output hsync, vsync, // to display
output [17:0] vidadr, output [2:0] RGB);
The input signal start triggers the state machine by setting register run.
The transmitter has 2 counters and a shift register. Counter tick runs from
0 to 1302, yielding a frequency of 25000 / 1302 = 19.2 KHz, the transmission
rate for bits. The signal endtick advances counter bitcnt, running from 0 to 9
(the number of bits in a packet). Signal endbit resets run and the counter to 0.
Signal rdy indicates whether or not a next byte can be loaded and sent.
module RS232T(
input clk, rst, // system clock, 25 MHz
input start, // request to accept and send a byte
input [7:0] data,
output rdy, // status output TxD); // serial data
reg run;
reg [11:0] tick;
reg [3:0] bitcnt;
260 17.2.5