Rather ForthApplicationTechniques
Rather ForthApplicationTechniques
___ NIQUES
otebook
SWiftForth, SwiftX, SwiftOS, pFjx, polyFORTH, and chipFORTH are trademarks of FORTH, Inc. All
other brand and product names are trademarks or registered trademarks of their respective
companies.
This document contains information proprietary to FORTH, Inc. Any reproduction, disclosure, or
unauthorized use of this document, either in whole or in part, is expressly forbidden without
prior permission in writing from:
FORTH, Inc.
Los Angeles, California
www.forth.com
Forth Application Techniques
CONTENTS
Section 1: Introduction
1.1 A Brief History of Forth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 9
1.2 The Philosophy of Forth ..... ... ... ....... . . ................. 9
1.3 Course Hardware & Software . . .. . ... . .. ..................... 11
1.4 Typographic Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 11
1.5 Some Definitions of Terms ................... . .............. 12
Contents 3
Forth Application Techniques
4 Contents
Forth Application Techniques
Contents 5
Forth Application Techniques
6 Contents
Forth Application Techniques
This book was designed to support classes in Forth programming taught at FORTH,
Inc. My colleagues and I at FORTH, Inc. have been training Forth programmers since
1973, and literally thousands of programmers from all over the world have taken
our courses.
Some of the material in this book, therefore, found its way into our training curricu-
lum years ago. Other sections are new; reflecting not only changes in Forth systems
an& usage, but also new ways to present challenging material, such as defining
words (Section 8.6). I have written some parts, and others have been written by
some of the excellent instructors I've worked with over the years. Among these are
Kim Harris, Al Krever, Randy Leberknight, Ned Conklin, and Gary Sprung. I'm grate-
ful to all of them.
The course is intended to introduce Forth to people with little or no prior experi-
ence in the language. However, it does assume the reader is familiar with general
computing concepts (bytes, memory, addresses, data objects, etc.). Prior program-
ming experience in some language is helpful, though not required.
There are three major components of a course:
1. Lectures
2. Supporting text with examples
3. Problems for the student to solve
If you are not using this book in a class, the first element is missing. Realistically,
this means you'll miss a lot of background details and explanations of why things
work the way they do. In that case, we strongly recommend that you supplement
this book with a more detailed text on Forth, such as Forth Programmer's Handbook
(available at www.forth.com).
However, of these three elements, the most important is student problem solving.
No lectures or text can do more than provide a context for working the problems,
which is where the real learning takes place. We urge you, therefore, to work all the
problems in this book on a Forth system. Although we've used FORTH, Inc. products
for our classes and some clearly marked, product-specific functions are discussed
in this book, virtually all the problems can be worked on any ANS Forth system. Use
a programmer's editor to edit your work in files, and load them following instruc-
tions for the system you're using.
When you work problems, always strive for the shortest and simplest solution. No
problem in this book requires more than a few lines of code. Also, try to follow the
style guidelines in Section 11. The most important stylistic element, which we
require of all our students, is to start each definition with a stack comment (dis-
cussed in Section 2.2.2 and Section 11.1.1). By doing this you'll help yourself visual-
ize how your definition should work as you write it.
By following these guidelines you can become not only a Forth programmer, but a
good Forth programmer.
Solutions to the problems in this book are available on our web site:
www.forth.com/forth/answers5.txt
Good luck!
~~~
Elizabeth D. Rather
FORTH, Inc.
SECTION 1: INTRODUCTION
Reference The paper "The Evolution of Forth" gives in-depth details of Forth's history:
www.forth.com/resources/ evolu tion
Forth expert Wil Badenl supplied the following definition of Forth to the committee
working on an ANSI standard for the language:
Introduction 9
J J
"Forth is a language for direct communication between human beings and machines.
Using natural language diction and machine-oriented syntax, Forth provides an eco-
nomical, productive environment for interactive compilation and interpretation of
programs, low-level access to computer-controlled hardware, and the ability to
extend the language itself"
Wil's discussion of this paragraph was interesting, and we include it here:
The first sentence tries to capture the general spirit of Forth. The second sentence
gives the six most important features of Forth:
Using natural language diction and machine-oriented syntax ...
All computer languages claim to use natural language diction. (Diction is what you
find in dictionaries.) Forth encourages you to choose real and complete words -
not words with scattered parts of their insides removed. Forth does this better than
COBOL; the prolixity of COBOL syntax makes programmers economize in their
choice of identifiers. In other languages, restrictions on the length of identifiers
force the prgmr to mmbl. This is the basis for the power of Forth - making it easy
for the computer to understand what you say saves precious resources for more
profitable activity.
Forth provides an economical and productive ...
The economy of Forth is its first, last, and greatest attraction. That's why Forth
excels when resources are scarce, and is why programmers are attracted to it. Pro-
ductivity keeps them there .
... environment ...
Forth is not just a language, it's an environment. You and the language become one
and invade the machine. Forth is the principal research, development, and produc-
tion tool for Forth applications .
... for interactive compilation and interpretation of programs,
Forth is the most interactive of programming languages. Programming and check-
out are not separate phases, but are intermingled in one interactive sharing.
low-level access to computer-controlled hardware,
Assemblers and other languages can get at the innards of machines, but not with
the immediacy of Forth.
and the ability to extend the language itself
Forth is a living language. You do not build or assemble applications. You grow
them.
1. Wil Baden ' s home page (along with his alter ago "Neil Bawd") is at:
http://home .earthlink .net/-neilbawd/
10 Introduction
Forth Application Techniques
This course is based on SWiftForth, a product of FORTH, Inc., running under Win-
dows and Linux.
SwiftForth is a 32-bit Forth implementation, meaning the basic unit of data (single-
precision numbers, addresses, etc.) is 32 bits wide. In Forth, this is called a cell.
However, we've attempted to identify all Forth issues that depend on features of
FORTH, Inc. products, and write all student problems such that they can be worked
on any ANS Forth-compliant systems, as described in the next section.
If the input parameters to a command are described generally rather than given
explicitly, they are shown in a monospaced typeface inside brackets:
<number of milliseconds> CONSTANT MotorDelay
Where we show data, such as strings being entered or examples with computer out-
put, the data and/ or output is shown in monospaced type. Commands you type are
bold, but the computer responses are not:
1024 VALUE ROOMSIZE ok
A line of input you type at the Forth command line is always terminated by the
Enter (or Return) key. In cases where there is some ambiguity, we may indicate the
point at which you hit the Enter key with <c r>.
ROOMSIZE . <cr> 1024 ok
Introduction 11
Forth Application Techniques
which, if any, optional wordsets it includes. So, if a word you're looking for appears
not to be in the system you're using, consult its documentation.
Forth allows any kind of character string to be a valid name, so certain ambiguities
can arise. For instance, Forth subroutines are referred to as words, but word can
mean an addressable unit of memory. To resolve this we will use certain conven-
tions when speaking of common items:
• An 8-bit unit of data is called a byte. For purposes of this book, a byte is considered
the same as a character, although we know in some circumstances characters are of
other sizes.
• The word-length of the processor (e.g., 16 or 32 bits) is called a cell. This is also the
size of an address and of a single item on a stack.
• On a 32-bit processor, a 16-bit item may be referred to as a half-cell.
Each type of value has its own operators, as you will find in the next section.
12 Introduction
Forth Application Techniques
Forth is designed to promote intimate communication between you and the com-
puter. The process of working with Forth is intended to be natural, unencumbered
by complex syntax or by the necessity to change programs and environments dur-
ing the programming process. This section describes basic elements of this process.
The element of communication with Forth is the word, a string of non-blank charac-
ters. Words are separated from each other by one or more spaces (or "white space
characters" in text files). You interact with Forth by typing a word or a group of
words and then pressing the Enter or Return key (denoted in this text as <cr».
For example, try saying 3 the follOwing to Forth:
What you have done is to execute several words. The word is the basic executable
unit in the Forth language. There are two general types of words:
• Primitives, characterized as machine dependent.
• High-level words, defined in terms of primitives and other high-level words.
When you use a word, it doesn't matter whether it is a primitive or high-level word,
but internally they have very different characteristics.
Examples of primitives:
DUP DROP + */
Examples of high-level words:
FORTH QUIT INTERPRET .5 DUMP
Try typing DUP DROP <cr> and then FORTH <cr>. Is there any difference?
Forth Fundamentals 13
Forth Application Techniques
You can type multiple words and parameters on a single Forth command line. All
the keys you type are echoed on the command line and are saved for future inter-
pretation, with a few exceptions:
• Backspace (or Delete) erases the character at the cursor position .
• Enter (or Return) terminates command line input, making the received text available
for interpretation.
If you try to use a word Forth does not know - for example, DIFFICULT - Forth dis-
plays that word followed by an error message, such as a single question mark.
When you press the Enter key, Forth will process your command line in strict left-to-
right order. Numbers will be placed on the stack, and words will be executed. If a
word will expect one or more parameters, you must type them before the word so
they'll be on the stack when the word executes.
Key Action
Page Up Scroll through the history of the current session.
PageDown
Up and Down Retrieve commands you have typed.
Arrows
Left and Right Move the cursor on the command line.
Arrows
Insert Toggle the insert/overwrite mode of typing.
Enter Execute the command line the cursor is on.
Ctrl-C Copy selected text.
Ctrl-V Paste text at the insertion point.
14 Forth Fundamentals
Forth Application Techniques
Key Action
Double-click Selects the word under the cursor, making it available for
the right-click and copy functions.
Right -click Displays a pop-up menu of options to perform on the
selected word or at the cursor position.
Forth Fundamentals 15
Forth Application Techniques
Rule The general rule for stack usage in Forth words is that a word removes its argu-
ments from the stack and leaves only explicit results.
The stack notation shown in Table 2 is often used to indicate the nature of items on
the stack.
4. Some systems require a special word to clear the data stack. Consult your system documentation.
16 Forth Fundamentals
Forth Application Techniques
Item Meaning
addr address
char character (or byte)
d signed double number
ud unsigned double number
len length (usually of a string)
n signed number
u unsigned number
flag Boolean flag (-1 = true, O=false)
x single cell of unspecified type
xt execution token
Where there are multiple items of a particular type on the stack, they are distin-
guished by indexes (e.g., xl x2). In some special circumstances, additional notation
may be used for clarity.
A stack comment describes only the use of the data stack by the word being docu-
mented. It doesn't imply anything about the absolute state of the stack. For exam-
ple, the stack comment ( - ) means this word expects nothing and leaves nothing;
it won't change the stack's state. This does not imply that the stack is empty. It is
generally inappropriate to expect any absolute state of the stack, beyond the items
a word expects or leaves. If words worry only about their own arguments and
results, they have the useful property of being "context independent"; that is, they
work the same, regardless of the absolute state of the stack, as long as their explicit
parameters are provided. Always make sure the words you write follow this basic
rule of stack usage. This will help you write more error-free code.
Now that you understand stack notation, we can describe a few of the words we've
used in examples so far, in a glossary. You'll see more glossaries as we go along.
Glossary
FORTH (- )
Selects the "vocabulary" (list of available words) containing basic Forth commands.
.S (- )
Displays whatever numbers are on the stack without changing the stack (also called
a non-destructive stack display).
DUP (x-xx)
Duplicates the top stack item.
DROP (x- )
Discards the top stack item.
Forth Fundamentals 17
Forth Application Techniques
+ ( nl n2 - n3)
Adds the top two numbers on the stack, leaving the sum.
(n- )
Converts the top number from binary to a string of numeric digits and displays it.
(Udot")
The diagrams in this section illustrate the behavior of some common Forth words
used to manage data items on the stack. The glossary that follows lists common
stack operators with their definitions.
Figure 1. Stack operator DUP
2
11
18 Forth Fundamentals
Forth Application Techniques
2 2
3 1
2 3
--
1 }---- ,;
2
r.,'
.'iL
''ii''
,. .',. ., ':!i'.
Glossary
DUP (x-xx)
Duplicates the top stack item
DROP (x - )
Discards the top stack item.
SWAP ( xl x2 - x2 xl )
Swaps the order of the top two items.
OVER (xl x2 - xl x2 xl )
Copies the second stack item to the top.
ROT ( xl x2 x3 - x2 x3 xl )
Rotates the top three stack items to bring the third one to the top. (Pronounced
"rote" - think of "rotate".)
-ROT ( xl x2 x3 - x3 xl x2)
Rotates the top three stack items to send the top one to third place. ("minus-rote")
Note ROT and -ROT work specifically on the top three stack items. They do not rotate the
entire stack, nor will they work on other than three stack items.
The best way to understand the stack operator words is to put a few numbers on
the stack and try them. Use . s to display the stack, if your Forth system doesn't
automatically show it for you.
Forth Fundamentals 19
Forth Application Techniques
Similar operations work on pairs of stack items. A stack pair might be a double-pre-
cision number, XY coordinates, a string (address and count) or even two unrelated
items.
You'll note that these words all begin with a "2". This is part of each word's name, a
reminder that it deals with pairs of items. The "2" isn't a parameter, and you cannot
invent soup just by typing its name!
In all cases, these operators will preserve the order of items within the pair, manip-
ulating the pair as though it were one double-sized item. These operators are, of
course, used with double-precision numbers, but they are equally useful with other
number pairs such as the address and length of a string, and on any occasion when
it's more convenient to manage stack items two at a time.
The diagrams in this section illustrate the behavior of some common words used to
manipulate double-cell items on the stack. The glossary that follows lists common
double-cell stack operators with their definitions.
Figure 6. Stack operator 2DUP
2 2
20 Forth Fundamentals
Forth Application Techniques
4 2
4 4
3 3
2 2
~
1
''''1i -
" ., :1"
Glossary
20up ( xl x2 - xl x2 xl x2)
Duplicates the top pair of stack items.
20ROP (xl x2 - )
Discards the top pair of stack items.
2SWAP ( xl x2 x3 x4 - x3 x4 xl x2)
Swaps the top two pairs of stack items.
20VER ( xl x2 x3 x4 - xl x2 x3 x4 xl x2)
Copies the second pair of stack items to the top.
2.2.5 Problems
In the problems below, show the operations needed to make the stack picture on
the left (input) look like the one on the right (output). Some problems have more
Forth Fundamentals 21
Forth Application Techniques
It would be good practice to review the previous list of exercises and see if you can
think up a set of circumstances that would require each of the possible transforma-
tions.
For example: 1 2 3 to 1 2 3 1 2 3
This could be required if the three top items were a boundary set and a constant.
Perhaps items 3 and 2 are the XY coordinates of a point in a graph and item 1 is its
intensity, or maybe it's a three-dimensional graph.
Certainly, in a language where almost all the operations you can perform on num-
bers happen on a stack and leave something other than you started with, being able
to duplicate the top several items is essential. What do you think is the likelihood of
having to deal with more than three items? Can you think of any more stack opera-
tors that would be useful?
Forth is one of very few high-level languages to make a stack directly available to
the programmer, and for this reason, many people who are new to Forth find it
alien. However, it has some advantages:
• It saves having to define a lot of variables that are used only for temporary storage.
• It saves the cycles required to fetch and store values, as in many Forth implementa-
tions the top stack items are in a register or otherwise are readily available.
• It gives Forth the characteristic of being concatenative, a fancy word that means you
22 Forth Fundamentals
Forth Application Techniques
can add an operation to a prior sequence of operations that left a value on the stack
simply by doing it; you don't have to fetch the value again, or re-arrange parenthe-
ses, or re-order an "equation."
People who have learned Forth report that learning to use the stack comfortably
was a lot like learning to ride a bicycle: it feels wobbly and awkward at first, but at
some point it's like a switch being thrown in your brain. Suddenly it becomes intui-
tive and natural. Once that happens, you'll wonder why it ever felt so hard!
A primary objective of the first several chapters in this book is to help you get to
that point. In a five-day course, most people find it happens sometime Wednesday!
Before we proceed, let's make a Forth word that extends the system and acts in a
user-friendly manner.
The syntax for this is very simple:
: <name> <some other Forth words> ;
Let's make a word that says "HELLO." To do this, type the following sequence on the
Forth command line exactly as it appears here:
Forth Fundamentals 23
! I
the word. If the word is found, it will be executed; if it can't be found, Forth stops
and gives an error message.
When Forth is compiling a definition (Le., everything following a : and name until
the concluding ;), it compiles references to words instead of executing them. The
form of the reference depends on the Forth implementation. When the compiler
encounters a number, it compiles it as a literal. A definition may only reference pre-
viously defined words; "forward references" are not allowed (although in certain
limited circumstances you can achieve that effect using techniques discussed in
Section 7.2.
Note that every word defined with a : must also have a name (GREET in this case)
and an ending;. Even though a word consisting of only these three things wouldn't
actually do anything, it does have uses, as we shall see later.
Unlike other programming languages, Forth supports names that consist of almost
any combination of characters you wish to use.
This provides extremely flexible naming conventions. For example, in a financial
package one might have words called +DEBITS to add a column of numbers and
%RATE to calculate interest on a loan.
These names may be as long as 255 characters. They may contain upper- and lower-
case letters, numbers, control codes or special characters (for example, * (? ! @ is a
valid name). Although control codes may be used to name a word, we don't recom-
mend using them, as some of these characters do strange things to terminals and
printers, and they can make text difficult to view and edit. Also, be careful of names
that look like valid numeric input!
Forth words can't contain spaces (which act as delimiters between Forth words), or
any control characters that directly control the command-line input process (e.g.
backspace, delete, return/enter).
24 Forth Fundamentals
Forth Application Techniques
Every Forth implementation has some way of managing definitions in source form
on disk that can be compiled. Early Forths used the concept of 1024-byte "blocks"
of source, displayed as 16 lines of 64 characters each, with an internal mechanism
mapping blocks to disk sectors or as files. Most modern Forths, including SWift-
Forth, use text source files. Here we will briefly describe SWiftForth's approach; if
you are using a different implementation, consult its documentation. As you write
more complex definitions to solve problems in this book, we recommend that you
edit your work in files.
SwiftForth lets you use any text editor (e.g., Notepad or a programmer's editor) to
prepare your source. You can also type definitions directly into SWiftForth's com-
mand window. If you wish, you can copy the text of a definition from the command
window to a file or vice versa. If you have source in a file, you may direct Forth's
text interpreter to it with the INCLUDE command.
INCLUDE <filename>
... causes Forth to interpret the text in that file as if it were being typed in the com-
mand window. The only difference is that, when interpreting from a file, SwiftForth
will ignore whitespace characters such as Tab.
For more information on using SwiftForth with source files, see the SwiftForth Ref-
erence Manual supplied in PDF format with the Swift Forth system.
Before we go much further, let's look briefly at some words found in all FORTH, Inc.
products (as well as in many other Forths) that help you examine the system you're
running on as well as your own code and data structures.
Glossary
DUMP ( addr u - )
Display u bytes of memory, starting at address addr (generally formatted for output
as hex digits, but may vary by system).
LOCATE <name> (- )
Display the source (if available) for word name.
SEE <name> (- )
Display the actual compiled code for word name.
Other programming aids are available on most systems; consult your product docu-
mentation for details.
Forth Fundamentals 25
Forth Application Techniques
Always remember, though, that the best debugging aid in Forth is Forth's own inter-
activity. You don't need test harnesses, single steppers, or debuggers to test your
words. All you have to do is supply stack arguments and type a word's name to exe-
cute it. If you keep your definitions short, you will find it incredibly easy to test
your code thoroughly.
Your definitions should all have stack comments, plus other comments to indicate
usage and other information. The following commands support comments in Forth.
Each of these is a Forth word, and must be followed by a space.
Glossary
( (- )
Begin a comment terminated by a ). It can span multiple lines, and can appear
inside or outside a definition. Text within the comment will be ignored.
.( (- )
Like (, but the text will be displayed when the file is interpreted.
{ (- )
Like (, but terminated by}. This allows parentheses to appear within the comment.
\ (- )
Begin a comment terminated by the end of the line on which it appears.
\\ (- )
Ignore all following text until the end of the file.
The size of a single-precision number, a single stack item, and an address are
always the same in ANS Forth. This size is called the cell size (in Forth we use the
term "cell" to avoid confusion with "word," as the latter has other special meanings
in Forth). Forth is available in 16-bit, 32-bit, and 64-bit cell versions. The 16-bit cell
size is typically used with microprocessors and microcontrollers whose native cell
size is 8 or 16 bits, and is used in FORTH, Inc.'s SwiftX cross-development systems
for 8-bit and 16-bit targets.
On 32-bit systems (such as pes and many embedded microcontrollers and micro-
processors), single-precision numbers are 32-bits wide and there is less need for
double-precision math. Similarly, a 64-bit implementation would have 64-bit single-
precision numbers. For maximum portability across all implementations, however,
Forth provides the same set of single, double, and mixed-precision operations on all
systems regardless of cell size, and number conversion follows the same rules.
Because our classroom computers are typically 32-bit systems, we will throughout
this book describe numbers assuming a 32-bit architecture.
26 Forth Fundamentals
Forth Application Techniques
Question Are these valid or invalid numbers on the system you're using?
1234 3.1415 345678 1,234 1982 AD DOG YES 76999
Double-precision numbers may have a preceding minus sign but are distinguished
from Single-precision numbers because they have other punctuation.
ANS Forth specifies that a number with a decimal point to the right of the rightmost
digit will be converted as a double-precision integer. FORTH, Inc. products also
allow other punctuation for readability and convenience; any of the characters {+ .
Forth Fundamentals 27
Forth Application Techniques
, / : } and - (anywhere but the leading position) cause the number to be inter-
preted as double-precision. These are valid double-precision numbers:
111-22-3333
1,234,610
1.00,234
555-1212
12:27:32
7/23/48
Double-precision numbers take up two stack cells, where single-precision numbers
only use one. The high-order cell is on top of the stack.
In the last section, we said that a word in Forth can contain, among other things,
any combination of numbers. You might be wondering how Forth can differentiate
between the two. For now, we will just tell you that if a word isn't recognized, it will
be treated as a number; if that doesn't work, an error message is given. This is dis-
cussed in more detail in Section 6.1 and Section 8.3.
Arithmetic in Forth is done using postfix notation, which means that operands (e.g.
numbers) precede operators (such as +). This is the most convenient method to use
when working with stacks. Complex functions can be written and solved without
the use of parentheses and, due to the nature of integer math, calculations are very
fast. A large set of math operators is provided, some of which are listed below.
Glossary
+ ( nl n2 - n3)
0+ ( dl d2 - d3)
M+ ( dl n - d2)
Add the top two numbers, returning the sum.
28 Forth Fundamentals
Forth Application Techniques
( nl n2 - n3)
D- (dl d2 - d3)
Subtract the top number from the second, returning the difference.
NEGATE ( nl - n2)
DNEGATE ( dl - d2)
Returns the two's complement (negative) of the number on the stack.
ABS ( nl - n2)
DABS ( dl - d2)
Returns the absolute value of the number on the stack.
MAX ( nl n2 - n3)
DMAX ( dl d2 - d3)
Returns the greater of two signed numbers.
MIN ( nl n2 - n3)
DMIN ( dl d2 - d3)
Returns the smaller of two signed numbers.
* ( nl n2 - n3)
M* ( nl n2 - d)
UM* ( ul u2 - ud)
Multiply two numbers, returning the product.
/ ( nl n2 - n3)
M/ (d nl - n2)
Divide the first number by the second (i.e., top stack item), returning the quotient.
/MOD ( nl n2 - n3 n4 )
UM/MOD ( ud ul - u2 u3 )
Divide the first number by the second, returning the remainder and quotient (top
stack item).
MOD ( nl n2 - n3)
Divide the first number by the second, returning only the remainder. There is no
mixed-precision version of this operator.
*/ ( nl n2 n3 - n4 )
M*/ ( dl nl n2 - d2 )
Multiply the first two numbers together, divide by the third, returning the result.
Forth Fundamentals 29
Forth Application Techniques
The following glossary lists some simple operators implemented to act upon single-
precision numbers in the most machine-efficient manner.
Glossary
1+ (xl - x2)
2+ ( xl - x2)
CELL+ (xl - x2)
Add one, two, or the number of bytes in one cell (depends on the target CPU's cell
size) to the top stack item.
1- (xl -x2)
2- (xl -x2)
CELL- (xl-x2)
Subtract one, two, or the number of bytes in one cell from the top stack item.
2* ( nl - n2)
2/ ( nl - n2)
Multiply or divide the top stack item by two.
CELLS ( nl - n2)
Multiply the top stack item by the number of bytes in one cell.
Most of the unusual operators in Forth are designed to take advantage of computer
architecture. For example, most machine multiplication operators leave a double-
length product, and most division operators divide a double by a single. By combin-
ing them, * / gives you extra precision at very low cost. Similarly, most division
leaves both a quotient and remainder; /MOD makes both available to you from a sin-
gle division operation. Simple operators like 1+ and 1- take advantage of increment
and decrement instructions, while 2* and 2/ are usually implemented as left and
right arithmetic shifts, respectively.
... to disassemble the code for 2* so you can see how it is implemented. Try this with
some of the other operators listed above, too!
2.5.2 Scaling
Sometimes you may be asked to multiply two values, say A and B, and divide by
another, C.
One could try A B * C /, but that might give a wrong answer in some cases. Why?
Scaling operators like * / are used to conserve precision. Given ABC on the stack, * /
performs the following steps:
1. First mUltiply A by B, yielding a double-precision intermediate result,
30 Forth Fundamentals
Forth Application Techniques
Among the most obvious of all applications for scaling is the calculation of percent-
ages. Given that a percentage can be defined as a fraction, one can define a % opera-
tor that would return the integer percentage required.
Try this First try to define a word called % with the following stack picture:
( nl n% - n2)
Pretty easy! After all, one only has to perform the following sequence to obtain 10%
of 12300:
12300 10 100 */
How would you calculate 105% of 123007
The concept behind this useful technique is that any irrational constant (that is,
numbers like TC that cannot be exactly represented but potentially have an infinite
number of decimal places) can be represented by a rational approximation with an
error of less than 10E -8. As a convenience, a table of rational approximations is
reprinted below.
As an example, calculate the circumference of a circle. The formula is two times the
radius times TC. Looking in the table below, we find that the value of TC can be approx-
imated by 355/ 113. The formula can now be stated as:
r 2* 355 113 */
5. Hart, John F., et al., Computer Approximations, Krieger Publishing Co., Inc.
Forth Fundamentals 31
Forth Application Techniques
2.5.5 Division
By now you may have noticed there are no multiply or divide operators for double-
precision numbers. Rather than creating a of or 0*, we use the mixed-mode scaling
operation M* f. It works like this:
Given da b c on the stack, first multiply da by b, yielding a triple-precision interme-
diate product. Then divide the result by c, yielding double preciSion once again.
Given any arbitrary polynomial, it may be factored for easier solution. After the fac-
toring, integer math will be sufficient for calculation. For example:
x 5 + 4x4 + 8x 3 - 12x2 - 6x + 17
x(x4 + 4x 3 + 8x 2 - 12x - 6) + 17
x(x(x 3 + 4x 2 + 8x - 12) - 6) + 17
x(x(x(x 2 + 4x + 8) - 12) - 6) + 17
x(x(x(x(x + 4) + 8) - 12) - 6) + 17
To calculate this answer using Forth, define a word like this:
POLY ( n1 -- n2) OUP 4 + OVER * 8 +
OVER * 12 - OVER * 6 - * 17 + ;
32 Forth Fundamentals
Forth Application Techniques
We now know something about mixed-mode operators, so let us go back and exam-
ine how the operator *1 might be implemented. Remember, *1 generates a double-
precision intermediate product, then uses a single-precision divide to produce a sin-
gle-precision result. This could be accomplished by using M* and MI.
Try this Try defining *1 in terms of those operations and notice what the stack problems
are. The problem is clearly what to do with the denominator.
This is not an unusual occurrence. Forth is equipped to handle situations where get-
ting a stack item out of the way for a moment would be perfect, by letting you use
its return stack.
The return stack is not just used for temporary storage. Its principal use is for nest-
ing return addresses for definitions made by the word : ("colon"), hence its name.
For purposes of saving and restoring parameters, the following operators support
limited use of the return stack:
Glossary Note the use of "R:" to denote the return stack picture.
>R ( x - ) ( R: - x )
Takes the top item off the data stack and pushes it onto the return stack.
R> ( - x ) ( R: x - )
Pops the top item off the return stack and pushes it onto the data stack.
R@ ( - x ) ( R: x - x )
Pushes a copy of the top return stack item onto the data stack.
2>R ( xl x2 - ) ( R: - xl x2)
Takes the top item off the data stack and pushes it onto the return stack.
2R> ( - xl x2) ( R: xl x2 - )
Pops the top item off the return stack and pushes it onto the data stack.
In the example a'" (b+c), where the stack picture is ( b c a ), the return stack could be
used this way:
>R + R> *
Warning! What goes on the return stack in a definition must come off the return stack before
the ; at the end of the definition 6 is reached. Also, these operators are only legal
inside colon definitions.
6. Or before an early EXIT is reached. We'll learn more about EXIT later.
Forth Fundamentals 33
Forth Application Techniques
2.7 Problems
2. In the problems below, rewrite the equations using postfix notation and the proper
arithmetic operators.
ABC A2 + 2AB + B2 + C
34 Forth Fundamentals
Forth Application Techniques
Express all arguments and results in whole degrees, without rounding. Use these
names:
F>C
F>K
C>F
C>K
K>F
K>C
Edit your definitions into a file. Each definition should have this stack picture:
( n1- n2)
... where nl is the input in one set of units, and n2 is the output in the converted
units. Test your words with the following values:
• O°F in Centigrade
• 212 °F in Centigrade
• -32 °F in Centigrade
• 16"( in Fahrenheit
• 233°K in Centigrade
• -40"( in Fahrenheit
Forth Fundamentals 35
Forth Application Techniques
36 Forth Fundamentals
Forth Application Techniques
3.1 Introduction
3.2 Conditionals
In the above example, statements following THEN will be executed if A and Bare
equal, or the statements following ELSE will be executed if A and B are not equal.
The traditional syntax may be read like this:
"IF A and B are equal THEN do this stuff, ELSE do this other thing and then CON-
TINUE."
Most programming languages have some form of this statement though the actual
syntax differs among them.
7. O.·J. Dahl , E.w. Dijkstra, C.A.R. Hoare. Structured Programming, Academic Press, London, 1972 ISBN
0·12-200550-3
IF is a destructive word: it removes the top stack item and uses it as a flag. If it is
any non-zero value, the words immediately following IF are executed. If it is zero,
words following an optional ELSE are executed, and THEN is the point at which
unconditional execution continues.
Glossary
IF (f1ag - )
If f1ag is true (non-zero), execute the words that follow; otherwise branch to the
words following ELSE (or following THEN if there's no ELSE).
ELSE (-)
Begin an optional false clause in an IF ... THEN structure. Words following ELSE will
be executed if the value passed to IF is zero.
THEN (- )
Terminates a conditional structure that begins with IF. Note: every IF must have
exactly one THEN in the same definition.
What happens?
Try these examples:
12 12 - TRY
123 0 MAX TRY
-123 123 MIN TRY
123 0= TRY
o INVERT TRY
How does the - get used as an operator for comparison? What logical operation is
performed? What did INVERT do?
It is not necessary to have both a true clause and a false clause; the false clause
(beginning with ELSE) is optional. However, every IF must have a THEN to terminate
the structure. It's "bad form" (inefficient and less readable) to have only a false
clause; if you're tempted to do that, invert the condition before IF using 0= or NOT
(described in the next section) so that you have only a true clause.
Important IF, ELSE, and THEN (as well as all other "structure" words in Forth) should only be
used inside colon definitions. This is because these words are active during the pro-
cess of compilation and they direct the formation of the compiled statement.
The comparison operators (which we'll get to shortly) and all Forth stack and math
operators - in fact, any words that leave a number on the data stack - are good
input to IF.
IF structures may be nested arbitrarily deeply, providing you nest an entire struc-
ture within the next outer structure. You may not use an IF to branch out of or into
another structure.
We strongly recommend that you avoid nesting too deeply, as this can lead to
unreadable, unmaintainable code.
If you are testing for a list of conditions, consider using a CASE statement (discussed
in Section 3.3).
The Forth comparison operators are very easy to use and are also postfix in nature,
just like the math words. They remove items from the stack and return a flag that
IF (or any other word) can use. Here is a list of them (plus some other useful words)
and their behaviors:
0< ( n - (lag)
Returns true if n is less than zero.
0> ( n - (lag)
Returns true if n is greater than zero.
0= ( n - (lag)
Returns true if n is zero, false for non-zero.
00= ( d - (lag)
Returns true for double-precision zero.
NOT ( (lagl - (lag2)
Returns true for zero, false for non-zero.
INVERT (xl - x2)
Inverts all the bits in xl to give x2 (its one's complement).
Note the difference between NOT and INVERT: NOT is a logical (Boolean) operator that
treats the entire flag as true/false. INVERT performs a bitwise operation.
( xl x2 - (lag)
Returns true if xl and x2 are equal.
<> ( xl x2 - (lag)
Returns true if xl and x2 are not equal.
> ( nl n2 - (lag)
Returns true if nl>n2.
< ( nl n2 - (lag)
Returns true if nl <n2.
0< ( dl d2 - (lag)
Returns true if dl<d2. Note double-number comparison.
u< ( ul u2 - (lag)
Returns true if ul<u2. Note unsigned comparison.
?DUP (x - x Ix x )
Duplicate top stack item only if it is non-zero. This is useful preceding IF when you
need a copy of the value in the true part but not the false part (it saves an ELSE
DROP). Note that the stack comment uses a vertical bar on the right side to indicate
alternative results.
WITHIN ( nl n2 n3 - flag)
Returns true if nl ::::: n2 and nl < n3. Note that the range is inclusive on the low end
and exclusive on the high end.
TRUE ( - flag)
Returns a true flag (-1).
FALSE ( - flag)
Returns a false flag (0).
Tip When you're first learning to use the stack, you may find it helpful to format your
definitions vertically, showing stack effects on each line. For example:
RANGE ( n -- ) \ n
500 1000 \ n 500 1000
WITHIN \ t
IF \ [stack empty]
." In range" \ Print if t TRUE
ELSE
." out of range" \ print if t FALSE
THEN ;
As you get more accustomed to using the stack, you may find a less vertical style
easier to read, and priority in your comments should be directed at explaining the
logic of your code.
The structure begins with CASE, which requires a case selector x on the stack. A
series of OF . . . ENDOF clauses follows, each OF preceded by a comparison value on the
stack. The case selector is compared against the test values, in order. If a match is
found, the case selector is dropped from the stack and the code follOwing the suc-
cessful OF is executed, up to its terminating ENDOF. Execution then continues after
the ENDCASE. If the case selector does not match any of the test values, it remains on
the stack after the last ENDOF, and some default action may be taken. Any default
action should preserve the stack depth (use DUP if necessary), because ENDCASE per-
forms a DROP (presumably on the case selector) before continuing execution.
The CASE structure is flexible, and is more readable than nested IF statements if
there are more than two or so comparisons. As with all Forth control structures,
CASE statements may be nested; there may be any number of OF . .. ENDOF pairs; and
there may be any amount of logic inside an OF . . . ENDOF clause. However, if the logic
Glossary
CASE ( -)
Begins a structure that is terminated by ENDCASE, which may have any number of OF
.. . ENDOF clauses between CASE and ENDCASE.
OF ( xl x2 - xl I )
Begins a conditional clause in a CASE .. . ENDCASE structure. If xl = x2, removes both
values from the stack and executes the words between OF and ENDOF. Otherwise,
retains xl on the stack and continues after the ENDOF.
ENDOF ( -)
Terminates a conditional clause begun by OF and transfers to the location following
ENDCASE.
ENDCASE ( x- )
Terminates a structure that is begun with CASE, discarding a value (presumably an
"unknown" not consumed by a successful OF).
Try it with various values of n. Why is the oUP necessary in the next-to-Iast line?
3.4 Problems
1. Define a word that will return true if the top two stack items are both zero.
2. Define a word that will display the phrase "VALID CHARACTER" if n is a printable
character. You can use 32 to 127 as the range of printable characters.
3. Define a word that will leave the value n on the stack if n is not equal to zero. You
could show the stack effect as ( n - n I ), where the vertical bar character I sepa-
rates the stack effect in the true case from the false case.
4. Define a word that will return false if any of the following values equals n, other-
wise returns true:
15,10,5,0
5. Define a word named MAXIMUM (without using the standard word MAX) that accepts
two parameters from the stack, compares them, retains the larger, and discards the
smaller.
MAXIMUM is used in the following way:
1 2 MAXIMUM
2 1 MAXIMUM
After MAXIMUM is executed, the larger of the two numbers is on the stack.
Like the IF ELSE THEN structure, these must be used inside colon definitions. Like
IF, UNTIL and WHILE consider any non-zero value of flag to be true.
Simply stated, these loops work until some condition is met. They are considered
useful for applications where an indefinite number of iterations must be performed
before being terminated by some event. This could be a key being struck, or a timer
going off, or perhaps a boundary being approached.
These two loops fall under the classical definitions of pre-test and post-test loops.
statement.
Here's a similar example that counts up:
: UP ( n - - ) BEGIN ." VALUE IS" DUP . CR
1+ DUP 10 = UNTIL DROP ;
Note that DOWN is able to use ?DUP (which only duplicates non-zero values) because
it's waiting for a zero, but UP has to do a DROP after the loop exits.
If you execute this word, will the loop ever terminate? If no, why not? If yes, under
what circumstances?
The form of an indefinite loop that performs its test before performing the repeat-
able operation is:
BEGIN <words executed at least once> <flag> WHILE
<words executed while flag is true> REPEAT
This structure provides a loop that can execute "zero" times, by putting the test
between BEGIN and WHILE and executing the words between WHILE and REPEAT only if
the argument to WHILE is true: "Do it while this condition is true."
Here are examples similar to the ones in the previous section, but using a pre-test-
ing loop. Note that if your argument to these is the same as the value you're testing
for (0 and 10, respectively), you will get no output:
: DOWN ( n - - ) BEGIN ?DUP WHILE
" VALUE IS " DUP . CR 1- REPEAT ;
... and:
UP ( n --) BEGIN DUP 10 < WHILE
" VALUE IS " DUP . CR 1+ REPEAT DROP
A closely related type of loop, the BEGIN ... AGAIN loop, provides an infinite loop -
one for which there is no exit condition.
BEGIN <words to be repeated indefinitely> AGAIN
An infinite loop is useful for specifying the natural behavior of a system, that is, the
highest-level definition in an application. The behavior of a background (or "con-
trol") task is usually defined this way. A typical task assignment might look like:
: RUN ( --) SAMPLER ACTIVATE INITIALIZE
BEGIN SAMPLE RECORD 1 #SAMPLES +! AGAIN;
... where SAMPLER is the name of a task, and the rest is part of the application.
Question How is BEGIN ... AGAIN different from BEGIN ... 0 UNTIL?
Glossary
BEGIN ( -)
Begins a structure that is terminated by REPEAT, UNTIL, or AGAIN.
WHILE ( f7ag - )
Begins the conditional part of a BEGIN .. . WHILE ... REPEAT structure. Executes the
words between WHILE and REPEAT so long as f7ag is true (non-zero). Exits the struc-
ture by branching to the point following REPEAT when flag is false.
REPEAT (- )
Terminates a BEGIN .. . WHILE ... REPEAT structure with an unconditional branch to
the word immediately follOwing BEGIN.
UNTIL ( flag - )
Terminates a BEGIN .. . UNTIL structure. If flag is false (zero), repeats the loop from
the word immediately following BEGIN. When flag becomes true (non-zero), it exits
the loop.
AGAIN (- )
Terminates a BEGIN ... AGAIN structure with an unconditional branch to the word
immediately following BEGIN.
• Process the key, generating output as shown for the key codes given in this table:
Key Output
I UP
J LEFT
K HOME
L RIGHT
M DOWN
Take care to factor this definition into clearly separate functions; the key to this is
simplicity.
A good name for this is ?WAY (pronounced "which way?").
In this problem, you'll need to use the following words (discussed in Section 5):
KEY ( - char)
Awaits a character from the keyboard and returns its code.
EMIT ( char - )
Displays the character whose code is char.
CHAR ( - char)
Returns the code for the character that follows in the input stream. Generally used
on the command line (or while interpreting a file), not inside colon definitions.
[CHAR] ( - char)
Compiles a literal value of the code for the character that follows. Used only inside
colon definitions. The character code is returned at run time.
Note that you can use KEY interactively to find out what the character code for any
letter is. For example, if you type:
KEY . <cr> I
You need to press <cr> to start executing KEY, which will wait for you to type the
letter and then execute . to type out the value of the character code.
Alternatively, use CHAR to get a character code:
CHAR I . <cr>
In this case, CHAR parses the next token in the input stream and returns its character
code, unlike KEY, which awaits its own input.
Optional Make ?WAY handle both upper- and lower-case characters. This problem will expand
later on, so do a good job!
The next kind of control structure is the definite, or finite, loop. We use this kind of
iteration technique when we know (or can calculate) how many times the loop must
be performed, or at least the maximum number of times it must be performed.
The basic form of a finite loop in Forth is:
<limit> <index> DO <repeated words> LOOP
DO expects two items on the stack, the starting value of the index and a limit. These
items are removed from the stack and are maintained internally (usually on the
return stack). On each iteration, LOOP increments the index by 1 and compares it to
limit; it exits the loop when index becomes equal to limit. If you need a copy of the
current value of the loop index, you can get it by using the word I. Note that I is
only available inside the loop, inside this definition.
There are good examples of this type of structure in Fortran and BASIC, and in most
all other high-level languages. For example, to display the numbers between 0 and
10 in BASIC, one would use the following sequence of statements:
10 FOR 1=0 to 10
20 PRINT I;
30 NEXT I
In Forth, the same effect (0 through 10, inclusive) would be accomplished by the fol-
lOwing word:
: COUNT-UP ( --) 11 0 DO I . LOOP ;
Question What is the difference between these two structures? How many times is the itera-
tion performed in each case?
Loop parameters are checked at the end of the loop, so any loop will always execute
at least once, regardless of the initial values of its parameters. Because the parame-
ters are checked at the end, and if the end condition is met the loop terminates
immediately, it is necessary to use 11 in COUNT-UP to get the number 10 to display.
Because a DO loop with equal input parameters will execute not once but a very large
number of times (equal to the largest possible single-cell unsigned number), the
word ?OO should be used in preference to DO if the loop parameters are being calcu-
lated and might be equal to each other. ?DO will skip immediately to the end of the
loop if the parameters are equal.
Following is a summary of the important words used in finite loop structures. Some
of these are discussed in more detail in the following sections.
Glossary
DO ( nl n2 -)
Begins a structure that is terminated by LOOP, using n2 as the starting value of the
loop index and nl as the limit.
LOOP (- )
Compares the current value of the loop index with the limit, and exits the loop
when they are equal. The index and limit are kept in an internal location, often the
return stack.
?DO ( nl n2 -)
The same as DO, but skips the loop entirely if nl=n2. This extra comparison and con-
ditional transfer impose some extra overhead, so don't use them if you know nl
and n2 can't be equal.
+LOOP ( n -)
Increments the current value of the loop index by the signed value n. Exits the loop
when the index passes the limit (they don't have to become exactly equal).
I (-n)
Returns a copy of the current value of the loop index for the innermost DO LOOpS
structure. May only be used inside a DO LOOP and within the same colon definition.
I' ( - n)
Like I, but returns the current loop limit. I' is a non-standard word that is in fairly
common use. As such, it may not be available in all Forth implementations.
J ( - n)
Like I, but returns the current loop index for the next-outer DO LOOP structure.
LEAVE (- )
Exits a DO ... LOOP immediately, to the point following the next LOOP or +LooP. Usually
used inside an IF ... THEN structure for an early exit when some exception condition
has become true.
UNLOOP (- )
Discards the current loop index and limit. Must be used if you leave the loop using
EXIT rather than LEAVE.
It is important to remember that the finite loop words cannot be arbitrarily mingled
with indefinite loop words. The important distinction is that a finite loop is main-
taining a loop index and limit, whereas the indefinite loops have no such concept.
For this reason, I inside a BEGIN . .. UNTIL structure, for example, is meaningless, and
you cannot use LEAVE to exit from an indefinite loop.
8. For the purposes of this entire section, a DO LOOP refers to any structure that begins with either DO or
?oo and ends with either LOOP or +LooP.
The significant aspect of the DO LOOP in Forth is that the iteration control values are
on the stack prior to the execution of the loop and are removed by the DO word.
Like the IF ELSE THEN and indefinite looping structures, the words DO and LOOP
must be used within a colon definition because they direct the compiler. The param-
eters of the iteration do not have to come from within the same definition as the
loop. You may pass the values to DO from the stack.
In the word COUNT-UP above, the execution would proceed as follows:
: COUNT-UP 11 0 DO I . LOOP ;
1. The values 11 and 0 are removed from the stack by the DO and are established as the
loop parameters, with 0 as the initial index and 11 the limit.
2. The commands between DO and LOOP are executed in order:
I copies the current value of the loop index to the data stack.
. ("dot") prints the value from the stack and removes it.
3. LOOP increments the current loop index by one. If the new index is less than the
limit, it will begin again at the word following the DO. Otherwise it will exit.
How many times was the structure executed?
Question What is the smallest number of times you can make DO LOOP execute?
The reason for the order of the loop parameters - with the starting value on top of
the stack and the limit beneath it - is because this makes it easy to structure a def-
inition that will do something a number of times speCified by a value that is on top
of the stack. For example:
: TRYS ( n -- ) o DO CR ." TRY" I . LOOP;
• 10090
• -50 -60
.-10-20
• 0 -20
• 9 -20
What is the behavior of the LOOP test?
There are times when one would like to count by a number other than + 1. Can you
think of a few examples?
To do this, Forth provides another kind of looping structure that uses the same DO
word, but whose looping word takes a signed increment from the data stack.
That word is +LooP and it is used in the form:
<n> +LooP
...where n is the increment added to the loop counter, each time the loop executes,
prior to the test for +LooP termination. Note that n can be negative, so it is easy to
construct a loop that counts down instead of up.
There are times when it would be nice to stop a loop before the limit has been
reached. This type of condition arises in process control when testing for a limit
switch, or during pattern recognition problems when testing for matches or for con-
vergence of a matrix operation. This is facilitated in Forth by the word LEAVE. When
LEAVE is executed, the loop terminates immediately.
Question How does LEAVE appear to operate? How would you construct such a word?
Sometimes you need not only to exit the loop, but to also exit the word in which the
loop appears. This strategy should be used infrequently, because it violates one of
the basic principles of structured programming - each routine should have only
one entry and one exit. Multiple exits from a word should be avoided, in general,
because they make the program flow harder to follow and your code more difficult
to understand, debug, and maintain. But in situations when it would take a lot more
code and complexity to avoid such an exit, the words UNLooP and EXIT are useful.
The word EXIT causes Forth to leave a definition immediately, and to resume execu-
tion of the next word in the definition that called the word containing EXIT.
A trivial example of EXIT is:
: TEST ( n --) 1. IF EXIT THEN 2 .
o TEST 1 2
1 TEST 1
However, if you use EXIT inside a DO LOOP structure, you can create a problem,
because you would be leaving the loop parameters on the return stack. The word
UNLooP discards the loop parameters for the current nesting level of a DO LOOP struc-
ture. This word is not needed when LOOP completes normally (or via LEAVE), but it is
required before leaving a definition by calling EXIT. One UNLOOP call for each level of
loop nesting is required before leaving a definition.
Warning If you find yourself using EXIT more than once in a word, you should consider re-
factoring it into two or more simpler definitions.
3.6.4 Problems
For example:
17 AVALANCHE 52 26 13 40 20 10 5 16 8 4 2 1
2. Define a word called RANGE that takes a range of numbers and loops between those
numbers until termination or until the iteration counter equals 50. Within the loop,
just type out the current value of the loop index.
For example:
o 40 RANGE displays 0 1 2 ... 39, and
30 60 RANGE displays 30 31 32 ... 50 (then stops).
3. Define a word called STAR that displays an asterisk character (*) on the terminal.
(Hint: This might be implemented using either EMIT or ." which were introduced
earlier.)
4. Define a word called STARS that displays <n> STARS upon request.
s. Define a word called BOX that displays a figure in the following manner.
For example, 3 5 BOX would display a box like this:
*****
*****
*****
6. Define a word called IBOX, which produces a "slanted" box (Hint: SPACE outputs a
single space and <n> SPACES outputs the given number of spaces.)
For example, 5 4 IBOX would display a slanted box like this:
****
****
****
****
****
The following rules must be observed whenever you are using finite loops:
1. DO range values come from the stack. After DO gets them, they are removed and
placed elsewhere (typically on the return stack), and when the loop terminates they
are discarded completely.
2. DO always needs two parameters, although it does not care how they get there. Fre-
quently the limit is supplied as a parameter to the word containing a loop.
3. +LooP also requires a parameter every time it is executed.
4. If you DUP something inside a LOOP (to keep a copy because it would otherwise be
consumed), it will be on the stack when the LOOP terminates. If you don't want it,
DROP it before ending the definition.
s. The stack effect of the phrase inside the loop must be no net change. Otherwise,
In a nested DO LOOP, I always returns the current index for the innermost loop at the
time I is executed.
To get a copy of the next outer loop from within an inner loop, use the word J. If
you need to nest deeper than that, you should factor your inner or outer loop into
separate definitions to facilitate testing and maintainability.
We have noted previously that in most Forth implementations, DO removes its argu-
ments from the data stack and pushes them onto the return stack. This is where
LOOP or +LOOP will manipulate and test them. When the loop terminates, LOOP or
+LooP removes those arguments from the return stack.
Question Knowing this, can you understand why it's imperative that you not try to "jump
out" of a loop?
Because of this behavior, you must obey some other rules when working with loops:
1. Both the beginning and ending of a DO LOOP structure must be inside the same defi-
nition.
2. LOOP or +LooP are the only valid terminators for a structure beginning with DO or ?DO.
3. If you use >R or R> in the same definition as a DO LOOP, they must be either both out-
side or both inside the loop (and at the same level, if there are nested loops).
3.6.8 Problems
3. Define a word called RAMP that could be used, for example, to control a stepper
motor or robot arm. This word takes a single number as input, representing dis-
tance; and it generates that many values, which rise to a maximum amplitude, level
off, and then decrease. Use 7 as the maximum amplitude.
For example, 15 RAMP would give:
1 2 3 4 5 6 7 7 7 6 5 4 3 2 1 ok
4 RAMPwill give:
1 2 2 1
A good solution to this problem uses only one LOOP.
Selection:
<flag> IF THEN
<flag> IF ELSE ••• THEN
Indefinite iteration:
BEGIN <flag> UNTIL
BEGIN ... <flag> WHILE ... REPEAT
BEGIN ... AGAIN
Definite iteration:
<n1><n2> DO ... LOOP
<n1><n2> ?oo ... LOOP
<n1><n2> DO ... <n3> +LooP
<n1><n2> ?oo ... <n3> +LooP
Up to this point, none of the exercises in this book have needed named data stor-
age. We delayed introducing Forth constants and variables in order to ensure that
you get plenty of practice designing routines without using named data items.
In other languages, you can't do much without naming data. In Forth, you can do
quite a lot with just the stack. This is one of Forth's big sources of efficiency. But
more-complex applications usually require named data items and structures. Forth
not only supports this, but offers a unique level of flexibility by allowing the pro-
grammer to define new kinds of data structures, which we'll get to in Section 8.6.
The two generic kinds of data objects are variables (named storage locations) and
constants (named values). A third kind of data object has characteristics of both: it
is a named value that can be changed, whereas constants cannot be changed.
4.1.1 Variables
The word VARIABLE defines a one-cell location in memory for data storage. VARIABLE
is used in the following way:
VARIABLE PLACE
We may describe words like VARIABLE as having two behaviors: a defining behavior
(which creates a member of this class of words), and an instance behavior that is
shared by all words defined by VARIABLE.
These two behaviors of VARIABLE are as follows:
1. The defining behavior creates a dictionary header and allots one cell of data space.
2. When an instance (a word defined by VARIABLE such as PLACE in the example above)
is executed, it pushes the address of its data space onto the stack.
Glossary The following glossary lists defining words for different kinds of variables. Note the
two different stack pictures: the first is for the defining behavior of the word itself
and the second shows the stack effect for an instance of the defining word.
VARIABLE ( - )
( - addr)
Defines a word with one cell of data storage. When a word defined by VARIABLE is
Data Storage 57
Forth Application Techniques
The two things we must be able to do with data are fetch it from an address, and
store it into an address. Each word in the VARIABLE class returns the address of its
data storage.
@ (addr - x)
Fetch the one-cell value x from memory at address addr. Pronounced "fetch."
(x addr - )
Store the one-cell value x to memory at address addr. Pronounced "store."
+! (x addr - )
Add the value x to the cell in memory at address addr. Pronounced "plus-store."
Glossary Prefixing these operators with 'C' (for "character") gives us these byte operators:
C@ ( addr - char)
Fetch the char from memory at address addr. Pronounced "C-fetch."
c! ( char addr - )
Store the one-byte value charto memory at address addr. Pronounced "C-store."
C+! (x addr - )
Add char to the byte in memory at address addr. Pronounced "C-plus-store." C+! is
not a standard word and may not be available in all implementations.
58 Data Storage
Forth Application Techniques
Glossary Prefixing the cell operator names with '2' (for double) gives us these double-cell
operators:
2@ ( addr - d)
Fetch the two cells from memory at address addr. Pronounced "2-fetch."
2! (x addr - )
Store a two-cell value to memory at address addr. Pronounced "2-store." The top
cell on the stack is stored in the first cell in memory.
For all the "store" operators defined above, the address is always on top of the
stack. Therefore, if you have derived data from a complex process, all you need to
do is say, for example, EVENT ! to store that data in a variable named EVENT. Failure
to put these arguments in the right order can be a fatal error.
Continuing with our example above, you could type:
1024 PLACE !
PLACE @ . 1024
The order in which the cells are managed is preserved in 2@ and 2! . The high-order
(Le., the most significant) part of a double-length number is always on top of the
stack, and the top stack item will be stored in the cell with the lower address in
memory (the first cell).
The prefix letters are especially important, because they help you match the correct
operator to each data type. Forth has distinct data types (characters, single-cell
items, double-length items, etc.), but makes no attempt to enforce the matching of
the various operators to specific data types. You can, for example, fetch individual
characters or bytes from a VARIABLE (one cell):
VARIABLE DATA
DATA C@
DATA 1+ C@
Warning Individual access to the bytes of a variable depends on the byte order of the cpu.
For example, many processors are little-endian (the least significant byte of data is
at the lower address in memory) whereas others are big-endian.
Similarly, you can access the individual cells of a 2VARIABLE:
2VARIABLE MORE-DATA
MORE-DATA @
MORE-DATA CELL+ @
This example does not have the byte-order dependency noted above. Further, using
CELL+ to increment the address by one cell can ensure that this usage is portable
across systems of differing cell sizes.
However, most of the time you'll want to use 2@ and 2! with instances of 2VARIABLE,
C@ and C! with words defined by CVARIABLE, and so on. It is almost always an error
Data Storage 59
Forth Application Techniques
to use a larger fetch or store operator with a smaller data type. For example,
DATA 2@
.. .will certainly fetch two cells, but you have no way of knowing what's in the cell
that isn't part of DATA (which was defined as a single-cell variable above).
Tip A word that is particularly convenient for debugging is ?, which queries a variable. It
is usually defined as:
: ? ( addr - ) @ • ;
4.1.3 Constants
The defining word CONSTANT defines a class of words whose behaviors are as fol-
lows:
1. The defining behavior of CONSTANT (used to define members of the class) creates a
header in the dictionary and compiles the number that is on top of the stack into
the dictionary.
2. The instance behavior of CONSTANT (executed by members of the class) pushes onto
the stack the number compiled when the instance was defined.
CONSTANT is used in the following way.
Note that you don't need any words, other than the instance name, to retrieve the
value of a CONSTANT.
The word 2CONSTANT is available to define a constant whose value may be either a
double-precision value or two Single-precision values, depending on your usage.
VALUE lets you assign a name to a number, like CONSTANT does, and later execution of
that name leaves the assigned value on the stack. But the value can be changed with
the word TO (whereas the value of a CONSTANT cannot be changed).
VALUE is used in the following way:
The choice between VALUE and VARIABLE is one of optimization and personal style,
but the choice between VALUE and CONSTANT is more technical: a CONSTANT cannot be
changed at run-time on most systems whereas a VALUE can. This may also affect
60 Data Storage
! I
how the compiler allocates storage for the value, especially in an embedded applica-
tion with a mix of read-only code space and writable data space.
Glossary
CONSTANT (x- )
( - x)
Defines a constant whose value is x.
2 CONSTANT ( xl x2 - )
( - xl x2)
Defines a two-cell constant whose cells hold the values xl and x2. (In the stack nota-
tion, these separate values may be specified as one double-precision value d).
VALUE (x - )
( - x)
Defines a changeable constant whose initial value is x.
TO (x - )
Changes the value of the VALUE whose name follows TO to x.
Prior to this chapter, we had only one kind of "defining word," : (colon). Now we've
introduced several others, and there will be more before this book is done. So let's
take a few minutes to explore the concept of a "defining word" in Forth.
Consider a colon definition. The colon itself is a defining word, and it is followed by
the name of a new definition. The defining behavior of colon creates a header in the
Forth dictionary for the name, and compiles references to all the words following
that name, up to the semi-colon that terminates the definition. The instance behav-
ior of the new word defined by : (colon) executes the words compiled in the defini-
tion.
So, we can generalize a few things about defining words:
• Each defining word makes an instance of a specific class of words, with a class-spe-
cific defining behavior and instance behavior.
• A defining word is always immediately followed by the name of the instance being
defined.
• The defining behavior creates a dictionary entry for the instance; it may do other
things as well, such as allocate data space or compile code.
• The defining behavior may have a stack effect (e.g., the value assigned to a constant
is consumed when the constant is defined).
Forth makes no distinction between "nouns" and "verbs," between data objects and
functions. For example, instances of VARIABLE are executable, just as colon defini-
Data Storage 61
Forth Application Techniques
tions are; the action of an instance of VARIABLE is to push the address of its data
space onto the stack.
In the absence of scoping mechanisms such as word lists (discussed in Section 8.2),
all words - including those defined by CONSTANT and VARIABLE - are global, and
any data space assigned at compile time is static. A common mistake made by pro-
grammers learning Forth is to attempt to use defining words such as VARIABLE
inside a colon definition, hoping to make their instances local to that definition. But
defining words used inside a colon definition are just like other words in that defi-
nition: they will only be executed when the definition is executed, not when it is
compiled! So, appropriate usage would be something like this, with the constant (or
variable, or value, etc.) defined outside the colon definition:
1000 CONSTANT SIZE
: SHOW ( - ) SIZE 0 DO •••
4.1.5 Problems
1. In 1626, Dutch traders bought Manhattan Island from Indians belonging to the Wap-
pinger Confederacy for fishhooks and other goods worth 60 guilders ($24 according
to the 1626 exchange rate 9 ). Suppose those clever Wappingers had deposited their
$24 in the Bank of New Amsterdam at 5.5%. What would their investment be worth
in 1926?
• First do this problem using simple interest:
amount = principal " interest rate ,', time
• Then do it using compound interest, compounding annually (compute the simple
interest every year and add it to the principal).
Hint: Your numbers are going to get very large.
2. Your application needs a set of parameters called UPPER-LIMIT, LOWER-LIMIT,
STARTING-VALUE, CURRENT-VALUE, and INCREMENT. All of these have fixed default
values, but their actual values may change in use.
Define these data items, and a user word LIMITS to set upper and lower limits, and
sets STARTING-VALUE and CURRENT-VALUE to LOWER-LIMIT. Write a word STEP that will
add INCREMENT to CURRENT-VALUE, and then display CURRENT-VALUE. If CURRENT-VALUE
reaches or passes UPPER-LIMIT or LOWER-LIMIT, it should be reset to STARTING-VALUE.
Include a word called DEFAULTS that re-establishes all the default values.
You may define these items using VALUE, VARIABLE, or a combination of the two. If
you wish, write this problem twice, once using VALUE and then using VARIABLE. Com-
pare the resulting code.
9. This is about 0.2 cents per acre. Historians estimate that the purchasing power of 60 guilders was
equivalent to several thousand dollars. This still looks like a good deal until you realize that the Wap-
pingers didn't actually own the island . They nonetheless set an example of business dealing that has
inspired residents of the area to this day.
62 Data Storage
Forth Application Techniques
An array is a named entity referring to more than one item of data. Arrays are com-
monly used to hold groups of numerical data or strings of text.
BUFFER: ( u- )
( - addr)
Create a named array of length u bytes. Its instance behavior is to return the start-
ing address of the array. BUFFER: is not a standard word, but it is available in most
FORTH, Inc. systems.
CREATE ( u -)
( - addr)
Creates a named data object associated with the next location in data space. Its
instance behavior is to return the address of this data space. However, CREATE
doesn't allocate any data space; this must be done with ALLOT or one of the other
memory management words described later.
HERE ( - addr)
Returns the address of the next available location of data space.
ALLOT ( n -)
Allots n bytes of data space, starting at HERE.
Simple arrays may be built using the word BUFFER: preceded by a size in bytes and
followed by a name. When the name is invoked, it will return the address of the
beginning of the buffer. Equivalent results may be obtained by CREATE and ALLOT.
The following lines construct two identical arrays, each 100 bytes long:
CREATE STUFF 100 ALLOT
100 BUFFER: STUFF
The CREATE ... ALLOT sequence is just slightly more "manual," and the individual
words in it can be used to build other kinds of structures as well.
The address and length of a region of memory form the principle arguments for
words used to manage arrays:
Glossary
ERASE (addr u - )
Erases a region of memory (clears it to all binary zeros), given its starting address
and length.
BLANK ( addr u - )
Sets the specified region of memory to blanks (20H).
Data Storage 63
Forth Application Techniques
Try this The following example defines a buffer named STUFF that is SIZE (50) bytes in
length. We then fill the buffer with various values and dump its contents.
50 CONSTANT SIZE
SIZE BUFFER: STUFF
Glossary
64 Data Storage
Forth Application Techniques
: CELLS ( n1 -- n2) 2* 2* ;
: CELL+ ( addr1 -- addr2) 4 + ;
Get in the habit of using CELLS and CELL+ to enhance the portability and readability
of your code.
Analogous words CHARS and CHAR+ are in ANS Forth for incrementing addresses by
characters. That is important if you're writing code that may run on processors for
which a character, a byte, and an address unit are not always the same size. Such
platforms are rare, however, and if you accept the environmental dependency that
these are the same size, you probably won't limit the portability of your code signif-
icantly. For the purpose of this book, we assume they are the same size.
Let's create an array named CATALOG with space for five cells (elements).
We'll define a word named ELEMENT that, given a parameter n on the stack, returns
the address of the nth element of the array.
You can store and fetch data from CATALOG using the index (0-4) of each ELEMENT in
the array.
Finally, we'll define a word named SHOW that prints the entire array.
<n> ELEMENT (The address of the nth element is now on the stack.)
Data Storage 6S
Forth Application Techniques
Element 2 contains 10
Element 3 contains 15
Element 4 contains 20
ok
4.2.3 Problems
1. Create a word named ARRAY that will be used to define a class of arrays in the fol-
lowing way:
<n> ARRAY <name>
.. .where n is the number of one-cell elements in the array and name is the name of
the array.
ARRAY should create a header in the dictionary and allot the specified number of
bytes. Now write a word INDEX that expects a parameter on the stack that is added
to the address of the array instance as an index into the array.
ARRAY is used in the following way:
10 ARRAY STUFF
4 STUFF INDEX
1 STUFF INDEX
2. Use ARRAY to define an array four cells in size. Make words that name each of the
four cells. For example:
4 CELLS ARRAY ITEMS
After your definitions, FIRST returns the address of the Oth cell, SECOND returns the
address of the next, etc.
4.3 Tables
A table is an array whose initial content is specified at compile time. A table defini-
tion usually starts with CREATE followed by one or more uses of , (pronounced
"comma") or c, ("C-comma") to compile specific values.
The word , allots one cell of data space in the next available dictionary location and
stores the top stack item in it. The most common use of , is to compile values into
a table whose name and starting address is defined by using CREATE. Consider this
example:
CREATE TENS 1, 10 , 100 , 1000 , 10000 ,
This establishes a table whose starting address is given by TENS. It contains powers
of ten from zero through four. Indexing this table by a power of ten will give the
appropriate value. A possible use might be:
66 Data Storage
Forth Application Techniques
Data Storage 67
Forth Application Techniques
68 Data Storage
Forth Application Techniques
Forth provides many words used to reference single characters or strings of charac-
ters. Characters may be grouped together and thought of as a string; this group can
then be operated on as a single entity.
...you'll get the character code for the letter "An on the stack. However, inside a def-
inition, you might write:
: ?OIGIT ( char -- flag) [CHAR] 0 [CHAR] 9 1+ WITHIN ;
... which returns true if the character supplied on the stack is a decimal digit. Speci-
fying a character this way makes your code much more readable than simply plug-
ging in the numeric code as a literal, and because [CHAR] compiles the character
code as a literal, there's no difference in program size or performance.
But this strategy is only practical with visible graphic characters. If you're working
with control codes, we recommend defining them as constants, such as:
27 CONSTANT ESC
10. Or inside a colon definition that needs to parse a character code from the input stream and then do
something with it.
Many Forth systems, including those from FORTH, Inc., define BL to return the code
for the space character ("blank") because that character is so frequently used.
5.1.2 Strings
Forth contains several words used to reference strings, compare them, and move
them between different locations. In addition, other words are used for string input
and output.
Most words that operate on exactly one string of characters expect the length of
that string to be on top of the stack with its address beneath:
( addr u - )
Words that operate on two strings expect three or four items on top of the stack:
( addrl addr2 u - )
( addrl ul addr2 u2 - )
In the first stack picture above (three arguments), the single length u applies to both
of the strings instead of using two separate character counts. In the second stack
picture (four arguments), a length is given for each string.
Glossary The primary string management words are shown in the following glossary:
addr,-
Numbered boxes
indicate the order
in which characters
are moved.
10 Numbered boxes
indicate the order
are moved.
Question What would be the effect of using CMOVE> on the strings overlapped as in Figure 11
above? Or of using CMOVE on the second pair? Draw a picture.
The word MOVE tests for overlap, then uses either CMOVE or CMOVE> to do the string
move. Most of the time you will use MOVE, unless you're interested in the following
special behavior of the CMOVE words. (PAD is described in Section 5.1.3.)
A standard working area is used to hold most character strings for processing. This
area is of indefinite size, and its location in memory usually is defined as an offset
from the current top of the dictionary.
Executing the word PAD returns the address of this working area. Because PAD usu-
ally is located relative to the dictionary, any operation that reserves or releases dic-
tionary space, such as VARIABLE or ALLOT, may change the location of PAD, rendering
any data left there inaccessible; therefore, PAD is best used only for temporary oper-
ations, not for storing strings for later use. PAD provides a convenient way to supply
Glossary
PAD (- addr)
Returns the address of a scratch pad that may be used for temporary storage of
strings (or other data).
Many Forth words that manage strings use an internal format called a counted
string. This format stores the string's length (up to 255) in the first byte:
I.. n bytes
I I I
This format is more efficient than a "null-terminated string" (common in other lan-
guages) because the count is stored when the string is acquired, and doesn't impose
a run-time penalty of re-counting the string every time it is used or of testing for the
null terminator.
COUNT is frequently used with counted strings. COUNT takes as its parameter the
address of a counted string. It returns the address of the string's first character and
the length of the string:
addr A string of text in memory
'l 5
I
I HIEI LI LI I 0
addr, COUNT
returns addr2 and 5
length of string
on the stack.
~ H EILILl o
addr2
Glossary
11. This is just an example . The actual implementation of COUNT is usually done in assembler to take
advantage of the CPU's instruction set.
Glossary These words provide for string input from a user input device, such as a keyboard:
ACCEPT ( addr u 1 - u2 )
Awaits up to ul characters from the current user input device, placing them at
addr. Input is terminated by a CR (e.g., the Enter or Return key). ACCEPT returns the
actual count u2 of characters received. If more than ul characters are sent, the
excess characters are discarded.
KEY ( - char)
Waits for a single character from the input device and leaves its character code on
the stack.
KEY? ( - flag)
Returns true if a character is available to be returned by KEY.
EKEY ( - u)
Waits for a single keyboard event on the input device and leaves its value on the
stack. The encoding of keyboard events is implementation defined.
EKEY? ( - flag)
Returns true if a keyboard event is available to be returned by EKEY.
Note that KEY returns a character code in the implementation-defined character set.
In the old days of serial terminals, that was usually just a 7-bit ASCII code. But more
complex input terminals and PC environments have made that inadequate.
A more general word capable of returning any keyboard event (including control
codes and function keys) is EKEY.
Both KEY and EKEY will wait until a key (or keyboard event) is available. If you want
to ask whether a new KEY or EKEY is available, the words KEY? and EKEY? can be used;
these tests are non-destructive. When either returns true, a subsequent use of KEY
(after KEY?) or EKEY (after EKEY?) will retrieve the incoming character or keyboard
event without waiting.
Exercise this definition, typing a mix of letter keys and function keys. Now try the
same definition using KEY? and KEY instead of EKEY? and EKEY. How does its behav-
ior differ?
The word ACCEPT is the main word used by Forth to receive its input stream from
the keyboard. Because ACCEPT will immediately respond to any "editing" characters
such as BS and DEL it encounters (by backspacing in the string), and because CR
(Enter or Return) will terminate input, this behavior makes ACCEPT unsuitable for
communications or for receiving binary data across a serial line. For these purposes,
we recommend KEY or EKEY in a loop.
Just as Forth has both single-character and multi-character words for input, it also
has two for output:
TYPE ( addr u -)
Outputs u characters from addr to the output device.
EMIT ( char - )
Takes one character from the top of the stack and sends it to the output device.
1. Define INSERT ( n char - ) so that the character on top of the stack is inserted into
the counted string at PAD after character n.
2. Define DELETE ( n - ) so that the nth character is deleted from the counted string at
PAD and the string at PAD is shortened accordingly.
3. Define UPPER ( charl - char2 ) to convert lower-case character codes to upper case,
and define LOWER to do the opposite.
WORD is the main workhorse of Forth's text interpreter. It fetches characters from the
input stream (e.g., the terminal input buffer or a text file) - starting at the offset
given by a variable called >IN - to a specified delimiter or the end of the input
stream, whichever comes first, according to the following rules:
1. The input characters are placed in storage, conventionally at the next available loca-
tion in the dictionary (whose address is returned by the word HERE), with the length
of the string in the first byte (Le., a counted string).
2. The dictionary pointer is not modified.
3. The area where the characters are placed is not pre-initialized.
4. WORD returns the address of the counted string on the stack. This is a convenience
for the words that conventionally follow it, such as COUNT TYPE.
The buffer in which WORD returns its parsed string is a transient area subject to fre-
quent re-usc. Therefore, when you use WORD to read a string from the input stream,
you should finish working with the string or move it to another area promptly.
5.2.1 Problems
1. Define a word named I I Mthat accepts a string from the input stream and prints the
string to the terminal:
I I M FORTH <c r> FORTH
I'M NAME <cr> NAME
2. Define a word named MEET that accepts a string from the input stream, prints HI
and the string on the line below, then causes Forth's "ok" prompt to appear on the
line below the string:
MEET FORTH <c r>
HI FORTH
ok
Compiled strings are used for labeling and display, for user prompts, error mes-
(- )
Compiles a string terminated by a " as a counted string. Typically used to build a
string data structure. ("comma-quote")
5" ( - addr u)
Accepts a string from the input stream, either inside or outside a colon definition.
When 5" is used inside a colon definition, the string will be compiled such that,
when the definition is executed, the string address and count will be on the stack.
When 5" is used interpretively, addr u are returned. The string is terminated by a "
(double-quote) character.
C" (- addr)
Like 5", but returns the address of a counted string.
" (- )
Used inside a colon definition, accepts from the input stream a string of characters
terminated by a " (double-quote) and compiles it in the definition such that, when
the definition is executed, the string will be typed out. ("dot-quote")
This will display the message WARM if the temperature value on the stack is greater
than 68, or COOL otherwise.
C" is very similar to 5", but it compiles a counted string and returns its address as a
single argument. Beyond that, it depends on the intended use whether the conve-
nience of passing a single address outweighs the need to use COUNT to get the actual
address and length.
The word ." is used inside colon definitions only to compile a string that will be
output when the word in which it appears is executed. For example:
: GREETING ( -- ) " Hi there" ;
5" and c" also may be executed interpretively, if you need temporary access to a
string outside a colon definition. For example, INCLUDED loads a file, given the
address and count of a string containing the filename. The syntax would be:
s" <filename>" INCLUDED
On many implementations, interpreted 5" and c" use a single buffer to hold the
string. Therefore, successive uses of s" or c" may overwrite the buffer.
Unlike s" and c n which provide temporary string access while interpreting, the word
, " (Ucomma-quote") is use to compile a string into data space. This may be used
after CREATE to compile a named string as in the following example:
CREATE NAME ," Acme Widgets, Inc."
: .NAME ( -- ) NAME COUNT TYPE ;
SwiftForth from FORTH, Inc. includes similar string-defining words to support zero-
terminated strings, strings with embedded control characters, and other similar
structures. See the SwiftForth Reference Manual for details.
Glossary
CR
° (- )
Causes the cursor to move to column of the next line on a display device, or to
output the current line and advance to the next line on a printer.
PAGE (- )
Clears the display and moves the cursor to row 0, column 0. If the device is a
printer, PAGE does a form feed.
AT-XY (xy- )
Moves the cursor to line y, column x.
GET-XY ( - xy)
Returns the current column and row position of the cursor.
Convert the ?WAY problem (Section 3.5.5) to move the cursor one position for each
press of a direction key. The HOME function should position the cursor in the upper-
left corner of the screen (position 0,0). This version should use AT-XY to move the
cursor; all other characters must be sent to the display screen using EMIT.
Glossary
length = 56
It was an ancient ~ariner, and he stoppeth one of three. I
addr ,~ addr j -
length = 7
J mar;ner I
addr2
This phrase:
<addrl> 56 <addr2> 7 SEARCH
... returns these results:
<addr3> 38 -1
... because there is a match starting at addr3 with 38 characters left in the first
string.
Input Output
ACCEPT TYPE
KEY, EKEY EMIT
KEY?, EKEY? MARK
Compare Initialize
COMPARE ERASE
SEARCH BLANK
FILL
Compiling Miscellaneous
," PAD
s" -TRAILING
c" CHAR and [CHAR]
BL
/STRING
When you are testing an application using command-line input, you can take advan-
tage of Forth's interactive nature. Thus, a hypothetical word SCANS whose function
is to perform a user-specified number of scans should expect its parameter on the
stack. Then to perform 100 scans, you could type:
100 SCANS
Such a usage is natural and convenient for the operator and requires no special pro-
gramming to handle the input parameter.
In dialog boxes or menu-driven user interfaces, however, normal Forth syntax is
inadequate. Forth provides several words to help you handle input numbers in a
variety of circumstances. This section describes those methods.
First, let's review the basic algorithm for converting a string to a number:
1. Start with an "accumulator" (normally double-precision) on the stack, whose initial
value is O.
2. Take the most significant (leftmost) character in the string. Convert that digit to its
binary equivalent. For decimal digits, this is as simple as subtracting the character
code for the digit O. Hex digits are more complicated; we'll get to them later.
3. Multiply the accumulator by the value in BASE (the current radix) and add the digit.
4. Repeat steps 2 and 3 until there are no more digits. The resulting integer is in the
accumulator.
This limited summary leaves some open issues, such as what to do when the string
is exhausted, or when you encounter a character that is not a digit. These issues are
handled at various levels by the following input number conversion words.
Glossary
NUMBER ( addr u - n / d )
Attempts to convert string addr u into a binary number, using the radix (e.g., 10 for
decimal, 16 for hex) in BASE. If valid punctuation ( , . + - / : ) is found, returns d; if
there is no punctuation, returns n. If conversion fails due to a character that is nei-
ther a digit nor punctuation, an ABORT occurs.
NUMBER? ( addr u - 0 / n 1 / d 2 )
Like NUMBER, but returns a flag on top of the stack describing the results:
o = the string does not represent a number
1 = no punctuation in string, single number returned
2 = punctuation in string, double number returned.
Number Conversion 83
Forth Application Techniques
84 Number Conversion
J J
attempt to convert that string to binary and, if successful, will leave the result on
the stack, below a flag that describes the result:
... returns the actual length of the string that received into PAD.
NUMBER or NUMBER? may be used to convert a string from another location (e.g., a
string that has not been fetched by use of WORD or ACCEPT). If it is not feasible to
guarantee the trailing space, you may prefer to move the string to PAD, as shown
below, or use >NUMBER and decide what to do when it encounters a non-digit.
12. The trailing space is not included in the count returned by ACCEPT.
Number Conversion 85
Forth Application Techniques
Write a word GET-IP that will accept from the input stream a number of the form
nnn. nnn. nnn . nnn and will break it into four numeric segments that are left in four
consecutive bytes of a buffer called PARTS. Assume that any individual part may
contain one to three digits, and no individual part will be greater than 255.
Forth contains a set words that allow numeric quantities to be output through use
of a pictured output control. These words allow specification of all aspects of the
numeric output format.
In Forth, the description of these words starts at the low-order portion of the field
and continues to the high-order portion. Although this is the reverse of the method
apparently used in other languages, it is the internal conversion process in all lan-
guages. Recall that BASE contains the current conversion radix. The basic algorithm
is the exact converse of the input conversion algorithm, as follows:
1. Start with an unsigned, double-length number. If you'll be treating this as a signed
number, you must first save the sign information (later we'll see how).
2. Divide the number by the value in BASE, getting a quotient and remainder. The
remainder is the value of a single digit; the first time, it's the low-order (rightmost)
digit, moving successively toward higher-order digits.
3. Convert the remainder to a printable character by adding the character code for 0
and appending the resulting character to a string being built from right to left.
4. Continue for as many digits you want, or until the quotient reaches zero (depending
on your choice of words).
5. Discard the quotient, and return the address and length of the string.
These steps are performed by a set of words described in Section 6.2.2, which pro-
vides complete control over the process and the appearance of the reSUlting string.
These words convert numbers on the stack into printable character strings con-
forming to the format requirements. The converted string can be displayed using
TYPE or can be used in some other way.
86 Number Conversion
Forth Application Techniques
In many systems, strings are built in an area in memory that immediately follows
the end of the dictionary (the address left by HERE). This area is large enough to
accommodate at least 32 characters of output (64 characters on 32-bit machines).
All the standard numeric output words use the same region. As a result, these
words may not be executed while a pictured numeric output conversion is in pro-
cess (e.g., during debugging). Furthermore, you must not make new definitions dur-
ing the pictured numeric conversion process, as this would overwrite the area in
which the string is being generated.
( n -)
Displays n as a signed, single-precision integer followed by one space.
.R ( nl +n2 - )
Displays the signed, single-precision integer nl with leading spaces, to fill a field of
width +n2, right-justified. The width of the printed string that would be output by .
is used to determine the number of leading blanks. No trailing blanks are printed. If
the magnitude of the number to be printed prevents printing within the number of
spaces specified, all digits are displayed with no leading spaces in a field as wide as
necessary.
? (addr- )
Displays the contents of addr. Equivalent to the phrase: @
D. ( d -)
Displays d as a signed, double-precision integer.
D. R ( d +n - )
Displays d as a signed, double-precision integer in a field of width +n, as for . R.
U. (u - )
Displays u as an unsigned, single-precision integer followed by one space.
Number Conversion 87
Forth Application Techniques
U.R ( U +n - )
Unsigned version of . R. Displays u with leading spaces to fill a field of width +n,
right -justified.
These words provide control over conversion of binary numbers into digits. Conver-
sion is initiated by d. Throughout the process, the number being operated on is on
the stack, repeatedly divided by the number radix (in BASE) as digits are converted.
The remaining number is discarded by #> at the end of the process.
Glossary
(- )
Begins formatted output of an unsigned double-precision integer.
# ( udl - ud2)
Prep ends the next digit to the string. Must be used between <# and #> (though not
necessarily in the same definition). The first digit added is the lowest-order digit
(units), the next digit is the tens digit, etc. Each time # is used, a digit is generated,
regardless of whether or not it is a significant digit.
#S ( udl - ud2 )
Converts digits repetitively until all significant digits have been converted, at which
point conversion is complete. Must be used between <# and I>. #S always results in
at least one output character, even if the number to be converted is a zero.
SIGN ( n -)
Inserts a minus sign at the current position in the output string if n is negative. The
magnitude of n is irrelevant, only its sign is of interest. In order for the sign to
appear at the left of the number (the usual place), SIGN must be called after all digits
have been converted.
HOLD ( char - )
Inserts a character at the current position in the output string. The character code
to be inserted must be on the stack.
#> ( ud - addr u )
Completes the conversion process after all digits have been converted. This word
discards the (presumably exhausted) double-precision number, and pushes onto the
stack the address of the output string and its count.
Consider one possible definition of the standard Forth word. ("dot"):
. ( n -- ) \ Display n
DUP ABS 0 \ Prepare
<# #S ROT SIGN #> \ Convert string
TYPE SPACE ; \ Output string
88 Number Conversion
Forth Application Techniques
DUP ABS leaves two numbers on the stack: the absolute value of the number on top
of the number itself, which is now useful only for its sign. 0 adds a cell on top of the
stack, so that the 0 cell and the ABS cell form a double-precision integer to be used
by the <# ... #> routines.
To print a signed, double-precision integer with the low-order three digits always
appearing, regardless of their value, you could use the following definition:
NNN ( d -- ) \ Display d, with at least 3 digits
SWAP OVER DABS \ Prepare
<# # # #S ROT SIGN #> \ Convert string
TYPE SPACE ; \ output string
The phrase SWAP OVER DABS puts the signed value beneath the absolute value of the
number to be printed, for use by the word SIGN. The sequence # # converts the low-
order two digits, regardless of value. The word #S converts the remaining digits and
always results in at least one character of output, even if the value is zero.
From the time the initialization word <# executes until the terminating #> executes,
the number being converted is on the stack. It's possible to use the stack for inter-
mediate results during formatted processing, but anything put on the stack must be
removed before any subsequent picture editing or fill characters may be processed.
CHAR - HOLD
The word r • r produces a decimal point at the current position in the pictured
numeric output. To illustrate, the word. $ below will print double-precision integers
as signed amounts with two decimal places:
.S ( d -- ) \ Display a number as dollars & cents
SWAP OVER DABS \ Prepare
<# # # r. r #s ROT SIGN #> \ Convert string
TYPE SPACE; \ Display
If fill characters are likely to be used in several definitions, you may wish to add
commands similar to r • r above.
Number Conversion 89
Forth Application Techniques
The pictured output capabilities described in the preceding two sections are suffi-
cient to handle most output requirements. Special cases, however, such as the intro-
duction of commas in a number or the floating of a character (such as a currency
symbol), require special processing. In order to perform certain of these operations,
it is necessary to refer to the unconverted portion of a number being printed.
This unconverted portion is a number equivalent to the original number divided by
10 (or the current radix) for each numeric digit already generated. For example, if
the initial number is 123, the intermediate number is 12 (following the conversion
of the first digit) and 1 (following conversion of the second digit).
The value of this number may be tested, and logical decisions may be based on its
value. To illustrate, consider the following definitions. The word D. ENG prints a dou-
ble-precision integer in engineering format:
," ( -- ) [CHAR] , HOLD ;
Using techniques similar to those above, you can do almost any kind of numeric
output formatting in Forth.
6.2.6 Problems
1. Define a word named . SSN that accepts a double-precision number from the stack
and prints it in the form of a social security number .
. SSN is used in the following way:
90 Number Conversion
Forth Application Techniques
4. Define a word named OF. that accepts two numbers from the stack. The second
number on the stack is a double-precision number and at the top of the stack is a
single-precision number. The double-precision number is to be printed, together
with a preceding minus sign if the number is negative, with the single-cell number
representing the number of digits to the right of the decimal point.
DF. is used in the following way:
5. The commands BASE? yield 10 regardless of the base in which Forth is operating at
the time. Define a word named ?BASE that will print the base in decimal without per-
manently altering it.
?BASE is used in the following way:13
DECIMAL ?BASE 10
HEX ?BASE 16
OCTAL ?BASE 8
2 BASE ! ?BASE 2
Number Conversion 91
Forth Application Techniques
92 Number Conversion
Forth Application Techniques
... or ...
['] <name>
.. .inside a definition.
The execution token of name may then be passed to the word EXECUTE, which will
execute it. On some implementations, an xt is an address; on others, it is a special
kind of pointer, table index, or offset.
The word EXECUTE expects on the stack the execution token of a definition.
Glossary Note that where two stack pictures are given below, the first is the compile-time
stack effect and the second is the run-time stack effect.
( - xt)
Parses the next word from the input stream and returns its xt. Aborts if the word
isn't found in the dictionary. Pronounced "tick."
['] (-)
( - xt)
Used inside a colon definition. Parses the next word in the input stream at compile
time, and compiles its xt as a literal. Aborts if the word isn't found in the dictionary.
EXECUTE ( xt - )
Executes the word whose xt is on the stack. Arguments to the word to be executed
(if any) must be on the stack below the xt.
7AM ( -- ) CR "Breakfast"
12PM (--) CR "Lunch" ;
6PM ( -- ) CR "supper" ;
Vectored Execution 93
Forth Application Techniques
Typing:
NIGHT SERVE
Note that other actions can occur between NIGHT and SERVE without changing this
behavior. Further, SERVE may be executed as many times as needed without its
action changing: only typing MORNING, NOON, or NIGHT will change the action of SERVE.
The word [' J searches the dictionary for the next word in the definition. If it finds
the word, it compiles the word's execution token into the dictionary as a literal. [' J
must be used inside a colon definition.
To get the xt of a word from the input stream, use ' ("tick"). It gets the next word
from the input stream, looks it up in the dictionary, and returns its xt on the stack.
The phrase @ EXECUTE is so common that FORTH, Inc. systems provide the word
@EXECUTE to save space and CPU time. The behavior of @EXECUTE is the same as the
phrase @ EXECUTE, with the addition of a check on the contents of the address sup-
plied. If it contains zero (which is not a valid xt), @EXECUTE will simply return to the
calling definition without performing any operation. 14 This means that execution
vectors may not require special initialization.
All members of a set of words to be vectored through a single execution vector
must share the same stack effect. That is, they must all require or leave the same
number of items on the stack.
The word DEFER provides a convenient means of managing a single execution vector.
The word IS provides a method to put another word's xt into the vector.
The syntax is:
DEFER <name>
<xt> IS <name>
14. Ignoring a zero xt is appropriate only in cases where there are no stack arguments .
94 Vectored Execution
Forth Application Techniques
DEFER defines name and makes it an execution vector. The execution token of the
word to be executed is stored into the data area of name by the word IS. An error
will occur if name is executed before it has been initialized by IS.
DEFER lets you change the execution of previously defined commands by creating a
slot that can be loaded with different behaviors at different times.
Try this The "mealtime" example above could be defined this way using DEFER and IS:
DEFER SERVE
CREATE GREETINGS
, ENGLISH-GREETING , \ 0
, FRENCH-GREETING , \ 1
, GERMAN-GREETING , \ 2
, AUSSIE-GREETING , \ 3
o CONSTANT ENGLISH
1 CONSTANT FRENCH
Vectored Execution 95
Forth Application Techniques
2 CONSTANT GERMAN
3 CONSTANT AUSSIE
Convert your ?WAY problem (Section 3.5.5 and Section 5.4.1) to use vectored execu-
tion, using your input key (I, ], K, L, M) as a selector.
: IGNORE
I IGNORE BUTTONS ! BUTTONS DUP CELL+ 15 CELLS CMOVE
These lines create a table with one cell for each button, and initialize all positions to
96 Vectored Execution
Forth Application Techniques
contain the address of an empty definition (effectively ignoring any undefined but-
ton). The move and replication of the IGNORE address must be done with a CMOVE
instead of a MOVE, because CMOVE moves bytes from lower to higher overlapping loca-
tions, achieving the replication of the address.
Next, we'll define a word that will insert an xt into a specified cell of the table:
: B: ( n -- ) SWAP CELLS BUTTONS + ! ;
Now we can create definitions and attach them to certain buttons by using B: with
the button number as a parameter. Each such definition will have a name, to allow it
to be tested independently of the button pad. For example,
: ESCAPE 1 ABORT" ? " ;
o B: ESCAPE
... defines Button 0 to be an "escape" button, using the standard Forth abort-with-
message word ABORT" (discussed in Section 8.5.1). This strategy allows you to define
buttons in different parts of the application, whereas the previous method requires
all table entries to be defined before you define the GREETING handler.
All that remains is to define a routine to monitor the button pad and to handle
responses:
MONITOR ( -- ) \ Respond to button presses
BEGIN BUTTON CELLS
BUTTONS + @EXECUTE
AGAIN ;
Typing MONITOR will place the terminal task in an infinite loop that responds to but-
tons. Button 0 will cause an abort and return control to the terminal.
In practice, MONITOR may very likely be executed by a background task (on systems
that support multitasking). In this case, you may need techniques other than ABORT"
(which requires some kind of output device) for halting. Background tasks, and mul-
titasking in general, are discussed in Section 10.
Vectored Execution 97
Forth Application Techniques
98 Vectored Execution
Forth Application Techniques
B.1 Dictionary
The dictionary is a linked list of Forth words. Before a word can be executed, it must
be found in the dictionary. This is done by sequentially searching the dictionary,
starting with the latest definition and searching backward through earlier ones. To
speed these searches, the dictionary may be organized into multiple threads; the
particular thread a word will be in depends upon a hash value computed from its
name and the word list in which it is defined. Word lists are discussed in more
detail beginning in Section 8.2.
Each word in the dictionary shares the same basic structure as shown in Figure 13.
Figure 13. Dictionary structure
previous definition
parameter
field
The head of a Forth word consists of a link field that points to the previous defini-
tion in the dictionary, a length byte that holds the actual number of characters in
the name field, and other fields.
The longest possible name in SWiftForth is 254 characters (and 32 in many other
systems), so a byte is enough to express the count of characters in the word's name.
The head also has control bits used to determine how the word will be handled. One
is called the smudge bit. This bit is set by : (colon) to render the word invisible to
dictionary searches, and is reset by ; (semicolon) to make the word visible again.
This prevents inadvertent recursion caused by a word compiling a reference to
itself, and also prevents calling a word that did not finish compiling due to an error.
Another control bit is the precedence bit. When it is set, the word will be executed by
the compiler when encountered inside a colon definition (instead of being compiled,
like normal words). Words that behave in this manner are called immediate words
(because they execute immediately, rather than later). Examples include DO, LOOP, IF,
ELSE, THEN, and other flow-of-control structure words.
The rest of a dictionary entry consists of the word's code field and parameter field.
The code field identifies a word's run-time code. The parameter field follows the
Advanced Concepts 99
Forth Application Techniques
code field, and varies in length depending upon the type of word. The content of the
code and parameter fields depends upon the strategy used to implement Forth on
this platform. For example, in some cases you might find the parameter field of a
colon definition contains pointers to previously defined words; in other cases, refer-
ences to other words are subroutine calls. Some systems embed code in place.
In Swift Forth, data objects have both a code field and a parameter field (the latter
containing the data or a pointer to the data), but colon definitions only have execut-
able code fields. In the SwiftX cross-compiler, the target system's dictionary is split:
the heads remain in the host, and only executable code and data fields reside in the
target. An optional add-on to SwiftX supports a target-resident interpreter, in which
case some or all target words may also have heads.
The general relationship between dictionary entries is shown in Figure 14.
Figure 14. Linked dictionary entries
latest definition
pointer to ~~~~~
top entry
I link I name I content
[ previou, definition
I link I name I content
[ ,or"'" ",(m,two
I link I name I content
A word in Forth is defined when its entry is created in the dictionary. This process
involves the following steps:
1. The next space-delimited string is parsed from the input stream. This will be the
word's name.
2. The definition is linked to the previous definition in the chain controlling the dictio-
nary search.
3. A pointer is set to the head of the chain containing the new word.
4. Space is allotted for the new word's name.
S. The code field is set to point to the instance behavior code for CREATE.
If there is not enough memory remaining to create the new entry in the dictionary,
some implementation-specific action will occur. This may be as simple as aborting
with an error message ("Dictionary full") or something more complex, like request-
ing more memory from the host operating system.
In Section 8.6, we'll see how you can define new classes of words.
If you have definitions you no longer need and you would like to recover the space
they use, you can use the word EMPTY.
EMPTY resets your working dictionary back to its prior state. This has the effect of
forgetting, or clearing, all the definitions you have entered into the dictionary. This
is frequently useful when you are repeatedly loading a large application to test.
Note that the kernel and the system dictionary are not affected by EMPTY (the system
dictionary contains words loaded when the system boots).
You may wish to subdivide a large application during development so you can
repeatedly reload the portion under test while leaving more stable, underlying sup-
port functions untouched. This may be done with overlays. Because Forth compilers
are so fast, Forth systems rarely support or require run-time overlays of pre-com-
piled code; the kind of overlays we're speaking of here are groups of functions com-
piled from source.
Overlays are facilitated by the word MARKER. The phrase MARKER <name> creates a
dictionary entry for name. When name is executed, it will discard the definition
name and all words defined after name. The dictionary pointer will be reset to the
last definition in the vocabulary before name. Other system-dependent actions may
be taken as well, such as restoration of interrupt vectors (see your system documen-
tation).
MARKER has two uses:
• To discard only some of your definitions. For example, when testing, you may wish
to reload only the last file, not your entire application .
• To create additional levels of overlays.
Suppose your application includes an overlay called GRAPHICS. After GRAPHICS is
loaded, you want to be able to load one of two additional overlays, called COLOR and
B&w, thus creating a second level of overlay. Here is the procedure to follow:
l. Define a marker as the final definition of GRAPHICS, using any word you want as a
dictionary marker. For example:
MARKER OVERLAY
A good place for this definition would be at the end of the graphics load file.
2. Execute OVERLAY to discard any definitions added since it was defined, and then
redefine it (because it forgets itself) on the first line of the load file of each level-two
overlay. The following example might be the first line of an overlay load file Color.f:
OVERLAY MARKER OVERLAY
... the system will forget any definitions compiled after the original definition of
OVERLAY (the one defined in Step 1 above), and will restore the marker definition of
OVERLAY in the event you want to either discard the color definitions and reload
them, or load an alternate level-two set of definitions, such as B&w.
Use different names for your markers to create any number of overlay levels.
SwiftForth includes a more complex overlay support facility capable of saving and
restoring elements of the program's state in addition to the dictionary, such as
default values for DEFER words, Windows settings, and other environmental issues.
This is documented in the SwiftForth Reference Manual.
Now that we have discussed the dictionary in more detail, you might be wondering
how words like IF work. In many systems, there are two versions of IF: one for con-
ditionals in high-level Forth, and another for analogous structures in the assembler.
They are distinguishable from each other in the dictionary because they are defined
in different word lists.
Space in the dictionary is allotted sequentially, with new entries having higher
addresses than older entries. However, when the text interpreter searches the dic-
tionary, it follows a linked list or chain of definitions, starting with the most recent
definition in that chain. There may be several such linked lists intermingled in the
memory space occupied by the dictionary. Such a linked list is called a word list.
Multiple word lists may be searched sequentially. If searching the first fails to yield
a match, a second may be searched, and so on through a specified sequence. A
defined sequence of word lists is called a search order.
At least two standard word lists are provided by most Forth implementations: FORTH
and ASSEMBLER. These hold regular Forth words and assembler definitions, respec-
tively. You may also define your own word lists.
A Forth system usually contains several built-in word lists. To get a list of all the
words in a word list, put the word list in the search order by typing its name fol-
lowed by WORDS. For example:
FORTH WORDS
On SwiftForth, you can also use the Words toolbar button. SWiftForth's Words dia-
log box includes a pull-down list of all the word lists available for display.
Each of the word lists in the system has a unique numeric identifier, called a wid, or
word list identifier. Here are some words for manipulating word lists:
Glossary
<word- 1 i st-name> (- )
Makes the specified word list the first one in the search order, replacing the one
that was previously first.
ALSO (- )
Duplicates the first word list in the search order, increasing the number of word
lists in the search order by one. ALSO is commonly followed by the name of a search
order, which replaces the top word list, so the effect is the new word list is added to
the previous list.
CONTEXT ( - addr)
Returns the address of a user variable that determines the dictionary search order.
CURRENT ( - addr)
Returns the address of a user variable specifying the word list in which new word
definitions will be appended.
DEFINITIONS ( - )
The compilation word list (the one specified by CURRENT) is changed to be the same
as the first word list in the search order.
ONLY (- )
Reduces the search order to contain only the minimum word lists, usually FORTH.
ORDER (- )
Displays the word list names forming the search order in their present search order
sequence. Also displays the word list into which new definitions will be placed (the
CURRENT word list).
PREVIOUS (- )
Removes the first word list (the one in the CONTEXT position) from the search order.
This may be used to undo the effect of an ALSO.
VOCABULARY <name> (- )
A dictionary entry for name is created which speCifies a new ordered list of word
definitions. Subsequent execution of name replaces the first word list in the search
order with name.
WORDS ( -)
Displays the names of all the words of the first word list in the search order.
Word lists may be searched individually or in groups in a specified order. Each word
list may have mUltiple chains to speed dictionary searches. The user variable CON-
TEXT contains the sequence of word lists to be searched. The contents of CONTEXT
may be changed by naming the desired word list, for example:
ASSEMBLER
Hereafter, future searches will begin with the ASSEMBLER word list.
You may display the current search order by typing ORDER.
New definitions are compiled into the word list indicated by the value of CURRENT.
The word list specified by CURRENT may be changed by using the name of the desired
word list, followed by the word DEFINITIONS, which sets CURRENT equal to CONTEXT.
SO, to make Forth search the ASSEMBLER word list and add new definitions to it, use:
ASSEMBLER DEFINITIONS
The word ASSEMBLER sets CONTEXT to that word list. DEFINITIONS then sets CURRENT
equal to CONTEXT. All new words will be compiled into the ASSEMBLER word list until
CURRENT is explicitly changed by a similar statement.
The default word list for both CONTEXT and CURRENT is FORTH. This is set whenever
the system is powered up or the dictionary is emptied.
The word list mechanism offers the potential for an exceptionally powerful security
technique. You can implement this by setting up a special application word list con-
sisting of a limited number of commands guaranteed to be safe for users. You then
ensure no application word can change CONTEXT, and CONTEXT is set so the text inter-
preter will only search the application word list.
This has the effect of sealing a task into its limited word list and rendering all other
words unfindable. Here is how a sealed word list is constructed:
1. Define a new word list for the findable words. For example:
VOCABULARY APPLICATION
2. Place all user definitions in the APPLICATION word list by declaring:
APPLICATION DEFINITIONS
Note that, when definitions for the APPLICATION word list are being compiled, FORTH
must be included in the search order:
ONLY FORTH ALSO APPLICATION DEFINITIONS
The words you type are processed by a text interpreter. Although its basic function-
ality is the same in all Forth systems, factoring and implementation details vary. A
general diagram of the process is shown in Figure 15.
Functions for processing the input stream, searching the dictionary, and attempting
number conversion are shared by the compiler. The difference in behavior depends
upon a variable called STATE. If STATE is zero, a word found in the dictionary will be
executed or a successfully converted number will be pushed on the stack. If STATE is
non-zero, a word will be compiled (unless it is IMMEDIATE, as described in Section
8.4) or a number will be compiled as a literal.
The text interpreter character pOinter >IN points to the character immediately fol-
lowing the last word that was interpreted from the input stream. This is a relative
pointer indicating (in characters) how far into the input stream the interpreter has
gone.
There are three ways of leaving the interpreter:
1. By successfully reaching the end of the input stream, in which case Forth says ok.
2. By aborting on a stack underflow (e.g., if a word needed an argument that was not
there).
3. By aborting if the string was not a valid word or valid number.
Either of the last two exit conditions will generate an error message.
The terminal is the default source for the input stream passed to INTERPRET. The
interaction between your typing and the text interpreter's processing of the com-
mands you type is controlled by the word QUIT.
QUIT is the outer loop of an interactive Forth (e.g., in SWiftForth's command win-
dow). This means that all you must do to use Forth is start up the system and you
ye
yes no
ABORT
2. Upon receiving the end-of-line character, QUIT calls INTERPRET to attempt to execute
each word in sequence.
3. Upon successfully finishing the line in INTERPRET, QUIT displays the message "ok."
( QUIT )
1
Clear the return stack,
initialize the
interpreter.
+).
~
•
keyboard
or file 1 Refill the input stream .
~
Interpret the no
input stream.
1
Stack
underflow?
~>-.I
Abort with an
error message.
Try this Type the Enter (or Return) key over and over again. Try typing the word QUIT.
Note: The "ok" message does not mean that what you executed was okay, it just
means that Forth was able to complete whatever tasks you assigned it and is ready
for a new command line to be entered.
ANS Forth defines a total of four potential sources for the input stream that is
passed to the text interpreter:
1. The terminal input buffer (as acquired by QUIT), discussed in Section 8.3.1.
2. Disk blocks, or 1024-byte chunks of disk, typically supported by native Forth sys-
tems and classical Forth implementations. Blocks are discussed in detail in Forth
Programmer's Handbook.
3. Text files, as discussed in Section 9.1.4.
4. An arbitrary string passed to the word EVALUATE.
The current input stream is defined by the word SOURCE-ID and the variable BLK, as
follows:
• If SOURCE-ID returns zero and BLK contains zero, the source is the Terminal Input
Buffer (TIB) and its length is in #TIB.
• If BLK contains non-zero, its value is assumed to be the number of a disk block to
interpret, and SOURCE-ID is ignored.
• If BLK contains zero and SOURCE-ID returns -1, the input stream is a text string
whose address and length were passed to EVALUATE.
• If BLK contains zero and SOURCE-ID returns a non-zero value (other than -1 as noted
above), the value returned by SOURCE-ID is assumed to be the fileid of a text file to
interpret line-by-line by using INCLUDE-FILE (or higher-level INCLUDE words) or by
reading successive lines into a buffer using REFILL.
The state of the current input stream also includes >IN, which starts with a value of
zero when a new input source is being processed and progresses with each inter-
preted word. There also may be implementation-specific parameters.
It is possible to nest input streams. For example, if you type:
INCLUDE <filename>
.. .the current state of the terminal input buffer will be saved during the processing
of filename and, if that file also contains INCLUDE commands, they will nest as well.
The word SOURCE returns the address and length of the current input stream, what-
ever it may be. If the input stream is coming from disk, the address is that of text
that has been read from disk and is available for processing.
Glossary Following is a summary of the words used for managing the input stream.
BLK ( - addr)
Contains zero or the block number of a block being interpreted.
SOURCE-ID ( - addr)
Contains zero, -1, or the fileid of a file being interpreted.
>IN ( - addr)
Contains a relative pointer to the next character in the current input stream to be
interpreted.
SOURCE ( - addr u )
Returns the address and length of the current input stream.
REFILL ( - flag)
Attempts to refill the current input buffer, either from the keyboard or, when
SOURCE-ID is non-zero, from the file indicated by its fileid. flag is true if the opera-
tion succeeded.
EVALUATE ( i*x addr u - j *x)
Makes the string at addr u the current input buffer and interprets it. Stack com-
ments i*x and j *x represent possible parameters and results, respectively.
EVALUATE typically is used to pass a command string to another task (or even to a
remote Forth system), or to process a command string acquired from a source other
than the normal input stream (such as a Windows dialog box).
8.4 Compiler
The action of the words: and; have been implied throughout this book. Each of
them actually has two behaviors, one at compile time and another at run time.
: NAME <words>;
At compile time, the word: is executed. It makes a new dictionary entry (e.g., using
CREATE) and sets a smudge bit in the name field to prevent unintended recursion
(the word RECURSE may be used in the definition if the word must call itself). Then :
proceeds by compiling the words making up the definition into the body of the
word being defined. The exceptions to this are immediate words and literal values
that execute at compile-time. The run-time behavior of : is responsible for causing
the instructions at these addresses to be executed.
The compile time behavior of the word ; is to un-smUdge the definition (making it
findable in the current search order) and to compile run-time code to exit the defini-
tion so the word can return to its caller.
As we will soon see, it is sometimes useful to be able to turn the compiler off and
on while compiling a colon definition. The words that do this are:
When [ is encountered, compilation stops and the interpreter pushes the two values
onto the stack, performs the division, and leaves the result 64 on the stack. Then]
resumes compilation, and the word LITERAL compiles the 64 from the stack into the
dictionary. When FAST is executed, LITERAL pushes the contents of the next cell in
the dictionary onto the stack. There are several advantages to this method. First, it
is faster because the arithmetic is performed only once. Second, it uses less diction-
ary space.
This scheme can also be used to document the origin or derivation of a number. For
example:
<words> [#MEALS #DAYS *] LITERAL <more words> ...
... makes sense as a number of meals per day times the number of serving days, both
previously defined as constants. The arithmetic product itself might be less mean-
ingful if it were inserted as a numeric literal.
xt of a definition but have no need for a name. Of course, in most such cases, having
a superfluous name is not harmful, but you may wish to avoid it to save space or to
provide a measure of security.
The word: NONAME begins compiling a definition which is, in most respects, identical
to a normal colon definition - but it has no head. Instead, its xt is left on the stack.
For example:
:NONAME ( --) 1 ABORT" uninitialized vector" ;
VALUE DEFAULT DEFER ACTION DEFAULT IS ACTION
Here, the xt returned by : NONAME is the argument to VALUE, becoming the value
returned by DEFAULT. It is used to initialize ACTION and may be used to reset it or to
initialize other DEFER words.
There are two layers of exception-handling words in Forth. The high-level words
described in the Section 8.5.1 have been in Forth for many years. The more flexible,
low-level words in Section 8.5.2 were added by ANS Forth in 1994.
Glossary
Forth provides several error handling methods. ABORT and ABORT" may be used to
detect errors. However, they are relatively inflexible: they unconditionally terminate
program execution and return to the idle state. The words CATCH and THROW, dis-
cussed in this section, provide a method for propagating error handling to any
desired level in an application program.
Glossary
.. .is the typical syntax. At the time CATCH executes, there may be other items on the
data stack. Before CATCH executes the word, it will save information about the cur-
rent data and return stacks, and possibly other environmental data (called an excep-
tion frame), so that if an error occurs it can use this information to attempt a
recovery.
After the routine called via CATCH has executed and control has returned to the rou-
tine that did the CATCH, there are two possible situations. If the lower-level routine
(and any words it called) did not cause a THROW to execute, the top stack item after
the CATCH will be zero and the remainder of the data stack may be different than it
was before, changed by the behavior of the lower-level routine. If a THROW did occur,
the top stack item after the CATCH will contain the non-zero throw code, and the
remainder of the data stack will be restored to the same depth (although not neces-
sarily to the same data) it had just before the CATCH. The return stack will also be
restored to the depth it had before the CATCH.
When THROW executes, it takes a throw code from the top of the stack. If this code is
zero, THROW does nothing except to remove the zero; the remainder of the stack is
unchanged. If the throw code is non-zero, THROW returns the code on top of the
stack, restores the data stack depth (but not necessarily the data) to its value when
CATCH was executed, restores the return stack depth, and passes control back to the
routine that made the CATCH. If a non-zero THROW occurs without a corresponding
application-program CATCH to return to, it is treated as an ABORT.
Exception frames are placed on an exception stack in order to allow nesting of
CATCH and THROW. Each use of CATCH pushes an exception frame onto the exception
stack. If execution proceeds normally, CATCH pops the frame; if an error occurs,
THROW pops the frame and uses its information for restoration.
The upper-level word TRY-IT calls the high-risk operation DO-IT (which, in turn, calls
COULD-FAIL) using CATCH. Following the CATCH, the data stack contains either the
character returned by KEY and a zero on top, or two otherwise-undefined items (to
restore it to its depth prior to the CATCH) and a 1 on top. any non-zero value is inter-
preted as true, the returned throw code is suitable for direct input to the IF clause
in TRY-IT.
As a further example of the use of CATCH and THROW, here is a possible implementa-
tion of our ?WAY problem:
KEYCASE ( -- )
KEY CASE
[CHAR] I OF <"up" code> ENDOF
[CHAR] J OF <"1 eft" code> ENDOF
( etc. )
$lB ( esc) OF 123 THROW ENDOF
DUP EMIT
ENDCASE ;
SAFE-WAY ( -- )
BEGIN ['] KEYCASE CATCH
DUP IF DUP 123 <> IF THROW ( error )
ELSE ." Escaped!"
THEN THEN
( 0 or 123) UNTIL ;
Note that the code following the CATCH in SAFE-WAY checks whether the throw code
that was returned is one we're prepared to process here; if it is not, it just does a
THROW to a higher CATCH. You may similarly place a CATCH around any application
word that may generate an exception whose management you wish to control. Typi-
cally, a CATCH is followed by a CASE statement to process possible THROW codes that
may be returned, although if you are only interested in one or two possible THROW
codes at this level, an IF ... THEN structure may be more appropriate, as in our exam-
ple. This strategy lets you handle errors at whatever level in an application is best
positioned to take appropriate action.
It's highly advisable to have a CATCH around the highest-level word in your applica-
tion to handle any THROW that wasn't handled at a lower level. At the top level of
SwiftForth, the text interpreter provides a CATCH around the interpretation of each
line processed. SwiftForth handles errors detected by the text interpreter's CATCH by
displaying a descriptive error message in the debug window. We recommend that
the highest level in your application provide a global CATCH, as well, for any excep-
tions you have not elected to handle at lower levels.
Certain negative throw codes have been given special meaning by ANS Forth, as
shown in the table below. Not all of these potential exceptions are actually checked
for, in most implementations, but their codes are reserved and should not be used
for any other purpose. Other negative codes are available to the system for system-
handled events. Positive throw codes are available for application use.
Table 7: Throw codes
Code Meaning
-1 ABORT
-2 ABORT"
-3 stack overflow
-4 stack underflow
-5 return stack overflow
-6 return stack underflow
-7 do-loops nested too deeply during execution
-8 dictionary overflow
-9 invalid memory address
-10 division by zero
-11 result out of range
-12 argument type mismatch
Advanced Concepts 11 5
Forth Application Techniques
The most unique feature of Forth, the most powerful, and often the most difficult
for students to grasp, is its ability to create new classes of words and to specify the
run-time behavior of the instances of those classes.
We have seen a number of defining words so far. Each has associated with it two
behaviors:
• One is exhibited when the defining word itself is executed to make a new instance
of its class of words. We call that its defining behavior.
• The other is the behavior shared by all instances of this class when they are exe-
cuted. We call that its instance behavior.
Glossary This glossary summarizes some of the defining words we're already familiar with.
(- )
Makes a definition whose content consists of a list of procedure calls, terminated by
; . A word defined by : executes the procedures in the order speCified.
CREATE ( -)
Makes a definition associated with the next location in data space (but doesn't allot
any space). A word defined by CREATE pushes onto the stack the address of the asso-
ciated data space (Le. the parameter field address).
VARIABLE ( - )
Makes a definition associated with the next location in data space. Allots one cell of
data space. A word defined by VARIABLE pushes the address of it data space onto
the stack.
2VARIABLE (-)
Like VARIABLE, but allots two cells of data space.
VALUE (X - )
Makes a definition with a specified single-cell value x. A word defined by VALUE
pushes its current value onto the stack (unlike a word defined by VARIABLE, which
pushes its address).
CONSTANT (X -)
Makes a definition with a specified single-cell value x. A word defined by CONSTANT
pushes the value onto the stack.
2CONSTANT ( xl x2 - )
Makes a definition with a specified cell pair. A word defined by 2CONSTANT pushes
the two cells onto the stack in the same order in which they were provided when the
definition was made.
The form of a defining word is:
<name> <defining behavior>
DOES> <instance behavior> ;
The defining behavior must include CREATE or a word that calls CREATE. The word
DOES> terminates the defining behavior, and sets the instance behavior of all words
defined by name to the words following DOES>. When the instance behavior begins
to execute, the address of the data space associated with the instance will be
pushed on the stack, on top of any explicit parameters the instance may expect.
As an example, one might define 2CONSTANT this way:
: 2CONSTANT ( n1 n2 --) CREATE , ,
DOES> ( -- n1 n2) 2@;
As a general rule, we don't show in the stack comment the instance address for the
DOES> part of the definition. This stack comment is intended to show a user how to
use these words, and it isn't the user's responsibility to provide this address.
For example, the definition:
ARRAY ( n --) CREATE DUP, CELLS ALLOT
DOES> (n -- a) SWAP OVER @ OVER < OVER
1 < OR ABORT" Out of Range" CELLS +
... will create a new class of words whose instance behavior (following DOES» is to
index into an array, after verifying that the requested index n is valid. Its usage is:
10 ARRAY F-STOP
7 ARRAY SHUTTER
4 ARRAY ASA
The words F-STOP, SHUTTER, and ASA are instances of the class ARRAY. All will have
the same behavior, just as each word defined with CONSTANT will have the same
behavior (namely, to push its value on the stack). If our application is a computer-
controlled camera, we will need various values of F-STOP, SHUTTER speed, and ASA
film speed. We will store into or retrieve those values by index number:
3 SHUTTER
... will return the address of the third shutter setting. Let's discuss how this works.
The defining behavior of the word ARRAY is to save space for the number of values
that will fit into the array, and to ALLOT that many CELLS in the dictionary. The word
DOES> marks the end of the defining behavior. All the words following DOES> in the
definition make up the instance behavior, i.e., these words will be executed when an
instance of this class of words is executed. The instance's action (SHUTTER, for exam-
ple) is to retrieve the number of entries in the table, to verify that the requested off-
set is within the table and, if it is, to compute the address. The relationship of the
dictionary entries for ARRAY and the member word SHUTTER is shown in Figure 17.
Figure 17. Defining word's defining and instance behaviors
The various time sequences for this definition of ARRAY are shown in Figure 18.
1 2 3
I
ARRAY ...
I
t
ARRAY 7 ARRAY SHUTTER
I
T
SHUTTER <index> SHUTTER
~
addr
We have just seen an example of the use of a defining word to create one-dimen-
sional arrays. Now create a defining word named 2ARRAY that will be used to define
a class of two-dimensional arrays in the following way:
<nl> <n2> 2ARRAY <name>
... where nl is the number of rows, n2 is the number of columns, and name is the
name of the new array. Remember that the cell referred to by 2 3 is not the same as
that referred to by 3 2!
The compile-time portion of 2ARRAY must CREATE a header in the dictionary and
ALLOT the correct number of cells. The run-time portion of 2ARRAY expects two
parameters on the stack, which must be used to compute the index into the array.
CREATE
DOES>
VARIABLE
2VARIABLE
CONSTANT
2CONSTANT
Most modern Forth implementations run under host operating systems (such as
SWiftForth, which runs under Windows and Linux). ANS Forth introduced a useful
set of general-purpose words for accessing host OS files, whether they are used for
program source or data storage, called the File Wordset. The most important of
these are discussed in this section; a more extensive treatment of the File Wordset
may be found in Forth Programmer's Handbook.
The File Wordset depends on several basic assumptions and special terms:
• Files are provided by a host operating system.
• File state information (e.g., current position in the file, size, etc.) is managed by the
OS. File sizes are dynamically variable, so write operations will increase the size of a
file as necessary.
• Filenames are represented as character strings. The format of the names is deter-
mined by the host operating system. Filenames may include system-specific path-
names.
• A file identifier (fileid) is a single-cell value passed to file operators to refer to spe-
cific files. The nature of a fileid value depends on the host OS. Opening a file assigns
it a file identifier, which remains valid until the file is closed. When the text inter-
preter is using a file as the input, its fileid will be returned by SOURCE-ID. The other
possible values that SOURCE-ID can return are zero (if the user input device is the
source), and -1 (if the source is a character string passed by EVALUATE). See Section
8.3 for information about the text interpreter.
• File contents are accessed as a sequence of characters. The file position is the char-
acter offset from the start of the file. The file position is updated by all read, write,
and reposition commands.
• File read operations return an actual transfer count, which can differ from the
requested transfer count.
• A file access method (fam) is a single-cell value indicating a permissible means of
accessing a specific file, such as read/write or read-only.
• An I/O result (ior) is a single-cell value indicating the result of an I/O operation. A
value of zero indicates success; the meanings of non-zero values are defined by the
host OS. An operation reaching the end of a file is not considered an error and
returns a zero ior.
These are the words used to query file status. Note that the parameters for file posi-
tion and size are unsigned double-cell integers; this allows for reasonably large file
sizes even on 16-bit implementations.
Glossary
The words described in this section perform the basic operations for creating and
managing files. When creating and opening files, you can specify a "file access
method" or {am. These are specified by pre-defined constants (whose actual values
are system-dependent), as follows :
• R/W Read/write
• R/O Read only
• w/o Write only
Any of these may be followed by BIN, which additionally specifies that the file is a
binary fileJ6
Glossary
16. In the Windows and Linux operating systems, there is no distinction made between binary and text
files , so BIN in SwiftForth is a no·op.
Note Passing the name of a file that already exists to CREATE-FILE causes that file to be
opened and its length truncated to 0 (as if it were just created). If this is not what
you intended, do an OPEN-FILE or FILE-STATUS first, and if that fails (Le., the file
does not already exist), then do a CREATE-FILE.
The words that use text strings for file names typically get them either from the
command line (using WORD or PARSE) or by using SOl (either inside a colon definition
or interpretively).
Try this Pick one of your classwork files to use for this exercise:
SOl <filename>" RiO OPEN-FILE.
This will open your file and display the resulting iar, which should be zero. The
value that remains on the stack is the fileid, generally the file handle returned by the
OS. Type . S to display the fileid without removing it from the stack.
Now, using this fileid as your argument, type:
FILE-SIZE . D.
This should display another zero iar, followed by your file size. Is that correct?
If you will be working with the file for a while (which is normally the case in an
application), you'll want to save your fileid in a VARIABLE or VALUE.
With the fileid on the stack, close your file:
<fileid> CLOSE-FILE
The following words provide basic access to data in files. Note that READ-LINE and
WRITE-LINE are only appropriate for text (not binary) files.
READ-FILE ( addr ul fileid - u2 ior)
Reads up to ul characters from the file referenced by fileid to the buffer at addr,
and updates FILE-POSITION. The return value u2 is the number of characters suc-
°
cessfully read. If no exception occurred, ior = and u2 = ul or the number of char-
acters actually read before encountering the end of the file, whichever is smaller. If
FILE-POSITION was equal to FILE-SIZE before executing READ-FILE, u2 is zero. If ior
is non-zero, u2 is the number of characters successfully transferred before the
exception occurred.
READ-LINE ( addr ul fileid - u2 flag ior)
Reads a line up to ul consecutive characters from the file referenced by fileid into a
buffer at addr, and updates FILE-POSITION. Terminates if a line-end is encountered.
The return value u2 is the actual number of characters read, not including the line-
end (if any). One or two line-ends may be read into memory at the end of the line in
addition to u2, so the buffer at addr should be at least ul+2 characters long. If n2 =
u 1, the line-end was not reached. If no exception occurred, ior = 0 and flag is true. If
FILE-POSITION was equal to FILE-SIZE before executing READ-LINE, flag is false, ior
= 0, and u2 = o. If ior is non-zero, the remaining returned parameters are undefined.
WRITE-FILE ( addr u fileid - ior)
Writes u characters from addr to the file referenced by fileid, starting at its current
file position, increasing FILE-SIZE if necessary. After this operation, FILE-POSITION
will return the position just after the last character written, and FILE-SIZE will
return a value equal to or greater than FILE-POSITION.
WRITE-LINE ( addr u fileid - ior)
Writes u characters from addr to the file referenced by fileid, starting at its current
file position, increasing FILE-SIZE if necessary. The text is followed by a line-end.
After this operation, FILE-POSITION will return the next file pOSition after the last
character written to the file, and FILE-SIZE will return a value equal to or greater
than FILE-POSITION.
Try this Here's some code that will read and display a line from the current position in a file:
258 BUFFER: LINE-BUF
You could use this in a word to display all lines in a file, like this:
SHOW-FILE C addr u -- )
RiO OPEN-FILE ABORT" Can't open file"
Most applications don't need to interpret files, but it can be convenient to use a text
file to provide a script or other material to the Forth interpreter. To do so, you tem-
porarily redirect the input stream to the file. The follOwing words are meant for this.
Glossary
INCLUDE-FILE ( fileid - )
Reads and interprets the file referenced by fileid, line by line, until the end of file is
reached. When the end of the file is reached, closes the file and restores the previ-
ous input stream specification.
INCLUDED ( addr u - )
Same as INCLUDE-FILE, except it first opens the file specified by its name, which is
given by the text string addr u.
INCLUDE <fi 1ename> (- )
Like INCLUDE-FILE, but the file is specified by the filename that follows in the input.
Overall management of the input stream will be discussed further in Section 8.3.
1. Select one of your existing classwork files. Define SHOW to display each line in the
file on which a specified string occurs. For example, if you type:
SHOW DUP
It should select this file as the target for SHOW above. Modify SHOW to use this file.
3. Make a new file, and write a word COpy that works like SHOW except that instead of
displaying the selected lines it copies them to the new file.
Tip Don't forget to close your files when you're finished with them.
This entire section is specific to FORTH, Inc. products, describing the multitasking
model used in polyFORTH, chipFORTH, and SwiftX as well as the model adapted for
the Windows environment in SwiftForth.
Multitasking 127
Forth Application Techniques
SwiftOS solves several problems with its simple, fast multitasking scheme. SwiftOS
is a cooperative multi tasker, so it achieves the benefits of fast context switches and
full programmer control. It minimizes the likelihood that any task can monopolize
the CPU by establishing the simple rule that all I/O operations include at least one
PAUSE, which allows other tasks to run. Because I/O is a frequent occurrence in
embedded and real-time applications, and because I/O operations are relatively
time consuming when measured in CPU cycles, most tasks spend most of their time
suspended - waiting for I/O - so no task wishing to run has to wait long. For infre-
quent situations that require CPU-intensive activity (e.g., long sorts or complex
mathematical calculations), it's easy to PAUSE in CPU-intensive functions to ensure
that other tasks have the opportunity to run.
Tasks are arranged in a round robin, which means each task on the system is given
a chance to use all the resources of the system before relinquishing control to the
next task in line (Figure 19). In SwiftOS, the programmer always knows exactly when
a task does and does not relinquish the CPU. Context switches always occur
128 Multitasking
Forth Application Techniques
between Forth words, so the job of saving and restoring is quite fast: on most pro-
cessors, a complete context switch can be done in just a few machine instructions.
Figure 19. Tasks in a round-robin multitasker
! \
:TaskF I Task B
\ /
-~
(Task E) Task C.
- ~
There are two phases to task management: definition and instantiation. When a task
is defined, it gets a dictionary entry containing a Task Control Block (TCB), which is
the table containing its size and other parameters. This happens when a program is
compiled, and the task's definition and TCB are permanent parts of the dictionary.
In a cross-compiler such as SwiftX, the TCB has an entry in the host's dictionary,
and values in the target's initialized data space.
In SwiftX, tasks are typically instantiated and assigned their behaviors as part of the
power-up initialization sequence in the target. Instantiation and assignment of
function are separate actions. When a task is instantiated, it is given a region of
RAM called its user area, which is initialized by values from the TCB but which also
contains dynamic information reflecting the activity of the task. Individual entries
in the task's user area are called user variables. The task also gets its own data and
return stacks.
After SwiftOS has instantiated a task, it may communicate with it via the shared
Multitasking 129
Forth Application Techniques
memory that is visible to both SwiftOS and the task, or via the task's user variables.
User variables are defined by USER, which takes an offset relative to the start of the
task's user area. About a dozen user variables are required by the system for a back-
ground task; more are required for a terminal task. Additional user variables may be
defined for application use. User variables differ from normal variables only in that
each task has its own private copy of its data space.
SwiftOS supports two kinds of tasks:
1. Control or background tasks
2. Terminal tasks
The difference is that terminal tasks have a much larger user area, enabling them to
perform serial I/O using TYPE, etc. Control tasks are primarily for tasks performing
simple functions such as monitoring custom I/O in targets where memory conser-
vation is important.
Control tasks are defined with the word BACKGROUND and are instantiated by BUILD,
whereas terminal tasks are defined by TERMINAL and are instantiated by CONSTRUCT.
These words are discussed in the glossary below.
The action of assigning an activity to a task must be done inside a colon definition,
using the form:
: <name> <taskname> ACTIVATE <words to execute> ;
When name is executed, the task taskname will begin executing the words that fol-
low ACTIVATE.
The task's assigned behavior, represented above by "words to execute," may be one
of two types:
1. transitory behavior, which the task simply executes and then terminates; and
2. persistent behavior, represented by an infinite loop the task will perform forever
(e.g., BEGIN .. . AGAIN).
Transitory behavior must be terminated by the word STOP, which leaves the task dis-
abled until it gets a new job assignment with another ACTIVATE.
Persistent behavior must include the infinite loop and, within that loop, provision
must be made for the task to relinquish the CPU using PAUSE, STOP, or a word that
calls one of these (such as MS or any I/O word such as TYPE, EMIT, KEY, etc.). These
words are also discussed in the glossary below.
Whether the task's behavior is persistent or transient, the programmer must always
ensure that a task will never reach the semicolon that terminates the definition in
which its behavior is assigned with ACTIVATE.
130 Multitasking
Forth Application Techniques
Glossary The following glossary summarizes the words used in SwiftOS to define and control
tasks.
BACKGROUND <name> ( nu ns nr - )
Defines the background task name - with nu bytes of user area, ns bytes of data
stack, and nr bytes of return stack - and sets up its task control block based on
these parameters. Use of name returns the address of its TCB.
BUILD ( addr - )
Initializes the task at addr that was constructed by BACKGROUND. The task will be
linked in the round-robin following OPERATOR, and will be linked to the task previ-
ously linked to OPERATOR. This must be done at run time in the target system before
any attempt to ACTIVATE the task.
Usage: <taskname> BUILD
OPERATOR ( addr - )
Returns the address of the task definition table of the first task defined in the ker-
nel. OPERATOR is a TERMINAL task.
ACTIVATE ( addr - )
Starts the task at addr executing the words following ACTIVATE. ACTIVATE may only
be used inside a colon definition. The task executing the balance of the definition
must be prevented from ever returning from the definition. The task may execute
an infinite loop that describes its desired behavior, or use STOP or NOD.
~D (-)
Infinite loop designed to ensure a task remains inactive until assigned a new behav-
ior with ACTIVATE.
HALT (addr - )
Cause the task at addr to perform NOD.
Usage: <taskname> HALT
If you are programming in SwiftX or chipFORTH, keep in mind which aspects of task
creation and control take place at target-compilation time (producing definitions
Multitasking 131
Forth Application Techniques
and tables in ROM) and which are executed in the ROM target system to initialize
and affect RAM.
SwiftOS is discussed in more detail in the SwiftX Reference Manual, and the docu-
mentation for each target includes examples of custom device drivers that meet the
SwiftOS requirement that each I/O operation has to relinquish the Cpu.
The following example show how SwiftOS tasks are constructed and activated.
{ =====================================================================
swiftos MULTITASKING example
Because EVENTS is a USER variable, each task has its own copy.
when the count goes negative (at 32768 on a 16-bit target, which this
is assumed to be) the tasks will stop.
Adjusting #DELAY will make both tasks run faster or slower. Use of MS
causes the task to PAUSE so others can run.
Monitor either task by typing:
<taskname> <n> SEE
===================================================================== }
32 128 64 BACKGROUND TASK1 32 128 64 BACKGROUND TASK2
\ Include in power-up code: TASK1 BUILD TASK2 BUILD
132 Multitasking
Forth Application Techniques
As in SwiftOS, there are two phases to task management: definition and instantia-
tion. When a task is defined, it gets a dictionary entry containing a Task Control
Block, or TCB, which is the table containing its size and other parameters. This hap-
pens when a program is compiled, and the task's definition and TCB are permanent
parts of the dictionary.
When a SWiftForth task is instantiated, Windows is requested to allocate a private
stack frame to it, within which SWiftForth sets up its data and return stacks and
user variables, all of which behave essentially like those in SwiftOS. At this time, the
task is also assigned its behavior, or words to execute.
After Swift Forth instantiates a task, it may communicate with it via the shared
memory visible to both Swift Forth and the task, or via the task's user variables.
In SwiftForth, a task is defined using the sequence:
<size> TASK <taskname>
... where size is the requested size of its user area and data stack, combined. The
minimum value for size is 4,096 bytes; a typical value is 8,192 bytes. The task's
return stack, also used for Windows calls, is always 16,384 bytes. When invoked,
taskname will return the address of the task's TCB.
The word ACTIVATE, which is used in Swift X to assign a behavior to a task, also
instantiates it in SWiftForth; there is no equivalent to the words BUILD and CON-
STRUCT. Its usage is the same:
Multitasking 133
Forth Application Techniques
When name is executed, the task taskname will be instantiated and will begin exe-
cuting the words that follow ACTIVATE.
Transitory behavior in SwiftForth must be terminated by the word TERMINATE, which
uninstantiates the task. A task that has terminated in this fashion may be instanti-
ated again, to perform the same or a different transitory behavior. Although its
stacks and user area have been discarded, the TCB remains in the dictionary.
Persistent behavior (which leaves the task instantiated for an extended period) must
include an infinite loop and, within that loop, provision must be made for the task
to relinquish the CPU using PAUSE, STOP, s1 eep, or a word that calls one of these
(such as MS). These words are discussed in the glossary below. If this is not done, the
task will consume all available CPU time (subject to Windows time-slicing) and per-
formance of all other tasks and programs will degrade.
As in SwiftOS, whether the task's behavior is transitory or persistent, the program-
mer must ensure that it will never reach the semicolon (or an EXIT) that would
return from the definition that contains the ACTIVATE.
A task that assigns behavior to another task using ACTIVATE is that task's owner. A
task may SUSPEND another task, RESUME it (after a SUSPEND), or KILL (uninstantiate) it.
A task may also HALT another task, causing it to cease operation permanently the
next time it executes STOP or PAUSE, but leaves it instantiated. The operational dis-
tinction between HALT and KILL is that after HALT the task remains instantiated and
will retain any settings in its user variables for its next ACTIVATE.
A Swift Forth task might manage one or more windows. If it does, it must frequently
check its message queue and process any pending messages.
In the Swift OS non-preemptive multitasker, the programmer has complete control
over when a task may relinquish the CPU, but this is not possible in OS environ-
ments like Windows and Linux. OccaSionally, it is necessary to perform a sequence
of operations that cannot be interrupted by other Swift Forth tasks. Such a sequence
is called a critical section, and Windows can ensure that a critical section is per-
formed without interruption. SWiftForth's API to this is in the form of a pair of
words, [c and c], which begin and end a critical section. No other SwiftForth task
will be permitted to run during the execution of any functions between these two
words.
Glossary Following are the principle SwiftForth task definition and control words.
TASK <name> ( u- )
Defines a task whose combined user area and data stack will be u bytes (4,096 min-
imum) in size. Invoking name returns the address of the Task Control Block (TCB).
ACTIVATE ( addr - )
Instantiates the task whose TCB is at addr, and starts it executing the words follow-
ing ACTIVATE. Must be used inside a definition. The words following ACTIVATE must
be structured as an infinite loop or must end with TERMINATE so the semicolon at the
134 Multitasking
Forth Application Techniques
end of the definition is never executed. Also, the code must call PAUSE or STOP so
task control can function properly.
If the task was already instantiated,ACTIVATE will simply set it to start executing the
words following ACTIVATE immediately after the owner next executes PAUSE or STOP.
TERMINATE (- )
Causes the task executing this word to cease operation and release all its memory
back to Windows. A task that terminates itself may be re-activated.
SUSPEND ( addr - )
Forces the task whose TCB is at addr to suspend operation indefinitely.
RESUME ( addr - )
Causes the task whose TCB is at addr to resume operation at the point at which it
was suspended.
HALT (addr - )
Causes the task whose TCB is at addr to cease operation permanently at the next
STOP or PAUSE but to remain instantiated.
sleep ( u -)
Relinquishes the CPU for approximately u milliseconds. If u is zero, the task relin-
quishes the rest of its time slice (typically about 10 milliseconds). sl eep is a Win-
dows call used by MS and PAUSE, and is appropriate when the task wishes to avoid
checking its message queue.
KILL (addr - )
Causes the task whose TCB is at addr to cease operation and release all its memory
back to Windows. A task that has been killed may be re-activated.
PAUSE (- )
Relinquishes the CPU while checking for messages (if the task has a message
queue).
STOP (- )
Checks for messages (if the task has a message queue) and suspends operation
indefinitely (until restarted by another task).
[C (- )
Begins a critical section in which other SwiftForth tasks cannot execute.
C] (- )
Concludes a critical section.
Multitasking 135
h i
The following example shows how Swift Forth tasks are constructed and activated.
The obvious difference between this code and the SwiftX example in Section 10.2.3
is that tasks are defined differently and don't need to be instantiated separately.
Also, note that the user variable EVENTS was automatically added to the end of cur-
rently defined user variables, as indicated by #USER; its size in bytes was specified
and #USER was updated.
Finally, because Swift Forth is a 32-bit implementation, we added an upper limit; we
couldn't count on the number circle to go negative soon enough for this example!
{ =====================================================================
swiftForth MULTITASKING example
Adjusting #DELAY will make both tasks run faster or slower. Use of MS
causes task to PAUSE so others can run.
136 Multitasking
Forth Application Techniques
In most operating systems there is a method of locking out other tasks from the use
of a specific system resource. Among commonly protected resources are:
• disk
• printers
• graphics screens
• speCific data
• non-reentrant functions (such as sorts)
• application-specific device usage
This can become complex on systems that require queuing or arbitration, but a non-
preemptive implementation greatly Simplifies it. In Swift Forth, judicious use of crit-
ical sections along with the words described here can provide equivalent security.
The way a task knows whether a resource is available is by testing a facility variable.
This is a normal VARIABLE distinguished only by its purpose. This variable repre-
sents the status of the resource:
• zero if resource is available,
• non-zero if the resource is not available.
Forth provides GET and RELEASE to handle this test.
Assume, for example, that PRINTER is defined as a facility variable, like this:
VARIABLE PRINTER
No other task can GET this variable until the task that owns it does a RELEASE.
A task releases a facility by using the phrase PRINTER RELEASE to initiate the follow-
ing steps:
• It checks to see if the resource is owned by the executing task.
Multitasking 137
Forth Application Techniques
Glossary
GET ( addr -)
Obtains control of the facility variable at addr, after first executing PAUSE to allow
other tasks to run. If the facility is owned by another task, the task executing GET
will wait until the facility is available.
RELEASE ( addr -)
Relinquishes the facility variable at addr. If the task executing RELEASE did not pre-
viously own the facility, this operation is a no-op.
Neither SwiftX nor SwiftForth have any explicit safeguards against deadlocks, in
which two (or more) tasks conflict because each wants a resource the other has.
For example:
: IHANG MUX GET TAPE GET ;
: 2HANG TAPE GET MUX GET ;
If IHANG and 2HANG are run by different tasks, the tasks could eventually deadlock.
The best way to avoid deadlocks is to get facilities one at a time, if possible. If you
have to get two resources at the same time, it is safest to always request them in the
same order. In the example above involving a multiplexer and tape, the programmer
could save values from the multiplexer in a buffer, then move them to tape. In
almost all cases, there is a simple way to avoid concurrent GET operations. However,
in a poorly written application, the conflicting requests might occur on different
nesting levels, hiding the problem until a conflict occurs.
It is better to design an application to GET only one resource at a time - deadlocks
are impossible in that case.
The code below illustrates the concepts discussed so far by adding a facility vari-
able named EVENTER. If the OPERATOR task (you, typing in the command window)
does an EVENTER GET, all counting will stop until you do EVENTER RELEASE.
138 Multitasking
Forth Application Techniques
{ =====================================================================
swiftForth MULTITASKING example
Because EVENTS is a USER variable, each task has its own copy.
Adjusting #DELAY will make both tasks run faster or slower. use of MS
causes task to PAUSE so others can run.
Multitasking 139
Forth Application Techniques
This is an actual project submitted to FORTH, Inc. by a customer. We were given the
task of developing a microprocessor-based telephone switch:
• 128 lines, 8 trunks
• Each line supports:
• off-hook sensing
• dial-tone generation
• busy signal
• touch-tone senSing
• test for busy
• other features
140 Multitasking
Forth Application Techniques
device driver
sched u ler
message handler
This design was specified by the customer. It assigned one task to each function (off
hook, etc.). Tasks communicated by sending messages through the scheduler/mes-
sage handler.
The off-hook task was continually scanning the 128 lines to detect an off-hook con-
dition. When one was detected, the task sent a message to the dial-tone task, which
issued a dial tone and sent a message to the dialing task, etc. The customer had
attempted to program this system, but the slow, 8-bit micro controller was not keep-
ing up with the message traffic.
We re-designed the system using the FORTH, Inc. vertical approach, as in Figure 21.
Figure 21. Telephone switcher, vertical approach
Multitasking 141
Forth Application Techniques
Each task runs a single, high-level, re-entrant definition, of which a simplified ver-
sion is:
PHONE ( --) BEGIN OFFHOOK TONE DIAL
-BUSY IF CONNECT
ELSE BUSY THEN
AGAIN ;
Each task contains private user variables that control its status and action; for
example, one cell is 0 if the line is not in use, or points to the line to which it is con-
nected if it is busy.
By letting each task perform its functions sequentially, message passing was virtu-
ally eliminated and the only intertask communication was the act of establishing a
connection if the called line was available. Even though there were many more
tasks, overall performance improved dramatically and the project was a success.
142 Multitasking
Forth Application Techniques
This section presents two alternate sets of style guidelines plus some suggestions
for using symbols in names meaningfully. You may use them as a starting point in
developing a style for yourself. If you work in a programming team, we strongly rec-
ommend that the team agrees on a set of guidelines and all members follow them.
This will make it much easier to share code among yourselves and to maintain it
over time.
The purpose of this section is to describe the standards used at FORTH, Inc. for
editing Forth source code to ensure readability and notational consistency across all
Forth systems.
l. All colon or code definitions must include a comment identifying stack parameters
on entry and exit. If no stack parameters are used, an "empty" stack comment is
still required.
2. The format of the comment is: ( input -- output )
... with the rightmost item in each list representing the top of the stack.
Example 1: TYPE ( a n -- ) (input only)
Example 2: -FOUND ( -- a a' t ) (output only)
Example 3: CODE @ ( a n ) (both)
Example 4: NO-OP ( -- ) (no arguments)
3. The stack arguments comment begins one space after the name of the word. The
terminating parenthesis should follow the last character, with one space. Exactly
three spaces follow the right parenthesis before the code begins. Remember to leave
one space after the opening (.
4. The specific description of the stack item should follow these conventions:
add r address
b 8-bit byte
char ASCII character
n single-length number, usually signed
f 1+1 First and last limits, exclusive at end (as for PRINTING, etc.)
Other special situations may be dealt with similarly, if necessary to improve clarity,
but use single characters where possible. Remember to describe any special nota-
tion in source comments!
5. Where there are several arguments of the same type, and if clarity demands that
they be distinguished, use ' (prime) or suffix numerals. For example:
CODE RSWAP ( n a a' -- n a)
... shows that the address returned is the same as the first one input.
l. All source files should begin with a comment that succinctly describes the contents
of the file. This should be followed by any discussion that applies to the file as a
whole, a list of required support features that are not part of ANS Forth, and a list
of words in the file that are intended for public use (Le., as distinct from words
intended for use only within this file as support words).
2. Before each closely related group of definitions should be a block comment describ-
ing the group as a whole (e.g., assumptions or rules of usage) and the individual
words in the group. A block comment begins with:
{ --------------------------------------------------------
...and ends with:
-------------------------------------------------------- }
3. Comments within definitions (other than stack effects) should be directed to help-
ing the reader understand what the code is doing from an application perspective,
or to elucidating a possibly obscure strategy.
1. Blank lines are valuable. Use them to separate definitions or groups of definitions.
Avoid a dense clump of lines with a lot of blank lines below, unless the clump is a
single definition. A blank line inside a definition is usually unhelpful and should be
avoided. Try to leave at least one blank line at the end.
2. Definitions should begin in the leftmost column of a line, with the following excep-
tions:
a. If the definition is prefaced by a bar ( I) to make it headless, the bar should go
in the first column, followed by one space, and the definition begins immedi-
ately thereafter.
b. Two or three related variables, constants, or other data items may share a line
if there is room for three spaces between them.
c. Very short colon definitions may share a line if they are closely related, are
spaced properly internally and are separated from each other by at least three
spaces.
3. The name of a definition must be separated from its defining word by only one
space. If it is a constant or other object with a specified value, the value must be
separated from the defining word by only one space.
4. Individual instructions in a code definition must be separated by three spaces. Com-
ponents of each instruction must be separated by only one space. For example:
W R ) MOV 0 W ) MOV B 0)+ 0 )+ CMP B
Open Firmware is a Forth-based standard for ROM boot and diagnostic firmware
used in many workstations. It was developed at Sun Microsystems in the 1980s, and
is used by Sun, Apple, Motorola, IBM, and other manufacturers.
This section describes the coding style in some Open Firmware implementations.
These guidelines are from a living document that first came into existence in 1985.
By following these guidelines in your code development, you will produce code that
is similar in style to a large body of existing Open Firmware work. This will make
your code more easily understood by others within the Open Firmware community.
Forth code can be very terse, the judicious use of spaces can increase the readabil-
ity of your code.
Two consecutive spaces are used to separate a definition's name from the beginning
of the stack diagram, another two consecutive spaces (or a new line) are used to
separate the stack diagram from the word's definition, and two consecutive spaces
(or a new line) separate the last word of a definition from the closing semi-colon.
For example:
: new-name __ C_stack-before_--_stack-after_) __ foo __ bar__ ;
: new-name__ C_stack-before_--_stack-after_)
___ foo_bar_framus_dup_widget_foozle_ribbit_grindle
Forth words are usually separated by one space. If a phrase consisting of several
words performs some function, that phrase should be separated from other words
or phrases by two consecutive spaces or a new line.
When creating multiple-line definitions, all lines except the first and last should be
indented by three (3) spaces. If additional indentation is needed with control struc-
tures, the left margin of each additional level of indentation should start three (3)
spaces to the right of the preceding level.
: name_C_stack before_--_stack after_)
__xxx ...
_ _ _ xxx ...
_ _ _xxx ...
__ xxx
In i f ... then or i f ... el se ...then control structures that occupy no more than one line,
two spaces should be used both before and after each if, else, or then.
_if_xxx_then _
_ if_xxx--else_xxx_then_
Longer constructs should be structured like this:
<code to generate flag>_if
__ <true clause>
then
<code to generate flag>_if
__ <true clause>
else
__ <false clause>
then
In do ...loop constructs that occupy no more than one line, two spaces should be
used both before and after each do or loop.
<code to calculate limits>_do__ xxx__ loop_
Longer constructs should be structured like this:
<code to calculate limits>_do
__ <body>
loop
The longer +1 oop constructs should be structured like this:
<code to calculate limits>_do
__ <body>
<incremental value>_+loop
In begi n... whi 1e... repeat constructs that occupy that occupy no more than one line,
two spaces should be used both before and after each begi n, whi 1e, or repeat.
__ begin __<flag code> __while__<body> __ repeat __
Longer constructs:
begin __<short flag code> __while
___ <body>
repeat
begin
___ <long flag code>
while
___ <body>
repeat
In begi n... unti 1 and begi n...agai n constructs that occupy no more than one line, two
spaces should be used both before and after each begi n, unti 1, or agai n.
__ begin __ <body> __ until
__ begin __ <body> __ again
Longer constructs:
begin
___ <body>
until
begin
___ <body>
again
Block comments begin with \_. All text after the space is ignored until after the next
new line. It would be possible to delimit block comments with parentheses, but the
use of parentheses is reserved by convention for stack comments.
Precede each non-trivial definition with a block comment giving a clear and concise
explanation of what the word does. Put more comments at the very beginning of the
file to describe external words that could be used from the User Interface.
Stack comments begin with C and end with). Use stack comments liberally within
definitions. Try to structure each definition so that, when you put stack comments
at the end of each line, the stack picture makes a nice pattern.
: name (stack before -- stack after)
___xxx xxx bar ( stack condition after the execution of bar)
___xxx xxx foo ( stack condition after the execution of foo)
___xxx xxx dup ( stack condition after the execution of dup)
Return stack comments are also delimited with parentheses. In addition, the nota-
tion r: is used at the beginning of the return stack comment to differentiate it from
a data stack comment.
Place return stack comments on any line that contains one or more words that
cause the return stack to change. (This limitation is a practical one; it is often diffi-
cult to do otherwise due to lack of space.) The words> rand r> must be paired
inside colon definitions and inside do ... ' oop constructs.
: name ( stack before -- stack after )
__ xxx >r (r:addr)
__ xxx r> ( r: )
11.2.10 Numbers
17. Unlike FORTH, Inc. products, Open Firmware ignores periods anywhere except at the right-hand end
of a number. It is designed for large cell-size systems, and makes little use of double precision .
18. h# and d# are specific to Open Firmware. They are words, and must be followed by a space before
the number. For example : h# Of.
In these tables, "name" refers to some word the programmer has chosen to repre-
sent a Forth routine.
Note: Where possible, a prefix before a name indicates the type or precision of the
value being operated on, whereas a suffix after a name indicates what the value is or
where it's kept.
9 781419 685767