Unit 4
Unit 4
Unit 4
There are two goals of this unit. The first is to introduce the subsymbolic quantity of
activation associated with chunks. The other is to show how those activation values are
learned through the history of usage of the chunks.
4.1 Introduction
We have seen retrieval requests in productions like this:
(P example-counting
=goal>
isa count
state counting
number =num1
=retrieval>
isa count-order
first =num1
second =num2
==>
=goal>
number =num2
+retrieval>
isa count-order
first =num2
)
In this case an attempt is being made to retrieve a countorder chunk with a particular
number (bound to =num2) in its first slot. Up to now we have been working with the
system at the symbolic level. If there was a chunk that matched that retrieval request it
would be placed into the retrieval buffer, and if not then the retrieval request would fail
and the state of the declarative memory module would indicate an error. The system was
deterministic and we did not consider any timing cost associated with that retrieval or the
possibility a chunk in declarative memory might fail to be retrieved. For the simple tasks
we have looked at so far that was sufficient.
Most psychological tasks however are not that simple and issues such as accuracy and
latency over time or across different conditions are measured. For modeling these more
involved tasks one will typically need to use the subsymbolic components of ACTR to
accurately model and predict human performance. For the remainder of the
tutorial, we will be looking at the subsymbolic components that control
the performance of the system. To use the subsymbolic components
we need to turn them on by setting the esc parameter (enable
subsymbolic computations) to t:
(sgp :esc t)
1
ACTR Tutorial 23Nov08 Unit Four
That setting will be necessary for the rest of the models in the tutorial.
4.2 Activation
Every chunk in ACTR’s declarative memory has associated with it a numerical value
called its activation. The activation reflects the degree to which past experiences and
current context indicate that chunk will be useful at any particular moment. When a
retrieval request is made the chunk with the greatest activation among those that match
the specification of the request will be the one placed into the retrieval buffer. There is
one constraint on that however. There is a parameter called the retrieval threshold which
sets the minimum activation a chunk can have and still be retrieved. It is set with the :rt
parameter:
If the chunk with the highest activation among those that match the request has an
activation which is less than the retrieval threshold, then no chunk will be placed into the
retrieval buffer and an error state will indicate the failure.
Ai = Bi + ε
B : The base-level activation. This reflects the recency and frequency of practice of the
i
chunk i.
ε : The noise value. The noise is composed of two components: a permanent noise
associated with each chunk and an instantaneous noise computed at the time of a retrieval
request.
2
ACTR Tutorial 23Nov08 Unit Four
This equation describes a process in which each time an item is presented there is an
increase in its base-level activation, which decays away as a power function of the time
since that presentation. These decay effects are summed and then passed through a
logarithmic transformation.
There are two types of events that are considered as presentations of a chunk. The first is
its initial entry into declarative memory. The other is any merging of that chunk with a
chunk that is already in declarative memory. The next two subsections describe those
events in more detail.
4.3.1 Chunks Entering Declarative Memory
When a chunk is initially entered into declarative memory is counted as its first
presentation. There are two ways for a chunk to be entered into declarative memory, both
of which have been discussed in the previous units. They are:
- Explicitly by the modeler using the add-dm command. These chunks are
entered at the time the call is executed, which is time 0 for a call in the body
of the model definition.
- When the chunk is cleared from a buffer. We have seen this happen in many
of the previous models as visual locations, visual objects, and goal chunks are
cleared from their buffers they can then be found among the chunks in
declarative memory.
4.3.2 Chunk Merging
Something we have not seen previously is what happens when the chunk cleared from a
buffer is an identical match to a chunk which is already in declarative memory. If a
chunk has the same chunk-type and all of its slot values are an exact match with one of
the existing chunks in declarative memory then instead of being added to declarative
memory it goes through a process we refer to as merging with that existing chunk.
Instead of adding the new chunk to declarative memory the preexisting chunk in
declarative memory is credited with a presentation. Then the name of the new chunk (the
one which is being merged into declarative memory) is changed to now reference the
chunk that was already in declarative memory i.e. there is one chunk which now has two
(or possibly more) names. This mechanism results in repeated completions of the same
operations (the clearing of a duplicate chunk) reinforcing the chunk that represents that
situation instead of creating lots of identical chunks each with only one presentation. So,
3
ACTR Tutorial 23Nov08 Unit Four
for example, repeatedly attending the same visual stimuli would result in strengthening a
single chunk that represents that object.
4.5 Noise
The noise component of the activation equation contains two sources of noise. There is a
permanent noise which can be associated with a chunk and an instantaneous noise value
which will be recomputed at each retrieval attempt. Both noise values are generated
according to a logistic distribution characterized by a parameter s. The mean of the
logistic distribution is 0 and the variance, σ2, is related to the s value by this equation:
π2 2
σ =2
s
3
The permanent noise s value is set with the :pas parameter and the instantaneous noise s
value is set with the :ans parameter. Typically, we are only concerned with the
instantaneous noise (the variance from trial to trial) and leave the permanent noise turned
off (a value of nil).
If we make a retrieval request and there is a matching chunk, that chunk will only be
retrieved if it exceeds the retrieval activation threshold, τ. The probability of this
happening depends on the expected activation, Ai, and the amount of noise in the system
which is controlled by the parameter s:
4
ACTR Tutorial 23Nov08 Unit Four
Inspection of that formula shows that, as Ai tends higher, the probability of recall
approaches 1, whereas, as τ tends higher, the probability decreases. In fact, when τ = Ai,
the probability of recall is .5. The s parameter controls the sensitivity of recall to changes
in activation. If s is close to 0, the transition from near 0% recall to near 100% will be
abrupt, whereas when s is larger, the transition will be a slow sigmoidal curve.
If no chunk matches the retrieval request, or no chunk has an activation which is greater
than the retrieval threshold then a retrieval failure will occur. The time it takes for the
failure to be signaled is:
Time = Fe −τ
τ : The retrieval threshold.
F: The latency factor.
5
ACTR Tutorial 23Nov08 Unit Four
Anderson, J.R. (1981). Interference: The relationship between response latency and
response accuracy. Journal of Experimental Psychology: Human Learning and Memory,
7, 326-343.
The Unit 4 folder contains the paired model for this experiment. The experiment code is
written to allow one to run a general form of the experiment. Both the number of pairs to
present and the number of trials to run can be specified. You can run through n trials of
m paired associates (m no greater than 20) by calling do-experiment with those
parameters:
(do-experiment m n)
For each of the m words you will see the stimulus for 5 seconds during which you have
the opportunity to make your response. Then you will see the associated number for 5
seconds. The simplest form of the experiment is one in which a single pair is presented
twice. Here is the trace of the model doing such a task. The first time the model has an
opportunity to learn the pair and the second time it has a chance to recall that learned
pair:
> (do-experiment 1 2)
0.000 GOAL SET-BUFFER-CHUNK GOAL GOAL REQUESTED NIL
0.000 VISION SET-BUFFER-CHUNK VISUAL-LOCATION VISUAL-LOCATION0-0 REQUESTED NIL
0.000 PROCEDURAL CONFLICT-RESOLUTION
0.050 PROCEDURAL PRODUCTION-FIRED ATTEND-PROBE
0.050 PROCEDURAL CLEAR-BUFFER VISUAL-LOCATION
0.050 PROCEDURAL CLEAR-BUFFER VISUAL
0.050 PROCEDURAL CONFLICT-RESOLUTION
0.135 VISION Encoding-complete VISUAL-LOCATION0-0-0 NIL
0.135 VISION SET-BUFFER-CHUNK VISUAL TEXT0
0.135 PROCEDURAL CONFLICT-RESOLUTION
0.185 PROCEDURAL PRODUCTION-FIRED READ-PROBE
0.185 PROCEDURAL CLEAR-BUFFER VISUAL
0.185 PROCEDURAL CLEAR-BUFFER IMAGINAL
0.185 PROCEDURAL CLEAR-BUFFER RETRIEVAL
0.185 DECLARATIVE START-RETRIEVAL
0.185 PROCEDURAL CONFLICT-RESOLUTION
0.385 IMAGINAL SET-BUFFER-CHUNK IMAGINAL PAIR0
0.385 PROCEDURAL CONFLICT-RESOLUTION
3.141 DECLARATIV RETRIEVAL-FAILURE
6
ACTR Tutorial 23Nov08 Unit Four
7
ACTR Tutorial 23Nov08 Unit Four
The basic structure of the screen processing productions should be familiar by now. The
one thing to note is that because this model must wait for stimuli to appear on screen it
takes advantage of the buffer stuffing mechanism so that it can wait for the change
instead of continuously checking. The way it does so is that the first production that will
fire, for either the probe or the associated number, has a visual-location buffer test on its
LHS which will only match once buffer stuffing places a chunk into the buffer. Here are
the attend-probe and detect-study-item productions for reference:
(p attend-probe
=goal>
isa goal
state start
=visual-location>
isa visual-location
?visual>
state free
==>
+visual>
isa move-attention
screen-pos =visual-location
=goal>
state attending-probe
)
(p detect-study-item
=goal>
isa goal
state read-study-item
=visual-location>
isa visual-location
?visual>
state free
==>
+visual>
isa move-attention
screen-pos =visual-location
=goal>
state attending-target
)
Because the buffer is cleared automatically by strict harvesting and no later productions
issue a request for a visual-location these productions must wait for buffer stuffing to put
a chunk into the visual-location buffer before they can match. Since none of the other
productions match in the mean time the model will essentially just wait for the screen to
change before doing anything else.
Now we will focus on the productions which are responsible for forming the association
and retrieving the chunk. When the model attends to the probe with the read-probe
production two actions are taken (in addition to the updating of the goal state):
(p read-probe
=goal>
isa goal
8
ACTR Tutorial 23Nov08 Unit Four
state attending-probe
=visual>
isa text
value =val
==>
+imaginal>
isa pair
probe =val
+retrieval>
isa pair
probe =val
=goal>
state testing
)
It makes a request to the imaginal buffer to create a chunk of type pair which will hold
the value read from the screen in the probe slot. It also makes a request through the
retrieval buffer to retrieve a pair chunk from declarative memory which has that same
probe value.
We will come back to the retrieval request shortly. For now we will focus on the creation
of the pair chunk. This request will cause the imaginal module to create a new chunk
which it will place into the imaginal buffer. This chunk will encode the association
between the probe and the answer which is presented later. The associate production
fires after the model reads the number which is associated with the probe:
(p associate
=goal>
isa goal
state attending-target
=visual>
isa text
value =val
=imaginal>
isa pair
==>
=imaginal>
answer =val
-imaginal>
=goal>
state start
+visual>
isa clear
)
This production sets the answer slot of the pair chunk which is in the imaginal buffer to
the answer which was read from the screen. It also then clears that chunk from the buffer
so that it can be entered into declarative memory. That will result in a chunk like this
being added to the model’s declarative memory:
PAIR0-0
ISA PAIR
PROBE "zinc"
ANSWER "9"
9
ACTR Tutorial 23Nov08 Unit Four
This chunk serves as the memory of this trial. An important thing to note is that the
chunk in the buffer is not added to the model’s declarative memory until that buffer is
cleared. Often that happens when the model later harvests that chunk from the buffer, but
in this case the model does not harvest the chunk later so it is explicitly cleared at that
point. One could imagine adding additional productions which would rehearse that
chunk, but for the demonstration model that is not done.
This production also makes a clear request to the visual buffer to stop attending to the
item. That is done so that the model does not have the automatic re-encoding when the
screen is updated.
The declarative memory module will attempt to retrieve a pair chunk with the requested
probe. Depending on whether a chunk can be retrieved, one of two production rules may
apply corresponding the either the successful retrieval of such a chunk or the failure to
retrieve a matching chunk:
(p recall
=goal>
isa goal
state testing
=retrieval>
isa pair
answer =ans
?manual>
state free
==>
+manual>
isa press-key
key =ans
=goal>
state read-study-item
+visual>
isa clear
)
(p cannot-recall
=goal>
isa goal
state testing
?retrieval>
state error
==>
=goal>
state read-study-item
+visual>
isa clear
)
10
ACTR Tutorial 23Nov08 Unit Four
The probability of the recall production firing and the mean latency for the recall will be
determined by the activation of the chunk that is retrieved and will increase with repeated
presentations and harvested retrievals.
The model gives a pretty good fit to the data as illustrated below in a run of 100
simulated subjects (because of stochasticity results are more reliable if there are more
runs and to generate that many runs in a reasonable amount of time one must turn off the
trace and remove the seed parameter to allow for differences from run to run):
Latency:
CORRELATION: 0.998
MEAN DEVIATION: 0.097
Trial 1 2 3 4 5 6 7 8
0.000 2.139 1.834 1.665 1.546 1.440 1.377 1.306
Accuracy:
CORRELATION: 0.994
MEAN DEVIATION: 0.043
Trial 1 2 3 4 5 6 7 8
0.000 0.554 0.769 0.855 0.904 0.932 0.955 0.960
Latency:
CORRELATION: 0.000
MEAN DEVIATION: 1.619
Trial 1 2 3 4 5 6 7 8
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Accuracy:
CORRELATION: 0.000
MEAN DEVIATION: 0.777
Trial 1 2 3 4 5 6 7 8
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
which shows a complete failure to retrieve any of the facts. Just lowering the retrieval
threshold so that they can be retrieved results in something like this:
> (collect-data 5)
11
ACTR Tutorial 23Nov08 Unit Four
Latency:
CORRELATION: 0.972
MEAN DEVIATION: 1.187
Trial 1 2 3 4 5 6 7 8
0.000 3.512 3.871 3.359 2.793 2.473 2.265 2.136
Accuracy:
CORRELATION: 0.914
MEAN DEVIATION: 0.176
Trial 1 2 3 4 5 6 7 8
0.000 0.100 0.600 1.000 1.000 1.000 1.000 1.000
It shows some of the general trends, but does not fit the data well. The behavior of this
model and the one that you have to write really depends on the settings of four
parameters. Here are those parameters and their settings in this model. The retrieval
threshold is set at -2. This determines how active a chunk has to be to be retrieved 50%
of the time. The instantaneous activation noise is set at 0.5. This determines how quickly
probability of retrieval changes as we move past the threshold. The latency factor is set
at 0.4. This determines the magnitude of the activation effects on latency. Finally, the
decay rate for base-level learning is set to the value 0.5 which is where we recommend it
be set for most tasks that involve the base-level learning mechanism.
How to determine those values can be a tricky process because the equations are all
related and thus they cannot be independently manipulated for a best fit. Typically some
sort of searching is required, and there are many ways to accomplish that. For the tutorial
models there will typically be only one or two parameters that you will need to adjust and
we recommend that you work through the process “by hand” adjusting the parameters
individually to see the effect that they have on the model. There are other ways of
determining parameters that can be used, but we will not be covering any in the tutorial.
12
ACTR Tutorial 23Nov08 Unit Four
You may find this detailed accounting of the activation computation useful in debugging
your models or just in understanding how the system computes activation values.
She manipulated whether the addend was 2, 3, or 4 and whether the problem was true or
false. She had 2 versions of each of the 6 kinds of problems (3 addends x 2 responses)
each with a different letter (a through f). She then manipulated the frequency with which
problems were studied in sets of 24 trials:
13
ACTR Tutorial 23Nov08 Unit Four
Each participant saw problems based on one of the three conditions. There were 8
repetitions of a set of 24 problems in a block (192 problems), and there were 3 blocks for
576 problems in all. The data presented below are in seconds to judge the problems true
or false based on the block and the addend. They are aggregated over both true and false
responses:
The interesting phenomenon concerns the interaction between the effect of the addend
and amount of practice. Presumably, the addend effect originally occurs because subjects
have to engage in counting, but latter they come to rely mostly on retrieval of answers
they have stored from previous computations.
The task for this unit is to develop a model of the control group data. Functions to run the
experiment and most of a model that can perform the task are provided in the model
called zbrodoff. The model as given does the task by counting through the alphabet and
numbers “in its head” (using the subvocalize action of the speech module to produce
reasonable timing data) to arrive at an answer which it compares to the initial equation to
determine how to respond. Here is the performance of this model on the task:
> (collect-data 1)
CORRELATION: 0.289
MEAN DEVIATION: 1.309
It is always correct (64 out of 64 for each cell) but does not get any faster from block to
block because it always uses the counting strategy. Your first task is to extend the model
14
ACTR Tutorial 23Nov08 Unit Four
so that it attempts to remember previous instances of the trials. If it can remember the
answer it does not have to resort to the counting strategy and can respond much faster.
The previous trials are encoded in the problem chunks, and encode the result of the
counting. A completed problem for a trial where the stimulus was “A + 2 = C” would
look like this:
PROBLEM0-0
ISA PROBLEM
ARG1 "a"
ARG2 "2"
RESULT "c"
The result slot contains the result of counting 2 letters from A. An important thing to note
is that the actual target letter for the trial is stored in the goal buffer for comparison after
the model has finished counting to a result. The model only encodes the result of the
counting in the problem chunks. Thus the same problem chunk will result from a trial
where the stimulus presented is “A + 2 = D”. The assumption is that the person is
actually learning the letter counting facts and not just memorizing the stimulus-response
pairings for the task. There will be one of those problem chunks for each of the additions
which is encountered, which will be a total of six after it completes the first set of trials.
After your model is able to utilize a retrieval strategy along with the counting strategy
given, your next step is to adjust the parameters so that the model’s performance better
fits the experimental data. The results should look something like this after you have the
retrieval strategy working:
CORRELATION: 0.929
MEAN DEVIATION: 0.656
The model is still always responding correctly on all trials, the correlation is good, but the
deviation is quite high because the model is too fast overall. The model’s performance
will depend on the same four parameters as the paired associate model: latency factor,
activation noise, base-level decay rate, and retrieval threshold. In the model you are
given, the first three are set to the same values as in the paired associate model and
represent reasonable values for this task. You should not have to adjust any of those.
However, the retrieval threshold (the :rt parameter) is set to its default value of 0. This is
the parameter you should manipulate to improve the fit to the data. Here is our fit to the
data adjusting only the retrieval threshold:
15
ACTR Tutorial 23Nov08 Unit Four
This experiment is more complicated than the ones that you have seen previously. It runs
continuously for many trials and the learning that occurs across trials is important. Thus
the model cannot treat each trial as an independent event and be reset before each one as
has been done for the previous units. While writing your model and testing the fit to the
data you will probably want to test it on smaller runs than the whole task. There are four
functions you can use to test the experiment. The first is collect-data which takes one
parameter, the number of times to run the full experiment. That function will average the
results of running the full experiment multiple times and report the correlation and
deviation to the experimental data. It may take a long time to run, especially if you
request a lot of runs for comparing to the data (to speed up the runs once you are sure the
model is doing the right thing you should definitely turn off the trace, the :v parameter in
the sgp command). You can use the do-block function, which takes one optional
parameter, to run 1 block of the experiment and display the results. The optional
parameter controls whether or not to show the window while the task is running. If you
do not supply the parameter a virtual window is used, and if you specify the parameter t
then a real window will be shown. The do-set function is similar to do-block, except it
only runs through the set of 24 items once. The do-trial function can be used to run a
single trial. It takes four parameters which are all single character strings and an optional
fifth parameter like do-block and do-set to control the display. The first three are the
elements of the equation to present i.e. "a" "2" "c" to present a + 2 = c. The fourth is the
correct key press, "K" for a true probe and "D" for a false probe. One thing to note is that
the only one of those functions which calls reset is collect-data. So if you are using the
other functions while testing the model keep in mind that unless you call reset it will
remember everything it has done since the last time it was reset.
There is one additional command in the provided model that you have not seen before:
This sets the base-level activation of all the chunks in declarative memory that exist when
it is called (which are the sequence chunks provided) to very large values by setting the
parameters n and L of the optimized base-level equation for each one. The first
parameter, 100000, specifies n and the second parameter, -1000, specifies the creation
time of the chunk. This ensures that they maintain a very high base-level activation and
do not fall below the retrieval threshold over the course of the task. The assumption is
that counting and the order of the alphabet are very well learned tasks for the model and
the human participants and the use of those skills does not lead to any significant learning
for those things during the course of the experiment.
Another thing to notice is that the :ncnar parameter is set to nil in the starting model.
Like the paired model, if that parameter is set to t then it will take significantly longer to
run this model. You may want to set it back to t to help with debugging your model as
you develop it, but you will probably want to set it back to nil once you start running the
model to compare its fit to the data.
16
ACTR Tutorial 23Nov08 Unit Four
Finally, because this model runs a lot of Lisp code to perform the experiment you will see
improved performance in running the model if the Lisp code is compiled. Some Lisp
systems do this automatically (Macintosh Common Lisp for example does) but most do
not. Therefore, if you would like to improve the running time of the model i.e. the real
time it takes to run the model through the experiment not the simulated time the model
reports, then you should compile the file if your Lisp does not do so automatically.
Loading the model through the Environment will not compile the file. If you have a Lisp
with a GUI then one of the File menu options is likely “Compile and Load”. Using that
to load the model will result in the file being compiled and will improve performance.
One important thing to keep in mind is that if you change the model file you will have to
use the “Compile and Load” option again to load the file after you save the changes. The
“Reload” button on the Environment’s Control Panel and the reload command in ACT-R
will not automatically recompile the file if you change it. They will continue to load the
last compiled version, which will not reflect the changes you have made. Compiling the
model files is something that you will probably want to do for the remainder of the
models in the tutorial.
17