Vector Symbolic Architectures: A New Building Material For Artificial General Intelligence
Vector Symbolic Architectures: A New Building Material For Artificial General Intelligence
Vector Symbolic Architectures: A New Building Material For Artificial General Intelligence
Introduction
Perhaps more so than any other sub-field of computer science, artificial intelligence has
relied on the use of specialized data structures and algorithms to solve the broad variety
of problems that fall under the heading of “intelligence”. Although initial enthusiasm
about general problem-solving algorithms [1] was eventually supplanted by a “Society
of Mind” view of specialized agents acting in concert to accomplish different goals [2],
the dominant paradigm has always been one of discrete atomic symbols manipulated by
explicit rules.
The strongest challenge to this view came in the 1980’s, with the emergence of con-
nectionism [3], popularly (and somewhat misleadingly) referred to as neural networks. In
contrast to the rigid, pre-specified solutions offered by “Good Old-Fashioned AI” (GO-
FAI), connectionism offered novel learning algorithms as solutions to a broad variety of
problems. These solutions used mathematical tools like learning internal feature repre-
sentations through supervised error correction, or back-propagation [4], self-organization
of features without supervision [5], and the construction of content-addressable mem-
ories through energy minimization [6]. In its most radical form, the connectionist ap-
1 The authors thank Chris Eliasmith, Pentti Kanerva, and Tony Plate for useful discussion on the topics
Because the dimension of the tensor product increases with each binding operation,
the size of the representation grows exponentially as more recursive embedding is per-
formed. The solution is to collapse the N × N role/filler matrix back into a length-N
vector. As shown in Figure 2, there are two ways of doing this. In Binary Spatter Cod-
ing, or BSC [15], only the elements along the main diagonal are kept, and the rest are
discarded. If bit vectors are used, this operation is the same as taking the exclusive or
(XOR) of the two vectors. In Holographic Reduced Representations, or HRR [16], the
sum of each diagonal is taken, with wraparound (circular convolution) keeping the length
of all diagonals equal. Both approaches use very large (N > 1000 elements) vectors of
random values drawn from a fixed set or interval.
Despite the size of the vectors, VSA approaches are computationally efficient, re-
quiring no costly backpropagation or other iterative algorithm, and can be done in paral-
lel. Even in a serial implementation, the BSC approach is O(N ) for a vector of length N ,
and the HRR approach can be implemented using the Fast Fourier Transform, which is
O(N log N ). The price paid is that most of the crucial operations (circular convolution,
vector addition) are a form of lossy compression that introduces noise into the represen-
tations. The introduction of noise requires that the unbinding process employ a “cleanup
memory” to restore the fillers to their original form. The cleanup memory can be imple-
mented using Hebbian auto-association, like a Hopfield Network [6] or Brain-State-in-a-
Box model [17]. In such models the original fillers are attractor basins in the network’s
dynamical state space. These methods can be simulated by using a table that stores the
original vectors and returns the one closest to the noisy version.
3. Applications
As a relatively new technology, VSA is just beginning to be used in the AI and cognitive
science communities. Its support for compositional structure, associative memory, and
efficient learning make it an appealing “raw material” for a number of applications. In
this concluding section we review some of these applications, and outline possibilities
for future work.
Jones and Mewhort [18] report using a holographic / convolution approach similar to
HRR, for incorporating both word meaning and sequence information into a model lex-
icon. Their holographic BEAGLE model performed better than (300-dimensional) La-
tent Semantic Analysis [19] on a semantic-distance test, and, unlike LSA, BEAGLE can
predict results from human experiments on word priming.
GOFAI has excelled at representing and reasoning with universally and existentially
quantified varialbes; e.g., ∀ x Computer(x) → HasBuggyP rogram(x)∨Broken(x).
It has been known for some time, however, that human performance on such reasoning
tasks differs in interesting ways from simple deductive logic [22]. Recent work by Elia-
smith [23] shows that HRR encodings of logical rules yield results similar to those seen
in the experimental literature.
Arathorn ’s Map Seeking Circuits (MSCs) [24] are recurrent neural networks for rec-
ognizing transformed images, using localist representations. We propose treating the lo-
calist MSC as an input recognition device, with the localist output values subsequently
encoded into VSA representations indicating items in the agent’s environment, and their
spatial relationships to the agent. This would allow the representation and manipulation
of multiple simultaneous items in the agent’s environment.
References
[1] Newell, A., Simon, H.A.: Gps, a program that simulates human thought. Lernende Automaten (1961)
109–124
[2] Minsky, M.L.: The Society of Mind. Simon and Schuster (1988)
[3] Rumelhart, D., McClelland, J., eds.: Parallel Distributed Processing: Explorations in the Microstructure
of Cognition. MIT Press (1986)
[4] Rumelhart, D., Hinton, G., Williams, R.: Learning internal representation by error propagation. In
Rumelhart, D., McClelland, J., eds.: Parallel Distributed Processing: Explorations in the Microstructure
of Cognition. Volume 1. MIT Press (1986)
[5] Kohonen, T.: Self-Organizing Maps. 3 edn. Springer-Verlag, Secaucus, NJ (2001)
[6] Hopfield, J.: Neural networks and physical systems with emergent collective computational abilities.
Proceedings of the National Academy of Sciences 79 (1982) 2554–2558
[7] Elman, J.: Finding structure in time. Cognitive Science 14 (1990) 179–211
[8] McClelland, J., Rumelhart, D., Hinton, G.: The appeal of parallel distributed processing. In Rumelhart,
D., McClelland, J., eds.: Parallel Distributed Processing: Explorations in the Microstructure of Cogni-
tion. Volume 1. MIT Press (1986)
[9] Fodor, J., Pylyshyn, Z.: Connectionism and cognitive architecture: A critical analysis. Cognition 28
(1988) 3–71
[10] Pollack, J.: Recursive distributed representations. Artifical Intelligence 36 (1990) 77–105
[11] Minsky, M.L., Papert, S.A.: Perceptrons: Expanded Edition. MIT Press (1988)
[12] Voegtlin, T., Dominey, P.F.: Linear recursive distributed representations. Neural Netw. 18(7) (2005)
878–895
[13] Gayler, R.: Vector symbolic architectures answer jackendoff’s challenges for cognitive neuroscience.
In Slezak, P., ed.: ICCS/ASCS International Conference on Cognitive Science. CogPrints, Sydney, Aus-
tralia, University of New South Wales (2003) 133–138
[14] Smolensky, P.: Tensor product variable binding and the representation of symbolic structures in connec-
tionist systems. Artificial Intelligence 46 (1990) 159–216
[15] Kanerva, P.: The binary spatter code for encoding concepts at many levels. In Marinaro, M., Morasso,
P., eds.: ICANN ’94: Proceedings of International Conference on Artificial Neural Networks. Volume 1.,
London, Springer-Verlag (1994) 226–229
[16] Plate, T.: Holographic reduced representations. Technical Report CRG-TR-91-1, Department of Com-
puter Science, University of Toronto (1991)
[17] Anderson, J., Silverstein, J., Ritz, S., Jones, R.: Distinctive feathers categorical perception and proba-
bility learning: some applications of a neural model. Psychological Review 84(5) (1977) 413–451
[18] Jones, M.N., Mewhort, D.J.K.: Representing word meaning and order information in a composite holo-
graphic lexicon. Psychological Review 114 (2007)
[19] Foltz, T.L.T., Laham, D.: Introduction to latent semantic analysis. Discourse Processes 25 (1998) 259–
284
[20] Gentner, D., Toupin, C.: Systematicity and surface similarity in the development of analogy. Cognitive
Science 10 (1986) 277–300
[21] Eliasmith, C., Thagard, P.: Integrating structure and meaning: a distributed model of analogical mapping.
Cognitive Science 25(2) (2001) 245–286
[22] Wason, P.: Reasoning. In Foss, B., ed.: New Horizons in Psychology. Penguin, Harmondsworth (1966)
[23] Eliasmith, C.: Cognition with neurons: A large-scale, biologically realistic model of the wason task. In
Bara, G., Barsalou, L., Bucciarelli, M., eds.: Proceedings of the 27 th Annual Meeting of the Cognitive
Science Society. (2005)
[24] Arathorn, D.W.: Map-Seeking Circuits in Visual Cognition. Stanford University Press (2002)