Grammatical Theory
Grammatical Theory
Grammatical Theory
theory
From transformational grammar to
constraintbased approaches
Fourth revised and extended edition
Stefan Müller
language
Textbooks in Language Sciences 1 science
press
Textbooks in Language Sciences
In this series:
3. Freitas, Maria João & Ana Lúcia Santos (eds.). Aquisição de língua materna e não
materna: Questões gerais e dados do português.
ISSN: 23646209
Grammatical
theory
From transformational grammar to
constraintbased approaches
Fourth revised and extended edition
Stefan Müller
language
science
press
Stefan Müller. 2020. Grammatical theory: From transformational grammar to
constraint-based approaches. Fourth revised and extended edition. (Textbooks in
Language Sciences 1). Berlin: Language Science Press.
This title can be downloaded at:
http://langsci-press.org/catalog/book/287
© 2020, Stefan Müller
Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0):
http://creativecommons.org/licenses/by/4.0/
ISBN: 978-3-96110-273-0 (Digital)
978-3-96110-277-8 (Hardcover)
978-3-96110-274-7 (Softcover)
ISSN: 2364-6209
DOI: 10.5281/zenodo.3992307
Source code available from www.github.com/langsci/25
Collaborative reading: paperhive.org/documents/remote?type=langsci&id=25
ii
Contents
iii
Contents
iv
Contents
v
Contents
vi
Contents
vii
Contents
24 Conclusion 701
References 723
Index 827
Name index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
Language index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843
viii
Preface
This book is an extended and revised version of my German book Grammatiktheorie
(Müller 2013a). It introduces various grammatical theories that play a role in current
theorizing or have made contributions in the past which are still relevant today. I explain
some foundational assumptions and then apply the respective theories to what can be
called the “core grammar” of German. I have decided to stick to the object language that
I used in the German version of this book since many of the phenomena that will be
dealt with cannot be explained with English as the object language. Furthermore, many
theories have been developed by researchers with English as their native language and it
is illuminative to see these theories applied to another language. I show how the theories
under consideration deal with arguments and adjuncts, active/passive alternations, local
reorderings (so-called scrambling), verb position, and fronting of phrases over larger
distances (the verb second property of the Germanic languages without English).
The second part deals with foundational questions that are important for developing
theories. This includes a discussion of the question of whether we have innate domain
specific knowledge of language (UG), the discussion of psycholinguistic evidence con-
cerning the processing of language by humans, a discussion of the status of empty ele-
ments and of the question whether we construct and perceive utterances holistically or
rather compositionally, that is, whether we use phrasal or lexical constructions. The sec-
ond part is not intended as a standalone book although the printed version of the book
is distributed this way for technical reasons (see below). Rather it contains topics that
are discussed again and again when frameworks are compared. So instead of attaching
these discussions to the individual chapters they are organized in a separate part of the
book.
Unfortunately, linguistics is a scientific field with a considerable amount of termino-
logical chaos. I therefore wrote an introductory chapter that introduces terminology in
the way it is used later on in the book. The second chapter introduces phrase structure
grammars, which plays a role for many of the theories that are covered in this book. I
use these two chapters (excluding the Section 2.3 on interleaving phrase structure gram-
mars and semantics) in introductory courses of our BA curriculum for German studies.
Advanced readers may skip these introductory chapters. The following chapters are
structured in a way that should make it possible to understand the introduction of the
theories without any prior knowledge. The sections regarding new developments and
classification are more ambitious: they refer to chapters still to come and also point to
other publications that are relevant in the current theoretical discussion but cannot be
repeated or summarized in this book. These parts of the book address advanced stu-
dents and researchers. I use this book for teaching the syntactic aspects of the theories
Preface
in a seminar for advanced students in our BA. The slides are available on my web page.
The second part of the book, the general discussion, is more ambitious and contains the
discussion of advanced topics and current research literature.
This book only deals with relatively recent developments. For a historical overview,
see for instance Robins (1997); Jungen & Lohnstein (2006). I am aware of the fact that
chapters on Integrational Linguistics (Lieb 1983; Eisenberg 2004; Nolda 2007), Optimality
Theory (Prince & Smolensky 1993; Grimshaw 1997; G. Müller 2000), Role and Reference
Grammar (Van Valin 1993) and Relational Grammar (Perlmutter 1983; 1984) are missing.
I will leave these theories for later editions.
The original German book was planned to have 400 pages, but it finally was much
bigger: the first German edition has 525 pages and the second German edition has 564
pages. I added a chapter on Dependency Grammar and one on Minimalism to the English
version and now the book has 853 pages. I tried to represent the chosen theories appro-
priately and to cite all important work. Although the list of references is over 85 pages
long, I was probably not successful. I apologize for this and any other shortcomings.
Acknowledgments
I would like to thank David Adger, Jason Baldridge, Felix Bildhauer, Emily M. Ben-
der, Stefan Evert, Gisbert Fanselow, Sandiway Fong, Hans-Martin Gärtner, Kim Gerdes,
Adele Goldberg, Bob Levine, Paul Kay, Jakob Maché, Guido Mensching, Laura Michaelis,
Geoffrey Pullum, Uli Sauerland, Roland Schäfer, Jan Strunk, Remi van Trijp, Shravan Va-
sishth, Tom Wasow, and Stephen Wechsler for discussion and Monika Budde, Philippa
Cook, Laura Kallmeyer, Tibor Kiss, Gisela Klann-Delius, Jonas Kuhn, Timm Lichte, Anke
Lüdeling, Jens Michaelis, Bjarne Ørsnes, Andreas Pankau, Christian Pietsch, Frank Rich-
ter, Ivan Sag, and Eva Wittenberg for comments on earlier versions of the German edi-
tion of this book and Thomas Groß, Dick Hudson, Sylvain Kahane, Paul Kay, Haitao
Liu (刘海涛), Andrew McIntyre, Sebastian Nordhoff, Tim Osborne, Andreas Pankau, and
Christoph Schwarze for comments on earlier versions of this book. Thanks to Leonardo
Boiko and Sven Verdoolaege for pointing out typos. Special thanks go to Martin Haspel-
math for very detailed comments on an earlier version of the English book.
This book was the first Language Science Press book that had an open review phase
(see below). I thank Dick Hudson, Paul Kay, Antonio Machicao y Priemer, Andrew McIn-
tyre, Sebastian Nordhoff, and one anonymous open reviewer for their comments. Theses
comments are documented at the download page of this book. In addition the book went
through a stage of community proofreading (see also below). Some of the proofreaders
did much more than proofreading, their comments are highly appreciated and I decided
to publish these comments as additional open reviews. Armin Buch, Leonel de Alencar,
Andreas Hölzl, Gianina Iordăchioaia, Timm Lichte, Antonio Machicao y Priemer, and
Neal Whitman deserve special mention here.
I thank Wolfgang Sternefeld and Frank Richter, who wrote a detailed review of the
German version of this book (Sternefeld & Richter 2012). They pointed out some mis-
takes and omissions that were corrected in the second edition of the German book and
which are of course not present in the English version.
x
Thanks to all the students who commented on the book and whose questions lead to
improvements. Lisa Deringer, Aleksandra Gabryszak, Simon Lohmiller, Theresa Kallen-
bach, Steffen Neuschulz, Reka Meszaros-Segner, Lena Terhart and Elodie Winckel de-
serve special mention.
Since this book is built upon all my experience in the area of grammatical theory, I
want to thank all those with whom I ever discussed linguistics during and after talks at
conferences, workshops, summer schools or via email. Werner Abraham, John Bateman,
Dorothee Beermann, Rens Bod, Miriam Butt, Manfred Bierwisch, Ann Copestake, Hol-
ger Diessel, Kerstin Fischer, Dan Flickinger, Peter Gallmann, Petter Haugereid, Lars Hel-
lan, Tibor Kiss, Wolfgang Klein, Hans-Ulrich Krieger, Andrew McIntyre, Detmar Meu-
rers, Gereon Müller, Martin Neef, Manfred Sailer, Anatol Stefanowitsch, Peter Svenon-
ius, Michael Tomasello, Hans Uszkoreit, Gert Webelhuth, Daniel Wiechmann and Arne
Zeschel deserve special mention.
I thank Sebastian Nordhoff for a comment regarding the completion of the subject
index entry for recursion.
Andrew Murphy translated part of Chapter 1 and the Chapters 2–3, 5–10, and 12–23.
Many thanks for this!
I also want to thank the 27 community proofreaders (Viola Auermann, Armin Buch,
Andreea Calude, Rong Chen, Matthew Czuba, Leonel de Alencar, Christian Döhler,
Joseph T. Farquharson, Andreas Hölzl, Gianina Iordăchioaia, Paul Kay, Anne Kilgus,
Sandra Kübler, Timm Lichte, Antonio Machicao y Priemer, Michelle Natolo, Stephanie
Natolo, Sebastian Nordhoff, Elizabeth Pankratz, Parviz Parsafar, Conor Pyle, Daniela
Schröder, Eva Schultze-Berndt, Alec Shaw, Benedikt Singpiel, Anelia Stefanova, Neal
Whitman, Viola Wiegand) that each worked on one or more chapters and really im-
proved this book. I got more comments from every one of them than I ever got for a
book done with a commercial publisher. Some comments were on content rather than
on typos and layout issues. No proofreader employed by a commercial publisher would
have spotted these mistakes and inconsistencies since commercial publishers do not have
staff that knows all the grammatical theories that are covered in this book.
During the past years, a number of workshops on theory comparison have taken place.
I was invited to three of them. I thank Helge Dyvik and Torbjørn Nordgård for inviting
me to the fall school for Norwegian PhD students Languages and Theories in Contrast,
which took place 2005 in Bergen. Guido Mensching and Elisabeth Stark invited me to
the workshop Comparing Languages and Comparing Theories: Generative Grammar and
Construction Grammar, which took place in 2007 at the Freie Universität Berlin and An-
dreas Pankau invited me to the workshop Comparing Frameworks in 2009 in Utrecht. I
really enjoyed the discussion with all participants of these events and this book benefited
enormously from the interchange.
I thank Peter Gallmann for the discussion of his lecture notes on GB during my time
in Jena. The Sections 3.1.3–3.4 have a structure that is similar to the one of his script
and take over a lot. Thanks to David Reitter for the LATEX macros for Combinatorial Cat-
egorial Grammar, to Mary Dalrymple and Jonas Kuhn for the LFG macros and example
structures, and to Laura Kallmeyer for the LATEX sources of most of the TAG analyses.
Most of the trees have been adapted to the forest package because of compatibility is-
sues with XƎLATEX, but the original trees and texts were a great source of inspiration and
xi
Preface
without them the figures in the respective chapters would not be half as pretty as they
are now.
I thank Sašo Živanović for implementing the LATEX package forest. It really simpli-
fies typesetting of trees, dependency graphs, and type hierarchies. I also thank him
for individual help via email and on stackexchange. In general, those active on stack-
exchange could not be thanked enough: most of my questions regarding specific de-
tails of the typesetting of this book or the implementation of the LATEX classes that
are used by Language Science Press now have been answered within several minutes.
Thank you! Since this book is a true open access book under the CC-BY license, it can
also be an open source book. The interested reader finds a copy of the source code at
https://github.com/langsci/25. By making the book open source I pass on the knowledge
provided by the LATEX gurus and hope that others benefit from this and learn to typeset
their linguistics papers in nicer and/or more efficient ways.
Viola Auermann and Antje Bahlke, Sarah Dietzfelbinger, Lea Helmers, and Chiara
Jancke cannot be thanked enough for their work at the copy machines. Viola also helped
a lot with proof reading prefinal stages of the translation. I also want to thank my (for-
mer) lab members Felix Bildhauer, Philippa Cook, Janna Lipenkova, Jakob Maché, Bjarne
Ørsnes and Roland Schäfer, which were mentioned above already for other reasons, for
their help with teaching. During the years from 2007 until the publication of the first
German edition of this book two of the three tenured positions in German Linguistics
were unfilled and I would have not been able to maintain the teaching requirements
without their help and would have never finished the Grammatiktheorie book.
I thank Tibor Kiss for advice in questions of style. His diplomatic way always was a
shining example for me and I hope that this is also reflected in this book.
xii
Stauffenburg. I think that this book has a broader relevance and should be accessible
for non-German-speaking readers as well. I therefore decided to have it translated into
English. Since Stauffenburg is focused on books in German, I had to look for another
publisher. Fortunately the situation in the publishing sector changed quite dramatically
in comparison to 1997: we now have high profile publishers with strict peer review that
are entirely open access. I am very glad about the fact that Brigitte Narr sold the rights
of my book back to me and that I can now publish the English version with Language
Science Press under a CC-BY license.
xiii
Preface
If you think that textbooks like this one should be freely available to whoever wants
to read them and that publishing scientific results should not be left to profit-oriented
publishers, then you can join the Language Science Press community and support us
in various ways: you can register with Language Science Press and have your name
listed on our supporter page with almost 600 other enthusiasts, you may devote your
time and help with proofreading and/or typesetting, or you may donate money for
specific books or for Language Science Press in general. We are also looking for in-
stitutional supporters like foundations, societies, linguistics departments or university
libraries. Detailed information on how to support us is provided at the following web-
page: http://langsci-press.org/supportUs. In case of questions, please contact me or the
Language Science Press coordinator at contact@langsci-press.org.
xiv
Section 13.1.8.2 and added lexical items that show that Pirahã-like modification without
recursion can be captured in a straightforward way in Categorial Grammar.
I reorganized the HPSG chapter to be in line with more recent approaches assuming
the valence features SPR and COMPS (Sag 1997; Müller 2019c) rather than a single valence
feature. I removed the section on the LOCAL feature in Sign-based Construction Grammar
(Section 10.6.2.2 in the first edition) since it was build on the wrong assumption that the
filler would be identical to the representation in the valence specification. In Sag (2012:
536) only the information in SYN and SEM is shared.
I added the example (60) on page 630 that shows a difference in choice of preposition
in a prepositional object in Dutch vs. German. Since the publication of the first En-
glish edition of the Grammatical Theory textbook I worked extensively on the phrasal
approach to benefactive constructions in LFG (Asudeh, Giorgolo & Toivonen 2014). Sec-
tion 21.2.2 was revised and adapted to what will be published as Müller (2018a). There
is now a brief chapter on complex predicates in TAG and Categorial Grammar/HPSG
(Chapter 22), that shows that valence-based approaches allow for an underspecification
of structure. Valence is potential structure, while theories like TAG operate with actual
structure.
Apart from this I fixed several minor typos, added and updated some references and
URLs. Thanks to Philippa Cook, Timm Lichte, and Antonio Machicao y Priemer for
pointing out typos. Thanks to Leonel Figueiredo de Alencar, Francis Bond, John Carroll,
Alexander Koller, Emily M. Bender, and Glenn C. Slayden for pointers to literature. Sašo
Živanović helped adapting version 2.0 of the forest package so that it could be used
with this large book. I am very graceful for this nice tree typesetting package and all the
work that went into it.
The source code of the book and the version history is available on GitHub. Issues
can be reported there: https://github.com/langsci/25. The book is also available on paper-
hive, a platform for collective reading and annotation: https://paperhive.org/documents/
remote?type=langsci&id=25. It would be great if you would leave comments there.
xv
Preface
added a figure depicting the architecture assumed in Minimalist theories with Phases
(right figure in Figure 4.1).
I thank Frank Van Eynde for pointing out eight typos in his review of the first edition.
They have been fixed. He also pointed out that the placement of ARG-ST in the feature
geometry of signs in HPSG did not correspond to Ginzburg & Sag (2000), where ARG-ST
is on the top level rather than under CAT. Note that earlier versions of this book had ARG-
ST under CAT and there had never been proper arguments for why it should not be there,
which is why many practitioners of HPSG have kept it in that position (Müller 2018a).
One reason to keep ARG-ST on the top level is that ARG-ST is appropriate for lexemes only.
If ARG-ST is on the sign level, this can be represented in the type hierarchy: lexemes and
words have an ARG-ST feature, phrases do not. If ARG-ST is on the CAT level, one would
have to distinguish between CAT values that belong to lexemes and words on the one
hand and phrasal CAT values on the other hand, which would require two additional
subtypes of the type cat. The most recent version of the computer implementation done
in Stanford by Dan Flickinger has ARG-ST under LOCAL (2019-01-24). So, I was tempted
to leave everything as it was in the second edition of the book. However, there is a real
argument for not having ARG-ST under CAT. CAT is assumed to be shared in coordinations
and CAT contains valence features for subjects and complements. The values of these
valence features are determined by a mapping from ARG-ST. In some analyses, extracted
elements are not mapped to the valence features and the same is sometimes assumed for
omitted elements. To take an example consider (1):
(1) He saw and helped the hikers.
saw and helped are coordinated and the members in the valence lists have to be compati-
ble. Now if one coordinates a ditransitive verb with one omitted argument with a strictly
transitive verb, this would work under the assumption that the omitted argument is not
part of the valence representation. But if ARG-ST is part of CAT, coordination would be
made impossible since a three-place argument structure list would be incompatible with
a two-place list. Hence I decided to change this in the third edition and represent ARG-ST
outside of CAT from now on.
I changed the section about Sign-Based Construction Grammar (SBCG) again. An
argument about nonlocal dependencies and locality was not correct, since Sag (2012:
166) does not share all information between filler and extraction side. The argument is
now revised and presented as Section 10.6.2.3. Reviewing Müller (2020b), Bob Borsley
pointed out to me that the XARG feature is a way to circumvent locality restrictions that
is actually used in SBCG. I added a footnote to the section on locality in SBCG.
A brief discussion of Welke’s (2019) analysis of the German clause structure was added
to the chapter about Construction Grammar (see Section 10.3).
The analysis of a verb-second sentence in LFG is now part of the LFG chapter (Fig-
ure 7.5 on page 242) and not just an exercise in the appendix. A new exercise was de-
signed instead of the old one and the old one was integrated into the main text.
I added a brief discussion of Osborne’s (2019) claim that Dependency Grammars are
simpler than phrase structure grammars (p. 411).
xvi
Geoffrey Pullum pointed out at the HPSG conference in 2019 that the label constraint-
based may not be the best for the theories that are usually referred to with it. Changing
the term in this work would require to change the title of the book. The label model
theoretic may be more appropriate but some implementational work in HPSG and LFG
not considering models may find the term inappropriate. I hence decided to stick to the
established term.
I followed the advice by Lisbeth Augustinus and added a preface to Part II of the book
that gives the reader some orientation as to what to expect.
I thank Mikhail Knyazev for pointing out to me that the treatment of V to I to C
movement in the German literature differs from the lowering that is assumed for English
and that some further references are needed in the chapter on Government & Binding.
Working on the Chinese translation of this book, Wang Lulu pointed out some typos
and a wrong example sentence in Chinese. Thanks for these comments!
I thank Bob Borsley, Gisbert Fanselow, Hubert Haider and Pavel Logacev for discus-
sion and Ina Baier for a mistake in a CG proof and Jonas Benn for pointing out some
typos to me. Thanks to Tabea Reiner for a comment on gradedness. Thanks also to An-
tonio Machicao y Priemer for yet another set of comments on the second edition and to
Elizabeth Pankratz for proofreading parts of what I changed.
xvii
Part I
4
1.1 Why do syntax?
The sentences in (4a,b) show that a singular or a plural subject requires a verb with the
corresponding inflection. In (4a,b), the verb only requires one argument so the function
of die Frau ‘the woman’ and die Mädchen ‘the girls’ is clear. In (4c,d) the verb requires
two arguments and die Frau ‘the woman’ and die Mädchen ‘the girls’ could appear in
either argument position in German. The sentences could mean that the woman knows
somebody or that somebody knows the woman. However, due to the inflection on the
verb and knowledge of the syntactic rules of German, the hearer knows that there is
only one available reading for (4c) and (4d), respectively.
It is the role of syntax to discover, describe and explain such rules, patterns and struc-
tures.
5
1 Introduction and basic terms
1.3 Constituents
If we consider the sentence in (5), we have the intuition that certain words form a unit.
(5) Alle Studenten lesen während dieser Zeit Bücher.
all students read during this time books
‘All the students are reading books at this time.’
For example, the words alle ‘all’ and Studenten ‘students’ form a unit which says some-
thing about who is reading. während ‘during’, dieser ‘this’ and Zeit ‘time’ also form a
6
1.3 Constituents
unit which refers to a period of time during which the reading takes place, and Bücher
‘books’ says something about what is being read. The first unit is itself made up of two
parts, namely alle ‘all’ and Studenten ‘students’. The unit während dieser Zeit ‘during
this time’ can also be divided into two subcomponents: während ‘during’ and dieser Zeit
‘this time’. dieser Zeit ‘this time’ is also composed of two parts, just like alle Studenten
‘all students’ is.
Recall that in connection with (1c) above we talked about the sets of Russian nesting
dolls (matryoshkas). Here, too, when we break down (5) we have smaller units which are
components of bigger units. However, in contrast to the Russian dolls, we do not just
have one smaller unit contained in a bigger one but rather, we can have several units
which are grouped together in a bigger one. The best way to envisage this is to imagine
a system of boxes: one big box contains the whole sentence. Inside this box, there are
four other boxes, which each contain alle Studenten ‘all students’, lesen ‘reads’, während
dieser Zeit ‘during this time’ and Bücher ‘books’, respectively. Figure 1.1 illustrates this.
In the following section, I will introduce various tests which can be used to show how
certain words seem to “belong together” more than others. When I speak of a word se-
quence, I generally mean an arbitrary linear sequence of words which do not necessarily
need to have any syntactic or semantic relationship, e.g., Studenten lesen während ‘stu-
dents read during’ in (5). A sequence of words which form a structural entity, on the
other hand, is referred to as a phrase. Phrases can consist of words as in this time or of
combinations of words with other phrases as in during this time. The parts of a phrase
and the phrase itself are called constituents. So all elements that are in a box in Figure 1.1
are constituents of the sentence.
Following these preliminary remarks, I will now introduce some tests which will help
us to identify whether a particular string of words is a constituent or not.
1.3.1.1 Substitution
If it is possible to replace a sequence of words in a sentence with a different sequence
of words and the acceptability of the sentence remains unaffected, then this constitutes
evidence for the fact that each sequence of words forms a constituent.
7
1 Introduction and basic terms
In (6), den Mann ‘the man’ can be replaced by the string eine Frau ‘a woman’. This is
an indication that both of these word sequences are constituents.
(6) a. Er kennt [den Mann].
he knows the man
‘He knows the man.’
b. Er kennt [eine Frau].
he knows a woman
‘He knows a woman.’
Similary, in (7a), the string das Buch zu lesen ‘the book to read’ can be replaced by dem
Kind das Buch zu geben ‘the woman the book to give’.
(7) a. Er versucht, [das Buch zu lesen].
he tries the book to read
‘He is trying to read the book.’
b. Er versucht, [dem Kind das Buch zu geben].
he tries the child the book to give
‘He is trying to give the child the book.’
This test is referred to as the substitution test.
1.3.1.2 Pronominalization
Everything that can be replaced by a pronoun forms a constituent. In (8), one can for
example refer to der Mann ‘the man’ with the pronoun er ‘he’:
(8) a. [Der Mann] schläft.
the man sleeps
‘The man is sleeping.’
b. Er schläft.
he sleeps
‘He is sleeping.’
It is also possible to use a pronoun to refer to constituents such as das Buch zu lesen ‘the
book to read’ in (7a), as is shown in (9):
(9) a. Peter versucht, [das Buch zu lesen].
Peter tries the book to read
‘Peter is trying to read the book.’
b. Klaus versucht das auch.
Klaus tries that also
‘Klaus is trying to do that as well.’
The pronominalization test is another form of the substitution test.
8
1.3 Constituents
2
I use the following notational conventions for all examples: ‘*’ indicates that a sentence is ungrammatical,
‘#’ denotes that the sentence has a reading which differs from the intended one and finally ‘§’ should be
understood as a sentence which is deviant for semantic or information-structural reasons, for example,
because the subject must be animate, but is in fact inanimate in the example in question, or because there
is a conflict between constituent order and the marking of given information through the use of pronouns.
9
1 Introduction and basic terms
1.3.1.5 Fronting
Fronting is a further variant of the movement test. In German declarative sentences,
only a single constituent may normally precede the finite verb:
(15) a. [Alle Studenten] lesen während der vorlesungsfreien Zeit Bücher.
all students read.3PL during the lecture.free time books
‘All students read books during the semester break.’
b. [Bücher] lesen alle Studenten während der vorlesungsfreien Zeit.
books read all students during the lecture.free time
c. * [Alle Studenten] [Bücher] lesen während der vorlesungsfreien Zeit.
all students books read during the lecture.free time
d. * [Bücher] [alle Studenten] lesen während der vorlesungsfreien Zeit.
books all students read during the lecture.free time
The possibility for a sequence of words to be fronted (that is to occur in front of the finite
verb) is a strong indicator of constituent status.
1.3.1.6 Coordination
If two sequences of words can be conjoined then this suggests that each sequence forms
a constituent.
In (16), der Mann ‘the man’ and die Frau ‘the woman’ are conjoined and the entire
coordination is the subject of the verb arbeiten ‘to work’. This is a good indication of the
fact that der Mann and die Frau each form a constituent.
(16) [Der Mann] und [die Frau] arbeiten.
the man and the woman work.3PL
‘The man and the woman work.’
The example in (17) shows that phrases with to-infinitives can be conjoined:
10
1.3 Constituents
(17) Er hat versucht, [das Buch zu lesen] und [es dann unauffällig verschwinden zu
he had tried the book to read and it then secretly disappear to
lassen].
let
‘He tried to read the book and then make it quietly disappear.’
1.3.2.1 Expletives
There is a particular class of pronouns – so-called expletives – which do not denote
people, things, or events and are therefore non-referential. An example of this is es
‘it’ in (18).
(18) a. Es regnet.
it rains
‘It is raining.’
b. Regnet es?
rains it
‘Is it raining?’
c. dass es jetzt regnet
that it now rains
‘that it is raining now’
As the examples in (18) show, es can either precede the verb, or follow it. It can also be
separated from the verb by an adverb, which suggests that es should be viewed as an
independent unit.
Nevertheless, we observe certain problems with the aforementioned tests. Firstly, es
‘it’ is restricted with regard to its movement possibilities, as (19a) and (20b) show.
(19) a. * dass jetzt es regnet
that now it rains
Intended: ‘that it is raining now’
b. dass jetzt keiner klatscht
that now nobody claps
‘that nobody is clapping now’
(20) a. Er sah es regnen.
he saw it.ACC rain
‘He saw that it was raining.’
11
1 Introduction and basic terms
b. * Es sah er regnen.
it.ACC saw he rain
Intended: ‘he saw that it was raining.’
c. Er sah einen Mann klatschen.
he saw a.ACC man clap
‘He saw a man clapping.’
d. Einen Mann sah er klatschen.
a.ACC man saw he clap
‘A man, he saw clapping.’
Unlike the accusative object einen Mann ‘a man’ in (20c,d), the expletive in (20b) cannot
be fronted.
Secondly, substitution and question tests also fail:
(21) a. * Der Mann / er regnet.
the man he rains
b. * Wer / was regnet?
who what rains
Similarly, the coordination test cannot be applied either:
(22) * Es und der Mann regnet / regnen.
it and the man rains rain
The failure of these tests can be easily explained: weakly stressed pronouns such as es are
preferably placed before other arguments, directly after the conjunction (dass in (18c))
and directly after the finite verb in (20a) (see Abraham 1995: 570). If an element is placed
in front of the expletive, as in (19a), then the sentence is rendered ungrammatical. The
reason for the ungrammaticality of (20b) is the general ban on accusative es appearing
in clause-initial position. Although such cases exist, they are only possible if es ‘it’ is
referential (Lenerz 1994: 162; Gärtner & Steinbach 1997: 4).
The fact that we could not apply the substitution and question tests is also no longer
mysterious as es is not referential in these cases. We can only replace es ‘it’ with another
expletive such as das ‘that’. If we replace the expletive with a referential expression, we
derive a different semantic interpretation. It does not make sense to ask about something
semantically empty or to refer to it with a pronoun.
It follows from this that not all of the tests must deliver a positive result for a sequence
of words to count as a constituent. That is, the tests are therefore not a necessary require-
ment for constituent status.
1.3.2.2 Movement
The movement test is problematic for languages with relatively free constituent order,
since it is not always possible to tell what exactly has been moved. For example, the
string gestern dem Mann ‘yesterday the man’ occupies different positions in the following
examples:
12
1.3 Constituents
1.3.2.3 Fronting
As mentioned in the discussion of (15), the position in front of the finite verb is normally
occupied by a single constituent. The possibility for a given word sequence to be placed
in front of the finite verb is sometimes even used as a clear indicator of constituent status,
and even used in the definition of Satzglied 3 . An example of this is taken from Bußmann
(1983), but is no longer present in Bußmann (1990):4
Satzglied test A procedure based on → topicalization used to analyze complex con-
stituents. Since topicalization only allows a single constituent to be moved to the
beginning of the sentence, complex sequences of constituents, for example adverb
phrases, can be shown to actually consist of one or more constituents. In the ex-
ample Ein Taxi quält sich im Schrittempo durch den Verkehr ‘A taxi was struggling
at walking speed through the traffic’, im Schrittempo ‘at walking speed’ and durch
den Verkehr ‘through the traffic’ are each constituents as both can be fronted inde-
pendently of each other. (Bußmann 1983: 446)
The preceding quote has the following implications:
• Some part of a piece of linguistic material can be fronted independently →
This material does not form a constituent.
3
Satzglied is a special term used in grammars of German, referring to a constituent on the clause level
(Eisenberg et al. 2005: 783).
4
The original formulation is: Satzgliedtest [Auch: Konstituententest]. Auf der → Topikalisierung beruhen-
des Verfahren zur Analyse komplexer Konstituenten. Da bei Topikalisierung jeweils nur eine Konstituente
bzw. ein → Satzglied an den Anfang gerückt werden kann, lassen sich komplexe Abfolgen von Kon-
stituenten (z. B. Adverbialphrasen) als ein oder mehrere Satzglieder ausweisen; in Ein Taxi quält sich im
Schrittempo durch den Verkehr sind im Schrittempo und durch den Verkehr zwei Satzglieder, da sie beide
unabhängig voneinander in Anfangsposition gerückt werden können.
13
1 Introduction and basic terms
5
tagesschau, 15.10.2002, 20:00.
14
1.3 Constituents
6
taz berlin, 10.07.1998, p. 22.
7
Zeitschrift für Dialektologie und Linguistik, LXIX, 3/2002, p. 339.
8
These data can be explained by assuming a silent verbal head preceding the finite verb and thereby en-
suring that there is in fact just one constituent in initial position in front of the finite verb (Müller 2005c;
2017a). Nevertheless, this kind of data are problematic for constituent tests since these tests have been
specifically designed to tease apart whether strings such as trocken and durch die Stadt or wenig and mit
Sprachgeschichte in (30) form a constituent.
15
1 Introduction and basic terms
(30) a. Man kommt am Wochenende auch mit der BVG trocken durch die
one comes at.the weekend also with the BVG dry through the
Stadt.
city
‘With the BVG, you can be sure to get around town dry at the weekend.’
b. Der dritte Beitrag in dieser Rubrik hat wenig mit Sprachgeschichte zu
the third contribution in this section has little with language.history to
tun.
do
‘The third contribution in this section has little to do with language history.’
The possibility for a given sequence of words to be fronted is therefore not a sufficient
diagnostic for constituent status.
We have also seen that it makes sense to treat expletives as constituents despite the
fact that the accusative expletive cannot be fronted (cf. (20a)):
(31) a. Er bringt es bis zum Professor.
he brings EXPL until to.the professor
‘He makes it to professor.’
b. # Es bringt er bis zum Professor.
it brings he until to.the professor
There are other elements that can also not be fronted. Inherent reflexives are a good
example of this:
(32) a. Karl hat sich nicht erholt.
Karl has REFL not recovered
‘Karl hasn’t recovered.’
b. * Sich hat Karl nicht erholt.
REFL has Karl not recovered
It follows from this that fronting is not a necessary criterion for constituent status. There-
fore, the possibility for a given word string to be fronted is neither a necessary nor suf-
ficient condition for constituent status.
1.3.2.4 Coordination
Coordinated structures such as those in (33) also prove to be problematic:
(33) Deshalb kaufte der Mann einen Esel und die Frau ein Pferd.
therefore bought the man a donkey and the woman a horse
‘Therefore, the man bought a donkey and the woman a horse.’
At first glance, der Mann einen Esel ‘the man a donkey’ and die Frau ein Pferd ‘the woman
a horse’ in (33) seem to be coordinated. Does this mean that der Mann einen Esel and die
Frau ein Pferd each form a constituent?
16
1.4 Parts of speech
As other constituent tests show, this assumption is not plausible. This sequence of
words cannot be moved together as a unit:9
(34) * Der Mann einen Esel kaufte deshalb.
the man a donkey bought therefore
Replacing the supposed constituent is also not possible without ellipsis:
(35) a. # Deshalb kaufte er.
therefore bought he
b. * Deshalb kaufte ihn.
therefore bought him
The pronouns do not stand in for the two logical arguments of kaufen ‘to buy’, which
are realized by der Mann ‘the man’ and einen Esel ‘a donkey’ in (33), but rather for one
in each. There are analyses that have been proposed for examples such as (33) in which
two verbs kauft ‘buys’ occur, where only one is overt, however (Crysmann 2008). The
example in (33) would therefore correspond to:
(36) Deshalb kaufte der Mann einen Esel und kaufte die Frau ein Pferd.
therefore bought the man a donkey and bought the woman a horse
This means that although it seems as though der Mann einen Esel ‘the man a donkey’
and die Frau ein Pferd ‘the woman a horse’ are coordinated, it is actually kauft der Mann
einen Esel ‘buys the man a donkey’ and (kauft) die Frau ein Pferd ‘buys the woman a
horse’ which are conjoined.
We should take the following from the previous discussion: even when a given word
sequence passes certain constituent tests, this does not mean that one can automatically
infer from this that we are dealing with a constituent. That is, the tests we have seen are
not sufficient conditions for constituent status.
Summing up, it has been shown that these tests are neither sufficient nor necessary
for attributing constituent status to a given sequence of words. However, as long as one
keeps the problematic cases in mind, the previous discussion should be enough to get
an initial idea about what should be treated as a constituent.
17
1 Introduction and basic terms
Each of the words is subject to certain restrictions when forming sentences. It is common
practice to group words into classes with other words which share certain salient prop-
erties. For example, der ‘the’ is an article, Biber ‘beaver’ is a noun, schwimmt ‘swims’ is
a verb and jetzt ‘now’ is an adverb. As can be seen in (38), it is possible to replace all the
words in (37) with words from the same word class.
(38) Die kleine Raupe frisst immer.
the small caterpillar eats always
‘The small caterpillar is always eating.’
This is not always the case, however. For example, it is not possible to use a verb such
as verschlingt ‘devours’ or the second-person form schwimmst in (38). This means that
the categorization of words into parts of speech is rather coarse and that we will have to
say a lot more about the properties of a given word. In this section, I will discuss various
word classes/parts of speech and in the following sections I will go into further detail
about the various properties which characterize a given word class.
The most important parts of speech are verbs, nouns, adjectives, prepositions and ad-
verbs. In earlier decades, it was common among researchers working on German (see
also Section 11.6.1 on Tesnière’s category system) to speak of action words, describing
words, and naming words. These descriptions prove problematic, however, as illustrated
by the following examples:
(39a) does not describe a concrete entity, (39b) describes a time interval and (39c) and
(39d) describe actions. It is clear that Idee ‘idea’, Stunde ‘hour’, Sprechen ‘speaking’ and
Erörterung ‘discussion’ differ greatly in terms of their meaning. Nevertheless, these
words still behave like Raupe ‘caterpillar’ and Biber ‘beaver’ in many respects and are
therefore classed as nouns.
The term action word is not used in scientific linguistic work as verbs do not always
need to denote actions:
(40) a. Ihm gefällt das Buch.
him pleases the book
‘He likes the book.’
18
1.4 Parts of speech
19
1 Introduction and basic terms
of these forms constitute the inflectional paradigm of a verb. Tense (present, preterite,
future), mood (indicative, subjunctive, imperative), person (1st, 2nd, 3rd) and number
(singular, plural) all play a role in the inflectional paradigm. Certain forms can coincide
in a paradigm, as (43c) and (43e) and (43d) and (43f) show.
Parallel to verbs, nouns also have an inflectional paradigm:
(44) a. der Mann
the.NOM man
b. des Mannes
the.GEN man.GEN
c. dem Mann
the.DAT man
d. den Mann
the.ACC man
e. die Männer
the.NOM men
f. der Männer
the.GEN men
g. den Männern
the.DAT men.DAT
h. die Männer
the.ACC men
We can differentiate between nouns on the basis of gender (feminine, masculine, neuter).
The choice of gender is often purely formal in nature and is only partially influenced by
biological sex or the fact that we are describing a particular object:
(45) a. die Tüte
the.F bag(F)
‘the bag’
b. der Krampf
the.M cramp(M)
‘cramp’
c. das Kind
the.N child(N)
‘the child’
As well as gender, case (nominative, genitive, dative, accusative) and number are also
important for nominal paradigms.
Like nouns, adjectives inflect for gender, case and number. They differ from nouns,
however, in that gender marking is variable. Adjectives can be used with all three gen-
ders:
20
1.4 Parts of speech
21
1 Introduction and basic terms
The parts of speech discussed thus far can all be differentiated in terms of their inflec-
tional properties. For words which do not inflect, we have to use additional criteria. For
example, we can classify words by the syntactic context in which they occur (as we did
for the non-inflecting adjectives above). We can identify prepositions, adverbs, conjunc-
tions, interjections and sometimes also particles. Prepositions are words which occur
with a noun phrase whose case they determine:
(50) a. in diesen Raum
in this.ACC room
b. in diesem Raum
in this.DAT room
wegen ‘because’ is often classed as a preposition although it can also occur after the noun
and in these cases would technically be a postposition:
(51) des Geldes wegen
the money.GEN because
‘because of the money’
It is also possible to speak of adpositions if one wishes to remain neutral about the exact
position of the word.
Unlike prepositions, adverbs do not require a noun phrase.
(52) a. Er schläft in diesem Raum.
he sleeps in this room
b. Er schläft dort.
he sleeps there
Sometimes adverbs are simply treated as a special variant of prepositions (see page 94).
The explanation for this is that a prepositional phrase such as in diesem Raum ‘in this
room’ shows the same syntactic distribution as the corresponding adverbs. in differs
from dort ‘there’ in that it needs an additional noun phrase. These differences are parallel
to what we have seen with other parts of speech. For instance, the verb schlafen ‘sleep’
requires only a noun phrase, whereas erkennen ‘recognize’ requires two.
(53) a. Er schläft.
he sleeps
b. Peter erkennt ihn.
Peter recognizes him
Conjunctions can be subdivided into subordinating and coordinating conjunctions.
Coordinating conjunctions include und ‘and’ and oder ‘or’. In coordinate structures, two
units with the same syntactic properties are combined. They occur adjacent to one an-
other. dass ‘that’ and weil ‘because’ are subordinating conjunctions because the clauses
that they introduce can be part of a larger clause and depend on another element of this
larger clause.
22
1.4 Parts of speech
part of speech
no comparative comparative
adverb
conjunction
article word preposition
verb noun pronoun adjective interjection
Figure 1.2: Decision tree for determining parts of speech following Eisenberg et al. (2005:
133)
10
The Duden is the official document for the German orthography. The Duden grammar does not have an
official status but is very influential and is used for educational purposes as well. I will refer to it several
times in this introductory chapter.
23
1 Introduction and basic terms
If a word inflects for tense, then it is a verb. If it displays different case forms, then
one has to check if it has a fixed gender. If this is indeed the case, then we know that
we are dealing with a noun. Words with variable gender have to be checked to see if
they have comparative forms. A positive result will be a clear indication of an adjec-
tive. All other words are placed into a residual category, which the Duden refers to as
pronouns/article words. Like in the class of non-inflectional elements, the elements in
this remnant category are subdivided according to their syntactic behavior. The Duden
grammar makes a distinction between pronouns and article words. According to this
classification, pronouns are words which can replace a noun phrase such as der Mann
‘the man’, whereas article words normally combine with a noun. In Latin grammars,
the notion of ‘pronoun’ includes both pronouns in the above sense and articles, since
the forms with and without the noun are identical. Over the past centuries, the forms
have undergone split development to the point where it is now common in contempo-
rary Romance languages to distinguish between words which replace a noun phrase and
those which must occur with a noun. Elements which belong to the latter class are also
referred to as determiners.
If we follow the decision tree in Figure 1.2, the personal pronouns ich ‘I’, du ‘you’, er
‘he’, sie ‘her’, es ‘it’, wir ‘we’, ihr ‘you’, and sie ‘they’, for example, would be grouped
together with the possessive pronouns mein ‘mine’, dein ‘your’, sein ‘his’/‘its’, ihr ‘her’/
‘their’, unser ‘our’, and euer ‘your’. The corresponding reflexive pronouns, mich ‘myself’,
dich ‘yourself’, sich ‘himself’/‘herself’/‘itself’, ‘themselves’, uns ‘ourselves’, euch ‘your-
self’, and the reciprocal pronoun einander ‘each other’ have to be viewed as a special
case in German as there are no differing gender forms of sich ‘himself’/‘herself’/‘itself’
and einander ‘each other’. Case is not expressed morphologically by reciprocal pronouns.
By replacing genitive, dative and accusative pronouns with einander, it is possible to see
that there must be variants of einander ‘each other’ in these cases, but these variants all
share the same form:
So-called pronominal adverbs such as darauf ‘on there’, darin ‘in there’, worauf ‘on
where’, worin ‘in where’ also prove problematic. These forms consist of a preposition
(e.g., auf ‘on’) and the elements da ‘there’ and wo ‘where’. As the name suggests,
pronominal adverbs contain something pronominal and this can only be da ‘there’ and
wo ‘where’. However, da ‘there’ and wo ‘where’ do not inflect and would therefore,
following the decision tree, not be classed as pronouns.
The same is true of relative pronouns such as wo ‘where’ in (56):
(56) a. Ich komme eben aus der Stadt, wo ich Zeuge eines Unglücks gewesen
I come PART from the city where I witness of.an accident been
24
1.4 Parts of speech
bin.11
am
‘I come from the city where I was witness to an accident.’
b. Studien haben gezeigt, daß mehr Unfälle in Städten passieren, wo die
studies have shown that more accidents in cities happen where the
Zebrastreifen abgebaut werden, weil die Autofahrer unaufmerksam
zebra.crossings removed become because the drivers unattentive
werden.12
become
‘Studies have shown that there are more accidents in cities where they do away
with zebra crossings, because drivers become unattentive.’
c. Zufällig war ich in dem Augenblick zugegen, wo der Steppenwolf
coincidentally was I in the moment present where the Steppenwolf
zum erstenmal unser Haus betrat und bei meiner Tante sich einmietete.13
to.the first.time our house entered and by my aunt REFL took.lodgings
‘Coincidentally, I was present at the exact moment in which Steppenwolf en-
tered our house for the first time and took lodgings with my aunt.’
If they are uninflected, then they cannot belong to the class of pronouns according to
the decision tree above. Eisenberg (2004: 277) notes that wo ‘where’ is a kind of unin-
flected relative pronoun (he uses quotation marks) and remarks that this term runs con-
trary to the exclusive use of the term pronoun for nominal, that is, inflected, elements.
He therefore uses the term relative adverb for them (see also Eisenberg et al. (2005: §856,
§857)).
There are also usages of the relatives dessen ‘whose’ and wessen ‘whose’ in combina-
tion with a noun:
(57) a. der Mann, dessen Schwester ich kenne
the man whose sister I know
b. Ich möchte wissen, wessen Schwester du kennst.
I would.like know whose sister you know
‘I would like to know whose sister you know.’
According to the classification in the Duden, these should be covered by the terms Rel-
ativartikelwort ‘relative article word’ and Interrogativartikelwort ‘interrogative article
word’. They are mostly counted as part of the relative pronouns and question pronouns
(see for instance Eisenberg (2004: 229)). Using Eisenberg’s terminology, this is unprob-
lematic as he does not make a distinction between articles, pronouns and nouns, but
rather assigns them all to the class of nouns. But authors who do make a distinction
between articles and pronouns sometimes also speak of interrogative pronouns when
discussing words which can function as articles or indeed replace an entire noun phrase.
11
Drosdowski (1984: 672).
12
taz berlin, 03.11.1997, p. 23.
13
Herman Hesse, Der Steppenwolf. Berlin und Weimar: Aufbau-Verlag. 1986, p. 6.
25
1 Introduction and basic terms
One should be prepared for the fact that the term pronoun is often simply used for
words which refer to other entities and, this is important, not in the way that nouns
such as book and John do, but rather dependent on context. The personal pronoun er
‘he’ can, for example, refer to either a table or a man. This usage of the term pronoun
runs contrary to the decision tree in Figure 1.2 and includes uninflected elements such
as da ‘there’ and wo ‘where’.
Expletive pronouns such as es ‘it’ and das ‘that’, as well as the sich ‘him’/‘her’/‘itself’
belonging to inherently reflexive verbs, do not make reference to actual objects. They
are considered pronouns because of the similarity in form. Even if we were to assume a
narrow definition of pronouns, we would still get the wrong results as expletive forms
do not vary with regard to case, gender and number. If one does everything by the book,
expletives would belong to the class of uninflected elements. If we assume that es ‘it’ as
well as the personal pronouns have a nominative and accusative variant with the same
form, then they would be placed in with the nominals. We would then have to admit
that the assumption that es has gender would not make sense. That is we would have to
count es as a noun by assuming neuter gender, analogous to personal pronouns.
We have not yet discussed how we would deal with the italicized words in (58):
(58) a. das geliebte Spielzeug
the beloved toy
b. das schlafende Kind
the sleeping child
c. die Frage des Sprechens und Schreibens über Gefühle
the question of.the talking and writing about feelings
‘the question of talking and writing about feelings’
d. Auf dem Europa-Parteitag fordern die Grünen einen ökosozialen
on the Europe-party.conference demand the Greens a eco-social
Politikwechsel.
political.change
‘At the European party conference, the Greens demanded eco-social political
change.’
e. Max lacht laut.
Max laughs loudly
f. Max würde wahrscheinlich lachen.
Max would probably laugh
geliebte ‘beloved’ and schlafende ‘sleeping’ are participle forms of lieben ‘to love’ and
schlafen ‘to sleep’. These forms are traditionally treated as part of the verbal paradigm.
In this sense, geliebte and schlafende are verbs. This is referred to as lexical word class.
The term lexeme is relevant in this case. All forms in a given inflectional paradigm belong
to the relevant lexeme. In the classic sense, this term also includes the regularly derived
forms. That is participle forms and nominalized infinitives also belong to a verbal lex-
eme. Not all linguists share this view, however. Particularly problematic is the fact that
we are mixing verbal with nominal and adjectival paradigms. For example, Sprechens
26
1.4 Parts of speech
‘speaking.GEN’ is in the genitive case and adjectival participles also inflect for case, num-
ber and gender. Furthermore, it is unclear as to why schlafende ‘sleeping’ should be
classed as a verbal lexeme and a noun such as Störung ‘disturbance’ is its own lexeme
and does not belong to the lexeme stören ‘to disturb’. I subscribe to the more modern
view of grammar and assume that processes in which a word class is changed result in
a new lexeme being created. Consequently, schlafende ‘sleeping’ does not belong to the
lexeme schlafen ‘to sleep’, but is a form of the lexeme schlafend. This lexeme belongs to
the word class ‘adjective’ and inflects accordingly.
As we have seen, it is still controversial as to where to draw the line between inflection
and derivation (creation of a new lexeme). Sag, Wasow & Bender (2003: 263–264) view
the formation of the present participle (standing) and the past participle (eaten) in English
as derivation as these forms inflect for gender and number in French.
Adjectives such as Grünen ‘the Greens’ in (58d) are nominalized adjectives and are
written with a capital like other nouns in German when there is no other noun that can
be inferred from the immediate context:
(59) A: Willst du den roten Ball haben?
want you the red ball have
’Do you want the red ball?’
B: Nein, gib mir bitte den grünen.
no give me please the green
‘No, give me the green one, please.’
In the answer to (59), the noun Ball has been omitted. This kind of omission is not
present in (58d). One could also assume here that a word class change has taken place.
If a word changes its class without combination with a visible affix, we refer to this as
conversion. Conversion has been treated as a sub-case of derivation by some linguists.
The problem is, however, that Grüne ‘greens’ inflects just like an adjective and the gender
varies depending on the object it is referring to:
(60) a. Ein Grüner hat vorgeschlagen, …
a green.M has suggested
‘A (male) member of the Green Party suggested …’
b. Eine Grüne hat vorgeschlagen, …
a green.F has suggested
‘A (female) member of the Green Party suggested …’
We also have the situation where a word has two properties. We can make life easier
for ourselves by talking about nominalized adjectives. The lexical category of Grüne is
adjective and its syntactic category is noun.
The word in (58e) can inflect like an adjective and should therefore be classed as an
adjective following our tests. Sometimes, these kinds of adjectives are also classed as
adverbs. The reason for this is that the uninflected forms of these adjectives behave like
adverbs:
27
1 Introduction and basic terms
1.5 Heads
The head of a constituent/phrase is the element which determines the most important
properties of the constituent/phrase. At the same time, the head also determines the
composition of the phrase. That is, the head requires certain other elements to be present
in the phrase. The heads in the following examples have been marked in italics:
(64) a. Träumt dieser Mann?
dreams this.NOM man
‘Does this man dream?’
b. Erwartet er diesen Mann?
expects he.NOM this.ACC man
‘Is he expecting this man?’
c. Hilft er diesem Mann?
helps he.NOM this.DAT man
‘Is he helping this man?’
28
1.5 Heads
d. in diesem Haus
in this.DAT house
e. ein Mann
a.NOM man
Verbs determine the case of their arguments (subjects and objects). In (64d), the preposi-
tion determines which case the noun phrase diesem Haus ‘this house’ bears (dative) and
also determines the semantic contribution of the phrase (it describes a location). (64e)
is controversial: there are linguists who believe that the determiner is the head (Venne-
mann & Harlow 1977; Brame 1982; Hudson 1984: 90–92; Hellan 1986; Abney 1987; Netter
1994; 1998) while others assume that the noun is the head of the phrase (Van Langen-
donck 1994; Pollard & Sag 1994: 49; Demske 2001; Müller 2007a: Section 6.6.1; Hudson
2004; Bruening 2009).
The combination of a head with another constituent is called a projection of the head.
A projection which contains all the necessary parts to create a well-formed phrase of
that type is a maximal projection. A sentence is the maximal projection of a finite verb.
Figure 1.3 shows the structure of (65) in box representation.
(65) Der Mann liest einen Aufsatz.
the man reads an essay
‘The man is reading an essay.’
Unlike Figure 1.1, the boxes have been labelled here.
VP
NP NP
V
Det N liest Det N
der Mann einen Aufsatz
The annotation includes the category of the most important element in the box. VP
stands for verb phrase and NP for noun phrase. VP and NP are maximal projections of
their respective heads.
Anyone who has ever faced the hopeless task of trying to find particular photos of
their sister’s wedding in a jumbled, unsorted cupboard can vouch for the fact that it is
most definitely a good idea to mark the boxes based on their content and also mark the
albums based on the kinds of photos they contain.
An interesting point is that the exact content of the box with linguistic material does
not play a role when the box is put into a larger box. It is possible, for example, to replace
the noun phrase der Mann ‘the man’ with er ‘he’, or indeed the more complex der Mann
29
1 Introduction and basic terms
aus Stuttgart, der das Seminar zur Entwicklung der Zebrafinken besucht ‘the man from
Stuttgart who takes part in the seminar on the development of zebra finches’. However,
it is not possible to use die Männer ‘the men’ or des Mannes ‘of the man’ in this position:
(66) a. * Die Männer liest einen Aufsatz.
the men reads an essay
b. * Des Mannes liest einen Aufsatz.
of.the man.GEN reads an essay
The reason for this is that die Männer ‘the men’ is in plural and the verb liest ‘reads’
is in singular. The noun phrase bearing genitive case des Mannes can also not occur,
only nouns in the nominative case. It is therefore important to mark all boxes with the
information that is important for placing these boxes into larger boxes. Figure 1.4 shows
our example with more detailed annotation.
VP, fin
The features of a head which are relevant for determining in which contexts a phrase
can occur are called head features. The features are said to be projected by the head.
30
1.6 Arguments and adjuncts
an agent and the direct object is a beneficiary. Arguments which fulfil a semantic role
are also called actants. This term is also used for inanimate objects.
This kind of relation between a head and its arguments is covered by the terms selec-
tion and valence. Valence is a term borrowed from chemistry. Atoms can combine with
other atoms to form molecules with varying levels of stability. The way in which the
electron shells are occupied plays an important role for this stability. If an atom com-
bines with others atoms so that its electron shell is fully occupied, then this will lead to
a stable connection. Valence tells us something about the number of hydrogen atoms
which an atom of a certain element can be combined with. In forming H2 O, oxygen
has a valence of 2. We can divide elements into valence classes. Following Mendeleev,
elements with a particular valence are listed in the same column in the periodic table.
The concept of valence was applied to linguistics by Tesnière (1959): a head needs
certain arguments in order to form a stable compound. Words with the same valence –
that is which require the same number and type of arguments – are divided into valence
classes. Figure 1.5 shows examples from chemistry as well as linguistics.
O helps
H H Kim Sandy
Figure 1.5: Combination of hydrogen and oxygen and the combination of a verb with its
arguments
We used (67) to explain logical valence. Logical valence can, however, sometimes differ
from syntactic valence. This is the case with verbs like rain, which require an expletive
pronoun as an argument. Inherently reflexive verbs such as sich erholen ‘to recover’ in
German are another example.
(68) a. Es regnet.
it rains
‘It is raining.’
b. Klaus erholt sich.
Klaus recovers REFL
‘Klaus is recovering.’
The expletive es ‘it’ with weather verbs and the sich of so-called inherent reflexives such
as erholen ‘to recover’ have to be present in the sentence. Germanic languages have
expletive elements that are used to fill the position preceding the finite verb. These
positional expletives are not realized in embedded clauses in German, since embedded
clauses have a structure that differs from canonical unembedded declarative clauses,
which have the finite verb in second position. (69a) shows that es cannot be omitted in
dass-clauses.
31
1 Introduction and basic terms
32
1.6 Arguments and adjuncts
33
1 Introduction and basic terms
An interesting case is the verb sich befinden ‘to be located’, which expresses the lo-
cation of something. This cannot occur without some information about the location
pertaining to the verb:
(80) * Wir befinden uns.
we are.located REFL
The exact form of this information is not fixed – neither the syntactic category nor the
preposition inside of prepositional phrases is restricted:
(81) Wir befinden uns hier / unter der Brücke / neben dem Eingang / im Bett.
we are REFL here under the bridge next.to the entrance in bed
‘We are here/under the bridge/next to the entrance/in bed.’
Local modifiers such as hier ‘here’ or unter der Brücke ‘under the bridge’ are analyzed
with regard to other verbs (e.g., schlafen ‘sleep’) as adjuncts. For verbs such as sich
befinden ‘to be (located)’, we will most likely have to assume that information about
location forms an obligatory syntactic argument of the verb.
The verb selects a phrase with information about location, but does not place any syn-
tactic restrictions on its type. This specification of location behaves semantically like
the other adjuncts we have seen previously. If I just consider the semantic aspects of
the combination of a head and adjunct, then I also refer to the adjunct as a modifier.15
Arguments specifying location with verbs such as sich befinden ‘to be located’ are also
subsumed under the term modifier. Modifiers are normally adjuncts, and therefore op-
tional, whereas in the case of sich befinden they seem to be (obligatory) arguments.
In conclusion, we can say that constituents that are required to occur with a certain
head are arguments of that head. Furthermore, constituents which fulfil a semantic role
with regard to the head are also arguments. These kinds of arguments can, however,
sometimes be optional.
Arguments are normally divided into subjects and complements.16 Not all heads re-
quire a subject (see Müller 2007a: Section 3.2). The number of arguments of a head can
therefore also correspond to the number of complements of a head.
34
1.7 Grammatical functions
1.7.1 Subjects
Although I assume that the reader has a clear intuition about what a subject is, it is by
no means a trivial matter to arrive at a definition of the word subject which can be used
cross-linguistically. For German, Reis (1982) suggested the following syntactic properties
as definitional for subjects:
• agreement of the finite verb with it
• nominative case in non-copular clauses
• omitted in infinitival clauses (control)
• optional in imperatives
I have already discussed agreement in conjunction with the examples in (4). Reis (1982)
argues that the second bullet point is a suitable criterion for German. She formulates a
restriction to non-copular clause because there can be more than one nominative argu-
ment in sentences with predicate nominals such as (82):
(82) a. Er ist ein Lügner.
he.NOM ist a liar.NOM
‘He is a liar.’
b. Er wurde ein Lügner genannt.
he.NOM was a liar.NOM called
‘He was called a liar.’
Following this criterion, arguments in the dative case such as den Männern ‘the men’
cannot be classed as subjects in German:
(83) a. Er hilft den Männern.
he helps the.DAT men.DAT
‘He is helping the men.’
b. Den Männern wurde geholfen.
the.DAT men.DAT were.3SG helped
‘The men were helped.’
Following the other criteria, datives should also not be classed as subjects – as Reis (1982)
has shown. In (83b), wurde, which is the 3rd person singular form, does not agree with
den Männern. The third of the aforementioned criteria deals with infinitive constructions
such as those in (84):
(84) a. Klaus behauptet, den Männern zu helfen.
Klaus claims the.DAT men.DAT to help
‘Klaus claims to be helping the men.’
35
1 Introduction and basic terms
36
1.7 Grammatical functions
It should be noted that there are different opinions on the question of whether clausal
arguments should be treated as subjects or not. As recent publications show, there is
still some discussion in Lexical Function Grammar (see Chapter 7) (Dalrymple & Lødrup
2000; Berman 2003b; 2007; Alsina, Mohanan & Mohanan 2005; Forst 2006).
If we can be clear about what we want to view as a subject, then the definition of object
is no longer difficult: objects are all other arguments whose form is directly determined
by a given head. As well as clausal objects, German has genitive, dative, accusative and
prepositional objects:
37
1 Introduction and basic terms
As well as defining objects by their case, it is commonplace to talk of direct objects and
indirect objects. The direct object gets its name from the fact that – unlike the indirect
object – the referent of a direct object is directly affected by the action denoted by the
verb. With ditransitives such as the German geben ‘to give’, the accusative object is the
direct object and the dative is the indirect object.
(92) dass er dem Mann den Aufsatz gibt
that he.NOM the.DAT man.DAT the.ACC essay.ACC gives
‘that he gives the man the essay’
For trivalent verbs (verbs taking three arguments), we see that the verb can take either
an object in the genitive case (93a) or, for verbs with a direct object in the accusative, a
second accusative object (93b):
(93) a. dass er den Mann des Mordes bezichtigte
that he the.ACC man.ACC the.GEN murder.GEN accused
‘that he accused the man of murder’
b. dass er den Mann den Vers lehrte
that he the.ACC man.ACC the.ACC verse.ACC taught
‘that he taught the man the verse’
These kinds of objects are sometimes also referred to as indirect objects.
Normally, only those objects which are promoted to subject in passives with werden ‘to
be’ are classed as direct objects. This is important for theories such as LFG (see Chapter 7)
since passivization is defined with reference to grammatical function. With two-place
verbal predicates, the dative is not normally classed as a direct object (Cook 2006).
(94) dass er dem Mann hilft
that he the.DAT man.DAT helps
‘that he helps the man’
38
1.7 Grammatical functions
In many theories, grammatical function does not form a primitive component of the the-
ory, but rather corresponds to positions in a tree structure. The direct object in German
is therefore the object which is first combined with the verb in a configuration assumed
to be the underlying structure of German sentences. The indirect object is the second
object to be combined with the verb. On this view, the dative object of helfen ‘to help’
would have to be viewed as a direct object.
In the following, I will simply refer to the case of objects and avoid using the terms
direct object and indirect object.
In the same way as with subjects, we consider whether there are object clauses which
are equivalent to a certain case and can fill the respective grammatical function of a
direct or indirect object. If we assume that dass du sprichst ‘that you are speaking’ in
(95a) is a subject, then the subordinate clause must be a direct object in (95b):
(95) a. Dass du sprichst, wird erwähnt.
that you speak is mentioned
‘The fact that you’re speaking is being mentioned.’
b. Er erwähnt, dass du sprichst.
he mentions that you speak
‘He mentions that you are speaking.’
In this case, we cannot really view the subordinate clause as the accusative object since
it does not bear case. However, we can replace the sentence with an accusative-marked
noun phrase:
(96) Er erwähnt diesen Sachverhalt.
he mentions this.ACC matter
‘He mentions this matter.’
If we want to avoid this discussion, we can simply call these arguments clausal objects.
39
1 Introduction and basic terms
b. Er arbeitet vergleichend.
he works comparatively
‘He does comparative work.’
c. Er arbeitet in der Universität.
he works in the university
‘He works at the university.’
d. Er arbeitet den ganzen Tag.
he works the whole day.ACC
‘He works all day.’
e. Er arbeitet, weil es ihm Spaß macht.
he works because it him.DAT fun makes
‘He works because he enjoys it.’
Although the noun phrase in (97d) bears accusative case, it is not an accusative object.
den ganzen Tag ‘the whole day’ is a so-called temporal accusative. The occurrence of
accusative in this case has to do with the syntactic and semantic function of the noun
phrase, it is not determined by the verb. These kinds of accusatives can occur with a
variety of verbs, even with verbs that do not normally require an accusative object:
1.7.3 Predicatives
Adjectives like those in (100a,b) as well as noun phrases such as ein Lügner ‘a liar’ in
(100c) are counted as predicatives.
40
1.7 Grammatical functions
Only ihn ‘him’ can be described as an object in (101a). In (101b), ihn becomes the subject
and therefore bears nominative case. einen Lügner ‘a liar’ refers to ihn ‘him’ in (101a)
17
There is some dialectal variation with regard to copula constructions: in Standard German, the case of
the noun phrase with sein ‘to be’ is always nominative and does not change when embedded under lassen
‘to let’. According to Drosdowski (1995: § 1259), in Switzerland the accusative form is common which one
finds in examples such as (ii.a).
41
1 Introduction and basic terms
and to er ‘he’ in (101b) and agrees in case with the noun over which it predicates. This
is also referred to as agreement case.
For other predicative constructions see Eisenberg et al. (2005: § 1206) and Müller
(2002a: Chapter 4, Chapter 5) and Müller (2008).
42
1.8 A topological model of the German clause
• verb-final clauses
• verb-first (initial) clauses
• verb-second (V2) clauses
43
1 Introduction and basic terms
discontinuous. We can then divide the German clause into various sub-parts on the ba-
sis of these distinctions. In (104b) and (104c), the verb and the auxiliary form a “bracket”
around the clause. For this reason, we call this the sentence bracket (Satzklammer). The
finite verbs in (104b) and (104c) form the left bracket and the non-finite verbs form the
right bracket. Clauses with verb-final order are usually introduced by conjunctions such
as weil ‘because’, dass ‘that’ and ob ‘whether’. These conjunctions occupy the same po-
sition as the finite verb in verb-initial or verb-second clauses. We therefore also assume
that these conjunctions form the left bracket in these cases. Using the notion of the sen-
tence bracket, it is possible to divide the structure of the German clause into the prefield
(Vorfeld), middle field (Mittelfeld) and postfield (Nachfeld). The prefield describes every-
thing preceding the left sentence bracket, the middle field is the section between the left
and right bracket and the postfield describes the position after the right bracket. The Ta-
bles 1.1 and 1.2 give some examples of this. The right bracket can contain multiple verbs
Table 1.1: Examples of how topological fields can be occupied in declarative main clauses
and is often referred to as a verbal complex or verb cluster. The assignment of question
words and relative pronouns to the prefield will be discussed in the following section.
Table 1.2: Examples of how topological fields can be occupied in yes/no questions, im-
peratives, exclamatives and various verb final sentences including adverbial
clauses, interrogative and relative clauses
19
Spiegel, 12/1999, p. 258.
20
Michail Bulgakow, Der Meister und Margarita. München: Deutscher Taschenbuch Verlag. 1997, p. 422.
45
1 Introduction and basic terms
46
1.8 A topological model of the German clause
How should we analyze relative clauses such as die er kennt ‘that he knows’? Do they
form part of the middle field or the postfield? This can be tested using a test developed
by Bech (1955: 72) (Rangprobe): first, we modify the example in (108) so that it is in
the perfect. Since non-finite verb forms occupy the right bracket, we can clearly see
the border between the middle field and postfield. The examples in (109) show that the
relative clause cannot occur in the middle field unless it is part of a complex constituent
with the head noun Frau ‘woman’.
(109) a. Er hat [der Frau] das Buch gegeben, [die er kennt].
he has the woman the book given that he knows
‘He has given the book to the woman that he knows.’
b. * Er hat [der Frau] das Buch, [die er kennt,] gegeben.
he has the woman the book that he knows given
c. Er hat [der Frau, die er kennt,] das Buch gegeben.
he has the woman that he knows the book given
This test does not help if the relative clause is realized together with its head noun at the
end of the sentence as in (110):
(110) Er gibt das Buch der Frau, die er kennt.
he gives the book the woman that he knows
‘He gives the book to the woman that he knows.’
If we put the example in (110) in the perfect, then we observe that the lexical verb can
occur before or after the relative clause:
(111) a. Er hat das Buch [der Frau] gegeben, [die er kennt].
he has the book the woman given that he knows
‘He has given the book to the woman he knows.’
b. Er hat das Buch [der Frau, die er kennt,] gegeben.
he has the book the woman that he knows given
In (111a), the relative clause has been extraposed. In (111b) it forms part of the noun phrase
der Frau, die er kennt ‘the woman that he knows’ and therefore occurs inside the NP in
the middle field. It is therefore not possible to rely on this test for (110). We assume that
the relative clause in (110) also belongs to the NP since this is the most simple structure. If
the relative clause were in the postfield, we would have to assume that it has undergone
extraposition from its position inside the NP. That is, we would have to assume the NP-
structure anyway and then extraposition in addition.
We have a similar problem with interrogative and relative pronouns. Depending on
the author, these are assumed to be in the left bracket (Kathol 2001; Dürscheid 2003:
94–95; Eisenberg 2004: 403; Pafel 2011: 57) or the prefield (Eisenberg et al. 2005: §1345;
Wöllstein 2010: 29–30, Section 3.1) or even in the middle field (Altmann & Hofman 2004:
75). In Standard German interrogative or relative clauses, both fields are never simulta-
neously occupied. For this reason, it is not immediately clear to which field an element
47
1 Introduction and basic terms
belongs. Nevertheless, we can draw parallels to main clauses: the pronouns in interrog-
ative and relative clauses can be contained inside complex phrases:
(112) a. der Mann, [mit dem] du gesprochen hast
the man with whom you spoken have
‘the man you spoke to’
b. Ich möchte wissen, [mit wem] du gesprochen hast.
I want.to know with whom you spoken have
‘I want to know who you spoke to.’
Normally, only individual words (conjunctions or verbs) can occupy the left bracket,23
whereas words and phrases can appear in the prefield. It therefore makes sense to assume
that interrogative and relative pronouns (and phrases containing them) also occur in this
position.
Furthermore, it can be observed that the dependency between the elements in the
Vorfeld of declarative clauses and the remaining sentence is of the same kind as the
dependency between the phrase that contains the relative pronoun and the remaining
sentence. For instance, über dieses Thema ‘about this topic’ in (113a) depends on Vortrag
‘talk’, which is deeply embedded in the sentence: einen Vortrag ‘a talk’ is an argument of
zu halten ‘to hold’, which in turn is an argument of gebeten ‘asked’.
(113) a. Über dieses Thema habe ich ihn gebeten, einen Vortrag zu halten.
about this topic have I him asked a talk to hold
‘I asked him to give a talk about this topic.’
b. das Thema, über das ich ihn gebeten habe, einen Vortrag zu halten
the topic about which I him asked have a talk to hold
‘the topic about which I asked him to give a talk’
The situation is similar in (113b): the relative phrase über das ‘about which’ is a dependent
of Vortrag ‘talk’ which is realized far away from it. Thus, if the relative phrase is assigned
to the Vorfeld, it is possible to say that such nonlocal frontings always target the Vorfeld.
Finally, the Duden grammar (Eisenberg et al. 2005: §1347) provides the following ex-
amples from non-standard German (mainly southern dialects):
(114) a. Kommt drauf an, mit wem dass sie zu tun haben.
comes there.upon PART with whom that you to do have
‘It depends on whom you are dealing with.’
23
Coordination is an exception to this:
48
1.8 A topological model of the German clause
(115) a. Lotti, die wo eine tolle Sekretärin ist, hat ein paar merkwürdige
Lotti who where a great secretary is has a few strange
Herren empfangen.
gentlemen welcomed
‘Lotti, who is a great secretary, welcomed a few strange gentlemen.’
b. Du bist der beste Sänger, den wo ich kenn.
you are the best singer who where I know
‘You are the best singer whom I know.’
These examples of interrogative and relative clauses show that the left sentence bracket
is filled with a conjunction (dass ‘that’ or wo ‘where’ in the respective dialects). So if one
wants to have a model that treats Standard German and the dialectal forms uniformly,
it is reasonable to assume that the relative phrases and interrogative phrases are located
in the Vorfeld.
1.8.4 Recursion
As already noted by Reis (1980: 82), when occupied by a complex constituent, the prefield
can be subdivided into further fields including a postfield, for example. The constituents
für lange lange Zeit ‘for a long, long time’ in (116b) and daß du kommst ‘that you are com-
ing’ in (116d) are inside the prefield but occur to the right of the right bracket verschüttet
‘buried’ / gewußt ‘knew’, that is they are in the postfield of the prefield.
(116) a. Die Möglichkeit, etwas zu verändern, ist damit verschüttet für lange
the possibility something to change is there.with buried for long
lange Zeit.
long time
‘The possibility to change something will now be gone for a long, long time.’
b. [Verschüttet für lange lange Zeit] ist damit die Möglichkeit, etwas
buried for long long time ist there.with the possibility something
zu verändern.
to change
c. Wir haben schon seit langem gewußt, daß du kommst.
we have PART since long known that you come
‘We have known for a while that you are coming.’
d. [Gewußt, daß du kommst,] haben wir schon seit langem.
known that you come have we PART since long
Like constituents in the prefield, elements in the middle field and postfield can also have
an internal structure and be divided into subfields accordingly. For example, daß ‘that’
is the left bracket of the subordinate clause daß du kommst in (116c), whereas du ‘you’
occupies the middle field and kommst ‘come’ the right bracket.
49
1 Introduction and basic terms
Comprehension questions
(117) a. he
b. Go!
c. quick
5. How can we define the terms prefield (Vorfeld), middle field (Mittelfeld), post-
field (Nachfeld) and the left and right sentence brackets (Satzklammer)?
Exercises
1. Identify the sentence brackets, prefield, middle field and postfield in the fol-
lowing sentences. Do the same for the embedded clauses!
50
1.8 A topological model of the German clause
Further reading
Reis (1980) gives reasons for why field theory is important for the description of
the position of constituents in German.
Höhle (1986) discusses fields to the left of the prefield, which are needed for
left-dislocation structures such as with der Mittwoch in (120), aber in (121a) and
denn in (121b):
(120) Der Mittwoch, der passt mir gut.
the Wednesday that fits me good
‘Wednesday, that suits me fine.’
(121) a. Aber würde denn jemand den Hund füttern morgen Abend?
but would PART anybody the dog feed tomorrow evening
‘But would anyone feed the dog tomorrow evening?’
b. Denn dass es regnet, damit rechnet keiner.
because that it rains there.with reckons nobody
‘Because no-one expects that it will rain.’
51
2 Phrase structure grammar
This chapter deals with phase structure grammars (PSGs), which play an important role
in several of the theories we will encounter in later chapters.
We can analyze the sentence in (1) using the grammar in (2) in the following way:
first, we take the first word in the sentence and check if there is a rule in which this
word occurs on the right-hand side of the rule. If this is the case, then we replace the
word with the symbol on the left-hand side of the rule. This happens in lines 2–4, 6–7
and 9 of the derivation in (3). For instance, in line 2 er is replaced by NP. If there are two
or more symbols which occur together on the right-hand side of a rule, then all these
words are replaced with the symbol on the left. This happens in lines 5, 8 and 10. For
instance, in line 5 and 8, Det and N are rewritten as NP.
(3) words and symbols rules that are applied
1 er das Buch dem Kind gibt
2 NP das Buch dem Kind gibt NP → er
3 NP Det Buch dem Kind gibt Det → das
4 NP Det N dem Kind gibt N → Buch
5 NP NP dem Kind gibt NP → Det N
6 NP NP Det Kind gibt Det → dem
7 NP NP Det N gibt N → Kind
8 NP NP NP gibt NP → Det N
9 NP NP NP V V → gibt
10 S S → NP NP NP V
In (3), we began with a string of words and it was shown that we can derive the structure
of a sentence by applying the rules of a given phrase structure grammar. We could have
applied the same steps in reverse order: starting with the sentence symbol S, we would
have applied the steps 9–1 and arrived at the string of words. Selecting different rules
from the grammar for rewriting symbols, we could use the grammar in (2) to get from
S to the string er dem Kind das Buch gibt ‘he the child the book gives’. We can say that
this grammar licenses (or generates) a set of sentences.
The derivation in (3) can also be represented as a tree. This is shown by Figure 2.1. The
NP NP NP V
Det N Det N
Figure 2.1: Analysis of er das Buch dem Kind gibt ‘he the book the child gives’
symbols in the tree are called nodes. We say that S immediately dominates the NP nodes
and the V node. The other nodes in the tree are also dominated, but not immediately
dominated, by S. If we want to talk about the relationship between nodes, it is common
54
2.1 Symbols and rewrite rules
to use kinship terms. In Figure 2.1, S is the mother node of the three NP nodes and the
V node. The NP node and V are sisters since they have the same mother node. If a node
has two daughters, then we have a binary branching structure. If there is exactly one
daughter, then we have a unary branching structure. Two constituents are said to be
adjacent if they are directly next to each other.
Phrase structure rules are often omitted in linguistic publications. Instead, authors
opt for tree diagrams or the compact equivalent bracket notation such as (4).
(4) [S [NP er] [NP [Det das] [N Buch]] [NP [Det dem] [N Kind]] [V gibt]]
he the book the child gives
Nevertheless, it is the grammatical rules which are actually important since these rep-
resent grammatical knowledge which is independent of specific structures. In this way,
we can use the grammar in (2) to parse or generate the sentence in (5), which differs
from (1) in the order of objects:
(5) [weil] er dem Kind das Buch gibt
because he.NOM the.DAT child the.ACC book gives
‘because he gives the child the book’
The rules for replacing determiners and nouns are simply applied in a different order
than in (1). Rather than replacing the first Det with das ‘the’ and the first noun with
Buch ‘book’, the first Det is replaced with dem ‘the’ and the first noun with Kind.
At this juncture, I should point out that the grammar in (2) is not the only possible
grammar for the example sentence in (1). There is an infinite number of possible gram-
mars which could be used to analyze these kinds of sentences (see exercise 1). Another
possible grammar is given in (6):
(6) NP → Det N NP → er N → Buch
V → NP V Det → das N → Kind
Det → dem V → gibt
This grammar licenses binary branching structures as shown in Figure 2.2 on the follow-
ing page.
Both the grammar in (6) and (2) are too imprecise. If we adopt additional lexical entries
for ich ‘I’ and den ‘the’ (accusative) in our grammar, then we would incorrectly license
the ungrammatical sentences in (7b–d):3
3
With the grammar in (6), we also have the additional problem that we cannot determine when an utterance
is complete since the symbol V is used for all combinations of V and NP. Therefore, we can also analyze
the sentence in (i) with this grammar:
The number of arguments required by a verb must be somehow represented in the grammar. In the fol-
lowing chapters, we will see exactly how the selection of arguments by a verb (valence) can be captured
in various grammatical theories.
55
2 Phrase structure grammar
NP V
NP V
Det N NP V
Det N
Figure 2.2: Analysis of er das Buch dem Kind gibt with a binary branching structure
56
2.1 Symbols and rewrite rules
57
2 Phrase structure grammar
Instead of the rule NP → Det N, we will have to use rules such as those in (13):5
(13) NP_3_sg_nom → Det_fem_sg_nom N_fem_sg_nom
NP_3_sg_nom → Det_mas_sg_nom N_mas_sg_nom
NP_3_sg_nom → Det_neu_sg_nom N_neu_sg_nom
NP_3_pl_nom → Det_fem_pl_nom N_fem_pl_nom
NP_3_pl_nom → Det_mas_pl_nom N_mas_pl_nom
NP_3_pl_nom → Det_neu_pl_nom N_neu_pl_nom
NP_3_sg_nom → Det_fem_sg_nom N_fem_sg_nom
NP_3_sg_nom → Det_mas_sg_nom N_mas_sg_nom
NP_3_sg_nom → Det_neu_sg_nom N_neu_sg_nom
NP_3_pl_nom → Det_fem_pl_nom N_fem_pl_nom
NP_3_pl_nom → Det_mas_pl_nom N_mas_pl_nom
NP_3_pl_nom → Det_neu_pl_nom N_neu_pl_nom
(13) shows the rules for nominative noun phrases. We would need analogous rules for
genitive, dative, and accusative. We would then require 24 symbols for determiners
(3 ∗ 2 ∗ 4), 24 symbols for nouns and 24 rules rather than one. If inflection class is taken
into account, the number of symbols and the number of rules doubles.
4
These are inflectional classes for adjectives which are also relevant for some nouns such as Beamter ‘civil
servant’, Verwandter ‘relative’, Gesandter ‘envoy’. For more on adjective classes see page 21.
5
To keep things simple, these rules do not incorporate information regarding the inflection class.
58
2.2 Expanding PSG with features
59
2 Phrase structure grammar
2.3 Semantics
In the introductory chapter and the previous sections, we have been dealing with syntac-
tic aspects of language and the focus will remain very much on syntax for the remainder
of this book. It is, however, important to remember that we use language to commu-
nicate, that is, to transfer information about certain situations, topics or opinions. If
we want to accurately explain our capacity for language, then we also have to explain
the meanings that our utterances have. To this end, it is necessary to understand their
syntactic structure, but this alone is not enough. Furthermore, theories of language ac-
quisition that only concern themselves with the acquisition of syntactic constructions
are also inadequate. The syntax-semantics interface is therefore important and every
grammatical theory has to say something about how syntax and semantics interact. In
the following, I will show how we can combine phrase structure rules with semantic
information. To represent meanings, I will use first-order predicate logic and 𝜆-calculus.
Unfortunately, it is not possible to provide a detailed discussion of the basics of logic
so that even readers without prior knowledge can follow all the details, but the simple
examples discussed here should be enough to provide some initial insights into how syn-
tax and semantics interact and furthermore, how we can develop a linguistic theory to
account for this.
To show how the meaning of a sentence is derived from the meaning of its parts, we
will consider (18a). We assign the meaning in (18b) to the sentence in (18a).
(18) a. Max schläft.
Max sleeps
‘Max is sleeping.’
b. schlafen ′(max ′)
Here, we are assuming schlafen ′ to be the meaning of schläft ‘sleeps’. We use prime
symbols to indicate that we are dealing with word meanings and not actual words. At
first glance, it may not seem that we have really gained anything by using schlafen ′ to
represent the meaning of (18a), since it is just another form of the verb schläft ‘sleeps’.
It is, however, important to concentrate on a single verb form as inflection is irrelevant
when it comes to meaning. We can see this by comparing the examples in (19a) and (19b):
(19) a. Jeder Junge schläft.
every boy sleeps
‘Every boy sleeps.’
b. Alle Jungen schlafen.
all boys sleep
‘All boys sleep.’
To enhance readability I use English translations of the predicates in semantic represen-
tations from now on.7 So the meaning of (18a) is represented as (20) rather then (18b):
7
Note that I do not claim that English is suited as representation language for semantic relations and con-
cepts that can be expressed in other languages.
60
2.3 Semantics
When looking at the meaning in (20), we can consider which part of the meaning comes
from each word. It seems relatively intuitive that max ′ comes from Max, but the trickier
question is what exactly schläft ‘sleeps’ contributes in terms of meaning. If we think
about what characterizes a ‘sleeping’ event, we know that there is typically an individual
who is sleeping. This information is part of the meaning of the verb schlafen ‘to sleep’.
The verb meaning does not contain information about the sleeping individual, however,
as this verb can be used with various subjects:
(21) a. Paul schläft.
Paul sleeps
‘Paul is sleeping.’
b. Mio schläft.
Mio sleeps
‘Mio is sleeping.’
c. Xaver schläft.
Xaver sleeps
‘Xaver is sleeping.’
We can therefore abstract away from any specific use of sleep ′ and instead of, for exam-
ple, max ′ in (20), we use a variable (e.g., 𝑥). This 𝑥 can then be replaced by paul ′, mio ′
or xaver ′ in a given sentence. To allow us to access these variables in a given meaning,
we can write them with a 𝜆 in front. Accordingly, schläft ‘sleeps’ will have the following
meaning:
(22) 𝜆𝑥 sleep′ (𝑥)
The step from (20) to (22) is referred to as lambda abstraction. The combination of the
expression (22) with the meaning of its arguments happens in the following way: we
remove the 𝜆 and the corresponding variable and then replace all instances of the variable
with the meaning of the argument. If we combine (22) and max ′ as in (23), we arrive at
the meaning in (20), namely sleep ′(max ′).
(23) 𝜆𝑥 sleep ′ (𝑥) max ′
The process is called 𝛽-reduction or 𝜆-conversion. To show this further, let us consider
an example with a transitive verb. The sentence in (24a) has the meaning given in (24b):
(24) a. Max mag Lotte.
Max likes Lotte
‘Max likes Lotte.’
b. like ′(max ′, lotte ′)
The 𝜆-abstraction of mag ‘likes’ is shown in (25):
(25) 𝜆𝑦𝜆𝑥 like′ (𝑥, 𝑦)
61
2 Phrase structure grammar
Note that it is always the first 𝜆 that has to be used first. The variable 𝑦 corresponds to the
object of mögen ‘to like’. For languages like English it is assumed that the object forms a
verb phrase (VP) together with the verb and this VP is combined with the subject. Ger-
man differs from English in allowing more freedom in constituent order. The problems
that result for form meaning mappings are solved in different ways by different theories.
The respective solutions will be addressed in the following chapters.
If we combine the representation in (25) with that of the object Lotte, we arrive at
(26a), and following 𝛽-reduction, (26b):
(26) a. 𝜆𝑦𝜆𝑥 like′ (𝑥, 𝑦)lotte′
b. 𝜆𝑥 like′ (𝑥, lotte′)
This meaning can in turn be combined with the subject and we then get (27a) and (27b)
after 𝛽-reduction:
(27) a. 𝜆𝑥 like′ (𝑥, lotte′)max′
b. like ′(max ′, lotte ′)
After introducing lambda calculus, integrating the composition of meaning into our
phrase structure rules is simple. A rule for the combination of a verb with its subject
has to be expanded to include positions for the semantic contribution of the verb, the
semantic contribution of the subject and then the meaning of the combination of these
two (the entire sentence). The complete meaning is the combination of the individual
meanings in the correct order. We can therefore take the simple rule in (28a) and turn it
into (28b):
(28) a. S → NP(nom) V
b. S(V′ NP′) → NP(nom, NP ′) V(V′)
V′ stands for the meaning of V and NP′ for the meaning of the NP(nom). V ′ NP′ stands for
the combination of V′ and NP′. When analyzing (18a), the meaning of V′ is 𝜆𝑥 sleep ′ (𝑥)
and the meaning of NP′ is max ′. The combination of V′ NP′ corresponds to (29a) or after
𝛽-reduction to (18b) – repeated here as (29b):
(29) a. 𝜆𝑥 sleep′ (𝑥)max′
b. sleep ′(max ′)
For the example with a transitive verb in (24a), the rule in (30) can be proposed:
(30) S(V′ NP2′ NP1′) → NP(nom, NP1 ′) V(V ′) NP(acc, NP2 ′)
The meaning of the verb (V ′) is first combined with the meaning of the object (NP2′) and
then with the meaning of the subject (NP1′).
At this point, we can see that there are several distinct semantic rules for the phrase
structure rules above. The hypothesis that we should analyze language in this way is
called the rule-to-rule hypothesis (Bach 1976: 184). A more general process for deriving
the meaning of linguistic expression will be presented in Section 5.1.4.
62
2.4 Phrase structure rules for some aspects of German syntax
63
2 Phrase structure grammar
analyze the other noun phrases in (31). In addition to rule (32a), one could propose a rule
such as the one in (32b).8,9
(32) a. NP → Det N
b. NP → Det A N
However, this rule would still not allow us to analyze noun phrases such as (33):
(33) alle weiteren schlagkräftigen Argumente
all further strong arguments
‘all other strong arguments’
In order to be able to analyze (33), we require a rule such as (34):
(34) NP → Det A A N
It is always possible to increase the number of adjectives in a noun phrase and setting an
upper limit for adjectives would be entirely arbitrary. Even if we opt for the following
abbreviation, there are still problems:
(35) NP → Det A* N
The asterisk in (35) stands for any number of iterations. Therefore, (35) encompasses
rules with no adjectives as well as those with one, two or more.
The problem is that according to the rule in (35) adjectives and nouns do not form a
constituent and we can therefore not explain why coordination is still possible in (36):
(36) alle [[großen Seeelefanten] und [grauen Eichhörnchen]]
all big elephant.seals and grey squirrels
‘all the big elephant seals and grey squirrels’
If we assume that coordination involves the combination of two or more word strings
with the same syntactic properties, then we would have to assume that the adjective and
noun form a unit.
The following rules capture the noun phrases with adjectives discussed thus far:
(37) a. NP → Det N
b. N → A N
c. N → N
These rules state the following: a noun phrase consists of a determiner and a nominal
element (N). This nominal element can consist of an adjective and a nominal element
(37b), or just a noun (37c). Since N is also on the right-hand side of the rule in (37b), we
can apply this rule multiple times and therefore account for noun phrases with multiple
adjectives such as (33). Figure 2.3 on the next page shows the structure of a noun phrase
without an adjective and that of a noun phrase with one or two adjectives. The adjective
8
See Eisenberg (2004: 238) for the assumption of flat structures in noun phrases.
9
There are, of course, other features such as gender and number, which should be part of all the rules
discussed in this section. I have omitted these in the following for ease of exposition.
64
2.4 Phrase structure rules for some aspects of German syntax
NP
NP Det N
NP Det N A N
Det N A N A N
N N N
grau ‘grey’ restricts the set of referents for the noun phrase. If we assume an additional
adjective such as groß ‘big’, then it only refers to those squirrels who are grey as well as
big. These kinds of noun phrases can be used in contexts such as the following:
We observe that this discourse can be continued with Aber alle kleinen grauen Eich-
hörnchen sind krank ‘but all small grey squirrels are ill’ and a corresponding answer.
The possibility to have even more adjectives in noun phrases such as ein kleines graues
Eichhörnchen ‘a small grey squirrel’ is accounted for in our rule system in (37). In the
rule (37b), N occurs on the left as well as the right-hand side of the rule. This kind of
rule is referred to as recursive.
We have now developed a nifty little grammar that can be used to analyze noun
phrases containing adjectival modifiers. As a result, the combination of an adjective
and noun is given constituent status. One may wonder at this point if it would not make
sense to also assume that determiners and adjectives form a constituent, as we also have
the following kind of noun phrases:
(39) diese schlauen und diese neugierigen Eichhörnchen
these smart and these curious squirrels
65
2 Phrase structure grammar
Here, we are dealing with a different structure, however. Two full NPs have been con-
joined and part of the first conjunct has been deleted.
(40) diese schlauen Eichhörnchen und diese neugierigen Eichhörnchen
these smart squirrels and these curious squirrels
One can find similar phenomena at the sentence and even word level:
(41) a. dass Peter dem Kind das Buch gibt und Maria der Frau die Schallplatte
that Peter the child the book gives and Maria the woman the record
gibt
gives
‘that Peters gives the book to the child and Maria the record to the woman’
b. be- und ent-laden
PRFX and PRFX-load
‘load and unload’
Thus far, we have discussed how we can ideally integrate adjectives into our rules for
the structure of noun phrases. Other adjuncts such as prepositional phrases or relative
clauses can be combined with N in an analogous way to adjectives:
(42) a. N → N PP
b. N → N relative clause
With these rules and those in (37), it is possible – assuming the corresponding rules for
PPs and relative clauses – to analyze all the examples in (31).
(37c) states that it is possible for N to consist of a single noun. A further important rule
has not yet been discussed: we need another rule to combine nouns such as Vater ‘father’,
Sohn ‘son’ or Bild ‘picture’, so-called relational nouns, with their arguments. Examples
of these can be found in (43a–b). (43c) is an example of a nominalization of a verb with
its argument:
(43) a. der Vater von Peter
the father of Peter
‘Peter’s father’
b. das Bild vom Gleimtunnel
the picture of.the Gleimtunnel
‘the picture of the Gleimtunnel’
c. das Kommen der Installateurin
the coming of.the plumber
‘the plumber’s visit’
The rule that we need to analyze (43a,b) is given in (44):
(44) N → N PP
66
2.4 Phrase structure rules for some aspects of German syntax
NP
NP Det N
Det N N PP
N PP N PP
Figure 2.4: Combination of a noun with PP complement vom Gleimtunnel to the right
with an adjunct PP
Figure 2.4 shows two structures with PP-arguments. The tree on the right also contains
an additional PP-adjunct, which is licensed by the rule in (42a).
In addition to the previously discussed NP structures, there are other structures where
the determiner or noun is missing. Nouns can be omitted via ellipsis. (45) gives an
example of noun phrases, where a noun that does not require a complement has been
omitted. The examples in (46) show NPs in which only one determiner and complement
of the noun has been realized, but not the noun itself. The underscore marks the position
where the noun would normally occur.
(45) a. ein interessantes _
an interesting
‘an interesting one’
b. ein neues interessantes _
a new interesting
‘a new interesting one’
c. ein interessantes _ aus Japan
an interesting from Japan
‘an interesting one from Japan’
d. ein interessantes _, das wir kennen
an interesting that we know
‘an interesting one that we know’
(46) a. (Nein, nicht der Vater von Klaus), der _ von Peter war gemeint.
no not the father of Klaus the of Peter was meant
‘No, it wasn’t the father of Klaus, but rather the one of Peter that was meant.’
67
2 Phrase structure grammar
b. (Nein, nicht das Bild von der Stadtautobahn), das _ vom Gleimtunnel war
no not the picture of the motorway the of.the Gleimtunnel was
beeindruckend.
impressive
‘No, it wasn’t the picture of the motorway, but rather the one of the Gleimtun-
nel that was impressive.’
c. (Nein, nicht das Kommen des Tischlers), das _ der Installateurin ist
no not the coming of.the carpenter the of.the plumber is
wichtig.
important
‘No, it isn’t the visit of the carpenter, but rather the visit of the plumber that
is important.’
In English, the pronoun one must often be used in the corresponding position,10 but in
German the noun is simply omitted. In phrase structure grammars, this can be described
by a so-called epsilon production. These rules replace a symbol with nothing (47a). The
rule in (47b) is an equivalent variant which is responsible for the term epsilon production:
(47) a. N →
b. N → 𝜖
The corresponding trees are shown in Figure 2.5. Going back to boxes, the rules in (47)
NP
Det N NP
A N Det N
N N PP
correspond to empty boxes with the same labels as the boxes of ordinary nouns. As we
have considered previously, the actual content of the boxes is unimportant when con-
sidering the question of where we can incorporate them. In this way, the noun phrases
in (31) can occur in the same sentences. The empty noun box also behaves like one with
10
See Fillmore et al. (2012: Section 4.12) for English examples without the pronoun one.
68
2.4 Phrase structure rules for some aspects of German syntax
a genuine noun. If we do not open the empty box, we will not be able to ascertain the
difference to a filled box.
It is not only possible to omit the noun from noun phrases, but the determiner can
also remain unrealized in certain contexts. (48) shows noun phrases in plural:
(48) a. Bücher
books
b. Bücher, die wir kennen
books that we know
c. interessante Bücher
interesting books
d. interessante Bücher, die wir kennen
interesting books that we know
The determiner can also be omitted in singular if the noun denotes a mass noun:
(49) a. Getreide
grain
b. Getreide, das gerade gemahlen wurde
grain that just ground was
‘grain that has just been ground’
c. frisches Getreide
fresh grain
d. frisches Getreide, das gerade gemahlen wurde
fresh grain that just ground was
‘fresh grain that has just been ground’
Finally, both the determiner and the noun can be omitted:
(50) a. Ich lese interessante.
I read interesting
‘I read interesting ones.’
b. Dort drüben steht frisches, das gerade gemahlen wurde.
there over stands fresh that just ground was
‘Over there is some fresh (grain) that has just been ground.’
Figure 2.6 on the next page shows the corresponding trees.
It is necessary to add two further comments to the rules that were developed up to
this point: up to now, I have always spoken of adjectives. However, it is possible to have
very complex adjective phrases in pre-nominal position. These can be adjectives with
complements (51a,b) or adjectival participles (51c,d):
(51) a. der seiner Frau treue Mann
the his.DAT wife faithful man
‘the man faithful to his wife’
69
2 Phrase structure grammar
NP
NP Det N
Det N A N
N N
_ Bücher _ interessante _
books interesting
Taking this into account, the rule (37b) has to be modified in the following way:
(52) N → AP N
An adjective phrase (AP) can consist of an NP and an adjective, a PP and an adjective or
just an adjective:
(53) a. AP → NP A
b. AP → PP A
c. AP → A
There are two imperfections resulting from the rules that were developed thus far. These
are the rules for adjectives or nouns without complements in (53c) as well as (37c) –
repeated here as (54):
(54) N→N
If we apply these rules, then we will generate unary branching subtrees, that is trees
with a mother that only has one daughter. See Figure 2.6 for an example of this. If we
70
2.4 Phrase structure rules for some aspects of German syntax
maintain the parallel to the boxes, this would mean that there is a box which contains
another box which is the one with the relevant content.
In principle, nothing stops us from placing this information directly into the larger
box. Instead of the rules in (55), we will simply use the rules in (56):
(55) a. A → kluge
b. N → Mann
(56) a. AP → kluge
b. N → Mann
(56a) states that kluge ‘smart’ has the same properties as a full adjective phrase, in partic-
ular that it cannot be combined with a complement. This is parallel to the categorization
of the pronoun er ‘he’ as an NP in the grammars (2) and (6).
Assigning N to nouns which do not require a complement has the advantage that we
do not have to explain why the analysis in (57b) is possible as well as (57a) despite there
not being any difference in meaning.
(57) a. [NP einige [N kluge [N [N [N Frauen ] und [N [N Männer ]]]]]]
some smart women and men
b. [NP einige [N kluge [N [N [N Frauen ] und [N Männer ]]]]]
some smart women and men
In (57a), two nouns have projected to N and have then been joined by coordination.
The result of coordination of two constituents of the same category is always a new
constituent with that category. In the case of (57a), this is also N. This constituent is
then combined with the adjective and the determiner. In (57b), the nouns themselves
have been coordinated. The result of this is always another constituent which has the
same category as its parts. In this case, this would be N. This N becomes N and is then
combined with the adjective. If nouns which do not require complements were catego-
rized as N rather than N, we would not have the problem of spurious ambiguities. The
structure in (58) shows the only possible analysis.
(58) [NP einige [N kluge [N [N Frauen ] und [N Männer ]]]]
some smart women and men
71
2 Phrase structure grammar
The Duden grammar (Eisenberg et al. 2005: §1300) offers examples such as those in
(60), which show that certain prepositional phrases serve to further define the semantic
contribution of the preposition by indicating some measurement, for example:
(60) a. [[Einen Schritt] vor dem Abgrund] blieb er stehen.
one step before the abyss remained he stand
‘He stopped one step in front of the abyss.’
b. [[Kurz] nach dem Start] fiel die Klimaanlage aus.
shortly after the take.off fell the air.conditioning out
‘Shortly after take off, the air conditioning stopped working.’
c. [[Schräg] hinter der Scheune] ist ein Weiher.
diagonally behind the barn is a pond
‘There is a pond diagonally across from the barn.’
d. [[Mitten] im Urwald] stießen die Forscher auf einen alten Tempel.
middle in.the jungle stumbled the researchers on an old temple
‘In the middle of the jungle, the researches came across an old temple.’
To analyze the sentences in (60a,b), one could propose the following rules in (61):
(61) a. PP → NP PP
b. PP → AP PP
72
2.5 𝑋 Theory
PP PP
P AP P
P NP P NP
At this point, the attentive reader is probably wondering why there is no empty mea-
surement phrase in the left figure of Figure 2.7, which one might expect in analogy to the
empty determiner in Figure 2.6. The reason for the empty determiner in Figure 2.6 is that
the entire noun phrase without the determiner has a meaning similar to those with a de-
terminer. The meaning normally contributed by the visible determiner has to somehow
be incorporated in the structure of the noun phrase. If we did not place this meaning in
the empty determiner, this would lead to more complicated assumptions about semantic
combination: we only really require the mechanisms presented in Section 2.3 and these
are very general in nature. The meaning is contributed by the words themselves and not
by any rules. If we were to assume a unary branching rule such as that in the left tree in
Figure 2.7 instead of the empty determiner, then this unary branching rule would have
to provide the semantics of the determiner. This kind of analysis has also been proposed
by some researchers. See Chapter 19 for more on empty elements.
Unlike determiner-less NPs, prepositional phrases without an indication of degree or
measurement do not lack any meaning component for composition. It is therefore not
necessary to assume an empty indication of measurement, which somehow contributes
to the meaning of the entire PP. Hence, the rule in (63c) states that a prepositional phrase
consists of P, that is, a combination of P and NP.
2.5 X theory
If we look again at the rules formulated in the previous section, we see that heads are
always combined with their complements to form a new constituent (65a,b), which can
then be combined with further constituents (65c,d):
(65) a. N → N PP
b. P → P NP
73
2 Phrase structure grammar
c. NP → Det N
d. PP → NP P
Grammarians working on English noticed that parallel structures can be used for phrases
which have adjectives or verbs as their head. I discuss adjective phrases at this point and
postpone the discussion of verb phrases to Chapter 3. As in German, certain adjectives in
English can take complements with the important restriction that adjective phrases with
complements cannot realize these pre-nominally in English. (66) gives some examples
of adjective phrases:
(66) a. Kim and Sandy are proud.
b. Kim and Sandy are very proud.
c. Kim and Sandy are proud of their child.
d. Kim and Sandy are very proud of their child.
Unlike prepositional phrases, complements of adjectives are normally optional. proud
can be used with or without a PP. The degree expression very is also optional.
The rules which we need for this analysis are given in (67), with the corresponding
structures in Figure 2.8.
(67) a. AP → A
b. AP → AdvP A
c. A → A PP
d. A→A
AP AP AP AP
A AdvP A A AdvP A
A A A PP A PP
proud very proud proud of their child very proud of their child
As was shown in Section 2.2, it is possible to generalize over very specific phrase
structure rules and thereby arrive at more general rules. In this way, properties such as
person, number and gender are no longer encoded in the category symbols, but rather
only simple symbols such as NP, Det and N are used. It is only necessary to specify
something about the values of a feature if it is relevant in the context of a given rule. We
can take this abstraction a step further: instead of using explicit category symbols such
74
2.5 𝑋 Theory
as N, V, P and A for lexical categories and NP, VP, PP and AP for phrasal categories, one
can simply use a variable for the word class in question and speak of X and XP.
This form of abstraction can be found in so-called X theory (or X-bar theory, the term
bar refers to the line above the symbol), which was developed by Chomsky (1970) and
refined by Jackendoff (1977). This form of abstract rules plays an important role in many
different theories. For example: Government & Binding (Chapter 3), Generalized Phrase
Structure Grammar (Chapter 5) and Lexical Functional Grammar (Chapter 7). In HPSG
(Chapter 9), X theory also plays a role, but not all restrictions of the X schema have been
adopted.
(68) shows a possible instantiation of X rules, where the category X has been used in
place of N, as well as examples of word strings which can be derived by these rules:
(68) X rule with specific categories example strings
75
2 Phrase structure grammar
XP
XP specifier X
X adjunct X
X complement X
Some categories do not have a specifier or have the option of having one. Adjuncts are
optional and therefore not all structures have to contain an X with an adjunct daughter.
In addition to the branching shown in the right-hand figure, adjuncts to XP and head-
adjuncts are sometimes possible. There is only a single rule in (68) for cases in which a
head precedes the complements, however an order in which the complement precedes
the head is of course also possible. This is shown in Figure 2.9.
Figure 2.10 on the next page shows the analysis of the NP structures das Bild ‘the pic-
ture’ and das schöne Bild von Paris ‘the beautiful picture of Paris’. The NP structures in
Figure 2.10 and the tree for proud in Figure 2.8 show examples of minimally populated
structures. The left tree in Figure 2.10 is also an example of a structure without an ad-
junct. The right-hand structure in Figure 2.10 is an example for the maximally populated
structure: specifier, adjunct, and complement are present.
The analysis given in Figure 2.10 assumes that all non-heads in a rule are phrases.
One therefore has to assume that there is a determiner phrase even if the determiner is
not combined with other elements. The unary branching of determiners is not elegant
but it is consistent.11 The unary branchings for the NP Paris in Figure 2.10 may also
seem somewhat odd, but they actually become more plausible when one considers more
complex noun phrases:
(69) a. das Paris der dreißiger Jahre
the Paris of.the thirty years
‘30’s Paris’
b. die Maria aus Hamburg
the Maria from Hamburg
‘Maria from Hamburg’
Unary projections are somewhat inelegant but this should not concern us too much
here, as we have already seen in the discussion of the lexical entries in (56) that unary
branching nodes can be avoided for the most part and that it is indeed desirable to avoid
11
For an alternative version of X theory which does not assume elaborate structure for determiners see
Muysken (1982).
76
2.5 𝑋 Theory
NP
DetP N
Det AP N
Det A N PP
NP A P
DetP N P NP
Det N N
Det N
Figure 2.10: X analysis of das Bild ‘the picture’ and das schöne Bild von Paris ‘the beautiful
picture of Paris’
such structures. Otherwise, one gets spurious ambiguities. In the following chapters, we
will discuss approaches such as Categorial Grammar and HPSG, which do not assume
unary rules for determiners, adjectives and nouns.
Furthermore, other X theoretical assumptions will not be shared by several theories
discussed in this book. In particular, the assumption that non-heads always have to
be maximal projections will be disregarded. Pullum (1985) and Kornai & Pullum (1990)
have shown that the respective theories are not necessarily less restrictive than theories
which adopt a strict version of the X theory. See also the discussion in Section 13.1.2.
77
2 Phrase structure grammar
Comprehension questions
1. Why are phrase structure grammars that use only atomic categories inade-
quate for the description of natural languages?
2. Assuming the grammar in (6), state which steps (replacing symbols) one has
to take to get to the symbol V in the sentence (70).
Exercises
(72) a. NP → Det N
b. N → N
78
2.5 𝑋 Theory
c. Det → 𝜖
d. N → 𝜖
The rule in (73) combines an unlimited number of modifiers with the noun
books followed by an unlimited number of modifiers. We can use this rule to
derive phrases such as those in (74):
(74) a. books
b. interesting books
c. interesting books from Stuttgart
Adj stands for something that can be a single word like poor or complex like
very poor.
Revisit the German data in (45) and (46) and explain why such an analysis
and even a more general one as in (77) would not extend to German.
7. Why can X theory not account for German adjective phrases without addi-
tional assumptions? (This task is for (native) speakers of German only.)
79
2 Phrase structure grammar
8. Come up with a phrase structure grammar that can be used to analyze the
sentence in (78), but also rules out the sentences in (79).
9. Consider which additional rules would have to be added to the grammar you
developed in the previous exercise in order to be able to analyze the following
sentences:
80
2.5 X theory
10. Use the online version of SWI-Prologa to test your grammar using a computer.
Details regarding the notation can be found in the English Wikipedia entry
for Definite Clause Grammar (DCG).b
a
https://swish.swi-prolog.org/, 2020-06-07.
b
https://en.wikipedia.org/wiki/Definite_clause_grammar, 2020-06-07.
Further reading
81
3 Transformational Grammar –
Government & Binding
Transformational Grammar and its subsequent incarnations (such as Government and
Binding Theory and Minimalism) were developed by Noam Chomsky at MIT in Boston
(Chomsky 1957; 1965; 1975; 1981a; 1986a; 1995b). Manfred Bierwisch (1963) was the first to
implement Chomsky’s ideas for German. In the 60s, the decisive impulse came from the
Arbeitsstelle Strukturelle Grammatik ‘Workgroup for Structural Grammar’, which was
part of the Academy of Science of the GDR. See Bierwisch 1992 and Vater 2010 for a
historic overview. As well as Bierwisch’s work, the following books focusing on German
or the Chomskyan research program in general should also be mentioned: Fanselow
(1987), Fanselow & Felix (1987), von Stechow & Sternefeld (1988), Grewendorf (1988),
Haider (1993), Sternefeld (2006).
The different implementations of Chomskyan theories are often grouped under the
heading Generative Grammar. This term comes from the fact that phrase structure gram-
mars and the augmented frameworks that were suggested by Chomsky can generate sets
of well-formed expressions (see p. 54). It is such a set of sentences that constitutes a lan-
guage (in the formal sense) and one can test if a sentence forms part of a language by
checking if a particular sentence is in the set of sentences generate by a given grammar.
In this sense, simple phrase structure grammars and, with corresponding formal assump-
tions, GPSG, LFG, HPSG and Construction Grammar (CxG) are generative theories. In
recent years, a different view of the formal basis of theories such as LFG, HPSG and
CxG has emerged such that the aforementioned theories are now model theoretic theo-
ries rather than generative-enumerative ones1 (See Chapter 14 for discussion). In 1965,
Chomsky defined the term Generative Grammar in the following way (see also Chomsky
1995b: 162):
A grammar of a language purports to be a description of the ideal speaker-hearer’s
intrinsic competence. If the grammar is, furthermore, perfectly explicit – in other
words, if it does not rely on the intelligence of the understanding reader but rather
provides an explicit analysis of his contribution – we may call it (somewhat redun-
dantly) a generative grammar. (Chomsky 1965: 4)
In this sense, all grammatical theories discussed in this book would be viewed as gen-
erative grammars. To differentiate further, sometimes the term Mainstream Generative
Grammar (MGG) is used (Culicover & Jackendoff 2005: 3) for Chomskyan models. In this
1
Model theoretic approaches are always constraint-based and the terms model theoretic and constraint-based
are sometimes used synonymously.
3 Transformational Grammar – Government & Binding
chapter, I will discuss a well-developed and very influential version of Chomskyan gram-
mar, GB theory. More recent developments following Chomsky’s Minimalist Program
are dealt with in Chapter 4.
3.1.1 Transformations
In the previous chapter, I introduced simple phrase structure grammars. Chomsky (1957:
Chapter 5) criticized this kind of rewrite grammars since – in his opinion – it is not clear
how one can capture the relationship between active and passive sentences or the vari-
ous ordering possibilities of constituents in a sentence. While it is of course possible to
formulate different rules for active and passive sentences in a phrase structure grammar
(e.g., one pair of rules for intransitive (1), one for transitive (2) and one for ditransitive
verbs (3)), it would not adequately capture the fact that the same phenomenon occurs in
the example pairs in (1)–(3):
(1) a. weil dort noch jemand arbeitet
because there still somebody works
‘because somebody is still working there’
b. weil dort noch gearbeitet wurde
because there still worked was
‘because work was still being done there’
84
3.1 General remarks on the representational format
Chomsky (1957: 43) suggests a transformation that creates a connection between active
and passive sentences. The transformation that he suggests for English corresponds to
(4), which is taken from Klenk (2003: 74):
(4) NP V NP → 3 [AUX be] 2en [PP [P by] 1]
1 2 3
This transformational rule maps a tree with the symbols on the left-hand side of the rule
onto a tree with the symbols on the right-hand side of the rule. Accordingly, 1, 2 and
3 on the right of the rule correspond to symbols, which are under the numbers on the
left-hand side. en stands for the morpheme which forms the participle (seen, been, …, but
also loved). Both trees for (5a,b) are shown in Figure 3.1.
S NP VP
NP VP { Mary Aux V PP
John V NP P NP
The symbols on the left of transformational rules do not necessarily have to be in a local
tree, that is, they can be daughters of different mothers as in Figure 3.1.
Rewrite grammars were divided into four complexity classes based on the properties
they have. The simplest grammars are assigned to the class 3, whereas the most com-
plex are of Type-0. The so-called context-free grammars we have dealt with thus far
85
3 Transformational Grammar – Government & Binding
2
For more on the power of formal languages, see Chapter 17.
86
3.1 General remarks on the representational format
order to be able to set parameters. Chomsky (2000: 8) compares the setting of param-
eters to flipping a switch. For a detailed discussion of the various assumptions about
language acquisition in the P&P-model, see Chapter 16. Speakers of English have to
learn that heads occur before their complements in their language, whereas a speaker
of Japanese has to learn that heads follow their complements. (7) gives the respective
examples:
(7) a. be showing pictures of himself
b. zibun -no syasin-o mise-te iru
REFL from picture showing be
As one can see, the Japanese verb, noun and prepositional phrases are a mirror image of
the corresponding phrases in English. (8) provides a summary and shows the parametric
value for the position parameter:
(8) Language Observation Parameter: head initial
English Heads occur before complements +
Japanese Heads occur after complements −
Investigating languages based on their differences with regard to certain assumed pa-
rameters has proven to be a very fruitful line of research in the last few decades and has
resulted in an abundance of comparative cross-linguistic studies.
After these introductory comments on language acquisition, the following sections
will discuss the basic assumptions of GB theory.
87
3 Transformational Grammar – Government & Binding
D-structure
move 𝛼
S-structure
Phonetic Logical
Form (PF) Form (LF)
referred to as the T-model (or Y-model) because D-structure, S-structure, PF and LF form
an upside-down T (or Y). We will look at each of these individual components in more
detail.
Using phrase structure rules, one can describe the relationships between individual
elements (for instance words and phrases, sometimes also parts of words). The format
for these rules is X syntax (see Section 2.5). The lexicon, together with the structure
licensed by X syntax, forms the basis for D-structure. D-structure is then a syntactic
representation of the selectional grid (= valence classes) of individual word forms in the
lexicon.
The lexicon contains a lexical entry for every word which comprises information
about morphophonological structure, syntactic features and selectional properties. This
will be explained in more detail in Section 3.1.3.4. Depending on one’s exact theoreti-
cal assumptions, morphology is viewed as part of the lexicon. Inflectional morphology
is, however, mostly consigned to the realm of syntax. The lexicon is an interface for
semantic interpretation of individual word forms.
The surface position in which constituents are realized is not necessarily the posi-
tion they have in D-structure. For example, a sentence with a ditransitive verb has the
following ordering variants:
(10) a. [dass] der Mann der Frau das Buch gibt
that the.NOM man the.DAT woman the.ACC book gives
‘that the man gives the woman the book’
88
3.1 General remarks on the representational format
The symbol ∀ stands for a universal quantifier and ∃ stands for an existential quantifier.
The first formula corresponds to the reading that for every man, there is a woman who
he loves and in fact, these can be different women. Under the second reading, there is
exactly one woman such that all men love her. The question of when such an ambiguity
arises and which reading is possible when depends on the syntactic properties of the
90
3.1 General remarks on the representational format
given utterance. LF is the level which is important for the meaning of determiners such
as a and every.
Control Theory is also specified with reference to LF. Control Theory deals with the
question of how the semantic role of the infinitive subject in sentences such as (16) is
filled.
(16) a. Der Professor schlägt dem Studenten vor, die Klausur noch mal zu
the professor suggests the student PART the test once again to
schreiben.
write
‘The professor advises the student to take the test again.’
b. Der Professor schlägt dem Studenten vor, die Klausur nicht zu bewerten.
the professor suggests the student PART the test not to grade
‘The professor suggests to the student not to grade the test.’
c. Der Professor schlägt dem Studenten vor, gemeinsam ins Kino zu gehen.
the professor suggests the student PART together into cinema to go
‘The professor suggests to the student to go to the cinema together.’
3
The exact meaning of the terms is framework-dependent. Coming from an HPSG perspective, I use the
first three terms referring to syntactic and semantic information, the latter two refer to the selection of
semantic roles. GB researchers often refer to argument structure as containing semantic information, to
valence frames as containing syntactic information and to subcategorization as a mix of syntactic and
semantic information.
91
3 Transformational Grammar – Government & Binding
Principle 1 (Theta-Criterion)
• Each theta-role is assigned to exactly one argument position.
• Every phrase in an argument position receives exactly one theta-role.
The arguments of a head are ordered, that is, one can differentiate between higher- and
lower-ranked arguments. The highest-ranked argument of verbs and adjectives has a
special status. Since GB assumes that it is often (and always in some languages) realized
in a position outside of the verb or adjective phrase, it is often referred to as the external
argument. The remaining arguments occur in positions inside of the verb or adjective
phrase. These kind of arguments are dubbed internal arguments or complements. For
simple sentences, this often means that the subject is the external argument.
When discussing types of arguments, one can identify three classes of theta-roles:
• Class 1: agent (acting individual), the cause of an action or feeling (stimulus),
holder of a certain property
If a verb has several theta-roles of this kind to assign, Class 1 normally has the highest
rank, whereas Class 3 has the lowest. Unfortunately, the assignment of semantic roles
to actual arguments of verbs has received a rather inconsistent treatment in the litera-
ture. This problem has been discussed by Dowty (1991), who suggests using proto-roles.
An argument is assigned the proto-agent role if it has sufficiently many of the proper-
ties that were identified by Dowty as prototypical properties of agents (e.g., animacy,
volitionality).
The mental lexicon contains lexical entries with the specific properties of syntactic
words needed to use that word grammatically. Some of these properties are the follow-
ing:
• form
• meaning (semantics)
• grammatical features: syntactic word class + morphosyntactic features
• theta-grid
92
3.1 General remarks on the representational format
3.1.4 X theory
In GB, it is assumed that all syntactic structures licensed by the core grammar5 corre-
spond to the X schema (see Section 2.5).6 In the following sections, I will comment on the
syntactic categories assumed and the basic assumptions with regard to the interpretation
of grammatical rules.
• A = adjective
• P = preposition/postposition
• Adv = adverb
4
See Perlmutter (1978) for a discussion of unaccusative verbs. The term ergative verb is also common, albeit
a misnomer. See Burzio (1981; 1986) for the earliest work on unaccusatives in the Chomskyan framework
and Grewendorf (1989) for German. Also, see Pullum (1988) on the usage of these terms and for a historical
evaluation.
5
Chomsky (1981a: 7–8) distinguishes between a regular area of language that is determined by a grammar
that can be acquired using genetically determined language-specific knowledge and a periphery, to which
irregular parts of language such as idioms (e.g., to pull the wool over sb.’s eyes) belong. See Section 16.3.
6
Chomsky (1970: 210) allows for grammatical rules that deviate from the X schema. It is, however, common
practice to assume that languages exclusively use X structures.
93
3 Transformational Grammar – Government & Binding
Table 3.1: Representation of four lexical categories using two binary features
−V +V
−N P = [ −N, −V ] V = [ −N, +V ]
+N N = [ +N, −V ] A = [ +N, +V ]
Adverbs are viewed as intransitive prepositions and are therefore captured by the de-
composition in the table above.
Using this cross-classification, it is possible to formulate generalizations. One can, for
example, simply refer to adjectives and verbs: all lexical categories which are [ +V ] are
either adjectives or verbs. Furthermore, one can say of [ +N ] categories (nouns and
adjectives) that they can bear case.
Apart from this, some authors have tried to associate the head position with the fea-
ture values in Table 3.1 (see e.g., Grewendorf 1988: 52; Haftka 1996: 124; G. Müller 2011:
238). With prepositions and nouns, the head precedes the complement in German:
(19) a. für Marie
for Marie
b. Bild von Maria
picture of Maria
With adjectives and verbs, the head is final:
(20) a. dem König treu
the king loyal
‘Loyal to the king’
b. der [dem Kind helfende] Mann
the the child helping man
‘the man helping the child’
c. dem Mann helfen
the man help
‘help the man’
This data seems to suggest that the head is final with [ +V ] categories and initial with
[ −V ] categories. Unfortunately, this generalization runs into the problem that there are
also postpositions in German. These are, like prepositions, not verbal, but do occur after
the NP they require:
7
See Chomsky (1970: 199) for a cross-classification of N, A and V, and Jackendoff (1977: Section 3.2) for a
cross-classification that additionally includes P but has a different feature assignment.
94
3.1 General remarks on the representational format
95
3 Transformational Grammar – Government & Binding
TAG, HPSG, Construction Grammar, and Dependency Grammar which allow crossing
branches and therefore discontinuous constituents (Becker, Joshi & Rambow 1991; Reape
1994; Bergen & Chang 2005; Heringer 1996: 261; Eroms 2000: Section 9.6.2).
In X theory, one normally assumes that there are at most two projection levels (X ′
and X′′). However, there are some versions of Mainstream Generative Grammar and
other theories which allow three or more levels (Jackendoff 1977; Uszkoreit 1987). In this
chapter, I follow the standard assumption that there are two projection levels, that is,
phrases have at least three levels:
• X0 = head
• X′ = intermediate projection (X, read: X bar)
(22) a. S → NP VP
b. S → NP Infl VP
Infl stands for Inflection as inflectional affixes are inserted at this position in the structure.
The symbol AUX was also used instead of Infl in earlier work, since auxiliary verbs are
treated in the same way as inflectional affixes. Figure 3.3 on the next page shows a
sample analysis of a sentence with an auxiliary, which uses the rule in (22b).
Together with its complements, the verb forms a structural unit: the VP. The con-
stituent status of the VP is supported by several constituent tests and further differences
between subjects and objects regarding their positional restrictions.
The rules in (22) do not follow the X template since there is no symbol on the right-
hand side of the rule with the same category as one on the left-hand side, that is, there is
no head. In order to integrate rules like (22) into the general theory, Chomsky (1986a: 3)
developed a rule system with two layers above the verb phrase (VP), namely the CP/IP
system. CP stands for Complementizer Phrase. The head of a CP can be a complementizer.
Before we look at CPs in more detail, I will discuss an example of an IP in this new system.
Figure 3.4 on the facing page shows an IP with an auxiliary in the I0 position. As we can
see, this corresponds to the structure of the X template: I0 is a head, which takes the VP
96
3.1 General remarks on the representational format
IP
S NP I′
NP INFL VP I VP
V′ V′
V NP V NP
Ann will read the newspaper Ann will read the newspaper
Figure 3.3: Sentence with an auxiliary verb Figure 3.4: Sentence with auxiliary verb in
following Chomsky (1981a: 19) the CP/IP system
as its complement and thereby forms I′. The subject is the specifier of the IP. Another
way to phrase this is to say that the subject is in the specifier position of the IP. This
position is usually referred to as SpecIP.9
The sentences in (23) are analyzed as complementizer phrases (CPs), the complemen-
tizer is the head:
In sentences such as (23), the CPs do not have a specifier. Figure 3.5 on the next page
shows the analysis of (23a).
Yes/no-questions in English such as those in (24) are formed by moving the auxiliary
verb in front of the subject.
(24) Will Ann read the newspaper?
Let us assume that the structure of questions corresponds to the structure of sentences
with complementizers. This means that questions are also CPs. Unlike the sentences in
(23), however, there is no subordinating conjunction. In the D-structure of questions,
the C0 position is empty and the auxiliary verb is later moved to this position. Figure 3.6
shows an analysis of (24). The original position of the auxiliary is marked by the trace
_𝑘 , which is coindexed with the moved auxiliary.
9
Sometimes SpecIP and similar labels are used in trees (for instance by Haegeman (1994), Meinunger (2000)
and Lohnstein (2014)). I avoid this in this book since SpecIP, SpecAdvP are not categories like NP or AP or
AdvP but positions that items of a certain category can take. See Chapter 2 on the phrase structure rules
that license trees.
97
3 Transformational Grammar – Government & Binding
CP CP
C′ C′
C IP C IP
NP I′ NP I′
I VP I VP
V′ V′
V NP V NP
that Ann will read the newspaper will𝑘 Ann _𝑘 read the newspaper
98
3.1 General remarks on the representational format
CP
NP𝑖 C′
C IP
NP I′
I VP
V′
V NP
V′ V′ V′
V _𝑖 V NP V NP𝑖
Figure 3.8: Alternative ways of depicting movement: the moved constituent can be rep-
resented by a trace or by an XP dominating a trace
Figure 3.8c. This picture is a mix of the two other pictures. The index is associated with
the category and not with the empty phonology. In my opinion this best depicts the fact
that trace and filler are related. However, I never saw this way of depicting movement
in the GB literature and hence I will stick to the more common notation in Figure 3.8b.
This way to depict movement is also more similar to the representation that is used by all
authors for the movement of words (so-called head-movement). For example the trace
_𝑘 , which stands for a moved I0 in Figure 3.6 is never depicted as daughter of I ′ but
always as a daughter of I0 .
99
3 Transformational Grammar – Government & Binding
Until now, I have not yet discussed sentences without auxiliaries such as (23b). In order
to analyze this kind of sentences, it is usually assumed that the inflectional affix is present
in the I0 position. An example analysis is given in Figure 3.9. Since the inflectional affix
IP
NP I′
I VP
V′
V NP
precedes the verb, some kind of movement operation still needs to take place. There are
two suggestions in the literature: one is to assume lowering, that is, the affix moves down
to the verb (Pollock 1989: 394; Chomsky 1991; Haegeman 1994: 110, 601; Sportiche et al.
2013). The alternative is to assume that the verb moves up to the affix (Fanselow & Felix
1987: 258–259). Since theories with lowering of inflectional affixes are complicated for
languages in which the verb ultimately ends up in C (basically in all Germanic languages
except English), I follow Fanselow & Felix’s (1987: 258–259) suggestion for English and
Grewendorf’s (1993: 1289) suggestion for German and assume that the verb moves from
V to I in English and from V to I to C in German.10
Following this excursus on the analysis of English sentences, we can now turn to
German.
10
Sportiche, Koopman & Stabler (2013) argue for an affix lowering approach by pointing out that approaches
assuming that the verb stem moves to I (their T) predict that adverbs appear to the right of the verb rather
than to the left:
(i) a. John will carefully study Russian.
b. John carefully studies Russian.
c. * John studies carefully Russian.
If the affix -s is in the position of the auxiliary and the verb moves to the affix, one would expect (i.c) to be
grammatical rather than (i.b).
A third approach is to assume empty I (or more recently T) heads for present and past tense and have
these heads select a fully inflected verb. See Carnie (2013: 220–221) for such an approach to English.
For German it was also suggested not to distinguish between I and V at all and treat auxiliaries like
normal verbs (see footnote 11 below). In such approaches verbs are inflected as V, no I node is assumed
(Haider 1993; 1997a).
100
3.1 General remarks on the representational format
CP
C′
IP
I′
VP
XP C XP V I
Note that SpecCP and SpecIP are not category symbols. They do not occur in gram-
mars with rewrite rules. Instead, they simply describe positions in the tree.
As shown in Figure 3.10, it is assumed that the highest argument of the verb (the sub-
ject in simple sentences) has a special status. It is taken for granted that the subject
always occurs outside of the VP, which is why it is referred to as the external argument.
The VP itself does not have a specifier. In more recent work, however, the subject is
generated in the specifier of the VP (Fukui & Speas 1986; Koopman & Sportiche 1991).
In some languages, it is assumed that it moves to a position outside of the VP. In other
languages such as German, this is the case at least under certain conditions (e.g., definite-
ness, see Diesing 1992). I am presenting the classical GB analysis here, where the subject
11
For GB analyses without IP, see Bayer & Kornfilt (1989), Höhle (1991a: 157), Haider (1993; 1997a) and Sterne-
feld (2006: Section IV.3). Haider assumes that the function of I is integrated into the verb. In LFG, an
IP is assumed for English (Bresnan 2001: Section 6.2; Dalrymple 2001: Section 3.2.1), but not for German
(Berman 2003a: Section 3.2.3.2). In HPSG, no IP is assumed.
101
3 Transformational Grammar – Government & Binding
is outside the VP. All arguments other than the subject are complements of the V, that
are realized within the VP, that is, they are internal arguments. If the verb requires just
one complement, then this is the sister of the head V0 and the daughter of V′ according
to the X schema. The accusative object is the prototypical complement.
Following the X template, adjuncts branch off above the complements of V ′. The
analysis of a VP with an adjunct is shown in Figure 3.11.
(26) weil der Mann morgen den Jungen trifft
because the man tomorrow the boy meets
‘because the man is meeting the boy tomorrow’
VP
V′
AdvP V′
NP V
102
3.2 Verb position
Welsh and Arabic are VSO languages. Around 40 % of all languages belong to the SOV
languages, around 35 % are SVO (Dryer 2013c).
The assumption of verb-final order as the base order is motivated by the following
observations:13
This unit can only be seen in verb-final structures, which speaks for the fact that
this structure reflects the base order.
Verbs which are derived from a noun by back-formation (e.g., uraufführen ‘to per-
form something for the first time’), can often not be divided into their component
parts and V2 clauses are therefore ruled out (This was first mentioned by Höhle
(1991b) in unpublished work. The first published source is Haider (1993: 62)):
The examples show that there is only one possible position for this kind of verb.
This order is the one that is assumed to be the base order.
2. Verbs in non-finite clauses and in finite subordinate clauses with a conjunction are
always in final position (I am ignoring the possibility of extraposing constituents):
103
3 Transformational Grammar – Government & Binding
3. If one compares the position of the verb in German with Danish (Danish is an SVO
language like English), then one can clearly see that the verbs in German form a
cluster at the end of the sentence, whereas they occur before any objects in Danish
(Ørsnes 2009a: 146):
4. The scope relations of the adverbs in (31) depend on their order: the left-most ad-
verb has scope over the two following elements.14 This was explained by assuming
the following structure:
14
At this point, it should be mentioned that there seem to be exceptions from the rule that modifiers to the
left take scope over those to their right. Kasper (1994: 47) discusses examples such as (i), which go back to
Bartsch & Vennemann (1972: 137).
As Koster (1975: Section 6) and Reis (1980: 67) have shown, these are not particularly convincing counter-
examples as the right sentence bracket is not filled in these examples and therefore the examples are not
necessarily instances of normal reordering inside of the middle field, but could instead involve extraposi-
tion of the PP. As noted by Koster and Reis, these examples become ungrammatical if one fills the right
bracket and does not extrapose the causal adjunct:
However, the following example from Crysmann (2004: 383) shows that, even with the right bracket occu-
pied, one can still have an order where an adjunct to the right has scope over one to the left:
(iii) Da muß es schon erhebliche Probleme mit der Ausrüstung gegeben haben, da wegen
there must EXPL already serious problems with the equipment given have since because.of
schlechten Wetters ein Reinhold Messmer niemals aufgäbe.
bad weather a Reinhold Messmer never would.give.up
‘There really must have been some serious problems with the equipment because someone like
Reinhold Messmer would never give up just because of some bad weather.’
Nevertheless, this does not change anything regarding the fact that the corresponding cases in (31) and
(32) have the same scope relations regardless of the position of the verb. The general means of semantic
composition may well have to be implemented in the same way as in Crysmann’s analysis.
104
3.3 Long-distance dependencies
It is interesting to note that scope relations are not affected by verb position. If
one assumes that sentences with verb-second order have the underlying structure
in (31), then this fact requires no further explanation. (32) shows the derived S-
structure for (31):
After motivating and briefly sketching the analysis of verb-final order, I will now look
at the CP/IP analysis of German in more detail. C0 corresponds to the left sentence
bracket and can be filled in two different ways: in subordinate clauses introduced by
a conjunction, the subordinating conjunction (the complementizer) occupies C0 as in
English. The verb remains in the right sentence bracket, as illustrated by (33).
(33) dass jeder diesen Mann kennt
that everybody this man knows
‘that everybody knows this man’
Figure 3.12 on the following page gives an analysis of (33). In verb-first and verb-second
clauses, the finite verb is moved to C0 via the I0 position: V0 → I0 → C0 (Grewendorf
1993: 1289). Figure 3.13 on page 107 shows the analysis of (34):
(34) Kennt jeder diesen Mann?
knows everybody this man
‘Does everybody know this man?’
The C0 position is empty in the D-structure of (34). Since it is not occupied by a comple-
mentizer, the verb can move there.
105
3 Transformational Grammar – Government & Binding
CP
C′
C IP
NP I′
VP I
V′
NP V
Since any constituent can be placed in front of the finite verb, German is treated typo-
logically as one of the verb-second languages (V2). Thus, it is a verb-second language
with SOV base order. English, on the other hand, is an SVO language without the V2
property, whereas Danish is a V2 language with SVO as its base order (see Ørsnes 2009a
for Danish).
106
3.3 Long-distance dependencies
CP
C′
C IP
NP I′
VP I
V′
NP V
Figure 3.14 on the following page shows the structure derived from Figure 3.13. The
crucial factor for deciding which phrase to move is the information structure of the sen-
tence. That is, material connected to previously mentioned or otherwise-known infor-
mation is placed further left (preferably in the prefield) and new information tends to
occur to the right. Fronting to the prefield in declarative clauses is often referred to as
topicalization. But this is rather a misnomer, since the focus (informally: the constituent
being asked for) can also occur in the prefield. Furthermore, expletive pronouns can
occur there and these are non-referential and as such cannot be linked to preceding or
known information, hence expletives can never be topics.
Transformation-based analyses also work for so-called long-distance dependencies, that
is, dependencies crossing several phrase boundaries:
(37) a. [Um zwei Millionen Mark]𝑖 soll er versucht haben, [eine
around two million Deutsche.Marks should he tried have an
15
Versicherung _𝑖 zu betrügen].
insurance.company to deceive
‘He apparently tried to cheat an insurance company out of two million Deu-
tsche Marks.’
15
taz, 04.05.2001, p. 20.
107
3 Transformational Grammar – Government & Binding
CP
NP𝑖 C′
C IP
NP I′
VP I
V′
NP V
b. „Wer𝑖 , glaubt er, daß er _𝑖 ist?“ erregte sich ein Politiker vom Nil.16
who believes he that he is retort REFL a politician from.the Nile
‘ “Who does he think he is?”, a politician from the Nile exclaimed.’
c. Wen𝑖 glaubst du, daß ich _𝑖 gesehen habe?17
who believe you that I seen have
‘Who do you think I saw?’
d. [Gegen ihn]𝑖 falle es den Republikanern hingegen schwerer,
against him fall it the Republicans however more.difficult
[ [ Angriffe _𝑖 ] zu lancieren].18
attacks to launch
‘It is, however, more difficult for the Republicans to launch attacks against
him.’
The elements in the prefield in the examples in (37) all originate from more deeply em-
bedded phrases. In GB, it is assumed that long-distance dependencies across sentence
boundaries are derived in steps (Grewendorf 1988: 75–79), that is, in the analysis of
16
Spiegel, 8/1999, p. 18.
17
Scherpenisse (1986: 84).
18
taz, 08.02.2008, p. 9.
108
3.4 Passive
(37c), the interrogative pronoun is moved to the specifier position of the dass-clause and
is moved from there to the specifier of the matrix clause. The reason for this is that there
are certain restrictions on movement which must be checked locally.
3.4 Passive
Before I turn to the analysis of the passive in Section 3.4.2, the first subsection will
elaborate on the differences between structural and lexical case.
109
3 Transformational Grammar – Government & Binding
110
3.4 Passive
Figure 3.15 shows the Case Principle in action with the example in (42a).21
(42) a. [dass] der Mann der Frau den Jungen zeigt
that the man the.DAT woman the.ACC boy shows
‘that the man shows the boy to the woman’
b. [dass] der Junge der Frau gezeigt wird
that the boy.NOM the.DAT woman shown is
‘that the boy is shown to the woman’
IP
NP I′
VP I
V′
NP V′
NP V
just case
just theta-role
der Mann der Frau den Jungen zeig- -t case and theta-role
the man the woman the boy show- -s
The passive morphology blocks the subject and absorbs the structural accusative. The
object that would get accusative in the active receives only a semantic role in its base
position in the passive, but it does not get the absorbed case. Therefore, it has to move
to a position where case can be assigned to it (Chomsky 1981a: 124). Figure 3.16 shows
how this works for example (42b). This movement-based analysis works well for English
21
The figure does not correspond to X theory in its classic form, since der Frau ‘the woman’ is a complement
which is combined with V′ . In classical X theory, all complements have to be combined with V0 . This
leads to a problem in ditransitive structures since the structures have to be binary (see Larson (1988) for a
treatment of double object constructions). Furthermore, in the following figures the verb has been left in
V0 for reasons of clarity. In order to create a well-formed S-structure, the verb would have to move to its
affix in I0 . Note also that the assignment of the subject theta-role by the verb crosses a phrase boundary.
This problem can be solved by assuming that the subject is generated within the VP, gets a theta role there
and then moves to SpecIP. An alternative suggestion was to assume that the VP assigns a semantic role to
SpecIP.
111
3 Transformational Grammar – Government & Binding
IP
NP I′
VP I
V′
NP V′
NP V
just case
just theta-role
der Junge𝑖 der Frau _𝑖 gezeigt wir- -d case and theta-role
the boy the woman shown is
(43c) shows that filling the subject position with an expletive is not possible, so the object
really has to move. However, Lenerz (1977: Section 4.4.3) showed that such a movement
is not obligatory in German:
In comparison to (44c), (44b) is the unmarked order. der Ball ‘the ball’ in (44b) occurs
in the same position as den Ball in (44a), that is, no movement is necessary. Only the
case differs. (44c) is, however, somewhat marked in comparison to (44b). So, if one
112
3.4 Passive
assumed (44c) to be the normal order for passives and (44b) is derived from this by
movement of dem Jungen ‘the boy’, (44b) should be more marked than (44c), contrary
to the facts. To solve this problem, an analysis involving abstract movement has been
proposed for cases such as (44b): the elements stay in their positions, but are connected
to the subject position and receive their case information from there. (Grewendorf 1988:
155–157; 1993: 1311) assumes that there is an empty expletive pronoun in the subject
position of sentences such as (44b) as well as in the subject position of sentences with
an impersonal passive such as (45):22
(45) weil heute nicht gearbeitet wird
because today not worked is
‘because there will be no work done today’
A silent expletive pronoun is something that one cannot see or hear and that does not
carry any meaning. For discussion of this kind of empty element, see Section 13.1.3 and
Chapter 19.
In the following chapters, I describe alternative treatments of the passive that do with-
out mechanisms such as empty elements that are connected to argument positions and
that seek to describe the passive in a more general, cross-linguistically consistent man-
ner as the suppression of the most prominent argument.
A further question which needs to be answered is why the accusative object does not
receive case from the verb. This is captured by a constraint, which goes back to Burzio
(1986: 178–185) and is therefore referred to as Burzio’s Generalization.23
(46) Burzio’s Generalization (modified):
If V does not have an external argument, then it does not assign (structural) ac-
cusative case.
Koster (1986: 12) has pointed out that the passive in English cannot be derived by Case
Theory since if one allowed empty expletive subjects for English as well as German and
22
See Koster (1986: 11–12) for a parallel analysis for Dutch as well as Lohnstein (2014) for a movement-based
account of the passive that also involves an empty expletive for the analysis of the impersonal passive.
23
Burzio’s original formulation was equivalent to the following: a verb assigns accusative if and only if it
assigns a semantic role to its subject. This claim is problematic from both sides. In (i), the verb does not
assign a semantic role to the subject; however there is nevertheless accusative case:
One therefore has to differentiate between structural and lexical accusative and modify Burzio’s General-
ization accordingly. The existence of verbs like begegnen ‘to bump into’ is problematic for the other side
of the implication. begegnen has a subject but still does not assign accusative but rather dative:
See Haider (1999) and Webelhuth (1995: 89) as well as the references cited there for further problems
with Burzio’s Generalization.
113
3 Transformational Grammar – Government & Binding
Dutch, then it would be possible to have analyses such as the following in (47) where np
is an empty expletive:
(47) np was read the book.
Koster rather assumes that subjects in English are either bound by other elements (that is,
non-expletive) or lexically filled, that is, filled by visible material. Therefore, the structure
in (47) would be ruled out and it would be ensured that the book would have to be placed
in front of the finite verb so that the subject position is filled.
114
3.5 Local reordering
IP
NP[acc]𝑖 IP
NP[nom] I′
VP I
V′
NP[dat] V′
NP V
only one reading possible. If movement has taken place, however, then there are two
possible readings (Frey 1993: 185):
(49) a. Es ist nicht der Fall, daß er mindestens einem Verleger fast jedes Gedicht
it is not the case that he at.least one publisher almost every poem
anbot.
offered
‘It is not the case that he offered at least one publisher almost every poem.’
b. Es ist nicht der Fall, daß er fast jedes Gedicht𝑖 mindestens einem Verleger
it is not the case that he almost every poem at.least one publisher
_𝑖 anbot.
offered
‘It is not the case that he offered almost every poem to at least one publisher.’
It turns out that approaches assuming traces run into problems as they predict certain
readings for sentences with multiple traces which do not exist (see Kiss 2001: 146 and
Fanselow 2001: Section 2.6). For instance in an example such as (50), it should be possible
to interpret mindestens einem Verleger ‘at least one publisher’ at the position of _𝑖 , which
would lead to a reading where fast jedes Gedicht ‘almost every poem’ has scope over
mindestens einem Verleger ‘at least one publisher’. However, this reading does not exist.
115
3 Transformational Grammar – Government & Binding
(50) Ich glaube, dass mindestens einem Verleger𝑖 fast jedes Gedicht 𝑗 nur dieser
I believe that at.least one publisher almost every poem only this
Dichter _𝑖 _ 𝑗 angeboten hat.
poet offered has
‘I think that only this poet offered almost every poem to at least one publisher.’
Sauerland & Elbourne (2002: 308) discuss analogous examples from Japanese, which
they credit to Kazuko Yatsushiro. They develop an analysis where the first step is to
move the accusative object in front of the subject. Then, the dative object is placed in
front of that and then, in a third movement, the accusative is moved once more. The
last movement can take place to construct either the S-structure24 or as a movement to
construct the Phonological Form. In the latter case, this movement will not have any
semantic effects. While this analysis can predict the correct available readings, it does
require a number of additional movement operations with intermediate steps.
The alternative to a movement analysis is so-called base generation: the starting struc-
ture generated by phrase structure rules is referred to as the base. One variant of base
generation assumes that the verb is combined with one argument at a time and each 𝜃 -
role is assigned in the respective head-argument configuration. The order in which argu-
ments are combined with the verb is not specified, which means that all of the orders in
(48) can be generated directly without any transformations.25 Fanselow (2001) suggested
such an analysis within the framework of GB.26 Note that such a base-generation analy-
sis is incompatible with an IP approach that assumes that the subject is realized in the
specifier of IP. An IP approach with base-generation of different argument orders would
allow the complements to appear in any order within the VP but the subject would be
first since it is part of a different phrase. So the orders in (51a,b) could be analyzed, but
the ones in (51c–f) could not:
(51) a. dass der Mann der Frau ein Buch gibt
that the.NOM man the.DAT woman a.ACC book gives
b. dass der Mann ein Buch der Frau gibt
that the.NOM man a.ACC book the.DAT woman gives
c. dass der Frau der Mann ein Buch gibt
that the.DAT woman the.NOM man a.ACC book gives
24
The authors are working in the Minimalist framework. This means there is no longer S-structure strictly
speaking. I have simply translated the analysis into the terms used here.
25
Compare this to the grammar in (6) on page 55. This grammar combines a V and an NP to form a new V.
Since nothing is said about the case of the argument in the phrase structure rule, the NPs can be combined
with the verb in any order.
26
The base generation analysis is the natural analysis in the HPSG framework. It has already been developed
by Gunji in 1986 for Japanese and will be discussed in more detail in Section 9.4. Sauerland & Elbourne
(2002: 313–314) claim that they show that syntax has to be derivational, that is, a sequence of syntactic
trees has to be derived. I am of the opinion that this cannot generally be shown to be the case. There
is, for example, an analysis by Kiss (2001) which shows that scope phenomena can be explained well by
constraint-based approaches.
116
3.6 Summary and classification
117
3 Transformational Grammar – Government & Binding
In the remainder of this section, I will critically discuss two points: the model of lan-
guage acquisition of the Principles & Parameters framework and the degree of formal-
ization inside Chomskyan linguistics (in particular the last few decades and the conse-
quences this has). Some of these points will be mentioned again in Part II.
3.6.2 Formalization
In his 1963 work on Transformational Grammar, Bierwisch writes the following:27
It is very possible that the rules that we formulated generate sentences which are
outside of the set of grammatical sentences in an unpredictable way, that is, they
27
Es ist also sehr wohl möglich, daß mit den formulierten Regeln Sätze erzeugt werden können, die auch
in einer nicht vorausgesehenen Weise aus der Menge der grammatisch richtigen Sätze herausfallen, die
also durch Eigenschaften gegen die Grammatikalität verstoßen, die wir nicht wissentlich aus der Unter-
suchung ausgeschlossen haben. Das ist der Sinn der Feststellung, daß eine Grammatik eine Hypothese
über die Struktur einer Sprache ist. Eine systematische Überprüfung der Implikationen einer für natürliche
Sprachen angemessenen Grammatik ist sicherlich eine mit Hand nicht mehr zu bewältigende Aufgabe. Sie
könnte vorgenommen werden, indem die Grammatik als Rechenprogramm in einem Elektronenrechner
realisiert wird, so daß überprüft werden kann, in welchem Maße das Resultat von der zu beschreibenden
Sprache abweicht.
118
3.6 Summary and classification
119
3 Transformational Grammar – Government & Binding
I think that we are, in fact, beginning to approach a grasp of certain basic princi-
ples of grammar at what may be the appropriate level of abstraction. At the same
time, it is necessary to investigate them and determine their empirical adequacy
by developing quite specific mechanisms. We should, then, try to distinguish as
clearly as we can between discussion that bears on leading ideas and discussion
that bears on the choice of specific realizations of them. (Chomsky 1981a: 2–3)
This departure from rigid formalization has led to there being a large number of publi-
cations inside Mainstream Generative Grammar with sometimes incompatible assump-
tions to the point where it is no longer clear how one can combine the insights of the
various publications. An example of this is the fact that the central notion of government
has several different definitions (see Aoun & Sportiche 1983 for an overview28 ).
This situation has been cricitized repeatedly since the 80s and sometimes very harshly
by proponents of GPSG (Gazdar, Klein, Pullum & Sag 1985: 6; Pullum 1985; 1989a; Pullum
1991: 48; Kornai & Pullum 1990).
The lack of precision and working out of the details29 and the frequent modification
of basic assumptions30 has led to insights gained by Mainstream Generative Grammar
rarely being translated into computer implementations. There are some implementa-
tions that are based on Transformational Grammar/GB/MP models or borrow ideas from
Mainstream Generative Grammar (Petrick 1965; Zwicky, Friedman, Hall & Walker 1965;
Kay 1967; Friedman 1969; Friedman, Bredt, Doran, Pollack & Martner 1971; Plath 1973;
Morin 1973; Marcus 1980; Abney & Cole 1986; Kuhns 1986; Correa 1987; Stabler 1987;
28
A further definition can be found in Aoun & Lightfoot (1984). This is, however, equivalent to an earlier
version as shown by Postal & Pullum (1986: 104–106).
29
See e.g., Kuhns (1986: 550), Crocker & Lewin (1992: 508), Kolb & Thiersch (1991: 262), Kolb (1997: 3) and
Freidin (1997: 580), Veenstra (1998: 25, 47), Lappin et al. (2000a: 888) and Stabler (2011a: 397, 399, 400) for
the latter.
30
See e.g., Kolb (1997: 4), Fanselow (2009) and the quote from Stabler on page 177.
120
3.6 Summary and classification
1992; 2001; Kolb & Thiersch 1991; Fong 1991; Crocker & Lewin 1992; Lohnstein 1993; Lin
1993; Fordham & Crocker 1994; Nordgård 1994; Veenstra 1998; Fong & Ginsburg 2012),31
but these implementations often do not use transformations or differ greatly from the
theoretical assumptions of the publications. For example, Marcus (1980: 102–104) and
Stabler (1987: 5) use special purpose rules for auxiliary inversion.32 These rules reverse
the order of John and has for the analysis of sentences such as (52a) so that we get the
order in (52b), which is then parsed with the rules for non-inverted structures.
(52) a. Has John scheduled the meeting for Wednesday?
b. John has scheduled the meeting for Wednesday?
These rules for auxiliary inversion are very specific and explicitly reference the category
of the auxiliary. This does not correspond to the analyses proposed in GB in any way.
As we have seen in Section 3.1.5, there are no special transformational rules for auxiliary
inversion. Auxiliary inversion is carried out by the more general transformation Move-𝛼
and the associated restrictive principles. It is not unproblematic that the explicit formu-
lation of the rule refers to the category auxiliary as is clear when one views Stabler’s
GB-inspired phrase structure grammar:
(53) a. s → switch(aux_verb,np), vp.
b. s([First|L0],L,X0,X) :- aux_verb(First),
np(L0,L1,X0,X1),
vp([First|L1],L,X1,X).
The rule in (53a) is translated into the Prolog predicate in (53b). The expression [First|L0]
after the s corresponds to the string, which is to be processed. The ‘|’-operator divides
the list into a beginning and a rest. First is the first word to be processed and L0 contains
all other words. In the analysis of (52a), First is has and L0 is John scheduled the meeting
for Wednesday. In the Prolog clause, it is then checked whether First is an auxiliary
(aux_verb(First)) and if this is the case, then it will be tried to prove that the list L0
begins with a noun phrase. Since John is an NP, this is successful. L1 is the sublist of
L0 which remains after the analysis of L0, that is scheduled the meeting for Wednesday.
This list is then combined with the auxiliary (First) and now it will be checked whether
the resulting list has scheduled the meeting for Wednesday begins with a VP. This is the
case and the remaining list L is empty. As a result, the sentence has been successfully
processed.
The problem with this analysis is that exactly one word is checked in the lexicon.
Sentences such as (54) can not be analyzed:33
31
See Fordham & Crocker (1994) for a combination of a GB approach with statistical methods.
32
Nozohoor-Farshi (1986; 1987) has shown that Marcus’ parser can only parse context-free languages. Since
natural languages are of a greater complexity (see Chapter 17) and grammars of corresponding complexity
are allowed by current versions of Transformational Grammar, Marcus’ parser can be neither an adequate
implementation of the Chomskyan theory in question nor a piece of software for analyzing natural lan-
guage in general.
33
For a discussion that shows that the coordination of lexical elements has to be an option in linguistic
theories, see Abeillé (2006).
121
3 Transformational Grammar – Government & Binding
(54) Could or should we pool our capital with that of other co-ops to address the needs
of a regional “neighborhood”?34
In this kind of sentence, two modal verbs have been coordinated. They then form an X0
and – following GB analyses – can be moved together. If one wanted to treat these cases
as Stabler does for the simplest case, then we would need to divide the list of words
to be processed into two unlimited sub-lists and check whether the first list contains
an auxiliary or several coordinated auxiliaries. We would require a recursive predicate
aux_verbs which somehow checks whether the sequence could or should is a well-formed
sequence of auxiliaries. This should not be done by a special predicate but rather by
syntactic rules responsible for the coordination of auxiliaries. The alternative to a rule
such as (53a) would be the one in (55), which is the one that is used in theories like GPSG
(Gazdar et al. 1985: 62), LFG (Falk 1984: 491), some HPSG analyses (Ginzburg & Sag 2000:
36), and Construction Grammar (Fillmore 1999):
(55) s → v(aux+), np, vp.
This rule would have no problems with coordination data like (54) as coordination of
multiple auxiliaries would produce an object with the category v(aux+) (for more on
coordination see Section 21.6.2). If inversion makes it necessary to stipulate a special
rule like (53a), then it is not clear why one could not simply use the transformation-less
rule in (55).
In the MITRE system (Zwicky et al. 1965), there was a special grammar for the surface
structure, from which the deep structure was derived via reverse application of trans-
formations, that is, instead of using one grammar to create deep structures which are
then transformed into other structures, one required two grammars. The deep structures
that were determined by the parser were used as input to a transformational component
since this was the only way to ensure that the surface structures can actually be derived
from the base structure (Kay 2011: 10).
The REQUEST system by Plath (1973) also used a surface grammar and inverse trans-
formations to arrive at the deep structure, which was used for semantic interpretation.
There are other implementations discussed in this chapter that differ from transfor-
mation-based analyses. For example, Kolb & Thiersch (1991: 265, Section 4) arrive at
the conclusion that a declarative, constraint-based approach to GB is more appropriate
than a derivational one. Johnson (1989) suggests a Parsing as Deduction approach which
reformulates sub-theories of GB (X theory, Theta-Theory, Case Theory, …) as logical
expressions.35 These can be used independently of each other in a logical proof. In
Johnson’s analysis, GB theory is understood as a constraint-based system. More general
restrictions are extracted from the restrictions on S- and D-structure which can then be
used directly for parsing. This means that transformations are not directly carried out
by the parser. As noted by Johnson, the language fragment he models is very small. It
contains no description of wh-movement, for example (p. 114).
34
http://www.cooperativegrocer.coop/articles/index.php?id=595. 2010-03-28.
35
See Crocker & Lewin (1992: 511) and Fordham & Crocker (1994: 38) for another constraint-based Parsing-
as-Deduction approach.
122
3.6 Summary and classification
Lin (1993) implemented the parser PrinciParse. It is written in C++ and based on GB
and Barriers – the theoretical stage after GB (see Chomsky 1986a). The system contains
constraints like the Case Filter, the Theta-Criterion, Subjacency, the Empty Category
Principle and so on. The Theta-Criterion is implemented with binary features +/-theta,
there is no implementation of Logical Form (p. 119). The system organizes the grammar
in a network that makes use of the object-oriented organization of C++ programs, that
is, default-inheritance is used to represent constraints in super and subclasses (Lin 1993:
Section 5). This concept of inheritance is alien to GB theory: it does not play any role in
the main publications. The grammar networks license structures corresponding to X the-
ory, but they code the possible relations directly in the network. The network contains
categories like IP, Ibar, I, CP, Cbar, C, VP, Vbar, V, PP, PSpec, Pbar, P and so on. This
corresponds to simple phrase structure grammars that fully specify the categories in the
rules (see Section 2.2) rather than working with abstract schemata like the ones assumed
in X theory (see Section 2.5). Furthermore Lin does not assume transformations but uses
a GPSG-like feature passing approach to nonlocal dependencies (p. 116, see Section 5.4
on the GPSG approach).
Probably the most detailed implementation in the tradition of GB and Barriers is Sta-
bler’s Prolog implementation (1992). Stabler’s achievement is certainly impressive, but
his book confirms what has been claimed thus far: Stabler has to simply stipulate many
things which are not explicitly mentioned in Barriers (e.g., using feature-value pairs
when formalizing X theory, a practice that was borrowed from GPSG) and some as-
sumptions cannot be properly formalized and are simply ignored (see Briscoe 1997 for
details).
GB analyses which fulfill certain requirements can be reformulated so that they no
longer make use of transformations. These transformation-less approaches are also
called representational, whereas the transformation-based approaches are referred to as
derivational. For representational analyses, there are only surface structures augmented
by traces but none of these structures is connected to an underlying structure by means
of transformations (see e.g., Koster 1978: 1987: 235; Kolb & Thiersch 1991; Haider 1993:
Section 1.4; Frey 1993: 14; Lohnstein 1993: 87–88, 177–178; Fordham & Crocker 1994:
38; Veenstra 1998: 58). These analyses can be implemented in the same way as corre-
sponding HPSG analyses (see Chapter 9) as computer-processable fragments and this
has in fact been carried out for example for the analysis of verb position in German.36
However, such implemented analyses differ from GB analyses with regard to their ba-
sic architecture and in small, but important details such as how one deals with the in-
teraction of long-distance dependencies and coordination (Gazdar 1981b). For a critical
discussion and classification of movement analyses in Transformational Grammar, see
Borsley (2012).
Following this somewhat critical overview, I want to add a comment in order to avoid
being misunderstood: I do not demand that all linguistic work shall be completely for-
36
This shows that ten Hacken’s contrasting of HPSG with GB and LFG (ten Hacken 2007: Section 4.3) and the
classification of these frameworks as belonging to different research paradigms is completely mistaken. In
his classification, ten Hacken refers mainly to the model-theoretic approach that HPSG assumes. However,
LFG also has a model-theoretic formalization (Kaplan 1989). Furthermore, there is also a model-theoretic
variant of GB (Rogers 1998). For further discussion, see Chapter 14.
123
3 Transformational Grammar – Government & Binding
malized. There is simply no space for this in a, say, thirty page essay. Furthermore, I do
not believe that all linguists should carry out formal work and implement their analyses
as computational models. However, there has to be somebody who works out the formal
details and these basic theoretical assumptions should be accepted and adopted for a
sufficient amount of time by the research community in question.
Comprehension questions
Exercises
124
3.6 Summary and classification
For the passive sentences, use the analysis where the subject noun phrase is
moved from the object position, that is, the analysis without an empty exple-
tive as the subject.
Further reading
For Sections 3.1–3.5, I used material from Peter Gallmann from 2003 (Gallmann
2003). This has been modified, however, at various points. I am solely responsible
for any mistakes or inadequacies. For current materials by Peter Gallmann, see
http://www.syntax-theorie.de.
In the book Syntaktische Analyseperspektiven, Lohnstein (2014) presents a vari-
ant of GB which more or less corresponds to what is discussed in this chapter
(CP/IP, movement-based analysis of the passive). The chapters in said book have
been written by proponents of various theories and all analyze the same newspa-
per article. This book is extremely interesting for all those who wish to compare
the various theories out there.
Haegeman (1994) is a comprehensive introduction to GB. Those who do read
German may consider the textbooks by Fanselow & Felix (1987), von Stechow
& Sternefeld (1988) and Grewendorf (1988) since they are also addressing the
phenomena that are covered in this book.
In many of his publications, Chomsky discusses alternative, transformation-
less approaches as “notational variants”. This is not appropriate, as analyses
without transformations can make different predictions to transformation-based
approaches (e.g., with respect to coordination and extraction. See Section 5.5 for
a discussion of GPSG in this respect). In Gazdar (1981a), one can find a compar-
ison of GB and GPSG as well as a discussion of the classification of GPSG as a
notational variant of Transformational Grammar with contributions from Noam
Chomsky, Gerald Gazdar and Henry Thompson.
Borsley (1999b) and Kim & Sells (2008) have parallel textbooks for GB and
HPSG in English. For the comparison of Transformational Grammar and LFG,
see Bresnan & Kaplan (1982). Kuhn (2007) offers a comparison of modern deriva-
tional analyses with constraint-based LFG and HPSG approaches. Borsley (2012)
contrasts analyses of long-distance dependencies in HPSG with movement-based
analyses as in GB/Minimalism. Borsley discusses four types of data which are
problematic for movement-based approaches: extraction without fillers, extrac-
tion with multiple gaps (see also the discussion of (57) on p. 171 and of (55) on
p. 201 of this book), extractions where fillers and gaps do not match and extrac-
tion without gaps.
125
4 Transformational Grammar –
Minimalism
Like the Government & Binding framework that was introduced in the previous chap-
ter, the Minimalist framework was initiated by Noam Chomsky at the MIT in Boston.
Chomsky (1993; 1995b) argued that the problem of language evolution should be taken
seriously and that the question of how linguistic knowledge could become part of our
genetic endowment should be answered. To that end he suggested refocusing the theo-
retical developments towards models that have to make minimal assumptions regarding
the machinery that is needed for linguistic analyses and hence towards models that as-
sume less language specific innate knowledge.
Like GB, Minimalism is wide-spread: theoreticians all over the world are working in
this framework, so the following list of researchers and institutions is necessarily incom-
plete. Linguistic Inquiry and Syntax are journals that almost exclusively publish Mini-
malist work and the reader is referred to these journals to get an idea about who is active
in this framework. The most prominent researchers in Germany are Artemis Alexiadou,
Humboldt University Berlin; Günther Grewendorf (2002), Frankfurt am Main; Joseph
Bayer, Konstanz; and Gereon Müller, Leipzig.
While innovations like X theory and the analysis of clause structure in GB are highly
influential and can be found in most of the other theories that are discussed in this book,
this is less so for the technical work done in the Minimalist framework. It is nevertheless
useful to familiarize with the technicalities since Minimalism is a framework in which
a lot of work is done and understanding the basic machinery makes it possible to read
empirically interesting work in that framework.
While the GB literature of the 1980s and 1990s shared a lot of assumptions, there was
an explosion of various approaches in the Minimalist framework that is difficult to keep
track of. The presentation that follows is based on David Adger’s textbook (Adger 2003).
Strong features make syntactic objects move to higher positions. The reader is familiar
with this feature-driven movement already since it was a component of the movement-
based analysis of the passive in Section 3.4. In the GB analysis of passive, the object had
to move to the specifier position of IP in order to receive case. Such movements that are
due to missing feature values are a key component in Minimalist proposals.
128
4.1 General remarks on the representational format
lexicon numeration
Transfer
PHON SEM
Figure 4.1: Architecture assumed in Minimalist theories before the Phase model (left) and
in the Phase model (right) according to Richards (2015: 812, 830)
syntax is called covert syntax. Like in GB’s LF, the covert syntax can be used to derive
certain scope readings.
This architecture was later modified to allow Spell-Out at several points in the deriva-
tion (right figure). It is now assumed that there are phases in a derivation and that a
completed phase is spelled out once it is used in a combination with a head (Chomsky
2008). For instance, a subordinated sentence like that Peter comes in (2) is one phase and
is sent to the interfaces before the whole sentence is completed.3
There are different proposals as to what categories form complete phases. Since the
concept of phases is not important for the following introduction, I will ignore this con-
cept in the following. See Section 15.1 on the psycholinguistic plausibility of phases in
particular and the Minimalist architecture in general.
3
Andreas Pankau (p. c. 2015) pointed out to me that there is a fundamental problem with such a conception
of phases, since if it is the case that only elements that are in a relation to a head are send off to the interface
then the topmost phrase in a derivation would never be sent to the interfaces, since it does not depend on
any head.
129
4 Transformational Grammar – Minimalism
If this structure would be used in a larger structure that is spelled out, the derivation
would crash since the conceptual system could not make sense of the D feature that is
still present at the P node.
130
4.1 General remarks on the representational format
Selectional features are atomic, that is, the preposition cannot select an DP[acc] as in
GB and the other theories in this book unless DP[acc] is assumed to be atomic. There-
fore, an additional mechanism is assumed that can check other features in addition to
selectional features. This mechanism is called Agree.
(5) a. * letters to he
b. letters to him
The analysis of (5b) is shown in Figure 4.4. There is an interesting difference between the
checking of selectional features and the checking of features via Agree. The features that
are checked via Agree do not have to be at the top node of the object that is combined
with a head. This will play a role later in the analysis of the passive and local reordering.
131
4 Transformational Grammar – Minimalism
XP
specifier X
specifier X
complement X
proper names: a lot of unary branching structure had to be assumed (See left picture in
Figure 2.9 on page 76). This is not necessary any longer in current Minimalist theories.4
4.1.4 Little v
In Section 3.4, I used X structures in which a ditransitive verb was combined with its
accusative object to form a V, which was then combined with the dative object to form
a further V. Such binary branching structures and also flat structures in which both
objects are combined with the verb to form a V are rejected by many practitioners of
GB and Minimalism since the branching does not correspond to branchings that would
be desired for phenomena like the binding of reflexives and negative polarity items. A
binding in which Benjamin binds himself in (6a) is impossible:
(6) a. * Emily showed himself Benjamin in the mirror.
b. Peter showed himself Benjamin in the mirror.
What is required for the analysis of Binding and NPI phenomena in theories that analyze
these phenomena in terms of tree configurations is that the reflexive pronoun is “higher”
in the tree than the proper name Benjamin. More precisely, the reflexive pronoun himself
has to c-command Benjamin. c-command is defined as follows (Adger 2003: 117):5
(7) A node A c-commands B if, and only if A’s sister either:
a. is B, or
b. contains B
In the trees to the left and in the middle of Figure 4.6 on the next page the c-command
relations are not as desired: in the left-most tree both DPs c-command each other and in
the middle one Benjamin c-commands himself rather than the other way round. Hence
4
For problems with this approach see Brosziewski (2003: Section 2.1).
5
c-command also plays a prominent role in GB. In fact, one part of Government & Binding is the Binding
Theory, which was not discussed in the previous chapter since binding phenomena do not play a role in
this book.
132
4.1 General remarks on the representational format
V V v
V Benjamin
it is assumed that the structures at the left and in the middle are inappropriate and that
there is some additional structure involving the category v, which is called little v (Adger
2003: Section 4.4). The sister of himself is V and V contains Benjamin, hence himself
c-commands Benjamin. Since the sister of Benjamin is V and V neither is nor contains
himself, Benjamin does not c-command himself.
The analysis of ditransitives involving an additional verbal head goes back to Larson
(1988). Hale & Keyser (1993: 70) assume that this verbal head contributes a causative
semantics. The structure in Figure 4.7 is derived by assuming that the verb show starts
out in the V position and then moves to the v position. show is assumed to mean see and
in the position of little v it picks up the causative meaning, which results in a cause-see ′
meaning (Adger 2003: 133).
vP
Peter v
v + show VP
himself V
While the verb shell analysis with an empty verbal head was originally invented by
Larson (1988) for the analysis of ditransitive verbs, it is now also used for the analysis of
strictly transitive and even intransitive verbs.
Adger (2003: Section 4.5) argues that semantic roles are assigned uniformly in certain
tree configurations:
133
4 Transformational Grammar – Minimalism
Adger assumes that such uniformly assigned semantic roles help in the process of lan-
guage acquisition and from this, it follows that little v should also play a role in the analy-
sis of examples with strictly transitive and intransitive verbs. The Figures 4.8 and 4.9
show the analysis of sentences containing the verbs burn and laugh, respectively.6
vP
Agent v [uD]
v VP
vP
Agent v [uD]
v laugh [V]
Adger (2003: 164) assumes that intransitive and transitive verbs move from V to little
v as well. This will be reflected in the following figures.
134
4.1 General remarks on the representational format
ideas of the CP/IP analysis have been transferred to the Minimalist analysis of English.
This subsection will first discuss special features that are assumed to trigger movement
(Subsection 4.1.5.1) and then case assignment (Subsection 4.1.5.2).
TP
will T[pres] vP
⟨ Anna ⟩ v [uD]
v VP
the book
Figure 4.10: Analysis of Anna will read the book. involving a modal and movement of the
subject from v to T
The Determiner Phrase (DP) the book is the object of read and checks the D feature of
read. little v selects for the subject Anna. Since T has a strong D feature (marked by an
asterisk ‘*’), Anna must not remain inside of the vP but moves on to the specifier position
of TP.
Full sentences are CPs. For the analysis of (9), an empty C head is assumed that is
combined with the TP. The empty C contributes a clause type feature Decl. The full
analysis of (9) is shown in Figure 4.11.
135
4 Transformational Grammar – Minimalism
CP
C[Decl] TP
will T[pres] vP
⟨ Anna ⟩ v [uD]
v VP
the book
Figure 4.11: Analysis of Anna will read the book. as CP with an empty C with the clause-
type feature Decl
The analysis of the question in (10) involves an unvalued clause-type feature on T for
the sentence type question.
(10) What will Anna read?
The empty complementizer C has a Q feature that can value the clause-type feature on
T. Since clause-type features on T that have the value Q are stipulated to be strong, the
T element has to move to C to check the feature locally. In addition, the wh element is
moved. This movement is enforced by a strong wh feature on C. The analysis of (10) is
given in Figure 4.12 on the next page.
136
4.1 General remarks on the representational format
CP
C TP
⟨ will ⟩ [T] vP
⟨ Anna ⟩ v [uD]
v VP
Figure 4.12: Analysis of What will Anna read? with an empty C with a strong wh feature
137
4 Transformational Grammar – Minimalism
TP
T[pres] vP
⟨ Anna ⟩ v [uD]
v VP
the book
Figure 4.13: Case assignment by T and v in the TP for of Anna reads the book.
4.1.6 Adjuncts
Adger (2003: Section 4.2.3) assumes that adjuncts attach to XP and form a new XP. He
calls this operation Adjoin. Since this operation does not consume any features it is dif-
ferent from External Merge and hence a new operation would be introduced into the
theory, contradicting Chomsky’s claim that human languages use only Merge as a struc-
ture building operation. There are proposals to treat adjuncts as elements in special
adverbial phrases with empty heads (see Section 4.6.1) that are also assumed to be part
of a hierarchy of functional projections. Personally, I prefer Adger’s solution that corre-
sponds to what is done in many other frameworks: there is a special rule or operation
for the combination of adjuncts and heads (see for instance Section 9.1.7 on the HPSG
schema for head adjunct combinations).
138
4.3 Long-distance dependencies
CP
C TP
⟨ jeder ⟩ v
VP v
DP ⟨ kennt ⟩ ⟨ kennt ⟩ v
diesen Mann
Figure 4.14: Analysis of Kennt jeder diesen Mann? ‘Does everybody know this man?’
following the analysis of Adger (2003)
139
4 Transformational Grammar – Minimalism
CP
C TP
⟨ jeder ⟩ v
VP v
Figure 4.15: Analysis of Diesen Mann kennt jeder. ‘This man, everybody knows.’ follow-
ing the analysis of Adger (2003: 331)
As in the verb-initial clause in Figure 4.14 a feature on C triggers verb movement. This
time it is a Decl feature since we are dealing with a declarative clause. The top feature
triggers movement of diesen Mann ‘this man’ to the specifier position of C.
4.4 Passive
Adger (2003) suggests an analysis for the passive in English, which I adapted here to
German. Like in the GB analysis that was discussed in Section 3.4 it is assumed that
the verb does not assign accusative to the object of schlagen ‘to beat’. In Minimalist
terms, this means that little v does not have an acc feature that has to be checked. This
special version of little v is assumed to play a role in the analysis of sentences of so-called
unaccusative verbs (Perlmutter 1978). Unaccusative verbs are a subclass of intransitive
verbs that have many interesting properties. For instance, they can be used as adjectival
participles although this is usually not possible with intransitive verbs:
140
4.4 Passive
vP
v VP
Figure 4.16: Structure of vP with unaccusative verbs like fall, collapse, wilt according to
Adger (2003: 140)
passive. Unaccusative verbs are similar to passivized verbs in that they do have a subject
that somehow also has object properties. The special version of little v is selected by
the Passive head werden ‘be’, which forms a Passive Phrase (abbreviated as PassP). See
Figure 4.17 for the analysis of the example in (19):
(19) dass er geschlagen wurde
that he beaten was
‘that he was beaten’
The Pass head requires the Infl feature of little v to have the value Pass, which results
in participle morphology at spellout. Hence the form that is used is geschlagen ‘beaten’.
The auxiliary moves to T to check the strong Infl feature at T and since the Infl feature
is past, the past form of werden ‘be’, namely wurde ‘was’, is used at spellout. T has a nom
141
4 Transformational Grammar – Minimalism
TP
PassP T[past,nom]
VP v
Figure 4.17: Minimalist analysis of the passive without movement but with nonlocal case
assignment via Agree
feature that has to be checked. Interestingly, the Minimalist approach does not require
the object of schlagen to move to the specifier position of T in order to assign case,
since case assignment is done via Agree. Hence in principle, the pronominal argument
of schlagen could stay in its object position and nevertheless get nominative from T.
This would solve the problem of the GB analysis that was pointed out by Lenerz (1977:
Section 4.4.3). See page 112 for Lenerz’ examples and discussion of the problem. However,
Adger (2003: 332) assumes that German has a strong EPP feature on T. If this assumption
is upheld, all problems of the GB account will carry over to the Minimalist analysis: all
objects have to move to T even when there is no reordering taking place. Furthermore,
impersonal passives of the kind in (20) would be problematic, since there is no noun
phrase that could be moved to T in order to check the EPP feature:
(20) weil getanzt wurde
because danced was
‘because there was dancing there’
142
4.6 New developments and theoretical variants
to move to. G. Müller (2014a: Section 3.5) offers a leaner solution, though. In his ap-
proach, the object simply moves to a second specifier position of little v. The analysis is
depicted in Figure 4.18.7
CP
C TP
diesen Mann v
jeder v
VP v
Figure 4.18: Analysis of dass diesen Mann jeder kennt ‘that everybody knows this man’
as movement of the object to a specifier position of v
143
4 Transformational Grammar – Minimalism
essary. In the Minimalist Program, Chomsky gives the central motivations for the far-
reaching revisions of GB theory (Chomsky 1993; 1995b). Until the beginning of the 90s, it
was assumed that Case Theory, the Theta-Criterion, X theory, Subjacency, Binding The-
ory, Control Theory etc. all belonged to the innate faculty for language (Richards 2015:
804). This, of courses, begs the question of how this very specific linguistic knowledge
made its way into our genome. The Minimalist Program follows up on this point and at-
tempts to explain properties of language through more general cognitive principles and
to reduce the amount of innate language-specific knowledge postulated. The distinction
between deep structure and surface structure, for example, was abandoned. Move still
exists as an operation, but can be used directly to build sub-structures rather than after
a complete D-structure has been created. Languages differ with regard to whether this
movement is visible or not.
Although Chomsky’s Minimalist Program should be viewed as a successor to GB, ad-
vocates of Minimalism often emphasize the fact that Minimalism is not a theory as such,
but rather a research program (Chomsky 2007: 4; 2013: 6). The actual analyses sug-
gested by Chomsky (1995b) when introducing the research program have been reviewed
by theoreticians and have sometimes come in for serious criticism (Kolb 1997; Johnson
& Lappin 1997; 1999; Lappin, Levine & Johnson 2000a,b; 2001; Seuren 2004; Pinker &
Jackendoff 2005), however, one should say that some criticisms overshoot the mark.
There are various strains of Minimalism. In the following sections, I will discuss some
of the central ideas and explain which aspects are regarded problematic.
144
4.6 New developments and theoretical variants
and therefore has to be moved to a position where it can receive case. This kind of ap-
proach is also used in newer analyses for a range of other phenomena. For example, it
is assumed that there are phrases whose heads have the categories focus and topic. The
corresponding functional heads are always empty in languages like German and English.
Nevertheless, the assumption of these heads is motivated by the fact that other languages
possess markers which signal the topic or focus of a sentence morphologically. This ar-
gumentation is only possible if one also assumes that the inventory of categories is the
same for all languages. Then, the existence of a category in one language would suggest
the existence of the same category in all other languages. This assumption of a shared
universal component (Universal Grammar, UG) with detailed language-specific knowl-
edge is, however, controversial and is shared by few linguists outside of the Chomskyan
tradition. Even for those working in Chomskyan linguistics, there have been questions
raised about whether it is permissible to argue in this way since if it is only the ability to
create recursive structures that is responsible for the human-specific ability to use lan-
guage (faculty of language in the narrow sense) – as Hauser, Chomsky & Fitch (2002)
assume –, then the individual syntactic categories are not part of UG and data from other
languages cannot be used to motivate the assumption of invisible categories in another
language.
8
The assumption of such heads is not necessary since features can be “bundled” and then they can be
checked together. For an approach in this vein, which is in essence similar to what theories such as HPSG
assume, see Sternefeld (2006: Section II.3.3.4, Section II.4.2).
In so-called cartographic approaches, it is assumed that every morphosyntactic feature corresponds
to an independent syntactic head (Cinque & Rizzi 2010: 54, 61). For an explicitly formalized proposal in
which exactly one feature is consumed during a combination operation see Stabler (2001: 335). Stabler’s
Minimalist Grammars are discussed in more detail in Section 4.6.4.
9
There are differing opinions as to whether functional projections are optional or not. Some authors assume
that the complete hierarchy of functional projections is always present but functional heads can remain
empty (e.g., Cinque 1999: 106 and Cinque & Rizzi 2010: 55).
10
See Chomsky (1995b: Section 4.10.1), however.
145
4 Transformational Grammar – Minimalism
ForceP
Force′
Force0 TopP*
Top′
Top0 FocP
Foc′
Foc0 TopP*
Top′
Top0 FinP
Fin′
Fin0 IP
78), von Stechow (1996: 103) and Meinunger (2000: 100–101, 124) differentiate between
two agreement positions for direct and indirect objects (AgrO, AgrIO). As well as AgrS,
AgrO and Neg, Beghelli & Stowell (1997) assume the functional heads Share and Dist in
order to explain scope phenomena in English as feature-driven movements at LF. For a
treatment of scope phenomena without empty elements or movement, see Section 19.3.
Błaszczak & Gärtner (2005: 13) assume the categories −PolP, +PolP and %PolP for their
discussion of polarity.
Webelhuth (1995: 76) gives an overview of the functional projections that had been
proposed up to 1995 and offers references for AgrA, AgrN, AgrV, Aux, Clitic Voices,
Gender, Honorific, 𝜇, Number, Person, Predicate, Tense, Z.
In addition to AdvP, NegP, AgrP, FinP, TopP and ForceP, Wiklund, Hrafnbjargarson,
Bentzen & Hróarsdóttir (2007) postulate an OuterTopP. Poletto (2000: 31) suggests both
146
4.6 New developments and theoretical variants
a HearerP and a SpeakerP for the position of clitics in Italian. Bosse & Bruening (2011:
75) assume a BenefactiveP
Cinque (1999: 106) adopts the 32 functional heads in Table 4.1 in his work. He assumes
that all sentences contain a structure with all these functional heads. The specifier po-
sitions of these heads can be occupied by adverbs or remain empty. Cinque claims that
these functional heads and the corresponding structures form part of Universal Gram-
mar, that is, knowledge of these structures is innate (page 107).11 Laenzlinger (2004) fol-
lows Cinque in proposing this sequence of functional heads for German. He also follows
Kayne (1994), who assumes that all syntactic structures have the order specifier head
complement cross-linguistically, even if the surface order of the constituents seems to
contradict this.
The constituent orders that are visible in the end are derived by leftward-movement.12
Figure 4.20 on page 149 shows the analysis of a verb-final clause where the functional
11
Table 4.1 shows only the functional heads in the clausal domain. Cinque (1994: 96, 99) also accounts for the
order of adjectives with a cascade of projections: Quality, Size, Shape, Color, Nationality. These categories
and their ordering are also assumed to belong to UG (p. 100).
Cinque (1994: 96) claims that a maximum of seven attributive adjectives are possible and explains this
with the fact that there are a limited number of functional projections in the nominal domain. As was
shown on page 65, with a fitting context it is possible to use several adjectives of the same kind, which is
why some of Cinque’s functional projections would have to be subject to iteration.
12
This also counts for extraposition, that is, the movement of constituents into the postfield in German.
Whereas this would normally be analyzed as rightward-movement, Kayne (1994: Chapter 9) analyzes it as
movement of everything else to the left. Kayne assumes that (i.b) is derived from (i.a) by moving part of
the DP:
(i) a. just walked into the room [DP someone who we don’t know].
b. Someone𝑖 just walked into the room [DP _𝑖 who we don’t know].
(i.a) must have to be some kind of derived intermediate representation, otherwise English would not be
SV(O) underlyingly but rather V(O)S. (i.a) is therefore derived from (ii) by fronting the VP just walked into
the room.
(ii) Someone who we don’t know just walked into the room
Such analyses have the downside that they cannot be easily combined with performance models (see Chap-
ter 15).
147
4 Transformational Grammar – Minimalism
adverbial heads have been omitted.13 Subjects and objects are generated as arguments
inside of vP and VP, respectively. The subject is moved to the specifier of the subject
phrase and the object is moved to the specifier of the object phrase. The verbal pro-
jection (VP𝑘 ) is moved in front of the auxiliary into the specifier position of the phrase
containing the auxiliary. The only function of SubjP and ObjP is to provide a landing site
for the respective movements. For a sentence in which the object precedes the subject,
Laenzlinger assumes that the object moves to the specifier of a topic phrase. Figure 4.20
contains only a ModP and an AspP, although Laenzlinger assumes that all the heads
proposed by Cinque are present in the structure of all German clauses. For ditransitive
verbs, Laenzlinger assumes multiple object phrases (page 230). A similar analysis with
movement of object and subject from verb-initial VPs to Agr positions was suggested by
Zwart (1994) for Dutch.
For general criticism of Kayne’s model, see Haider (2000). Haider shows that a Kayne-
like theory makes incorrect predictions for German (for instance regarding the position
of selected adverbials and secondary predicates and regarding verbal complex formation)
and therefore fails to live up to its billing as a theory which can explain all languages.
Haider (1997a: Section 4) has shown that the assumption of an empty Neg head, as as-
sumed by Pollock (1989), Haegeman (1995) and others, leads to problems. See Bobaljik
(1999) for problems with the argumentation for Cinque’s cascade of adverb-projections.
Furthermore, it has to be pointed out that SubjP and ObjP, TraP (Transitive Phrase) and
IntraP (Intransitive Phrase) (Karimi-Doostan 2005: 1745) and TopP (topic phrase), DistP
(quantifier phrase), AspP (aspect phrase) (Kiss 2003: 22; Karimi 2005: 35), PathP and Pla-
ceP (Svenonius 2004: 246) encode information about grammatical function, valence, in-
formation structure and semantics in the category symbols.14 In a sense, this is a misuse
of category symbols, but such a misuse of information structural and semantic categories
is necessary since syntax, semantics, and information structure are tightly connected
and since it is assumed that the semantics interprets the syntax, that is, it is assumed
that semantics comes after syntax (see Figure 3.2 and Figure 4.1). By using semantically
and pragmatically relevant categories in syntax, there is no longer a clean distinction be-
tween the levels of morphology, syntax, semantics and pragmatics: everything has been
‘syntactified’. Rizzi (2014) himself talks about syntactification. He points out that there
are fundamental problems with the T-model and its current variants in Minimalism and
concludes that a syntactification in terms of Rizzi-style functional heads or a prolifera-
tion of T heads with respective features (see also Borsley 2006; Borsley & Müller 2020)
is the only way to save this architecture. Felix Bildhauer (p. c. 2012) has pointed out to
me that approaches which assume a cascade of functional projections where the indi-
vidual aspects of meaning are represented by nodes are actually very close to phrasal
approaches in Construction Grammar (see Adger 2013: 470 also for a similar view). One
13
These structures do not correspond to X theory as it was presented in Section 2.5. In some cases, heads
have been combined with complements to form an XP rather than an X′ . For more on X theory in the
Minimalist Program, see Section 4.6.3.
14
For further examples and references, see Newmeyer (2004a: 194; 2005: 82). Newmeyer references also
works which stipulate a projection for each semantic role, e.g., Agent, Reciprocal, Benefactive, Instrumen-
tal, Causative, Comitative, and Reversive Phrase.
148
4.6 New developments and theoretical variants
CP
C0 TopP
DP 𝑗 SubjP
DP𝑖 ModP
AdvP ObjP
DP 𝑗 NegP
AdvP AspP
AdvP MannP
AdvP AuxP
VP𝑘 Aux+
Aux vP
DP𝑖 VP𝑘
wahrscheinlich
V DP 𝑗
probably
149
4 Transformational Grammar – Minimalism
simply lists configurations and these are assigned a meaning (or features which are in-
terpreted post-syntactically, see Cinque & Rizzi (2010: 62) for the interpretation of TopP,
for example).
pP
p AgrOP
P p D AgrO
AgrO PP
P AgrO P D
with ∅ me t′ t t
Figure 4.21: PP analysis following Radford with case assignment in specifier position and
little p
only necessary in order to retain the assumption that feature checking takes place in
specifier-head relations. If one were to allow the preposition to determine the case of its
object locally, then all this theoretical apparatus would not be necessary and it would be
possible to retain the well-established structure in (23).
Sternefeld (2006: 549–550) is critical of this analysis and compares it to Swiss cheese
(being full of holes). The comparison to Swiss cheese is perhaps even too positive since,
unlike Swiss cheese, the ratio of substance to holes in the analysis is extreme (2 words vs.
150
4.6 New developments and theoretical variants
5 empty elements). We have already seen an analysis of noun phrases on page 70, where
the structure of an NP, which only consisted of an adjective klugen ‘clever’, contained
more empty elements than overt ones. The difference to the PP analysis discussed here is
that empty elements are only postulated in positions where overt determiners and nouns
actually occur. The little p projection, on the other hand, is motivated entirely theory-
internally. There is no theory-external motivation for any of the additional assumptions
made for the analysis in Figure 4.21 (see Sternefeld 2006: 549–550).
A variant of this analysis has been proposed by Hornstein, Nunes & Grohmann (2005:
124). The authors do without little p, which makes the structure less complex. They
assume the structure in (24), which corresponds to the AgrOP-subtree in Figure 4.21.
(24) [AgrP DP𝑘 [Agr′ P𝑖 +Agr [PP t𝑖 t𝑘 ]]]
The authors assume that the movement of the DP to SpecAgrP happens invisibly, that
is, covert. This solves Radford’s problem and makes the assumption of pP redundant.
The authors motivate this analysis by pointing out agreement phenomena in Hun-
garian: Hungarian postpositions agree with the preceding noun phrase in person and
number. That is, the authors argue that English prepositional and Hungarian postpo-
sitional phrases have the same structure derived by movement, albeit the movement is
covert in English.
In this way, it is possible to reduce the number and complexity of basic operations
and, in this sense, the analysis is minimal. These structures are, however, still incredibly
complex. No other kind of theory discussed in this book needs the amount of inflated
structure to analyze the combination of a preposition with a noun phrase. The struc-
ture in (24) cannot be motivated by reference to data from English and it is therefore
impossible to acquire it from the linguistic input. A theory which assumes this kind
of structures would have to postulate a Universal Grammar with the information that
features can only be checked in (certain) specifier positions (see Chapters 13 and 16 for
more on Universal Grammar and language acquisition). For general remarks on (covert)
movement see Haider (2014: Section 2.3).
151
4 Transformational Grammar – Minimalism
The category system, selectional mechanisms and projection of features would therefore
have to be made considerably more complicated when compared to a system which
simply base generates the orders or a system in which a constituent is moved out of the
IP, thereby creating a new IP.
Proposals that follow Cinque (1999) are problematic for similar reasons: Cinque as-
sumes the category AdverbP for the combination of an adverb and a VP. There is an
empty functional head, which takes the verbal projection as its complement and the ad-
verb surfaces in the specifier of this projection. In these systems, adverb phrases have to
pass on inflectional properties of the verb since verbs with particular inflectional proper-
ties (finiteness, infinitives with zu, infinitives without zu, participles) have to be selected
by higher heads (see page 185 and Section 9.1.4). There is of course the alternative to
use Agree for this, but then all selection would be nonlocal and after all selection is not
agreement. For further, more serious problems with this analysis like modification of ad-
verbs by adverbs in connection with partial fronting and restrictions on non-phrasality
of preverbal adverbials in English, see Haider (1997a: Section 5).
A special case of the adverb problem is the negation problem: Ernst (1992) studied the
syntax of negation more carefully and pointed out that negation can attach to several
different verbal projections (26a,b), to adjectives (26c) and adverbs (26d).
(26) a. Ken could not have heard the news.
b. Ken could have not heard the news.
c. a [not unapproachable] figure
d. [Not always] has she seasoned the meat.
If all of these projections are simply NegPs without any further properties (about verb
form, adjective part of speech, adverb part of speech), it would be impossible to account
for their different syntactic distributions. Negation is clearly just a special case of the
more general problem, since adverbs may attach to adjectives forming adjectival phrases
in the traditional sense and not adverb phrases in Chinque’s sense. For instance, the
adverb oft ‘often’ in (27) modifies lachender ‘laughing’ forming the adjectival phrase oft
lachender, which behaves like the unmodified adjectival participle lachender: it modifies
Mann ‘man’ and it precedes it.
(27) a. ein lachender Mann
a laughing man
‘a laughing man’
152
4.6 New developments and theoretical variants
153
4 Transformational Grammar – Minimalism
In Spanish, partial focus can be achieved not by special intonation, but rather only by
altruistic movement in order to move the object out of the focus. See also Bildhauer &
Cook (2010: p. 72) for a discussion of “altruistic” multiple frontings in German.
It is therefore not possible to assume that elements are moved to a particular position
in the tree in order to check some feature motivated by information structural proper-
ties. Since feature checking is a prerequisite for movement in current minimalist theory,
one would have to postulate a special feature, which only has the function of triggering
altruistic movement. Fanselow (2003a: Section 4; 2006: 8) has also shown that the order-
ing constraints that one assumes for topic, focus and sentence adverbs can be adequately
described by a theory which assumes firstly, that arguments are combined (in minimalist
terminology: merged) with their head one after the other and secondly, that adjuncts can
be adjoined to any projection level. The position of sentence adverbs directly before the
focused portion of the sentence receives a semantic explanation: since sentence adverbs
behave like focus-sensitive operators, they have to directly precede elements that they
refer to. It follows from this that elements which do not belong to the focus of an utter-
ance (topics) have to occur in front of the sentence adverb. It is therefore not necessary
to assume a special topic position to explain local reorderings in the middle field. This
analysis is also pursued in LFG and HPSG. The respective analyses are discussed in more
detail in the corresponding chapters.
4.6.2 Labeling
In the Minimalist Program, Chomsky tries to keep combinatorial operations and mech-
anisms as simple as possible. He motivates this with the assumption that the existence
of a UG with less language-specific knowledge is more plausible from a evolutionary
point of view than a UG which contains a high degree of language-specific knowledge
(Chomsky 2008: 135).
For this reason, he removes the projection levels of X theory, traces, indices and “sim-
ilar descriptive technology” (Chomsky 2008: 138). All that remains is Merge and Move,
that is, Internal and External Merge. Internal and External Merge combine two syntactic
objects 𝛼 and 𝛽 into a larger syntactic object which is represented as a set { 𝛼, 𝛽 }. 𝛼
and 𝛽 can be either lexical items or internally complex syntactic objects. Internal Merge
moves a part of an object to its periphery.15 The result of internally merging an element
is a set { 𝛼, 𝛽 } where 𝛼 was a part of 𝛽. External Merge also produces a set with two
elements. However, two independent objects are merged. The objects that are created
by Merge have a certain category (a set of features). For instance, if one combines the
elements 𝛼 and 𝛽, one gets { l, { 𝛼, 𝛽 } }, where l is the category of the resulting object.
15
To be more specific, part of a syntactic object is copied and the copy is placed at the edge of the entire
object. The original of this copy is no longer relevant for pronunciation (Copy Theory of Movement).
154
4.6 New developments and theoretical variants
This category is also called a label. Since it is assumed that all constituents are headed,
the category that is assigned to { 𝛼, 𝛽 } has to be either the category of 𝛼 or the category
of 𝛽. Chomsky (2008: 145) discusses the following two rules for the determination of the
label of a set.
(29) a. In { H, 𝛼 }, H an LI, H is the label.
b. If 𝛼 is internally merged to 𝛽, forming { 𝛼, 𝛽 } then the label of 𝛽 is the label of
{ 𝛼, 𝛽 }.
As Chomsky notes, these rules are not unproblematic since the label is not uniquely de-
termined in all cases. An example is the combination of two lexical elements. If both
H and 𝛼 in (29a) are lexical items (LI), then both H and 𝛼 can be the label of the result-
ing structure. Chomsky notices that this could result in deviant structures, but claims
that this concern is unproblematic and ignores it. Chomsky offered a treatment of the
combination of two lexical items in his 2013 paper. The solution to the problem is to
assume that all combinations of lexical elements consist of a functional element and a
root (Marantz 1997; Borer 2005). Roots are not considered as labels per definition16 and
hence the category of the functional element determines the category of the combination
(Chomsky 2013: 47). Such an analysis can only be rejected: the goal of the Minimalist
Program is to simplify the theoretical proposals to such an extent that the models of lan-
guage acquisition and language evolution become plausible, but in order to simplify basic
concepts it is stipulated that a noun cannot simply be a noun but needs a functional ele-
ment to tell the noun what category it has. Given that the whole point of Chomsky’s Bare
Phrase Structure (Chomsky 1995a) was the elimination of the unary branching structures
in X theory, it is unclear why they are reintroduced now through the backdoor, only
more complex with an additional empty element.17 Theories like Categorial Grammar
and HPSG can combine lexical items directly without assuming any auxiliary projections
or empty elements. See also Rauh (2016) for a comparison of the treatment of syntactic
categories in earlier versions of Transformational Grammar, HPSG, Construction Gram-
mar, Role and Reference Grammar and root-based Neo-Constructivist proposals like the
one assumed by Chomsky (2013). Rauh concludes that the direct connection of syntac-
tic and semantic information is needed and that the Neo-Constructivism of Marantz
and Borer has to be rejected. For further criticism of Neo-Constructivist approaches see
Wechsler (2008a) and Müller & Wechsler (2014a: Sections 6.1 and 7).
The combination of a pronoun with a verbal projection poses a problem that is related
to what has been said above. In the analysis of He left, the pronoun he is a lexical element
16
Another category that is excluded as label per definition is Conj, which stands for conjunction (Chomsky
2013: 45–46). This is a stipulation that is needed to get coordination to work. See below.
17
The old X rule in (i.a) corresponds to the binary combination in (i.b).
(i) a. N′ → N
b. N → N-func root
In (i.a) a lexical noun is projected to an N′ and in (i.b), a root is combined with a functional nominal head
into a nominal category.
155
4 Transformational Grammar – Minimalism
and hence would be responsible for the label of He left, since left is an internally complex
verbal projection in Minimalist theories. The result would be a nominal label rather than
a verbal one. To circumvent this problem, Chomsky (2013: 46) assumes that he has a
complex internal structure: ‘perhaps D-pro’, that is, he is (perhaps) composed out of an
invisible determiner and a pronoun.
The case in which two non-LIs are externally merged (for instance a nominal and a
verbal phrase) is not discussed in Chomsky (2008). Chomsky (2013: 43–44) suggests that
a phrase XP is irrelevant for the labeling of { XP, YP } if XP is moved (or rather copied
in the Copy Theory of Movement) in a further step. Chomsky assumes that one of two
phrases in an { XP, YP } combination has to move, since otherwise labeling would be
impossible (p. 12).18 The following coordination example will illustrate this: Chomsky
assumes that the expression Z and W is analyzed as follows: first, Z and W are merged.
This expression is combined with Conj (30a) and in the next step Z is raised (30b).
(30) a. [𝛼 Conj [𝛽 Z W]]
b. [𝛾 Z [𝛼 Conj [𝛽 Z W]]
Since Z in 𝛽 is only a copy, it does not count for labeling and 𝛽 can get the label of W. It
is stipulated for the combination of Z and 𝛼 that Conj cannot be the label and hence the
label of the complete structure is Z.19
A special case that is discussed by Chomsky is the Internal Merge of an LI 𝛼 with a
non LI 𝛽. According to rule (29a) the label would be 𝛼. According to (29b), the label
would be 𝛽 (see also Donati (2006)). Chomsky discusses the combination of the pronoun
what with you wrote as an example.
(31) what [ C [you wrote t]]
If the label is determined according to (29b), one then has a syntactic object that would
be called a CP in the GB framework; since this CP is, moreover, interrogative, it can
18
His explanation is contradictory: on p. 11 Chomsky assumes that a label of a combination of two entities
with the same category is this category. But in his treatment of coordination, he assumes that one of the
conjuncts has to be raised, since otherwise the complete structure could not be labeled.
19
As Bob Borsley (p.c. 2013) pointed out to me, this makes wrong predictions for coordinations of two singular
noun phrases with and, since the result of the coordination is a plural DP and not a singular one like the
first conjunct. Theories like HPSG can capture this by grouping features in bundles that can be shared in
coordinated structures (syntactic features and nonlocal features, see Pollard & Sag (1994: 202)).
Furthermore the whole account cannot explain why (i.b) is ruled out.
The information about the conjunction has to be part of the representation for or Lee in order to be able to
contrast it with and Lee.
A further problem is that the label of 𝛼 should be the label of W since Conj does not count for label
determination. This would lead to a situation in which we have to choose between Z and W to determine
the label of 𝛾 . Following Chomsky’s logic, either Z or W would have to move on to make it possible to
label 𝛾 . Chomsky (2013) mentions this problem in footnote 40, but does not provide a solution.
156
4.6 New developments and theoretical variants
Since wessen Schuhe ‘whose shoes’ is not a lexical item, rule (29b) has to be applied,
provided no additional rules are assumed to deal with such cases. This means that the
whole free relative clause wessen Schuhe danach besprenkelt sind is labeled as CP. For
the free relatives in (33) and (34) the labeling as a CP is an unwanted result, since they
function as subjects or objects of the matrix predicates and hence should be labelled
20
Chomsky (2013: 47) admits that there are many open questions as far as the labeling in free relative clauses
is concerned and hence admits that there remain many open questions with labeling as such.
21
Bausewein (1990: 155).
22
Thomas Gsella, taz, 12.02.1997, p. 20.
23
taz, taz mag, 08./09.08.1998, p. XII.
157
4 Transformational Grammar – Minimalism
DP. However, since wessen Schuhe is a complex phrase and not a lexical item, (29a) does
not apply and hence there is no analysis of the free relative clause as a DP. Therefore,
it seems one must return to something like the GB analysis proposed by Groos & van
Riemsdijk (1981), at least for the German examples. Gross and van Riemsdijk assume
that free relatives consist of an empty noun that is modified by the relative clause like
a normal noun. In such an approach, the complexity of the relative phrase is irrelevant.
It is only the empty head that is relevant for labeling the whole phrase.24 However,
once empty heads are countenanced in the analysis, the application of (29a) to (31) is
undesirable since the application would result in two analyses for (32b): one with the
empty nominal head and one in which (31) is labeled as DP directly. One might argue
that in the case of several possible derivations, the most economical one wins, but the
assumption of transderivational constraints leads to undesired consequences (Pullum
2013: Section 5).
Chomsky (2013) abandons the labeling condition in (29b) and replaces it with general
labeling rules that hold for both internal and external Merge of two phrases. He dis-
tinguishes two cases. In the first case, labeling becomes possible since one of the two
24
Assuming an empty head is problematic since it may be used as an argument only in those cases in which
it is modified by an adjunct, namely the relative clause (Müller 1999a: 97). See also Ott (2011: 187) for a
later rediscovery of this problem. It can be solved in HPSG by assuming a unary projection that projects
the appropriate category from a relative clause. I also use the unary projection to analyze so-called non-
matching free relative clauses (Müller 1999a). In constructions with nonmatching free relative clauses,
the relative clause fills an argument slot that does not correspond to the properties of the relative phrase
(Bausewein 1990). Bausewein discusses the following example, in which the relative phrase is a PP but the
free relative fills the accusative slot of kocht ‘cooks’.
158
4.6 New developments and theoretical variants
phrases of the set { XP, YP } is moved away. This case was already discussed above.
Chomsky writes about the other case: X and Y are identical in a relevant respect, provid-
ing the same label, which can be taken as the label of the SO (p. 11). He sketches an analysis
of interrogative clauses on p. 13 in which the interrogative phrase has a Q feature and
the remaining sentence from which the Q phrase was extracted has a Q feature as well.
Since the two constituents share this property, the label of the complete clause will be
Q. This kind of labeling will “perhaps” also be used for labeling normal sentences con-
sisting of a subject and a verb phrase agreeing in person and number. These features
would be responsible for the label of the sentence. The exact details are not worked out,
but almost certainly will be more complex than (29b).
A property that is inherent in both Chomsky (2005) and Chomsky (2013) is that the
label is exclusively determined from one of the merged objects. As Bob Borsley pointed
out to me, this is problematic for interrogative/relative phrases like (35).
(35) with whom
The phrase in (35) is both a prepositional phrase (because the first word is a prepo-
sition) and an interrogative/relative phrase (because the second word is an interroga-
tive/relative word). So, what is needed for the correct labeling of PPs like the one in (35)
is a well-defined way of percolating different properties from daughters to the mother
node.25
For further problems concerning labeling and massive overgeneration by recent for-
mulations of Merge see Fabregas et al. (2016).
Summarizing, one can say that labeling, which was introduced to simplify the theory
and reduce the amount of language specific innate knowledge that has to be assumed,
can only be made to function with a considerable amount of stipulations. For instance,
the combination of lexical elements requires the assumption of empty functional heads,
whose only purpose is determining the syntactic category of a certain lexical element.
If this corresponded to linguistic reality, knowledge about labeling, the respective func-
tional categories, and information about those categories that have to be ignored for
the labeling would have to be part of innate language specific knowledge and nothing
would be gained. One would be left with bizarre analyses with an enormous degree
25
HPSG solves this problem by distinguishing head features including part of speech information and non-
local features containing information about extraction and interrogative/relative elements. Head features
are projected from the head, the nonlocal features of a mother node are the union of the nonlocal features
of the daughters minus those that are bound off by certain heads or in certain configurations.
Citko (2008: 926) suggests an analysis in which both daughters can contribute to the mother node. The
result is a complex label like { P, { D, N } }. This is a highly complex data structure and Citko does not provide
any information on how the relevant information that it contains is accessed. Is an object with the label
{ P, { D, N } } a P, a D or an N? One could say that P has priority since it is in the least embedded set, but D
and N are in one set. What about conflicting features? How does a preposition that selects for a DP decide
whether { D, N } is a D or an N? In any case it is clear that a formalization will involve recursive relations
that dig out elements of subsets in order to access their features. This adds to the overall complexity of
the proposal and is clearly dispreferred over the HPSG solution, which uses one part of speech value per
linguistic object.
159
4 Transformational Grammar – Minimalism
26
For the Categorial Grammar approach to work, it is necessary to assign the category x/x to an adjunct,
where x stands for the category of the head to which the adjunct attaches. For instance, an adjective
combines with a nominal object to form a nominal object. Therefore its category is n/n rather than adj.
Similarly, Stabler’s approach does not extend to adjuncts unless he is willing to assign the category
noun to attributive adjectives. One way out of this problem is to assume a special combination operation
for adjuncts and their heads (see Frey & Gärtner 2002: Section 3.2). Such a combination operation is
equivalent to the Head-Adjunct Schema of HPSG.
27
Pauline Jacobson (p.c. 2013) pointed out that the problem with intransitive verbs could be solved by assum-
ing that the last-merged element is the specifier and all non-last-merged elements are complements. This
would solve the problems with intransitive verbs and with the coordination of verbs in (36) but it would
not solve the problem of coordination in head-final languages as in (39). Furthermore, current Minimalist
approaches make use of multiple specifiers and this would be incompatible with the Jacobsonian proposal
unless one would be willing to state more complicated restrictions on the status of non-first-merged ele-
ments.
160
4.6 New developments and theoretical variants
Apart from this, theories assuming that syntactic objects merged with word groups are
specifiers do not allow for analyses in which two lexical verbs are directly coordinated,
as in (36):28
For example, in an analysis suggested by Steedman (1991: 264), and (being the head)
is first merged with loves and then the result is merged with knows. The result of this
combination is a complex object that has the same syntactic properties as the combined
parts: the result is a complex verb that needs a subject and an object. After the combi-
nation of the conjunction with the two verbs, the result has to be combined with this
record and he. this record behaves in all relevant respects like a complement. Follow-
ing Chomsky’s definition, however, it should be a specifier, since it is combined with
the third application of Merge. The consequences are unclear. Chomsky assumes that
Merge does not specify constituent order. According to him, the linearization happens
at the level of Phonological Form (PF). The restrictions that hold there are not described
in his recent papers. However, if the categorization as complement or specifier plays a
role for linearization as in Kayne’s work (2011: 2, 12) and in Stabler’s proposal (see Sec-
tion 4.6.4), this record would have to be serialized before knows and loves, contrary to the
facts. This means that a Categorial Grammar-like analysis of coordination is not viable
and the only remaining option would seem to assume that knows is combined with an
object and then two VPs are coordinated. Kayne (1994: 61, 67) follows Wexler & Culi-
cover (1980: 303) in suggesting such an analysis and assumes that the object in the first
VP is deleted. However, Borsley (2005: 471) shows that such an analysis makes wrong
predictions, since (37a) would be derived from (37b) although these sentences differ in
meaning.29
(37) a. Hobbs whistled and hummed the same tune.
b. Hobbs whistled the same tune and hummed the same tune.
28
Chomsky (2013: 46) suggests the coordination analysis in (30): according to this analysis, the verbs would
be merged directly and one of the verbs would be moved around the conjunction in a later step of the
derivation. As was mentioned in the previous section, such analyses do not contribute to the goal of
making minimal assumptions about innate language specific knowledge since it is absolutely unclear how
such an analysis of coordination would be acquired by language learners. Hence, I will not consider this
coordination analysis here.
Another innovation of Chomsky’s 2013 paper is that he eliminates the concept of specifier. He writes
in footnote 27 on page 43: There is a large and instructive literature on problems with Specifiers, but if
the reasoning here is correct, they do not exist and the problems are unformulable. This is correct, but this
also means that everything that was explained with reference to the notion of specifier in the Minimalist
framework until now does not have an explanation any longer. If one follows Chomsky’s suggestion, a
large part of the linguistic research of the past years becomes worthless and has to be redone.
Chomsky did not commit himself to a particular view on linearization in his earlier work, but somehow
one has to ensure that the entities that were called specifier are realized in a position in which constituents
are realized that used to be called specifier. This means that the following remarks will be relevant even
under current Chomskyan assumptions.
29
See also Bartsch & Vennemann (1972: 102), Jackendoff (1977: 192–193), Dowty (1979: 143), den Besten (1983:
104–105), Klein (1985: 8–9) and Eisenberg (1994b) for similar observations and criticism of similar proposals
in earlier versions of Transformational Grammar.
161
4 Transformational Grammar – Minimalism
Since semantic interpretation cannot see processes such as deletion that happen at the
level of Phonological Form (Chomsky 1995b: Chapter 3), the differences in meaning can-
not be explained by an analysis that deletes material.
In a further variant of the VP coordination analysis, there is a trace that is related to
this record. This would be a Right-Node-Raising analysis. Borsley (2005) has shown that
such analyses are problematic. Among the problematic examples that he discusses is the
following pair (see also Bresnan 1974: 615).
(38) a. He tried to persuade and convince him.
b. * He tried to persuade, but couldn’t convince, him.
The second example is ungrammatical if him is not stressed. In contrast, (38a) is well-
formed even with unstressed him. So, if (38a) were an instance of Right-Node-Raising,
the contrast would be unexpected. Borsley therefore excludes a Right-Node-Raising
analysis.
The third possibility to analyze sentences like (36) assumes discontinuous constituents
and uses material twice: the two VPs knows this record and loves this record are coor-
dinated with the first VP being discontinuous. (See Crysmann (2001) and Beavers &
Sag (2004) for such proposals in the framework of HPSG.) However, discontinuous con-
stituents are not usually assumed in the Minimalist framework (see for instance Kayne
(1994: 67)). Furthermore, Abeillé (2006) showed that there is evidence for structures in
which lexical elements are coordinated directly. This means that one needs analyses
like the CG analysis discussed above, which would result in the problems with the spec-
ifier/complement status just discussed.
Furthermore, Abeillé has pointed out that NP/DP coordinations in head-final lan-
guages like Korean and Japanese present difficulties for Merge-based analyses. (39)
shows a Japanese example.
(39) Robin-to Kim
Robin-and Kim
‘Kim and Robin’
In the first step Robin is merged with to. In a second step Kim is merged. Since Kim is
a specifier, one would expect that Kim is serialized before the head as it is the case for
other specifiers in head-final languages.
Chomsky tries to get rid of the unary branching structures of standard X theory, which
were needed to project lexical items like pronouns and determiners into full phrases,
referring to work by Muysken (1982). Muysken used the binary features MIN and MAX to
classify syntactic objects as minimal (words or word-like complex objects) or maximal
(syntactic objects that stand for complete phrases). Such a feature system can be used
to describe pronouns and determiners as [+MIN, +MAX]. Verbs like give, however, are
classified as [+MIN, −MAX]. They have to project in order to reach the [+MAX]-level. If
specifiers and complements are required to be [+MAX], then determiners and pronouns
fulfill this requirement without having to project from X0 via X ′ to the XP-level.
162
4.6 New developments and theoretical variants
In Chomsky’s system, the MIN/MAX distinction is captured with respect to the com-
pleteness of heads (complete = phrase) and to the property of being a lexical item. How-
ever, there is a small but important difference between Muysken’s and Chomsky’s pro-
posal: the predictions with regard to the coordination data that was discussed above.
Within the category system of X theory, it is possible to combine two X0 s to get a new,
complex X0 . This new object has basically the same syntactic properties that simple X0 s
have (see Jackendoff 1977: 51 and Gazdar, Klein, Pullum & Sag 1985). In Muysken’s sys-
tem, the coordination rule (or the lexical item for the conjunction) can be formulated
such that the coordination of two +MIN items is a +MIN item. In Chomsky’s system an
analogous rule cannot be defined, since the coordination of two lexical items is not a
lexical item any longer.
Like Chomsky in his recent Minimalist work, Categorial Grammar (Ajdukiewicz 1935)
and HPSG (Pollard and Sag 1987; 1994: 39–40) do not (strictly) adhere to X theory. Both
theories assign the symbol NP to pronouns (for CG see Steedman & Baldridge (2006:
p. 615), see Steedman (2000: Section 4.4) for the incorporation of lexical type raising in
order to accommodate quantification). The phrase likes Mary and the word sleeps have
the same category in Categorial Grammar (s\np). In both theories it is not necessary to
project a noun like tree from N0 to N in order to be able to combine it with a determiner or
an adjunct. Determiners and monovalent verbs in controlled infinitives are not projected
from an X0 level to the XP level in many HPSG analyses, since the valence properties
of the respective linguistic objects (an empty SUBCAT or COMPS list) are sufficient to de-
termine their combinatoric potential and hence their distribution (Müller 1996d; Müller
1999b). If the property of being minimal is needed for the description of a phenomenon,
the binary feature LEX is used in HPSG (Pollard and Sag 1987: 172; 1994: 22). However,
this feature is not needed for the distinction between specifiers and complements. This
distinction is governed by principles that map elements of an argument structure list
(ARG-ST) onto valence lists that are the value of the SPECIFIER and the COMPLEMENTS fea-
ture (abbreviated as SPR and COMPS respectively).30 Roughly speaking, the specifier in a
verbal projection is the least oblique argument of the verb for configurational languages
like English. Since the argument structure list is ordered according to the obliqueness
hierarchy of Keenan & Comrie (1977), the first element of this list is the least oblique
argument of a verb and this argument is mapped to the SPR list. The element in the SPR
list is realized to the left of the verb in SVO languages like English. The elements in the
COMPS list are realized to the right of their head. Approaches like the one by Ginzburg &
Sag (2000: 34, 364) that assume that head-complement phrases combine a word with its
arguments have the same problem with coordinations like (36) since the head of the VP
is not a word.31 However, this restriction for the head can be replaced by one that refers
to the LEX feature rather than to the property of being a word or lexical item.
30
Some authors assume a three-way distinction between subjects, specifiers, and complements.
31
As mentioned above, a multidomination approach with discontinuous constituents is a possible solution
for the analysis of (36) (see Crysmann 2001 and Beavers & Sag 2004). However, the coordination of lexical
items has to be possible in principle as Abeillé (2006) has argued. Note also that the HPSG approach
to coordination cannot be taken over to the MP. The reason is that the HPSG proposals involve special
grammar rules for coordination and MP comes with the claim that there is only Merge. Hence the additional
introduction of combinatorial rules is not an option within the MP.
163
4 Transformational Grammar – Minimalism
Pollard & Sag as well as Sag & Ginzburg assume flat structures for English. Since one
of the daughters is marked as lexical, it follows that the rule does not combine a head
with a subset of its complements and then apply a second time to combine the result
with further complements. Therefore, a structure like (40a) is excluded, since gave John
is not a word and hence cannot be used as the head daughter in the rule.
(40) a. [[gave John] a book]
b. [gave John a book]
Instead of (40a), only analyses like (40b) are admitted; that is, the head is combined with
all its arguments all in one go. The alternative is to assume binary branching struc-
tures (Müller 2015a; Müller & Ørsnes 2015: Section 1.2.2). In such an approach, the head
complement schema does not restrict the word/phrase status of the head daughter. The
binary branching structures in HPSG correspond to External Merge in the MP.
In the previous two sections, certain shortcomings of Chomsky’s labeling definition
and problems with the coordination of lexical items were discussed. In the following
section, I discuss Stabler’s definition of Merge in Minimalist Grammar, which is explicit
about labeling and in one version does not have the problems discussed above. I will
show that his formalization corresponds rather directly to HPSG representations.
164
4.6 New developments and theoretical variants
(41) >
3 <
1 2
1 is the head in (41), 2 is the complement and 3 the specifier. The pointer points to the
part of the structure that contains the head. The daughters in a tree are ordered, that is,
3 is serialized before 1 and 1 before 2.
Stabler (2011a: 402) defines External Merge as follows:
<
t1 t2
if t1 has exactly 1 node
(42) em(t1 [=f], t2 [f]) =
>
t2 t1 otherwise
=f is a selection feature and f the corresponding category. When t1 [=f] and t2 [f] are
combined, the result is a tree in which the selection feature of t1 and the respective
category feature of t2 are deleted. The upper tree in (42) represents the combination of
a (lexical) head with its complement. t1 is positioned before t2 . The condition that t1 has
to have exactly one node corresponds to Chomsky’s assumption that the first Merge is
a Merge with a complement and that all further applications of Merge are Merges with
specifiers (Chomsky 2008: 146).
Stabler defines Internal Merge as follows:32
t1 is a tree with a subtree t2 which has the feature f with the value ‘−’. This subtree is
deleted (t2 [−f]> ↦→ 𝜖) and a copy of the deleted subtree without the −f feature (t2> ) is
positioned in specifier position. The element in specifier position has to be a maximal
projection. This requirement is visualized by the raised ‘>’.
Stabler provides an example derivation for the sentence in (44).
(44) who Marie praises
32
In addition to what is shown in (43), Stabler’s definition contains a variant of the Shortest Move Constraint
(SMC), which is irrelevant for the discussion at hand and hence will be omitted.
165
4 Transformational Grammar – Minimalism
praises is a two-place verb with two =D features. This encodes the selection of two de-
terminer phrases. who and Marie are two Ds and they fill the object and subject position
of the verb. The resulting verbal projection Marie praises who is embedded under an
empty complementizer which is specified as +WH and hence provides the position for
the movement of who, which is placed in the specifier position of CP by the application
of Internal Merge. The −WH feature of who is deleted and the result of the application
of Internal Merge is who Marie praises.
This analysis has a problem that was pointed out by Stabler himself in unpublished
work cited by Veenstra (1998: 124): it makes incorrect predictions in the case of mono-
valent verbs. If a verb is combined with an DP, the definition of External Merge in (42)
treats this DP as a complement33 and serializes it to the right of the head. Instead of
analyses of sentences like (45a) one gets analyses of strings like (45b).
To solve this problem, Stabler assumes that monovalent verbs are combined with a
nonovert object (see Veenstra (1998: 61, 124) who, quoting Stabler’s unpublished work,
also adopts this solution). With such an empty object, the resulting structure contains
the empty object as a complement. The empty object is serialized to the right of the verb
and Max is the specifier and hence serialized to the left of the verb as in (46).
(46) Max sleeps _.
Of course, any analysis of this kind is both stipulative and entirely ad hoc, being moti-
vated only by the wish to have uniform structures. Moreover, it exemplifies precisely
one of the methodological deficiencies of Transformational Generative Grammar dis-
cussed at length by Culicover & Jackendoff (2005: Section 2.1.2): the excessive appeal to
uniformity.
An alternative is to assume an empty verbal head that takes sleeps as complement and
Max as subject. Such an analysis is often assumed for ditransitive verbs in Minimalist
theories which assume Larsonian verb shells (Larson 1988). Larsonian analyses usually
assume that there is an empty verbal head that is called little v and that contributes a
causative meaning. As was discussed in Section 4.1.4, Adger (2003) adopts a little v-based
analysis for intransitive verbs. Omitting the TP projection, his analysis is provided in
Figure 4.22 on the next page. Adger argues that the analysis of sentences with unergative
verbs involves a little v that selects an agent, while the analysis of unaccusative verbs
involves a little v that does not select an N head. For unaccusatives, he assumes that
the verb selects a theme. He states that little v does not necessarily have a causative
meaning but introduces the agent. But note that in the example at hand the subject of
sleep is neither causing an event, nor is it necessarily deliberately doing something. So
it is rather an undergoer than an agent. This means that the assumption of the empty v
head is made for purely theory-internal reasons without any semantic motivation in the
33
Compare also Chomsky’s definition of specifier and complement in Section 4.6.3.
166
4.6 New developments and theoretical variants
vP
Max 𝑣
v sleep
34
For extensions see Frey & Gärtner (2002: Section 3.2).
167
4 Transformational Grammar – Minimalism
<
t1 t2
if 𝛼 is =x
(48) em(t1 [𝛼], t2 [x]) =
>
t2 t1 if 𝛼 is x=
The position of the equal sign specifies on which side of the head an argument has
to be realized. This corresponds to forward and backward Application in Categorial
Grammar (see Section 8.1.1). Stabler calls this form of grammar Directional MG (DMG).
This variant of MG avoids the problem with monovalent verbs and the coordination data
is unproblematic as well if one assumes that the conjunction is a head with a variable
category that selects for elements of the same category to the left and to the right of
itself. know and love would both select an object to the right and a subject to the left and
this requirement would be transferred to knows and loves.35 See Steedman (1991: 264) for
the details of the CG analysis and Bouma & van Noord (1998: 52) for an earlier HPSG
proposal involving directionality features along the lines suggested by Stabler for his
DMGs.
168
4.6 New developments and theoretical variants
Since we are dealing with syntactic aspects exclusively, only a subset of the used fea-
tures is relevant: valence information and information about part of speech and certain
morphosyntactic properties that are relevant for the external distribution of a phrase is
represented in a feature description under the path SYNSEM|LOC|CAT. The features that
are particularly interesting here are the so-called head features. Head features are shared
between a lexical head and its maximal projection. The head features are located inside
CAT and are grouped together under the path HEAD. Complex hierarchical structure is
also modelled with feature value pairs. The constituents of a complex linguistic object
are usually represented as parts of the representation of the complete object. For in-
stance, there is a feature HEAD-DAUGHTER the value of which is a feature structure that
models a linguistic object that contains the head of a phrase. The Head Feature Principle
(50) refers to this daughter and ensures that the head features of the head daughter are
identical with the head features of the mother node, that is, they are identical to the head
features of the complete object.
SYNSEM|LOC|CAT|HEAD 1
(50) headed-phrase ⇒
HEAD-DTR|SYNSEM|LOC|CAT|HEAD 1
Identity is represented by boxes with the same number.
Ginzburg & Sag (2000: 30) represent all daughters of a linguistic object in a list that is
given as the value of the DAUGHTERS attribute. The value of the feature HEAD-DAUGHTER
is identified with one of the elements of the DAUGHTERS list:
" #
HEAD-DTR 1
(51) a.
DTRS 1 𝛼, 𝛽
" #
HEAD-DTR 1
b.
DTRS 𝛼, 1 𝛽
𝛼 and 𝛽 are shorthands for descriptions of linguistic objects. The important point about
the two descriptions in (51) is that the head daughter is identical to one of the two daugh-
ters, which is indicated by the 1 in front of 𝛼 and 𝛽, respectively. In the first feature
description, the first daughter is the head and in the second description, the second
daughter is the head. Because of the Head Feature Principle, the syntactic properties of
the whole phrase are determined by the head daughter. That is, the syntactic properties
of the head daughter correspond to the label in Chomsky’s definition. This notation cor-
responds exactly to the one that is used by Stabler: (51a) is equivalent to (52a) and (51b)
is equivalent to (52b).
𝛼 𝛽 𝛼 𝛽
An alternative structuring of this basic information, discussed by Pollard & Sag (1994:
Chapter 9), uses the two features HEAD-DAUGHTER and NON-HEAD-DAUGHTERS rather
169
4 Transformational Grammar – Minimalism
than HEAD-DAUGHTER and DAUGHTERS. This gives rise to feature descriptions like (53a),
which corresponds directly to Chomsky’s set-based representations, discussed in Sec-
tion 4.6.2 and repeated here as (53b).
" #
HEAD-DTR 𝛼
(53) a.
NON-HEAD-DTRS 𝛽
b. { 𝛼, { 𝛼, 𝛽 } }
The representation in (53a) does not contain information about linear precedence of 𝛼
and 𝛽. Linear precedence of constituents is constrained by linear precedence rules, which
are represented independently from constraints regarding (immediate) dominance.
The definition of Internal Merge in (43) corresponds to the Head-Filler Schema in
HPSG (Pollard & Sag 1994: 164). Stabler’s derivational rule deletes the subtree t2 [−f]> .
HPSG is monotonic, that is, nothing is deleted in structures that are licensed by a gram-
mar. Instead of deleting t2 inside of a larger structure, structures containing an empty
element (NB – not a tree) are licensed directly.36 Both in Stabler’s definition and in the
HPSG schema, t2 is realized as filler in the structure. In Stabler’s definition of Internal
Merge, the category of the head daughter is not mentioned, but Pollard & Sag (1994: 164)
restrict the head daughter to be a finite verbal projection. Chomsky (2007: 17) assumes
that all operations but External Merge operate on phase level. Chomsky assumes that
CP and v*P are phases. If this constraint is incorporated into the definition in (43), the re-
strictions on the label of t1 would have to be extended accordingly. In HPSG, sentences
like (54) have been treated as VPs, not as CPs and hence Pollard & Sag’s requirement
that the head daughter in the Head Filler Schema be verbal corresponds to Chomsky’s
restriction.
(54) Bagels, I like.
Hence, despite minor presentational differences, we may conclude that the formalization
of Internal Merge and that of the Head-Filler Schema are very similar.
An important difference between HPSG and Stabler’s definition is that ‘movement’ is
not feature driven in HPSG. This is an important advantage since feature-driven move-
ment cannot deal with instances of so-called altruistic movement (Fanselow 2003a), that
is, movement of a constituent that happens in order to make room for another con-
stituent in a certain position (see Section 4.6.1.4).
A further difference between general X theory and Stabler’s formalization of Internal
Merge on the one hand and HPSG on the other is that in the latter case there is no
restriction regarding the completeness (or valence ‘saturation’) of the filler daughter.
Whether the filler daughter has to be a maximal projection (English) or not (German),
36
See Bouma, Malouf & Sag (2001) for a traceless analysis of extraction in HPSG and Müller (2017a: Chapter 7)
and Chapter 19 of this book for a general discussion of empty elements.
170
4.6 New developments and theoretical variants
follows from restrictions that are enforced locally when the trace is combined with its
head. This makes it possible to analyze sentences like (55) without remnant movement.37
(55) Gelesen𝑖 hat 𝑗 das Buch keiner _𝑖 _ 𝑗 .
read has the book nobody
In contrast, Stabler is forced to assume an analysis like the one in (56b) (see also G. Müller
(1998) for a remnant movement analysis). In a first step, das Buch is moved out of the VP
(56a) and in a second step, the emptied VP is fronted as in (56b).
(56) a. Hat [das Buch] 𝑗 [keiner [VP _ 𝑗 gelesen]].
b. [VP _ 𝑗 Gelesen]𝑖 hat [das Buch] 𝑗 [keiner _𝑖 ].
Haider (1993: 281), De Kuthy & Meurers (2001: Section 2) and Fanselow (2002) showed
that this kind of remnant movement analysis is problematic for German. The only phe-
nomenon that Fanselow identified as requiring a remnant movement analysis is the prob-
lem of multiple fronting (see Müller (2003a) for an extensive discussion of relevant data).
Müller (2005b,c; 2017a) develops an alternative analysis of these multiple frontings which
uses an empty verbal head in the Vorfeld, but does not assume that adjuncts or arguments
like das Buch in (56b) are extracted from the Vorfeld constituent. Instead of the remnant
movement analysis, the mechanism of argument composition from Categorial Grammar
(Geach 1970; Hinrichs & Nakazawa 1994a) is used to ensure the proper realization of ar-
guments in the sentence. Chomsky (2007: 20) already uses argument composition as part
of his analysis of TPs and CPs. Hence both remnant movement and argument composi-
tion are assumed in recent Minimalist proposals. The HPSG alternative, however, would
appear to need less theoretical apparatus and hence has to be preferred for reasons of
parsimony.
Finally, it should be mentioned that all transformational accounts have problems with
Across the Board extraction like (57a) and (57) in which one element corresponds to
several gaps.
(57) a. Bagels, I like and Ellison hates.38
b. The man who𝑖 [Mary loves _𝑖 ] and [Sally hates _𝑖 ] computed my tax.
This problem was solved for GPSG by Gazdar (1981b) and the solution carries over to
HPSG. The Minimalist community tried to address these problems by introducing op-
erations like sideward movement (Nunes 2004) where constituents can be inserted into
sister trees. So in the example in (57a), Bagels is copied from the object position of hates
into the object position of like and then these two copies are related to the fronted el-
ement. Kobele criticized such solutions since they overgenerate massively and need
37
See also Müller & Ørsnes (2013b) for an analysis of object shift in Danish that can account for verb fronting
without remnant movement. The analysis does not have any of the problems that remnant movement
analyses have.
38
Pollard & Sag (1994: 205).
171
4 Transformational Grammar – Minimalism
172
4.6 New developments and theoretical variants
HPSG in Müller (2013c) and in the previous subsections. However, I overlooked one cru-
cial difference between the usual assumptions about selection in Minimalist proposals
on the one hand and Categorial Grammar, Dependency Grammar, LFG, HPSG, TAG,
and Construction Grammar on the other hand: what is selected in the former type of
theory is a single feature, while the latter theories select for feature bundles. This seems
to be a small difference, but the consequences are rather severe. Stabler’s definition of
External Merge that was given on page 165 removes the selection feature (=f) and the
corresponding feature of the selected element (f). In some publications and in the intro-
duction in this book, the selection features are called uninterpretable features and are
marked with a u. The uninterpretable features have to be checked and then they are
removed from the linguistic object as in Stabler’s definition. The fact that they have
been checked is represented by striking them out. It is said that all uninterpretable fea-
tures have to be checked before a syntactic object is send to the interfaces (semantics
and pronunciation). If uninterpretable features are not checked, the derivation crashes.
Adger (2003: Section 3.6) explicitly discusses the consequences of these assumptions: a
selecting head checks a feature of the selected object. It is not possible to check features
of elements that are contained in the object that a head combines with. Only features
at the topmost node, the so-called root node, can be checked with external merge. The
only way features inside complex objects can be checked is by means of movement. This
means that a head may not combine with a partially saturated linguistic object, that is,
with a linguistic object that has an unchecked selection feature. I will discuss this design
decision with reference to an example provided by Adger (2003: 95). The noun letters
selects for a P and Ps select for an N. The analysis of (59a) is depicted left in Figure 4.23.
(59) a. letters to Peter
b. * letters to
N N
Figure 4.23: The analysis of letters to Peter according to Adger (2003: 95)
The string in (59b) is ruled out since the uninterpretable N feature of the preposition to
is not checked. So this integrates the constraint that all dependent elements have to be
maximal into the core mechanism. This makes it impossible to analyze examples like
(60) in the most straightforward way, namely as involving a complex preposition and a
noun that is lacking a determiner:
(60) vom Bus
from.the bus
173
4 Transformational Grammar – Minimalism
In theories in which complex descriptions can be used to describe dependants, the de-
pendent may be partly saturated. So for instance in HPSG, fused prepositions like vom
‘from.the’ can select an N, which is a nominal projection lacking a specifier:
(61) N[SPR ⟨ DET ⟩]
The description in (61) is an abbreviation for an internally structured set of feature-value
pairs (see Section 9.1.1). The example here is given for the illustration of the differences
only, since there may be ways of accounting for such cases in a single-feature-Merge
system. For instance, one could assume a DP analysis and have the complex preposition
select a complete NP (something of category N with no uninterpretable features). Al-
ternatively, one can assume that there is indeed a full PP with all the structure that is
usually assumed and the fusion of preposition and determiner happens during pronun-
ciation. The first suggestion eliminates the option of assuming an NP analysis as it was
suggested by Bruening (2009) in the Minimalist framework.
Apart from this illustrative example with a fused preposition, there are other cases
in which one may want to combine unsaturated linguistic objects. I already discussed
coordination examples above. Another example is the verbal complex in languages like
German, Dutch, and Japanese. Of course there are analyses of these languages that do
not assume a verbal complex (G. Müller 1998; Wurmbrand 2003a), but these are not
without problems. Some of the problems were discussed in the previous section as well.
Summing up this brief subsection, it has to be said that the feature checking mech-
anism that is built into the conception of Merge is more restrictive than the selection
that is used in Categorial Grammar, Lexical Functional Grammar, HPSG, Construction
Grammar, and TAG. In my opinion, it is too restrictive.
4.6.6 Summary
In sum, one can say that the computational mechanisms of the Minimalist Program (e.g.,
transderivational constraints and labeling), as well as the theory of feature-driven move-
ment are problematic and the assumption of empty functional categories is sometimes
ad hoc. If one does not wish to assume that these categories are shared by all languages,
then proposing two mechanisms (Merge and Move) does not represent a simplification
of grammar since every single functional category which must be stipulated constitutes
a complication of the entire system.
The labeling mechanism is not yet worked out in detail, does not account for the
phenomena it was claimed to provide accounts for, and hence should be replaced by the
head/functor-based labeling that is used in Categorial Grammar and HPSG.
174
4.7 Summary and classification
4.7.2 Formalization
Section 3.6.2 commented on the lack of formalization in transformational grammar up
until the 1990s. The general attitude towards formalization did not change in the mini-
malist era and hence there are very few formalizations and implementations of Minimal-
ist theories.
Stabler (2001) shows how it is possible to formalize and implement Kayne’s theory
of remnant movement. In Stabler’s implementation39 , there are no transderivational
39
His system is available at: http://linguistics.ucla.edu/people/stabler/coding.html. 2020-07-16.
175
4 Transformational Grammar – Minimalism
constraints, no numerations40 , he does not assume Agree (see Fong 2014: 132) etc. The
following is also true of Stabler’s implementation of Minimalist Grammars and GB sys-
tems: there are no large grammars. Stabler’s grammars are small, meant as a proof of
concept and purely syntactic. There is no morphology41 , no treatment of multiple agree-
ment (Stabler 2011b: Section 27.4.3) and above all no semantics. PF and LF processes are
not modelled.42 The grammars and the computational system developed by Sandiway
Fong are of similar size and faithfulness to the theory (Fong & Ginsburg 2012; Fong 2014):
the grammar fragments are small, encode syntactic aspects such as labeling directly in
the phrase structure (Fong & Ginsburg 2012: Section 4) and therefore, fall behind X the-
ory. Furthermore, they do not contain any morphology. Spell-Out is not implemented,
so in the end it is not possible to parse or generate any utterances.43 Herring’s (2016)
dissertation is a promising beginning. Herring developed a system that can be used for
40
There is a numeration lexicon in Veenstra (1998: Chapter 9). This lexicon consists of a set of numerations,
which contain functional heads, which can be used in sentences of a certain kind. For example, Veenstra
assumes numerations for sentences with bivalent verbs and subjects in initial position, for embeded sen-
tences with monovalent verbs, for wh-questions with monovalent verbs, and for polar interrogatives with
monovalent verbs. An element from this set of numerations corresponds to a particular configuration and
a phrasal construction in the spirit of Construction Grammar. Veenstra’s analysis is not a formalization
of the concept of the numeration that one finds in Minimalist works. Normally, it is assumed that a nu-
meration contains all the lexical entries which are needed for the derivation of a sentence. As (i) shows,
complex sentences can consist of combinations of sentences with various different sentence types:
(i) Der Mann, der behauptet hat, dass Maria gelacht hat, steht neben der Palme, die im letzten
the man who claimed has that Maria laughed has stands next.to the palm.tree which in last
Jahr gepflanzt wurde.
year planted was
‘The man who claimed Maria laughed is standing next to the palm tree that was planted last year.’
In (i), there are two relative clauses with verbs of differing valence, an embedded sentence with a monova-
lent verb and the matrix clause. Under a traditional understanding of numerations, Veenstra would have
to assume an infinite numeration lexicon containing all possible combinations of sentence types.
41
The test sentences have the form as in (i).
176
4.7 Summary and classification
grammar development in the Minimalist Program. In the version described in his thesis
the system could generate but was unable to parse sentences (p. 138, 143). PF phenom-
ena were not modeled (p. 142–143) and the two example fragments are small and come
without a semantics (p. 143).
The benchmark here has been set by implementations of grammars in constraint-
based theories; for example, the HPSG grammars of German (Müller & Kasper 2000),
English (Flickinger, Copestake & Sag 2000) and Japanese (Siegel 2000) that were devel-
oped in the 90s as part of Verbmobil (Wahlster 2000) for the analysis of spoken language or
the LFG or CCG systems with large coverage. These grammars can analyze up to 83 % of
utterances in spoken language (for Verbmobil from the domains of appointment schedul-
ing and trip planning) or written language. Linguistic knowledge is used to generate
and analyze linguistic structures. In one direction, one arrives at a semantic representa-
tion of a string of words and in the other one can create a string of words from a given
semantic representation. A morphological analysis is indispensable for analyzing nat-
urally occurring data from languages with elaborated morphological marking systems.
In the remainder of this book, the grammars and computational systems developed in
other theories will be discussed at the beginning of the respective chapters.
The reason for the lack of larger fragments inside of GB/MP could have to do with the
fact that the basic assumptions of Minimalist community change relatively quickly:
In Minimalism, the triggering head is often called a probe, the moving element is
called a goal, and there are various proposals about the relations among the features
that trigger syntactic effects. Chomsky (1995b: p. 229) begins with the assumption
that features represent requirements which are checked and deleted when the re-
quirement is met. The first assumption is modified almost immediately so that only
a proper subset of the features, namely the ‘formal’, ‘uninterpretable’ features are
deleted by checking operations in a successful derivation (Collins, 1997; Chomsky
1995b: §4.5). Another idea is that certain features, in particular the features of cer-
tain functional categories, may be initially unvalued, becoming valued by entering
into appropriate structural configurations with other elements (Chomsky 2008; Hi-
raiwa, 2005). And some recent work adopts the view that features are never deleted
(Chomsky 2007: p. 11). These issues remain unsolved. (Stabler 2011a: 397)
In order to fully develop a grammar fragment, one needs at least three years (compare
the time span between the publication of Barriers (1986) and Stabler’s implementation
(1992)). Particularly large grammars require the knowledge of several researchers work-
ing in international cooperation over the space of years or even decades. This process is
disrupted if fundamental assumptions are repeatedly changed at short intervals.
As far as large-scale coverage is concerned, the more recent work by John Torr is an
exception to what was said above.44 Torr, Stanojevic, Steedman & Cohen (2019) state that
their parser is the first one to take up the Sproat & Lappin Challenge to the Minimalist
community (2005). The work of the authors is impressive and they really implemented
44
It is not an exception as far as theory development is concerned. Torr’s system is based on Chomsky
(1995b), so he did not follow new trends but stayed within a certain setting.
177
4 Transformational Grammar – Minimalism
45
Torr explained in p.c. 2019 that these 45 rules can be folded into two Merge functions and two Move
functions. But in the end this is just a clever way of hiding complexity. It is like Chomsky (2005: 12) revising
the theory with Move and Merge into one with just one operation Merge but assuming two subcases of
Internal and External Merge.
178
4.7 Summary and classification
the type described in Torr & Stabler (2016). Figure 4.24 shows the analysis that was sug-
gested by Torr & Stabler (2016) and Figure 4.25 the HPSG analog. Directional Minimalist
Figure 4.24: Derivation tree of who Jack likes in Directional Minimalist Grammar accord-
ing to Torr & Stabler (2016)
Grammars use the ‘=’ sign to indicate the direction in which an argument is required. =d
means that a DP is required to the left of a head and d= encode the requirement of a DP
to the right. This is like the ‘/’ notation of Categorial Grammar (see Chapter 8). likes has
the category d= =d v, which means that it is a verb requiring a d to its right (the object)
and a d to its left (the subject). who is of category d and has a −wh feature, something
that has to be checked for a derivation to be complete. Jack is the subject of likes and
fulfills the =d requirement of likes. Items like [pres] and [int] are empty elements. [pres]
has a +case feature and can make Jack move to its specifier. The movement consumes
the −case feature and puts Jack to the front of the string. This looks like a unary pro-
jection in the derivation tree. The empty interrogative head [int] selects for a t to its
right. The result is a C projection that has a +wh feature. In the final step who, which is
−wh moves to the left and the wh features are removed. The important thing is that the
information about the phonology of who and its wh feature is percolated up in the tree
until it is finally bound off in the last derivation step.
Figure 4.25 shows the HPSG analog. The information about the local properties of the
wh word including its phonology are passed up in the tree until they are bound off in a
filler head configuration. The Filler-Head Schema binds off the nonlocal dependency and
makes sure that the phonology of the filler is not realized twice (see Reape 1994, Müller
2020a: Section 6 on linearization domains and Abeillé & Chaves 2020: Section 7 on
multi-dominance approaches in HPSG). An alternative to a binary branching Filler-Head
179
4 Transformational Grammar – Minimalism
V[SLASH ⟨⟩]
NP V[SLASH ⟨ who ⟩]
NP V[SLASH ⟨ who ⟩]
V NP[SLASH ⟨ who ⟩]
Figure 4.25: Possible HPSG analysis of who Jack likes using discontinuous constituents
Schema would be a unary branching rule that binds off the element in SLASH and adds
the stored phonology to the phonology of the daughter. This would then be completely
parallel to the unary branching assumed in Torr’s Directional Minimalist Grammar.
Concluding the discussion of Torr’s work, it can be said that it is truly impressive but
that it shows a convergence between Minimalism (or rather Minimalist Grammar) and
HPSG. Tools from GPSG/HPSG were adopted and the outcome differs in crucial aspects
from what is taught in Minimalist textbooks (just one or two instances of Merge vs. 45,
transformations vs. GPSG-style percolation of features).
Further reading
This chapter heavily draws on Adger (2003). Other textbooks on Minimalism are
Radford (1997), Grewendorf (2002), and Hornstein, Nunes & Grohmann (2005).
Kuhn (2007) offers a comparison of modern derivational analyses with con-
straint-based LFG and HPSG approaches. Borsley (2012) contrasts analyses of
long-distance dependencies in HPSG with movement-based analyses as in GB/
Minimalism. Borsley discusses four types of data which are problematic for
movement-based approaches: extraction without fillers, extraction with multi-
ple gaps, extractions where fillers and gaps do not match and extraction without
gaps. Borsley & Müller (2020) is another comparison of Minimalism and HPSG.
The authors discuss differences of approach and outlook of the two frameworks
(formalization and exhaustivity), empirical quality of the work, differences in as-
sumed syntactic structures, psycholinguistic issues and the assumptions made
in the frameworks regarding language acquisition.
180
4.7 Summary and classification
181
4 Transformational Grammar – Minimalism
particular framework but is also aware of the broad range of relevant descriptive
and theoretical literature.
As I have shown in Section 4.6.4 and in Müller (2013c) and will also show in
the following chapters and the discussion chapters in particular, there are most
certainly similarities between the various analyses on the market and they do
converge in certain respects. The way of getting out of the current crisis lies
with the empirically-grounded and theoretically broad education and training of
following generations.
In short: both teachers and students should read the medical record by Sterne-
feld and Richter. I implore the students not to abandon their studies straight after
reading it, but rather to postpone this decision at least until after they have read
the remaining chapters of this book.
a
Vagueness: in this article, perhaps occurs 19 times, may 17 as well as various if s. Consistency:
the assumptions made are inconsistent. See footnote 18 on page 156 of this book. Argumentation
style: the term specifier is abolished and it is claimed that the problems associated with this
term can no longer be formulated. Therefore, they are now not of this world. See footnote 28
on page 161 of this book. Immunization: Chomsky writes the following regarding the Empty
Category Principle: apparent exceptions do not call for abandoning the generalization as far as it
reaches, but for seeking deeper reasons to explain where and why it holds p. 9. This claim is most
certainly correct, but one wonders how much evidence one needs in a specific case in order
to disregard a given analysis. In particular regarding the essay Problems of Projection, one has
to wonder why this essay was even published only five years after On phases. The evidence
against the original approach is overwhelming and several points are taken up by Chomsky
(2013) himself. If Chomsky were to apply his own standards (for a quote of his from 1957, see
page 6) as well as general scientific methods (Occam’s Razor), the consequence would surely be
a return to head-based analyses of labeling.
For detailed comments on this essay, see Sections 4.6.2 and 4.6.3.
182
5 Generalized Phrase Structure
Grammar
Generalized Phrase Structure Grammar (GPSG) was developed as an answer to Trans-
formational Grammar at the end of the 1970s. The book by Gazdar, Klein, Pullum &
Sag (1985) is the main publication in this framework. Hans Uszkoreit has developed a
largish GPSG fragment for German (1987). Analyses in GPSG were so precise that it was
possible to use them as the basis for computational implementations. The following is a
possibly incomplete list of languages with implemented GPSG fragments:
• German (Weisweber 1987; Weisweber & Preuss 1992; Naumann 1987; 1988; Volk
1988)
• English (Evans 1985; Phillips & Thompson 1985; Phillips 1992; Grover, Carroll &
Briscoe 1993)
• French (Emirkanian, Da Sylva & Bouchard 1996)
• Persian (Bahrani, Sameti & Manshadi 2011)
As was discussed in Section 3.1.1, Chomsky (1957) argued that simple phrase structure
grammars are not well-suited to describe relations between linguistic structures and
claimed that one needs transformations to explain them. These assumptions remained
unchallenged for two decades (with the exception of publications by Harman (1963)
and Freidin (1975)) until alternative theories such as LFG and GPSG emerged, which
addressed Chomsky’s criticisms and developed non-transformational explanations of
phenomena for which there were previously only transformational analyses or simply
none at all. The analysis of local reordering of arguments, passives and long-distance
dependencies are some of the most important phenomena that have been discussed in
this framework. Following some introductory remarks on the representational format
of GPSG in Section 5.1, I will present the GPSG analyses of these phenomena in some
more detail.
184
5.1 General remarks on the representational format
used in the second. Furthermore, (3) contains information about the form of the verb
(inf stands for infinitives with zu ‘to’).
If we analyze the sentence in (4) with the second rule in (2) and the second rule in (3),
then we arrive at the structure in Figure 5.1.
(4) Karl hat versucht, [den Kuchen aufzuessen].
Karl has tried the cake to.eat.up
‘Karl tried to finish eating the cake.’
V2[VFORM inf ]
The rules in (2) say nothing about the order of the daughters which is why the verb
(H[6]) can also be in final position. This aspect will be discussed in more detail in Sec-
tion 5.1.2. With regard to the HFC, it is important to bear in mind that information about
the infinitive verb form is also present on the mother node. Unlike simple phrase struc-
ture rules such as those discussed in Chapter 2, this follows automatically from the Head
Feature Convention in GPSG. In (3), the value of VFORM is given and the HFC ensures
that the corresponding information is represented on the mother node when the rules
in (2) are applied. For the phrase in (4), we arrive at the category V2[VFORM inf ] and
this ensures that this phrase only occurs in the contexts it is supposed to:
(5) a. [Den Kuchen aufzuessen] hat er nicht gewagt.
the cake to.eat.up has he not dared
‘He did not dare to finish eating the cake.’
b. * [Den Kuchen aufzuessen] darf er nicht.
the cake to.eat.up be.allowed.to he not
Intended: ‘He is not allowed to finish eating the cake.’
c. * [Den Kuchen aufessen] hat er nicht gewagt.
the cake eat.up has he not dared
Intended: ‘He did not dare to finish eating the cake.’
d. [Den Kuchen aufessen] darf er nicht.
the cake eaten.up be.allowed.to he not
‘He is not allowed to finish eating the cake.’
185
5 Generalized Phrase Structure Grammar
gewagt ‘dared’ selects for a verb or verb phrase with an infinitive with zu ‘to’ but not a
bare infinitive, while darf ‘be allowed to’ takes a bare infinitive.
This works in an analogous way for noun phrases: there are rules for nouns which
do not take an argument as well as for nouns with certain arguments. Examples of rules
for nouns which either require no argument or two PPs are given in (6) (Gazdar, Klein,
Pullum & Sag 1985: 127):
(6) N1 → H[30] (Haus ‘house’, Blume ‘flower’)
N1 → H[31], PP[mit], PP[über] (Gespräch ‘talk’, Streit ‘argument’)
186
5.1 General remarks on the representational format
187
5 Generalized Phrase Structure Grammar
The following linearization rules serve to exclude orders such as those in (14):
(15) V[+MC] < X
X < V[−MC]
MC stands for main clause. The LP-rules ensure that in main clauses (+MC), the verb
precedes all other constituents and follows them in subordinate clauses (−MC). There is
a restriction that says that all verbs with the MC-value ‘+’ also have to be (+FIN). This
will rule out infinitive forms in initial position.
These LP rules do not permit orders with an occupied prefield or postfield in a local
tree. This is intended. We will see how fronting can be accounted for in Section 5.4.
5.1.3 Metarules
We have previously encountered linearization rules for sentences with subjects, however
our rules have the form in (16), that is, they do not include subjects:
(16) V2 → H[7], N2[CASE dat]
V2 → H[8], N2[CASE dat], N2[CASE acc]
These rules can be used to analyze the verb phrases dem Mann das Buch zu geben ‘to
give the man the book’ and das Buch dem Mann zu geben ‘to give the book to the man’
as they appear in (17), but we cannot analyze sentences like (9), since the subject does
not occur on the right-hand side of the rules in (16).
188
5.1 General remarks on the representational format
N2[nom] V2
N2[dat] N2[acc] V
Figure 5.3: VP analysis for German (not appropriate in the GPSG framework)
fact that only elements in the same local tree, that is, elements which occur on the right-
hand side of a rule, can be reordered. While we can reorder the parts of the VP and
thereby derive (9b), it is not possible to place the subject at a lower position between the
objects. Instead, a metarule can be used to analyze sentences where the subject occurs
between other arguments of the verb. This rule relates phrase structure rules to other
phrase structure rules. A metarule can be understood as a kind of instruction that creates
another rule for each rule with a certain form and these newly created rules will in turn
license local trees.
For the example at hand, we can formulate a metarule which says the following: if
there is a rule with the form “V2 consists of something” in the grammar, then there also
has to be another rule “V3 consists of whatever V2 consists + an NP in the nominative”.
In formal terms, this looks as follows:
(19) V2 → W ↦→
V3 → W, N2[CASE nom]
189
5 Generalized Phrase Structure Grammar
5.1.4 Semantics
The semantics adopted by Gazdar, Klein, Pullum & Sag (1985: Chapter 9–10) goes back to
Richard Montague (1974). Unlike a semantic theory which stipulates the combinatorial
possibilities for each rule (see Section 2.3), GPSG uses more general rules. This is possible
due to the fact that the expressions to be combined each have a semantic type. It is
customary to distinguish between entities (e) and truth values (t). Entities refer to an
object in the world (or in a possible world), whereas entire sentences are either true or
false, that is, they have a truth value. It is possible to create more complex types from
the types e and t. Generally, the following holds: if a and b are types, then ⟨ a, b ⟩ is
also a type. Examples of complex types are ⟨ e, t ⟩ and ⟨ e, ⟨ e, t ⟩⟩. We can define the
following combinatorial rule for this kind of typed expressions:
(21) If 𝛼 is of type ⟨ b, a ⟩ and 𝛽 of type b, then 𝛼 (𝛽) is of type a.
This type of combination is also called functional application. With the rule in (21), it
is possible that the type ⟨ e, ⟨ e, t ⟩⟩ corresponds to an expression which still has to be
combined with two expressions of type e in order to result in an expression of t. The first
combination step with e will yield ⟨ e, t ⟩ and the second step of combination with a fur-
ther e will give us t. This is similar to what we saw with 𝜆-expressions on page 62: 𝜆𝑦𝜆𝑥
like ′(x, y) has to combine with a y and an x. The result in this example was mögen ′(max ′,
lotte ′), that is, an expression that is either true or false in the relevant world.
In Gazdar et al. (1985), an additional type is assumed for worlds in which an expression
is true or false. For reasons of simplicity, I will omit this here. The types that we need
for sentences, NPs and N ′s, determiners and VPs are given in (22):
(22) a. TYP(S) = t
b. TYP(NP) = ⟨ ⟨ e, t ⟩, t ⟩
c. TYP(N′) = ⟨ e, t ⟩
d. TYP(Det) = TYP(N′), TYP(NP)
e. TYP(VP) = ⟨ e, t ⟩
A sentence is of type t since it is either true or false. A VP needs an expression of type e to
yield a sentence of type t. The type of the NP may seem strange at first glance, however,
it is possible to understand it if one considers the meaning of NPs with quantifiers. For
sentences such as (23a), a representation such as (23b) is normally assumed:
(23) a. All children laugh.
b. ∀𝑥 child ′(x) → laugh ′(x)
190
5.1 General remarks on the representational format
The symbol ∀ stands for the universal quantifier. The formula can be read as follows.
For every object, for which it is the case that it has the property of being a child, it is also
the case that it is laughing. If we consider the contribution made by the NP, then we see
that the universal quantifier, the restriction to children and the logical implication come
from the NP:
(24) ∀𝑥 child ′(x) → P(x)
This means that an NP is something that must be combined with an expression which
has exactly one open slot corresponding to the x in (24). This is formulated in (22b): an
NP corresponds to a semantic expression which needs something of type ⟨ e, t ⟩ to form
an expression which is either true or false (that is, of type t).
An N ′ stands for a nominal expression for the kind 𝜆x child(x). This means if there is a
specific individual which one can insert in place of the x, then we arrive at an expression
that is either true or false. For a given situation, it is the case that either John has the
property of being a child or he does not. An N′ has the same type as a VP.
TYP(N ′) and TYP(NP) in (22d) stand for the types given in (22c) and (22b), that is, a
determiner is semantically something which has to be combined with the meaning of N′
to give the meaning of an NP.
Gazdar, Klein, Pullum & Sag (1985: 209) point out a redundancy in the semantic spec-
ification of grammars which follow the rule-to-rule hypothesis (see Section 2.3) since,
instead of giving rule-by-rule instructions with regard to combinations, it suffices in
many cases simply to say that the functor is applied to the argument. If we use types
such as those in (22), it is also clear which constituent is the functor and which is the
argument. In this way, a noun cannot be applied to a determiner, but rather only the
reverse is possible. The combination in (25a) yields a well-formed result, whereas (25b)
is ruled out.
(25) a. Det′(N′)
b. N′(Det′)
5.1.5 Adjuncts
For nominal structures in English, Gazdar et al. (1985: 126) assume the X analysis and,
as we have seen in Section 2.4.1, this analysis is applicable to nominal structures in Ger-
man. Nevertheless, there is a problem regarding the treatment of adjuncts in the verbal
domain if one assumes flat branching structures, since adjuncts can freely occur between
arguments:
191
5 Generalized Phrase Structure Grammar
(27) a. weil der Mann der Frau das Buch gestern gab
because the man the woman the book yesterday gave
‘because the man gave the book to the woman yesterday’
b. weil der Mann der Frau gestern das Buch gab
because the man the woman yesterday the book gave
c. weil der Mann gestern der Frau das Buch gab
because the man yesterday the woman the book gave
d. weil gestern der Mann der Frau das Buch gab
because yesterday the man the woman the book gave
For (27), one requires the following rule:
(28) V3 → H[8], N2[CASE dat], N2[CASE acc], N2[CASE nom], AdvP
Of course, adjuncts can also occur between the arguments of verbs from other valence
classes:
(29) weil (oft) die Frau (oft) dem Mann (oft) hilft
because often the woman often the man often helps
‘because the woman often helps the man’
Furthermore, adjuncts can occur between the arguments of a VP:
(30) Der Mann hat versucht, der Frau heimlich das Buch zu geben.
the man has tried the woman secretly the book to give
‘The man tried to secretly give the book to the woman.’
In order to analyze these sentences, we can use a metarule which adds an adjunct to the
right-hand side of a V2 (Uszkoreit 1987: 146).
(31) V2 → W ↦→
V2 → W, AdvP
By means of the subject introducing metarule in (19), the V3-rule in (28) is derived from a
V2-rule. Since there can be several adjuncts in one sentence, a metarule such as (31) must
be allowed to apply multiple times. The recursive application of metarules is often ruled
out in the literature due to reasons of generative capacity (see Chapter 17) (Thompson
1982; Uszkoreit 1987: 146). If one uses the Kleene star, then it is possible to formulate the
adjunct metarule in such as way that it does not have to apply recursively (Uszkoreit
1987: 146):
(32) V2 → W ↦→
V2 → W, AdvP*
192
5.2 Passive as a metarule
If one adopts the rule in (32), then it is not immediately clear how the semantic con-
tribution of the adjuncts can be determined.2 For the rule in (31), one can combine the
semantic contribution of the AdvP with the semantic contribution of the V2 in the in-
put rule. This is of course also possible if the metarule is applied multiple times. If this
metarule is applied to (33a), for example, the V2-node in (33a) contains the semantic
contribution of the first adverb.
(33) a. V2 → V, NP, AdvP
b. V2 → V, NP, AdvP, AdvP
The V2-node in (33b) receives the semantic representation of the adverb applied to the
V2-node in (33a).
Weisweber & Preuss (1992) have shown that it is possible to use metarules such as
(31) if one does not use metarules to compute a set of phrase structure rules, but rather
directly applies the metarules during the analysis of a sentence. Since sentences are
always of finite length and the metarule introduces an additional AdvP to the right-
hand side of the newly licensed rule, the metarule can only be applied a finite number
of times.
This is true for all verb classes which can form the passive. It does not make a difference
whether the verbs takes one, two or three arguments:
(34) a. weil er noch gearbeitet hat
because he.NOM still worked has
’because he has still worked’
b. weil noch gearbeitet wurde
because still worked was
‘because there was still working there’
2
In LFG, an adjunct is entered into a set in the functional structure (see Section 7.1.6). This also works
with the use of the Kleene Star notation. From the f-structure, it is possible to compute the semantic
denotation with corresponding scope by making reference to the c-structure. In HPSG, Kasper (1994) has
made a proposal which corresponds to the GPSG proposal with regard to flat branching structures and
an arbitrary number of adjuncts. In HPSG, however, one can make use of so-called relational constraints.
These are similar to small programs which can create relations between values inside complex structures.
Using such relational constraints, it is then possible to compute the meaning of an unrestricted number of
adjuncts in a flat branching structure.
3
This characterization does not hold for other languages. For instance, Icelandic allows for dative subjects.
See Zaenen, Maling & Thráinsson (1985).
193
5 Generalized Phrase Structure Grammar
In a simple phrase structure grammar, we would have to list two separate rules for each
pair of sentences making reference to the valence class of the verb in question. The
characteristics of the passive discussed above would therefore not be explicitly stated
in the set of rules. In GPSG, it is possible to explain the relation between active and
passive rules using a metarule: for each active rule, a corresponding passive rule with
suppressed subject is licensed. The link between active and passive clauses can therefore
be captured in this way.
An important difference to Transformational Grammar/GB is that we are not creating
a relation between two trees, but rather between active and passive rules. The two rules
license two unrelated structures, that is, the structure of (38b) is not derived from the
structure of (38a).
194
5.2 Passive as a metarule
In what follows, I will discuss the analysis of the passive given in Gazdar, Klein, Pullum
& Sag (1985) in some more detail. The authors suggest the following metarule for English
(p. 59):4
(39) VP → W, NP ↦→
VP[PAS] → W, (PP[by])
This rule states that verbs which take an object can occur in a passive VP without this
object. Furthermore, a by-PP can be added. If we apply this metarule to the rules in (40),
then this will yield the rules listed in (41):
(40) VP → H[2], NP
VP → H[3], NP, PP[to]
It is possible to use the rules in (40) to analyze verb phrases in active sentences:
(42) a. [S The man [VP devoured the carcass]].
b. [S The man [VP handed the sword to Tracy]].
The combination of a VP with the subject is licensed by an additional rule (S → NP, VP).
With the rules in (41), one can analyze the VPs in the corresponding passive sentences
in (43):
(43) a. [S The carcass was [VP[PAS] devoured (by the man)]].
b. [S The sword was [VP[PAS] handed to Tracy (by the man)]].
At first glance, this analysis may seem odd as an object is replaced inside the VP by a
PP which would be the subject in an active clause. Although this analysis makes correct
predictions with regard to the syntactic well-formedness of structures, it seems unclear
how one can account for the semantic relations. It is possible, however, to use a lexical
rule that licenses the passive participle and manipulates the semantics of the output
lexical item in such a way that the by-PP is correctly integrated semantically (Gazdar
et al. 1985: 219).
We arrive at a problem, however, if we try to apply this analysis to German since the
impersonal passive cannot be derived by simply suppressing an object. The V2-rules for
verbs such as arbeiten ‘work’ and denken ‘think’ as used for the analysis of (34a) and
(35a) have the following form:
(44) V2 → H[5]
V2 → H[13], PP[an]
4
See Weisweber & Preuss (1992: 1114) for a parallel rule for German which refers to accusative case on the
left-hand side of the metarule.
195
5 Generalized Phrase Structure Grammar
There is no NP on the right-hand side of these rules which could be turned into a von-PP.
If the passive is to be analyzed as suppressing an NP argument in a rule, then it should
follow from the existence of the impersonal passive that the passive metarule has to be
applied to rules which license finite clauses, since information about whether there is a
subject or not is only present in rules for finite clauses.5 In this kind of system, the rules
for finite sentences (V3) are the basic rules and the rules for V2 would be derived from
these.
It would only make sense to have a metarule which applies to V3 for German since
English does not have V3 rules which contain both the subject and its object on the right-
hand side of the rule.6 For English, it is assumed that a sentence consists of a subject
and a VP (see Gazdar et al. 1985: 139). This means that we arrive at two very different
analyses for the passive in English and German, which do not capture the descriptive
insight that the passive is the suppression of the subject and the subsequent promotion
of the object in the same way. The central difference between German and English seems
to be that English obligatorily requires a subject,7 which is why English does not have
an impersonal passive. This is a property independent of passives, which affects the
possibility of having a passive structure, however.
The problem with the GPSG analysis is the fact that valence is encoded in phrase struc-
ture rules and that subjects are not present in the rules for verb phrases. In the following
chapters, we will encounter approaches from LFG, Categorial Grammar, HPSG, Con-
struction Grammar, and Dependency Grammar which encode valence separately from
phrase structure rules and therefore do not have a principled problem with impersonal
passive.
See Jacobson (1987b: 394–396) for more problematic aspects of the passive analysis in
GPSG and for the insight that a lexical representation of valence – as assumed in Cate-
gorial Grammar, GB, LFG and HPSG – allows for a lexical analysis of the phenomenon,
which is however unformulable in GPSG for principled reasons having to do with the
fundamental assumptions regarding valence representations.
5
GPSG differs from GB in that infinitive verbal projections do not contain nodes for empty subjects. This is
also true for all other theories discussed in this book with the exception of Tree-Adjoining Grammar.
6
Gazdar et al. (1985: 62) suggest a metarule similar to our subject introduction metarule on page 189. The
rule that is licensed by their metarule is used to analyze the position of auxiliaries in English and only
licenses sequences of the form AUX NP VP. In such structures, subjects and objects are not in the same
local tree either.
7
Under certain conditions, the subject can also be omitted in English. For more on imperatives and other
subject-less examples, see page 536.
196
5.4 Long-distance dependencies as the result of local dependencies
empty verb in final position and links this to the verb in initial position using technical
means which we will see in more detail in the following section.
197
5 Generalized Phrase Structure Grammar
The rule in (50) connects a sentence with verb-initial order with a constituent which is
missing in the sentence:
(50) V3[+FIN] → X[+TOP], V3[+MC]/X
In (50), X stands for an arbitrary category which is marked as missing in V3 by the ‘/’.
X is referred to as a filler.
The interesting cases of values for X with regard to our examples are given in (51):
(51) V3[+FIN] → N2[+TOP, CASE nom], V3[+MC]/N2[CASE nom]
V3[+FIN] → N2[+TOP, CASE dat], V3[+MC]/N2[CASE dat]
V3[+FIN] → N2[+TOP, CASE acc], V3[+MC]/N2[CASE acc]
(51) does not show actual rules. Instead, (51) shows examples for insertions of specific
categories into the X-position, that is, different instantiations of the rule.
The following linearization rule ensures that a constituent marked by [+TOP] in (50)
precedes the rest of the sentence:
(52) [+TOP] < X
TOP stands for topicalized. As was mentioned on page 107, the prefield is not restricted
to topics. Focused elements and expletives can also occur in the prefield, which is why
the feature name is not ideal. However, it is possible to replace it with something else,
for instance prefield. This would not affect the analysis. X in (52) stands for an arbitrary
category. This is a new X and it is independent from the one in (50).
Figure 5.4 shows the interaction of the rules for the analysis of (53).9
(53) Dem Mann gibt er das Buch.
the.DAT man gives he.NOM the.ACC book
‘He gives the man the book.’
V3[+FIN, +MC]
N2[dat,+TOP] V3[+MC]/N2[dat]
The metarule in (47) licenses a rule which adds a dative object into slash. This rule
now licenses the subtree for gibt er das Buch ‘gives he the book’. The linearization rule
9
The FIN feature has been omitted on some of the nodes since it is redundant: +MC-verbs always require
the FIN value ‘+’.
198
5.4 Long-distance dependencies as the result of local dependencies
V[+MC] < X orders the verb to the very left inside of the local tree for V3. In the next
step, the constituent following the slash is bound off. Following the LP-rule [+TOP] < X,
the bound constituent must be ordered to the left of the V3 node.
The analysis given in Figure 5.4 may seem too complex since the noun phrases in (53)
all depend on the same verb. It is possible to invent a system of linearization rules which
would allow one to analyze (53) with an entirely flat structure. One would nevertheless
still need an analysis for sentences such as those in (37) on page 107 – repeated here as
(54) for convenience:
(54) a. [Um zwei Millionen Mark]𝑖 soll er versucht haben, [eine
around two million Deutsche.Marks should he tried have an
Versicherung _𝑖 zu betrügen].10
insurance.company to deceive
‘He apparently tried to cheat an insurance company out of two million Deu-
tsche Marks.’
b. „Wer𝑖 , glaubt er, daß er _𝑖 ist?“ erregte sich ein Politiker vom Nil.11
who believes he that he is retort REFL a politician from.the Nile
‘ “Who does he think he is?”, a politician from the Nile exclaimed.’
c. Wen𝑖 glaubst du, daß ich _𝑖 gesehen habe?12
who believe you that I seen have
‘Who do you think I saw?’
d. [Gegen ihn]𝑖 falle es den Republikanern hingegen schwerer,
against him fall it the Republicans however more.difficult
[ [ Angriffe _𝑖 ] zu lancieren].13
attacks to launch
‘It is, however, more difficult for the Republicans to launch attacks against
him.’
The sentences in (54) cannot be explained by local reordering as the elements in the
prefield are not dependent on the highest verb, but instead originate in the lower clause.
Since only elements from the same local tree can be reordered, the sentences in (54)
cannot be analyzed without postulating some kind of additional mechanism for long-
distance dependencies.14
Before I conclude this chapter, I will discuss yet another example of fronting, namely
one of the more complex examples in (54). The analysis of (54c) consists of several
10
taz, 04.05.2001, p. 20.
11
Spiegel, 8/1999, p. 18.
12
Scherpenisse (1986: 84).
13
taz, 08.02.2008, p. 9.
14
One could imagine analyses that assume the special mechanism for nonlocal dependencies only for sen-
tences that really involve dependencies that are nonlocal. This was done in HPSG by Kathol (1995) and
Wetta (2011) and by Groß & Osborne (2009) in Dependency Grammar. I discuss the Dependency Grammar
analyses in detail in Section 11.7.1 and show that analyses that treat simple V2 sentences as ordering vari-
ants of non-V2 sentences have problems with the scope of fronted adjuncts, with coordination of simple
sentences and sentences with nonlocal dependencies and with so-called multiple frontings.
199
5 Generalized Phrase Structure Grammar
steps: the introduction, percolation and finally binding off of information about the long-
distance dependency. This is shown in Figure 5.5. Simplifying somewhat, I assume that
V3[+FIN,+MC]
N2[acc,+TOP] V3[+MC]/N2[acc]
V3[−dass,−MC]/N2[acc]
N2[nom] V[6,−MC]
gesehen habe ‘have seen’ behaves like a normal transitive verb.15 A phrase structure rule
licensed by the metarule in (47) licenses the combination of ich ‘I’ and gesehen habe ‘has
seen’ and represents the missing accusative object on the V3 node. The complementizer
dass ‘that’ is combined with ich gesehen habe ‘I have seen’ and the information about
the fact that an accusative NP is missing is percolated up the tree. This percolation is
controlled by the so-called Foot Feature Principle, which states that all foot features of
all the daughters are also present on the mother node. Since the SLASH feature is a foot
feature, the categories following the ‘/’ percolate up the tree if they are not bound off in
the local tree. In the final step, the V3/N2[acc] is combined with the missing N2[acc].
The result is a complete finite declarative clause of the highest projection level.
200
5.5 Summary and classification
(Across the Board Extraction, Ross 1967). The following examples from Gazdar (1981b:
173) show that gaps in conjuncts must be identical, that is, a filler of a certain category
must correspond to a gap in every conjunct:
(55) a. The kennel which Mary made and Fido sleeps in has been stolen.
(= S/NP & S/NP)
b. The kennel in which Mary keeps drugs and Fido sleeps has been stolen.
(= S/PP & S/PP)
c. * The kennel (in) which Mary made and Fido sleeps has been stolen.
(= S/NP & S/PP)
GPSG can plausibly handle this with mechanisms for the transmission of information
about gaps. In symmetric coordination, the SLASH elements in each conjunct have to
be identical. On the one hand, a transformational approach is not straightforwardly
possible since one normally assumes in such analyses that there is a tree and something
is moved to another position in the tree thereby leaving a trace. However, in coordinate
structures, the filler would correspond to two or more traces and it cannot be explained
how the filler could originate in more than one place.
While the analysis of Across the Board extraction is a true highlight of GPSG, there
are some problematic aspects that I want to address in the following: the interaction
between valence and morphology, the representation of valence and partial verb phrase
fronting, and the expressive power of the GPSG formalism.
201
5 Generalized Phrase Structure Grammar
the phenomenon. Furthermore, the valence of the resulting adjective also depends on
the valence of the verb. For example, a verb such as vergleichen ‘compare’ requires a
mit (with)-PP and vergleichbar ‘comparable’ does too (Riehemann 1993: 7, 54; 1998: 68).
In the following chapters, we will encounter models which assume that lexical entries
contain information as to whether a verb selects for an accusative object or not. In such
models, morphological rules which need to access the valence properties of linguistic
objects can be adequately formulated.
The issue of interaction of valence and derivational morphology will be taken up in
Section 21.2.2 again, where approaches in LFG and Construction Grammar are discussed
that share assumptions about the encoding of valence with GPSG.
202
5.5 Summary and classification
Comprehension questions
203
5 Generalized Phrase Structure Grammar
Exercises
1. Write a small GPSG grammar which can analyze the following sentences:
Include all arguments in a single rule without using the metarule for intro-
ducing subjects.
Further reading
The main publication in GPSG is Gazdar, Klein, Pullum & Sag (1985). This book
has been critically discussed by Jacobson (1987b). Some problematic analyses are
contrasted with alternatives from Categorial Grammar and reference is made to
the heavily Categorial Grammar influenced work of Pollard (1984), which counts
as one of the predecessors of HPSG. Some of Jacobson’s suggestions can be found
in later works in HPSG.
Grammars of German can be found in Uszkoreit (1987) and Busemann (1992).
Gazdar (1981b) developed an analysis of long-distance dependencies, which is
still used today in theories such as HPSG.
A history of the genesis of GPSG can be found in Pullum (1989b).
204
6 Feature descriptions
In the previous chapter, we talked about sets of feature-value pairs, which can be used to
describe linguistic objects. In this chapter, we will introduce feature descriptions which
play a role in theories such as LFG, HPSG, Construction Grammar, versions of Catego-
rial Grammar and TAG (and even some formalizations of Minimalist theories (Veenstra
1998)). This chapter will therefore lay some of the groundwork for the chapters to follow.
Feature structures are complex entities which can model properties of a linguistic ob-
ject. Linguists mostly work with feature descriptions which describe only parts of a given
feature structure. The difference between models and descriptions will be explained in
more detail in Section 6.7.
Alternative terms for feature structures are:
• feature-value structure
• attribute-value structure
In what follows, I will restrict the discussion to the absolutely necessary details in order
to keep the formal part of the book as short as possible. I refer the interested reader to
Shieber (1986), Pollard & Sag (1987: Chapter 2), Johnson (1988), Carpenter (1992), King
(1994) and Richter (2004). Shieber’s book is an accessible introduction to Unification
Grammars. The works by King and Richter, which introduce important foundations
for HPSG, would most probably not be accessible for those without a good grounding
in mathematics. However, it is important to know that these works exist and that the
corresponding linguistic theory is build on a solid foundation.
206
6.1 Feature descriptions
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
FATHER …
(5)
MOTHER …
DAUGHTER-1 …
DAUGHTER-2 …
DAUGHTER-3 …
How many features do we want to assume? Where is the limit? What would the value
of DAUGHTER-32 be?
For this case, it makes much more sense to use a list. Lists are indicated with angle
brackets. Any number of elements can occur between these brackets. A special case is
when no element occurs between the brackets. A list with no elements is also called
empty list. In the following example, Max Meier has a daughter called Clara, who herself
has no daughter.
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
FATHER …
MOTHER …
FIRSTNAME clara
(6)
* LASTNAME meier
+
DATE-OF-BIRTH 10.10.2004
DAUGHTER FATHER …
MOTHER …
DAUGHTER ⟨⟩
Now, we are left with the question of sons. Should we add another list for sons? Do
we want to differentiate between sons and daughters? It is certainly the case that the
gender of the children is an important property, but these are properties of the objects
themselves, since every person has a gender. The description in (7) therefore offers a
more adequate representation.
At this point, one could ask why the parents are not included in a list as well. In fact,
we find similar questions also in linguistic works: how is information best organized for
the job at hand? One could argue for the representation of descriptions of the parents
under separate features, by pointing out that with such a representation it is possible to
make certain claims about a mother or father without having to necessarily search for
the respective descriptions in a list.
If the order of the elements is irrelevant, then we could use sets rather than lists. Sets
are written inside curly brackets.1
1
The definition of a set requires many technicalities. In this book, I would use sets only for collecting
semantic information. This can be done equally well using lists, which is why I do not introduce sets here
and instead use lists.
207
6 Feature descriptions
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
GENDER male
FATHER …
MOTHER …
FIRSTNAME clara
(7)
LASTNAME meier
* DATE-OF-BIRTH 10.10.2004 +
GENDER female
CHILDREN
FATHER …
MOTHER …
CHILDREN ⟨⟩
6.2 Types
In the previous section, we introduced feature descriptions consisting of feature-value
pairs and showed that it makes sense to allow for complex values for features. In this sec-
tion, feature descriptions will be augmented to include types. Feature descriptions which
are assigned a type are also called typed feature descriptions. Types say something about
which features can or must belong to a particular structure. The description previously
discussed describes an object of the type person.
person
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
(8)
GENDER male
FATHER …
MOTHER …
CHILDREN …, …
Types are written in italics.
The specification of a type determines which properties a modelled object has. It is
then only possible for a theory to say something about these properties. Properties such
as OPERATING VOLTAGE are not relevant for objects of the type person. If we know the
type of a given object, then we also know that this object must have certain properties
even if we do not yet know their exact values. In this way, (9) is still a description of
Max Meier even though it does not contain any information about Max’ date of birth:
person
FIRSTNAME max
(9)
LASTNAME meier
GENDER male
208
6.2 Types
We know, however, that Max Meier must have been born on some day since this is a
description of the type person. The question What is Max’ date of birth? makes sense for
a structure such as (9) in a way that the question Which operating voltage does Max have?
does not. If we know that an object is of the type person, then we have the following
basic structure:
person
FIRSTNAME firstname
LASTNAME lastname
DATE-OF-BIRTH date
(10)
GENDER gender
FATHER person
MOTHER person
CHILDREN list of person
In (10) and (9), the values of features such as FIRSTNAME are in italics. These values are
also types. They are different from types such as person, however, as no features belong
to them. These kinds of types are called atomic.
Types are organized into hierarchies. It is possible to define the subtypes woman and
man for person. These would determine the gender of a given object. (11) shows the
feature structure for the type woman, which is analogous to that of man.
female person
FIRSTNAME firstname
LASTNAME lastname
DATE-OF-BIRTH date
(11)
GENDER female
FATHER person
MOTHER person
CHILDREN list of person
At this point, we could ask ourselves if we really need the feature GENDER. The nec-
essary information is already represented in the type woman. The question if specific
information is represented by special features or whether it is stored in a type with-
out a corresponding individual feature will surface again in the discussion of linguistic
analyses. Both alternatives differ mostly in the fact that the information which is mod-
elled by types is not immediately accessible for structure sharing, which is discussed in
Section 6.4.
Type hierarchies play an important role in capturing linguistic generalizations, which
is why type hierarchies and the inheritance of constraints and information will be ex-
plained with reference to a further example in what follows. One can think of type
hierarchies as an effective way of organizing information. In an encyclopedia, the in-
dividual entries are linked in such a way that the entries for monkey and mouse will
each contain a pointer to mammal. The description found under mammal does therefore
not have to be repeated for the subordinate concepts. In the same way, if one wishes to
209
6 Feature descriptions
electric appliance
describe various electric appliances, one can use the hierarchy in Figure 6.1. The most
general type electrical device is the highest in Figure 6.1. Electrical devices have certain
properties, e.g., a power supply with a certain power consumption. All subtypes of elec-
trical device “inherit” this property. In this way, printing device and scanning device also
have a power supply with a specific power consumption. A printing device can produce
information and a scanning device can read in information. A photocopier can both pro-
duce information and read it. Photocopiers have both the properties of scanning and
printing devices. This is expressed by the connection between the two superordinate
types and photocopier in Figure 6.1. If a type is at the same time the subtype of several
superordinate types, then we speak of multiple inheritance. If devices can print, but not
scan, they are of type printer. This type can have further more specific subtypes, which
in turn may have particular properties, e.g., laser printer. New features can be added to
subtypes, but it is also possible to make values of inherited features more specific. For
example, the material that can be scanned with a negative scanner is far more restricted
than that of the supertype scanner, since negative scanners can only scan negatives.
The objects that are modeled always have a maximally specific type. In the example
above, this means that we can have objects of the type laser printer and negative scanner
but not of the type printing device. This is due to the fact that printing device is not
maximally specific since this type has two subtypes.
Type hierarchies with multiple inheritance are an important means for expressing
linguistic generalizations (Flickinger, Pollard & Wasow 1985; Flickinger 1987; Sag 1997).
Types of words or phrases which occur at the very top of these hierarchies correspond
to constraints on linguistic objects, which are valid for linguistic objects in all languages.
Subtypes of such general types can be specific to certain languages or language classes.
6.3 Disjunction
Disjunctions can be used if one wishes to express the fact that a particular object can
have various different properties. If one were to organize a class reunion twenty years
210
6.4 Structure sharing
after leaving school and could not recall the exact names of some former classmates,
it would be possible to search the web for “Julia (Warbanow or Barbanow)”. In feature
descriptions, this “or” is expressed by a ‘∨’.
person
(12) FIRSTNAME julia
LASTNAME warbanow ∨ barbanow
Some internet search engines do not allow for searches with ‘or’. In these cases, one has
to carry out two distinct search operations: one for “Julia Warbanow” and then another
for “Julia Barbanow”. This corresponds to the two following disjunctively connected
descriptions:
person person
(13) FIRSTNAME julia ∨ FIRSTNAME julia
LASTNAME warbanow LASTNAME barbanow
The identity of values is indicated by boxes containing numbers. The boxes can also be
viewed as variables.
When describing objects we can make claims about equal values or claims about iden-
tical values. A claim about the identity of values is stronger. Let us take the follow-
ing feature description containing information about the children that Max’s father and
mother have as an example:
211
6 Feature descriptions
person
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
person
FIRSTNAME peter
FATHER LASTNAME meier
*
+
person
(15) CHILDREN , …
FIRSTNAME klaus
person
FIRSTNAME anna
LASTNAME meier
MOTHER * +
person
CHILDREN , …
FIRSTNAME klaus
Notice that under the paths FATHER|CHILDREN and MOTHER|CHILDREN, we find a list con-
taining a description of a person with the first name Klaus. The question of whether the
feature description is of one or two children of Peter and Anna cannot be answered.
It is certainly possible that we are dealing with two different children from previous
partnerships who both happen to be called Klaus.
By using structure sharing, it is possible to specify the identity of the two values as
in (16). In (16), Klaus is a single child that belongs to both parents. Everything inside
person
FIRSTNAME max
LASTNAME meier
DATE-OF-BIRTH 10.10.1985
person
FIRSTNAME peter
FATHER LASTNAME meier
*
(16) +
person
CHILDREN 1 , …
FIRSTNAME klaus
person
FIRSTNAME anna
MOTHER
LASTNAME meier
CHILDREN 1, …
the brackets which immediately follow 1 is equally present in both positions. One can
think of 1 as a pointer or reference to a structure which has only been described once.
212
6.5 Cyclic structures
One question still remains open: what about Max? Max is also a child of his parents and
should therefore also occur in a list of the children of his parents. There are two points
in (16) where there are three dots. These ellipsis marks stand for information about the
other children of Peter and Anna Meier. Our world knowledge tells us that both of them
must have the same child namely Max Meier himself. In the following section, we will
see how this can be expressed in formal terms.
6.6 Unification
Grammatical rules are written exactly like lexical entries in HPSG and Construction
Grammar and are done so with the help of feature descriptions. For a word or a larger
2
The dots here stand for the path to 2 in the list which is the value of CHILDREN. See Exercise 3.
213
6 Feature descriptions
3
The term unification should be used with care. It is only appropriate if certain assumptions with regard to
the formal basis of linguistic theories are made. Informally, the term is often used in formalisms where uni-
fication is not technically defined. In HPSG, it mostly means that the constraints of two descriptions lead
to a single description. What one wants to say here, intuitively, is that the objects described have to satisfy
the constraints of both descriptions at the same time (constraint satisfaction). Since the term unification is
so broadly-used, it will also be used in this section. The term will not play a role in the remaining discus-
sions of theories with the exception of explicitly unification-based approaches. In contrast, the concept of
constraint satisfaction presented here is very important for the comprehension of the following chapters.
214
6.7 Phenomena, models and formal theories
Katharina Meier could also have other properties unknown to the detective. The impor-
tant thing is that the properties known to the detective match those that the client is
looking for. Furthermore, it is important that the detective uses reliable information and
does not make up any information about the sought object. The unification of the search
in (18a) and the information accessible to the detective in (19) is in fact (19) and not (20),
for example:
person
FIRSTNAME katharina
LASTNAME meier
(20) GENDER female
DATE-OF-BIRTH 15.10.1965
HAIRCOLOR blond
CHILDREN ⟨⟩
(20) contains information about children, which is neither contained in (18a) nor in (19).
It could indeed be the case that Katharina Meier has no children, but there are perhaps
several people called Katharina Meier with otherwise identical properties. With this
invented information, we might exclude one or more possible candidates.
It is possible that our detective Max Müller does not have any information about hair
color in his files. His files could contain the following information:
person
FIRSTNAME katharina
(21) LASTNAME meier
GENDER female
DATE-OF-BIRTH 15.10.1965
These data are compatible with the search criteria. If we were to unify the descriptions
in (18a) and (21), we would get (19). If we assume that the detective has done a good job,
then Bettina Kant now knows that the person she is looking for has the properties of
her original search plus the newly discovered properties.
215
6 Feature descriptions
In a given model, there are only fully specified representations, that is, the model con-
tains four forms of Frau, each with a different case. For masculine nouns such as Mann
‘man’, one would have to say something about case in the description since the genitive-
singular form Mann-es differs from other singular forms, which can be seen by adding
Mann into the examples in (22). (23) shows the feature descriptions for Frau ‘woman’
and Mann ‘man’:
(23) a. Frau
‘woman’:
GENDER fem
b. Mann
‘man’:
GENDER mas
CASE nominative ∨ dative ∨ accusative
Unlike (23b), (23a) does not contain a case feature since we do not need to say anything
special about case in the description of Frau. Since all nominal objects require a case fea-
ture, it becomes clear that the structures for Frau must actually also have a case feature.
The value of the case feature is of the type case. case is a general type which subsumes
the subtypes nominative, genitive, dative and accusative. Concrete linguistic objects al-
ways have exactly one of these maximally specified types as their case value. The feature
structures belonging to (23) are given in Figure 6.2 and Figure 6.3.
Figure 6.2: Feature structures for the description of Frau ‘woman’ in (23a)
In these representations, each node has a certain type (noun, fem, nominative, …) and
the types in feature structures are always maximally specific, that is, they do not have
any further subtypes. There is always an entry node (noun in the example above) and
216
6.7 Phenomena, models and formal theories
the other nodes are connected with arrows that are annotated with the feature labels
(GENDER, CASE).
If we return to the example with people from the previous sections, we can capture
the difference between a model and a description as follows: if we have a model of
people that includes first name, last name, date of birth, gender and hair color, then it
follows that every object we model also has a birthday. We can, however, decide to omit
these details from our descriptions if they do not play a role for stating constraints or
formulating searches.
The connection between linguistic phenomena, the model and the formal theory is
shown in Figure 6.4, which is adapted from Pollard & Sag (1994: 7). The model is designed
phenomenon model
linguistic models feature
objects structures
Figure 6.4: Phenomenon, model and formal theory according to Netter (1998: 26)
217
6 Feature descriptions
Comprehension questions
Exercises
1. Think about how one could describe musical instruments using feature de-
scriptions.
2. Come up with a type hierarchy for the word classes (det, comp, noun, verb,
adj, prep). Think about the ways in which one can organize the type hierachy
so that one can express the generalizations that where captured by the binary
features in Table 3.1 on page 94.
3. In this chapter, we introduced lists. This may look like an extension of the
formalism, but it is not as it is possible to convert the list notation into a
218
6.7 Phenomena, models and formal theories
notation which only requires feature-value pairs. Think about how one could
do this.
4. (Additional exercise) The relation append will play a role in Chapter 9. This
relation serves to combine two lists to form a third. Relational constraints
such as append do in fact constitute an expansion of the formalism. Using
relational constraints, it is possible to relate any number of feature values
to other values, that is, one can write programs which compute a particular
value depending on other values. This poses the question as to whether one
needs such powerful descriptive tools in a linguistic theory and if we do allow
them, what kind of complexity we afford them. A theory which can do with-
out relational constraints should be preferred over one that uses relational
constraints (see Müller 2007a: Chapter 20 for a comparison of theories).
For the concatenation of lists, there is a possible implementation in feature
structures without recourse to relational constraints. Find out how this can
be done. Give your sources and document how you went about finding the
solution.
Further reading
219
7 Lexical Functional Grammar
Lexical Functional Grammar (LFG) was developed in the 80s by Joan Bresnan and Ron
Kaplan (Bresnan & Kaplan 1982). LFG forms part of so-called West-Coast linguistics:
unlike MIT, where Chomsky works and teaches, the institutes of researchers such as
Joan Bresnan and Ron Kaplan are on the west coast of the USA (Joan Bresnan in Stanford
and Ron Kaplan at Xerox in Palo Alto and now at the language technology firm Nuance
Communications in the Bay Area in California).
Bresnan & Kaplan (1982) view LFG explicitly as a psycholinguistically plausible alter-
native to transformation-based approaches. For a discussion of the requirements regard-
ing the psycholinguistic plausibility of linguistics theories, see Chapter 15.
The more in-depth works on German are Berman (1996; 2003a) and Cook (2001).
LFG has well-designed formal foundations (Kaplan & Bresnan 1982; Kaplan 1989), and
hence first implementations were available rather quickly (Frey & Reyle 1983b,a; Ya-
sukawa 1984; Block & Hunze 1986; Eisele & Dörre 1986; Wada & Asher 1986; Delmonte
1990; Her, Higinbotham & Pentheroudakis 1991; Kohl 1992; Kohl, Gardent, Plainfossé,
Reape & Momma 1992; Kaplan & Maxwell III 1996; Mayo 1997; 1999; Boullier & Sagot
2005a,b; Clément 2009; Clément & Kinyon 2001).
The following is a list of languages with implemented LFG fragments, probably in-
complete:
• Arabic (Attia 2008),
• Arrernte (Dras, Lareau, Börschinger, Dale, Motazedi, Rambow, Turpin & Ulinski
2012),
• Bengali (Sengupta & Chaudhuri 1997),
• Danish (Ørsnes 2002; Ørsnes & Wedekind 2003; 2004),
• English (Her, Higinbotham & Pentheroudakis 1991; Butt, Dipper, Frank & King
1999; Riezler, King, Kaplan, Crouch, Maxwell III & Johnson 2002; King & Maxwell
III 2007),
• French (Zweigenbaum 1991; Frank 1996; Frank & Zaenen 2002; Butt, Dipper, Frank
& King 1999; Clément & Kinyon 2001; Boullier, Sagot & Clément 2005; Schwarze
& de Alencar 2016; de Alencar 2017),
• Georgian (Meurer 2009),
• German (Rohrer 1996; Berman 1996; Kuhn & Rohrer 1997; Butt, Dipper, et al. 1999;
Dipper 2003; Rohrer & Forst 2006; Forst 2006; Frank 2006; Forst & Rohrer 2009),
7 Lexical Functional Grammar
222
7.1 General remarks on the representational format
All lexical items that have a meaning (e.g., nouns, verbs, adjectives) contribute a PRED
feature with a corresponding value. The grammatical functions governed by a head (gov-
ernment = subcategorization) are determined in the specification of PRED.3 Correspond-
ing functions are called governable grammatical functions. Examples of this are shown
in Table 7.1 on the next page (Dalrymple 2006). The PREDspecification corresponds to
the theta grid in GB theory. The valence of a head is specified by the PRED value.
The non-governable grammatical functions are given in Table 7.2 on the following
page. Topic and focus are information-structural terms. There are a number of works on
their exact definition, which differ to varying degrees (Kruijff-Korbayová & Steedman
2
The English examples and their analyses discussed in this section are taken from Dalrymple (2001) and
Dalrymple (2006).
3
In the structure in (1b), the SUBJ and OBJ in the list following devour are identical to the values of SUBJ
and OBJ in the structure. For reasons of presentation, this will not be explicitly indicated in this structure
and following structures.
223
7 Lexical Functional Grammar
SUBJ: subject
OBJ: object
COMP: sentential complement or closed (non-predicative) infinitival
complement
XCOMP: open (predicative) complement, often infinitival, the SUBJ func-
tion is externally controlled
OBJ𝜃 : secondary OBJ functions that are related to a special, language
specific set of grammatical roles; English has OBJTHEME only.
OBL𝜃 : a group of thematically restricted oblique functions, as for in-
stance OBLGOAL or OBLAGENT . These often correspond to adposi-
tional phrases in c-structure.
ADJ: adjuncts
TOPIC: the topic of an utterance
FOCUS: the focus of an utterance
2003: 253–254), but broadly speaking, one can say that the focus of an utterance consti-
tutes new information and that the topic is old or given information. Bresnan (2001: 97)
uses the following question tests in order to determine topic and focus:
f-structures are characterized using functional descriptions, for example, one can refer to
a value of the feature TENSE in the functional structure 𝑓 using the following expression:
(4) (𝑓 TENSE)
It is possible to say something about the value which this feature should have in the
feature description. The following descriptions express the fact that in the structure 𝑓 ,
the feature TENSE must have the value PAST.
The value of a feature may also be a specific f-structure. The expression in (6) ensures
that the SUBJ feature in 𝑓 is the f-structure 𝑔:
224
7.1 General remarks on the representational format
(6) (𝑓 SUBJ) = 𝑔
N V′
David V
sneezed
The function 𝜙 from the NP-node to the f-structure corresponding to the NP is depicted
with an arrow marked 𝜙.
A phrase and its head always correspond to the same f-structure:
(10) V′ 𝜙
PRED ‘SNEEZE⟨SUBJ⟩’
V
TENSE PAST
sneezed
In LFG grammars of English, the CP/IP system is assumed as in GB theory (see Sec-
tion 3.1.5). IP, I ′ and I (and also VP) are mapped onto the same f-structure.
225
7 Lexical Functional Grammar
b. IP PRED ‘YAWN⟨SUBJ⟩’
TENSE PRES
NP I′
SUBJ PRED ‘DAVID’
N′ I VP
N is V′
David V
yawning
f-structures have to fulfill two well-formedness conditions: they have to be both complete
and coherent. Both these conditions will be discussed in the following sections.
7.1.2 Completeness
Every head adds a constraint of the PRED value of the corresponding f-structure. In
determining completeness, one has to check that the elements required in the PRED value
are actually realized. In (12b), OBJ is missing a value, which is why (12a) is ruled out by
the theory.
(12) a. * David devoured.
PRED ‘DEVOUR⟨SUBJ,OBJ⟩’
b. SUBJ PRED ‘DAVID’
7.1.3 Coherence
The Coherence Condition requires that all argument functions in a given f-structure
have to be selected in the value of the local PRED attribute. (13a) is ruled out because
COMP does not appear under the arguments of devour.
(13) a. * David devoured a sandwich that Peter sleeps.
PRED ‘DEVOUR⟨SUBJ,OBJ⟩’
SUBJ [ PRED ‘DAVID’ ]
SPEC A
OBJ
PRED ‘SANDWICH’
b.
PRED ‘SLEEP⟨SUBJ⟩’
COMP
SUBJ PRED ‘PETER’
The constraints on completeness and coherence together ensure that all and only those
arguments required in the PREDspecification are actually realized. Both of those con-
straints taken together correspond to the Theta-Criterion in GB theory (see page 92).4
4
For the differences between predicate-argument structures in LFG and the deep structure oriented Theta
Criterion, see Bresnan & Kaplan (1982: xxvi–xxviii).
226
7.1 General remarks on the representational format
(14) V′ → V
↑=↓
f-structure of the mother = own f-structure
V
(16) shows a V′ rule with an object:
(16) V′ → V NP
↑=↓ (↑ OBJ) = ↓
The annotation on the NP signals that the OBJ value in the f-structure of the mother
(↑ OBJ) is identical to the f-structure of the NP node, that is, to everything that is con-
tributed from the material below the NP node (↓). This is shown in the figure in (17):
h i
(17) V′ OBJ []
V NP
In the equation (↑ OBJ) = ↓, the arrows ‘↑’ and ‘↓’ correspond to feature structures. ‘↑’
and ‘↓’ stand for the 𝑓 and 𝑔 in equations such as (6).
(18) is an example with an intransitive verb and (19) is the corresponding visualization:
7.1.5 Semantics
Following Dalrymple (2006: 90–92), glue semantics is the dominant approach to seman-
tic interpretation in LFG (Dalrymple, Lamping & Saraswat 1993; Dalrymple 2001: Chap-
ter 8). There are, however, other variants where Kamp’s discourse representation struc-
tures (Kamp & Reyle 1993) are used (Frey & Reyle 1983b,a).
227
7 Lexical Functional Grammar
In the following, glue semantics will be presented in more detail.5 Under a glue-based
approach, it is assumed that f-structure is the level of syntactic representation which is
crucial for the semantic interpretation of a phrase, that is, unlike GB theory, it is not the
position of arguments in the tree which play a role in the composition of meaning, but
rather functional relations such as SUBJ and OBJ. Glue semantics assumes that each sub-
structure of the f-structure corresponds to a semantic resource connected to a meaning
and furthermore, that the meaning of a given f-structure comes from the sum of these
parts. The way the meaning is assembled is regulated by certain instructions for the com-
bination of semantic resources. These instructions are given as a set of logic premises
written in linear logic as glue language. The computation of the meaning of an utterance
corresponds to a logical conclusion.
This conclusion is reached on the basis of logical premises contributed by the words
in an expression or possibly even by a syntactic construction itself. The requirements on
how the meaning of the parts can be combined to yield the full meaning are expressed
in linear logic, a resource-based logic. Linear logic is different from classic logic in that
it does not allow that premises of conclusions are not used at all or more than once in
a derivation. Hence, in linear logic, premises are resources which have to be used. This
corresponds directly to the use of words in an expression: words contribute to the entire
meaning exactly once. It is not possible to ignore them or to use their meaning more
than once. A sentence such as Peter knocked twice. does not mean the same as Peter
knocked. The meaning of twice must be included in the full meaning of the sentences.
Similarly, the sentence cannot mean the same as Peter knocked twice twice., since the
semantic contribution of a given word cannot be used twice.
The syntactic structure for the sentence in (20a) together with its semantic represen-
tation is given in (20b):
(20) a. David yawned.
b. IP
𝜙
NP I′ " #
PRED ‘YAWN⟨SUBJ⟩’
N VP yawn′ (david′) : [ ]
SUBJ PRED ‘DAVID’
David V 𝜎
yawned
The semantic structure of this sentence is connected to the f-structure via the correspon-
dence function 𝜎 (depicted here as a dashed line). The semantic representation is derived
from the lexical information for the verb yawned, which is given in (21).
(21) 𝜆𝑥 .yawn′ (𝑥) : (↑ SUBJ)𝜎 −◦ ↑𝜎
This formula is referred to as the meaning constructor. Its job is to combine the meaning
of yawned – a one place predicate 𝜆𝑥 .yawn′ (𝑥) – with the formula (↑ SUBJ)𝜎 −◦ ↑𝜎 in
5
The following discussion heavily draws from the corresponding section of Dalrymple (2006). (It is a trans-
lation of my translation of the original material into German.)
228
7.1 General remarks on the representational format
linear logic. Here, the connective −◦ is the linear implication symbol of linear logic. The
symbol contains the meaning that if a semantic resource (↑ SUBJ)𝜎 for the meaning
of the subject is available, then a semantic resource for ↑𝜎 must be created which will
stand for the entire meaning of the sentence. Unlike the implication operator of classic
logic, the linear implication must consume and produce semantic resources: the formula
(↑ SUBJ)𝜎 −◦ ↑𝜎 states that if a semantic resource (↑ SUBJ)𝜎 is found, it is consumed and
the semantic resource ↑𝜎 is produced.
Furthermore, it is assumed that a proper name such as David contributes its own
semantic structure as a semantic resource. In an utterance such as David yawned, this
resource is consumed by the verb yawned, which requires a resource for its SUBJ in order
to produce the resource for the entire sentence. This corresponds to the intuition that a
verb in any given sentence requires the meaning of its arguments in order for the entire
sentence to be understood.
The f-structure of David yawned with the instantiated meaning construction con-
tributed by David and yawned is given in (22):
" #
(22) PRED ‘YAWN⟨SUBJ⟩’
𝑦:
SUBJ 𝑑 : PRED ‘DAVID’
[David] david ′ : 𝑑𝜎
[yawn] 𝜆𝑥 .yawn ′ (𝑥) : 𝑑𝜎 −◦ 𝑦𝜎
The left side of the meaning constructor marked by [David] is the meaning of the proper
name David, david ′ to be precise. The left-hand side of the meaning constructor [yawn]
is the meaning of the intransitive verb – a one-place predicate 𝜆𝑥 .yawn′ (𝑥).
Furthermore, one must still postulate further rules to determine the exact relation
between the right-hand side (the glue) of the meaning constructors in (22) and the left-
hand side (the meaning). For simple, non-implicational meaning constructors such as
[David] in (22), the meaning on the left is the same as the meaning of the semantic
structure on the right. Meaning constructors such as [yawn] have a 𝜆-expression on the
left, which has to be combined with another expression via functional application (see
Section 2.3). The linear implication on the right-hand side must be applied in parallel.
This combined process is shown in (23).
(23) 𝑥 : 𝑓𝜎
𝑃 : 𝑓𝜎 −◦ 𝑔𝜎
𝑃 (𝑥) : 𝑔𝜎
The right-hand side of the rule corresponds to a logical conclusion following the modus
ponens rule. With these correspondences between expressions in linear logic and the
meanings themselves, we can proceed as shown in (24), which is based on Dalrymple
(2006: 92). After combining the respective meanings of yawned and David and then
carrying out 𝛽-reduction, we arrive at the desired result of yawn′ (david′) as the meaning
of David yawned.
229
7 Lexical Functional Grammar
Glue analyses of quantification, modification and other phenomena have been investi-
gated in a volume on glue semantics (Dalrymple 1999). Particularly problematic for these
approaches are cases where there appear to be too many or too few resources for the
production of utterances. These kinds of cases have been discussed by Asudeh (2004).
7.1.6 Adjuncts
Adjuncts are not selected by their head. The grammatical function ADJ is a non-govern-
able grammatical function. Unlike arguments, where every grammatical function can
only be realized once, a sentence can contain multiple adjuncts. The value of ADJ in the
f-structure is therefore not a simple structure as with the other grammatical functions,
but rather a set. For example, the f structure for the sentence in (25a) contains an ADJ set
with two elements: one for yesterday and one for at noon.
(25) a. David devoured a sandwich at noon yesterday.
PRED ‘DEVOUR⟨SUBJ,OBJ⟩’
SUBJ PRED ‘DAVID’
SPEC A
OBJ
b. PRED ‘SANDWICH’
PRED ‘AT⟨OBJ⟩’
ADJ PRED ‘YESTERDAY’ ,
OBJ PRED ‘NOON’
The annotation on the c-structure rule for adjuncts requires that the f-structure of the
adjuncts be part of the ADJ set of the mother’s f-structure:
(26) V′ → V′ PP
↑=↓ ↓ ∈ (↑ ADJ)
230
7.2 Passive
7.2 Passive
Bresnan & Mchombo (1995) argue that one should view words as “atoms” of which syn-
tactic structure is comprised (lexical integrity 6 ).
Syntactic rules cannot create new words or make reference to the internal structure
of words. Every terminal node (each “leaf” of the tree) is a word. It follows from this that
analyses such as the GB analysis of Pollock (1989) in Figure 7.1 on the next page for the
French example in (27) are ruled out (the figure is taken from Kuhn 2007: 617):
In Pollock’s analysis, the various morphemes are in specific positions in the tree and are
combined only after certain movements have been carried out.
The assumption of lexical integrity is made by all theories discussed in this book with
the exception of GB and Minimalism. However, formally, this is not a must as it is also
possible to connect morphemes to complex syntactic structures in theories such as Cat-
egorial Grammar, GPSG, HPSG, CxG, DG and TAG (Müller 2018b: Section 4). As far as I
know, this kind of analysis has never been proposed.
Bresnan noticed that, as well as passivized verbs, there are passivized adjectives which
show the same morphological idiosyncrasies as the corresponding participles (Bresnan
1982b: 21; Bresnan 2001: 31). Some examples are given in (28):
If one assumes lexical integrity, then adjectives would have to be derived in the lexicon.
If the verbal passive were not a lexical process, but rather a phrase-structural one, then
the form identity would remain unexplained.
In LFG, grammatical functions are primitives, that is, they are not derived from a posi-
tion in the tree (e.g., Subject = SpecIP). Words (fully inflected word-forms) determine the
6
See Anderson (1992: 84) for more on lexical integrity.
231
7 Lexical Functional Grammar
AgrP
Spec-AgrP Agr′
Agr NegP
pas Neg TP
ne Spec-TP T′
T VP
-er- Spec-VP V′
Marie V
parl-
Figure 7.1: Pollock’s analysis of Marie ne parlerait pas ‘Marie would not speak.’ according
to Kuhn (2007: 617)
232
7.2 Passive
in a universally valid hierarchy (Bresnan & Kanerva 1989; Bresnan 2001: 307): agent >
beneficiary > experiencer/goal > instrument > patient/theme > locative. Patient-like
roles are marked as unrestricted ([−r]) in a corresponding representation, the so-called
a-structure. Secondary patient-like roles are marked as objective ([+o]) and all other roles
are marked as non-objective ([−o]). For the transitive verb schlagen ‘to beat’, we have
the following:
(30) Agent Patient
a-structure schlagen ‘beat’ ⟨x y ⟩
[−o] [−r]
The mapping of a-structure to f-structure is governed by the following restrictions:
(31) a. Subject-Mapping-Principle: The most prominent role marked with [−o] is
mapped to SUBJ if it is initial in the a-structure. Otherwise, the role marked
with [−r] is mapped to SUBJ.
b. The argument roles are connected to grammatical functions as shown in the
following table. Non-specified values for o and r are to be understood as ‘+’:
[−r] [+r]
[−o] SUBJ OBL𝜃
[+o] OBJ OBJ𝜃
c. Function-Argument Biuniqueness: Every a-structure role must be associated
to exactly one function and vice versa.
For the argument structure in (30), the principle in (31a) ensures that the agent x receives
the grammatical function SUBJ. (31b) adds an o-feature with the value ‘+’ so that the
patient y is associated with OBJ:
(32) Agent Patient
a-structure schlagen ‘beat’ ⟨x y ⟩
[−o] [−r]
SUBJ OBJ
Under passivization, the most prominent role is suppressed so that only the [−r] marked
patient role remains. Following (31a), this role will then be mapped to the subject.
(33) Agent Patient
a-structure schlagen ‘beat’ ⟨x y ⟩
[−o] [−r]
∅ SUBJ
Unlike the objects of transitive verbs, the objects of verbs such as helfen ‘help’ are marked
as [+o] (Berman 1999). The lexical case of the objects is given in the a-structure, since
this case (dative) is linked to a semantic role (Zaenen, Maling & Thráinsson 1985: 465).
The corresponding semantic roles are obligatorily mapped to the grammatical function
OBJ𝜃 .
233
7 Lexical Functional Grammar
Since there is neither a [−o] nor a [−r] argument, no argument is connected to the sub-
ject function. The result is an association of arguments and grammatical functions that
corresponds to the one found in impersonal passives.
These mapping principles may seem complex at first glance, but they play a role in
analyzing an entire range of phenomena, e.g., the analysis of unaccusative verbs (Bres-
nan & Zaenen 1990). For the analysis of the passive, we can now say that the passive
suppresses the highest [−o] role. Mentioning an eventual object in the passive rule is no
longer necessary.
• a trace in verb-final position (as in GB) (see Choi 1999, Berman 1996: Section 2.1.4)
and
• so-called extended head domains (see Berman 2003a).
In the analysis of extended head domains, the verb is simply omitted from the verb
phrase. The following preliminary variant of the VP rule is used:7
(36) VP → NP* (V)
All components of the VP are optional as indicated by the brackets and by the Kleene
star. The Kleene star stands for arbitrarily many occurrences of a symbol. This also in-
cludes zero occurrences. As in GB analyses, the verb in verb-first clauses is in C. No I
projection is assumed – as in a number of GB works (Haider 1993; 1995; 1997a; Sterne-
feld 2006: Section IV.3), since it is difficult to motivate its existence for German (Berman
2003a: Section 3.2.2). The verb contributes its f-structure information from the C posi-
tion. Figure 7.2 on the facing page contains a simplified version of the analysis proposed
by Berman (2003a: 41).
7
See Bresnan (2001: 110), Zaenen & Kaplan (2002: 413) and Dalrymple (2006: 84) for corresponding rules
with optional constituents on the right-hand side of the rule. Zaenen & Kaplan (2002: 413) suggest a rule
that is similar to (36) for German.
234
7.4 Local reordering
pred ‘VERSCHLINGENhSUBJ,OBJi’
CP
subj pred ‘DAVID’
↑=↓ tense PRES
C
obj pred ‘APFEL’
↑=↓ ↑=↓
C VP
(↑ subj) = ↓ (↑ obj) = ↓
NP NP
After what we learned about phrase structure rules in Chapters 2 and 5, it may seem
strange to allow VPs without V. This is not a problem in LFG, however, since for the
analysis of a given sentence, it only has to be ensured that all the necessary parts (and
only these) are present. This is ensured by the constraints on completeness and coher-
ence. Where exactly the information comes from is not important. In Figure 7.2, the verb
information does not come from the VP, but rather from the C node. C′ is licensed by a
special rule:
(37) C′ → C VP
↑=↓ ↑=↓
In LFG rules, there is normally only one element annotated with ‘↑ = ↓’, namely the
head. In (37), there are two such elements, which is why both equally contribute to the f-
structure of the mother. The head domain of V has been extended to C. The information
about SUBJ and OBJ comes from the VP and the information about PREDfrom C.
235
7 Lexical Functional Grammar
If one assumes that traces are relevant for the semantic interpretation of a given struc-
ture, then the first option has the same problems as movement-based GB analyses. These
have already been discussed in Section 3.5.
In what follows, I will present the analysis proposed by Berman (1996: Section 2.1.3)
in a somewhat simplified form. Case and grammatical functions of verbal arguments are
determined in the lexicon (Berman 1996: 22). (38) shows the lexical entry for the verb
verschlingen ‘devour’:9,10
(38) verschlingt V (↑ PRED) = ‘VERSCHLINGEN⟨SUBJ, OBJ⟩’
(↑ SUBJ AGR CAS) = NOM
(↑ OBJ AGR CAS) = ACC
(↑ TENSE) = PRES
9
The four cases in German can be represented using two binary features (GOV, OBL) (Berman 1996: 22).
Nominative corresponds to GOV− and OBL− and accusative to GOV+ and OBL−. This kind of encoding
allows one to leave case partially underspecified. If one does not provided a value for GOV, then an
element with OBL− is compatible with both nominative and accusative. Since this underspecification is
not needed in the following discussion, I will omit this feature decomposition and insert the case values
directly.
10
Alternative analyses derive the grammatical function of an NP from its case (Berman 2003a: 37 for German;
Bresnan 2001: 187, 201 for German and Russian).
Karttunen (1989: Section 2.1) makes a similar suggestion for Finnish in the framework of Categorial Gram-
mar. Such analyses are not entirely unproblematic as case cannot always be reliably paired with grammat-
ical functions. In German, as well as temporal accusatives (ii.a), there are also verbs with two accusative
objects (ii.b–c) and predicative accusatives (ii.d).
All of these accusatives can occur in long-distance dependencies (see Section 7.5):
wen is not the object of glauben ‘believe’ and as such cannot be included in the f-structure of glauben
‘believe’. One would have to reformulate the implication in (i) as a disjunction of all possible grammatical
functions of the accusative and in addition account for the fact that accusatives can come from a more
deeply embedded f-structure.
Bresnan (2001: 202) assumes that nonlocal dependencies crossing a clause involve a gap in German.
With such a gap one can assume that case is only assigned locally within the verbal projection. In
any case one would have to distinguish several types of frontings in German and the specification of
case/grammatical function interaction would be much more complicate than (i).
236
7.5 Long-distance dependencies and functional uncertainty
Berman proposes an analysis that does not combine the verb with all its arguments
and adjuncts at the same time, as was the case in GPSG. Instead, she chooses the other
extreme and assumes that the verb is not combined with an adjunct or an argument, but
rather forms a VP directly. The rule for this is shown in (39):
(39) VP → (V)
↑=↓
At first sight, this may seem odd since a V such as verschlingen ‘devour’ does not have
the same distribution as a verb with its arguments. However, one should recall that the
constraints pertaining to coherence and completeness of f-structures play an important
role so that the theory does not make incorrect predictions.
Since the verb can occur in initial position, it is marked as optional in the rule in (39)
(see Section 7.3).
The following rule can be used additionally to combine the verb with its subject or
object.
(40) VP → NP VP
(↑ SUBJ |OBJ |OBJ𝜃 ) = ↓ ↑=↓
The ‘|’ here stands for a disjunction, that is, the NP can either be the subject or the object
of the superordinate f-structure. Since VP occurs both on the left and right-hand side
of the rule in (40), it can be applied multiple times. The rule is not complete, however.
For instance, one has to account for prepositional objects, for clausal arguments, for
adjectival arguments and for adjuncts. See footnote 12 on page 241.
Figure 7.3 on the next page shows the analysis for (41a).
(41) a. [dass] David den Apfel verschlingt
that David the apple devours
‘that David is devouring the apple’
b. [dass] den Apfel David verschlingt
that the apple David devours
The analysis of (41b) is shown in Figure 7.4 on the following page. The analysis of (41b)
differs from the one of (41a) only in the order of the replacement of the NP node by the
subject or object.
One further fact must be discussed: in the rule (39), the verb is optional. If it is omitted,
the VP is empty. In this way, the VP rule in (40) can have an empty VP on the right-hand
side of the rule. This VP is also simply omitted even though the VP symbol in the right-
hand side of rule (40) is not marked as optional. That is, the corresponding symbol then
also becomes optional as a result of taking the rest of the grammar into consideration as
well as possible interactions with other rules.
VP pred ‘VERSCHLINGENhSUBJ,OBJi’
subj pred ‘DAVID’
(↑ subj) = ↓
VP
NP tense PRES
(↑ obj) = ↓ VP
obj pred ‘APFEL’
NP
V
VP pred ‘VERSCHLINGENhSUBJ,OBJi’
subj pred ‘DAVID’
(↑ obj) = ↓
VP
NP tense PRES
(↑ subj) = ↓ VP
obj pred ‘APFEL’
NP
V
238
7.5 Long-distance dependencies and functional uncertainty
those that are created by a fixed syntactic mechanism and that interact with the rest of
the syntax.
Unlike argument functions, the discourse functions TOPIC and FOCUS are not lexically
subcategorized and are therefore not subject to the completeness and coherence condi-
tions. The values of discourse function features like TOPIC and FOCUS are identified with
an f-structure that bears an argument function. (43) gives the f-structure for the sentence
in (42):
PRED ‘THINK⟨ SUBJ, COMP⟩’
TOPIC PRED ‘CHRIS’
SUBJ PRED ‘pro’
(43)
PRED ‘SEE⟨ SUBJ, OBJ⟩’
COMP SUBJ PRED ‘DAVID’
OBJ
The connecting line means that the value of TOPIC is identical to the value of COMP|OBJ. In
Chapter 6 on feature descriptions, I used boxes for structure sharing rather than connect-
ing lines, since boxes are more common across frameworks. It is possible to formulate
the structure sharing in (43) as an f-structure constraint as in (44):
(44) (↑ TOPIC) = (↑ COMP OBJ)
Fronting operations such as (42) are possible from various levels of embedding: for in-
stance, (45a) shows an example with less embedding. The object is located in the same f-
structure as the topic. However, the object in (42) comes from a clause embedded under
think.
The f-structure corresponding to (45a) is given in (45b):
(45) a. Chris, we saw.
PRED ‘SEE⟨ SUBJ, OBJ⟩’
TOPIC PRED ‘CHRIS’
b.
SUBJ PRED ‘pro’
OBJ
The identity restriction for TOPIC and object can be formulated in this case as in (46):
(46) (↑ TOPIC) = (↑ OBJ)
Example (47a) shows a case of even deeper embedding than in (42) and (47b,c) show the
corresponding f-structure and the respective restriction.
239
7 Lexical Functional Grammar
The restrictions in (44), (46) and (47c) are c-structure constraints. The combination of a
c-structure with (44) is given in (48):
(48) CP → XP C′
(↑ TOPIC) = ↓ ↑=↓
(↑ TOPIC) = (↑ COMP OBJ)
(48) states that the first constituent contributes to the TOPIC value in the f-structure of
the mother and furthermore that this topic value has to be identical to that of the object
in the complement clause. We have also seen examples of other embeddings of various
depths. We therefore need restrictions of the following kind as in (49):
(49) a. (↑ TOPIC) = (↑ OBJ)
b. (↑ TOPIC) = (↑ COMP OBJ)
c. (↑ TOPIC) = (↑ COMP COMP OBJ)
d. …
The generalization emerging from these equations is given in (50):
(50) (↑ TOPIC) = (↑ COMP* OBJ)
Here, ‘*’ stands for an unrestricted number of occurrences of COMP. This means of
leaving the possible identification of discourse and grammatical function open is known
as functional uncertainty, see Kaplan & Zaenen (1989).
As was shown in the discussion of examples (2) and (3) on page 224, it is not the case
that only a TOPIC can be placed in the specifier position of CP in English as FOCUS can oc-
cur there too. One can use disjunctions in LFG equations and express the corresponding
condition as follows:
240
7.6 Summary and classification
One can introduce a special symbol for TOPIC|FOCUS, which stands for a disjunction of
discourse functions: DF. (51) can then be abbreviated as in (52):
The final version of the c-structure rule for fronting in English will therefore have the
form of (53):11
(53) CP → XP C′
(↑ DF) = ↓ ↑=↓
(↑ DF) = (↑ COMP* OBJ)
In German, as well as objects, nearly any other constituent (e.g., subjects, sentential
complements, adjuncts) can be fronted. The c-structure rule for this is shown in (54):12
(54) CP → XP C′
(↑ DF) = ↓ ↑=↓
(↑ DF) = (↑ COMP* GF)
Here, GF is an abbreviation for a disjunction of grammatical functions which can occur
in the prefield. Figure 7.5 shows the analysis of the sentence in (55):
(55) Den Apfel verschlingt David.
the.ACC apple devours David.NOM
‘David is devouring the apple.’
Neither the finite verb nor the object den Apfel ‘the apple’ is realized within the VP. The
finite verb is realized in C and contributes its f-structure information as a co-head to the
VP to the VP f-structure. The NP in the prefield adds information to the f-structure of
the sentence under TOPIC, which is one of the options to resolve DF, and the TOPIC value
is identified with the OBJ grammatical function via the functional uncertainty equation
(↑ DF) = (↑ COMP* GF).
241
7 Lexical Functional Grammar
CP pred ‘VERSCHLINGENhSUBJ,OBJi’
pred ‘DAVID’
subj
(↑df)= (↑comp* gf) ↑=↓
case nom
(↑df)=↓ C tense
PRES
NP
pred ‘APFEL’
topic
↑=↓ ↑=↓
case acc
C VP obj
(↑ subj) = ↓
NP
242
7.6 Summary and classification
Criterion and Case Theory in GB. LFG can explicitly differentiate between subjects and
non-subjects. In GB, on the other hand, a clear distinction is made between external and
internal arguments (see Williams 1984: Section 1.2). In some variants of GB, as well as
in HPSG and CxG, the argument with subject properties (if there is one) is marked ex-
plicitly (Haider 1986a; Heinz & Matiasek 1994; Müller 2003b; Michaelis & Ruppenhofer
2001). This special argument is referred to as the designated argument. In infinitival con-
structions, subjects are often not expressed inside the infinitival phrase. Nevertheless,
the unexpressed subject is usually coreferential with an argument of the matrix verb:
This is a fact that every theory needs to be able to capture, that is, every theory must be
able to differentiate between subjects and non-subjects.
For a comparison of GB/Minimalism and LFG/HPSG, see Kuhn (2007).
Comprehension questions
Exercises
243
7 Lexical Functional Grammar
Further reading
Section 7.1 was based extensively on the textbook and introductory article of
Dalrymple (2001; 2006). Additionally, I have drawn from teaching materials of
Jonas Kuhn from 2007. Bresnan (2001) is a comprehensive textbook in English for
the advanced reader. Some of the more in-depth analyses of German in LFG are
Berman (1996; 2003a). Schwarze & de Alencar (2016) is an introduction to LFG
that uses French examples. The authors demonstrate how the XLE system can be
used for the development of a French LFG grammar. The textbook also discusses
the Finite State Morphology component that comes with the XLE system.
Levelt (1989) developed a model of language production based on LFG. Pinker
(1984) – one of the best-known researchers on language acquisition – used LFG
as the model for his theory of acquisition. For another theory on first and second
language acquisition that uses LFG, see Pienemann (2005).
Wechsler & Asudeh (2020) compare LFG with HPSG.
244
8 Categorial Grammar
Categorial Grammar is the second oldest of the approaches discussed in this book. It was
developed in the 30s by the Polish logician Kazimierz Ajdukiewicz (Ajdukiewicz 1935).
Since syntactic and semantic descriptions are tightly connected and all syntactic combi-
nations correspond to semantic ones, Categorial Grammar is popular amongst logicians
and semanticists. Some stellar works in the field of semantics making use of Catego-
rial Grammar are those of Richard Montague (1974). Other important works come from
David Dowty in Columbus, Ohio (1979), Michael Moortgat in Utrecht (1989), Glyn Morrill
in Barcelona (1994), Bob Carpenter in New York (1998) and Mark Steedman in Edinburgh
(1991; 1997; 2000). A large fragment for German using Montague Grammar has been de-
veloped by von Stechow (1979). The 2569-page grammar of the Institut für Deutsche
Sprache in Mannheim (Eroms, Stickel & Zifonun 1997) contains Categorial Grammar
analyses in the relevant chapters. Fanselow (1981) worked on morphology in the frame-
work of Montague Grammar. Uszkoreit (1986a), Karttunen (1986; 1989) and Calder, Klein
& Zeevat (1988) developed combinations of unification-based approaches and Categorial
Grammar.
The basic operations for combining linguistic objects are rather simple and well-under-
stood so that it is no surprise that there are many systems for the development and pro-
cessing of Categorial Grammars (Yampol & Karttunen 1990; Carpenter 1994; Bouma &
van Noord 1994; Lloré 1995; König 1999; Moot 2002; White & Baldridge 2003; Baldridge,
Chatterjee, Palmer & Wing 2007; Morrill 2012; 2017). An important contribution has
been made by Mark Steedman’s group (see for instance Clark, Hockenmaier & Steed-
man 2002; Clark & Curran 2007).
Implemented fragments exist for the following languages:
• German (Uszkoreit 1986a; König 1999; Vierhuff, Hildebrandt & Eikmeyer 2003;
Vancoppenolle, Tabbert, Bouma & Stede 2011)
In addition, Baldridge, Chatterjee, Palmer & Wing (2007: 15) mention an implementation
for Classical Arabic.
Some of the systems for the processing of Categorial Grammars have been augmented
by probabilistic components so that the processing is robust (Osborne & Briscoe 1997;
Clark, Hockenmaier & Steedman 2002). Some systems can derive lexical items from
corpora, and Briscoe (2000) and Villavicencio (2002) use statistical information in their
UG-based language acquisition models.
246
8.1 General remarks on the representational format
𝑐ℎ𝑎𝑠𝑒𝑑 𝑀𝑎𝑟𝑦
𝑣𝑝/𝑛𝑝 𝑛𝑝
>
𝑣𝑝
vp
vp/np np
chased Mary
One usually assumes left associativity for ‘/’; that is, (vp/pp)/np = vp/pp/np.
If we look at the lexical entries in (1), it becomes apparent that the category v does
not appear. The lexicon only determines what the product of combination of a lexical
entry with its arguments is. The symbol for vp can also be eliminated: an (English) vp is
something that requires an NP to its left in order to form a complete sentence. This can
be represented as s\np. Using the rule for backward application, it is possible to compute
derivations such as the one in Figure 8.3.
(3) Backward application:
Y ∗ X\Y = X
In Categorial Grammar, there is no explicit difference made between phrases and words:
an intransitive verb is described in the same way as a verb phrase with an object: s\np.
Equally, proper nouns are complete noun phrases, which are assigned the symbol np.
8.1.2 Semantics
As already mentioned, Categorial Grammar is particularly popular among semanticists
as syntactic combinations always result in parallel semantic combinations and even for
complex combinations such as those we will discuss in more detail in the following sec-
tions, there is a precise definition of meaning composition. In the following, we will take
a closer look at the representational format discussed in Steedman (1997: Section 2.1.2).
247
8 Categorial Grammar
Steedman proposes the following lexical entry for the verb eats:1
(4) eats := (s: eat ′(x, y)\np3S :x)/np:y
In (4), the meaning of each category is given after the colon. Since nothing is known
about the meaning of the arguments in the lexical entry of eat, the meaning is repre-
sented by the variables 𝑥 and 𝑦. When the verb combines with an NP, the denotation of
the NP is inserted. An example is given in (5):2
(5) (𝑠 : 𝑒𝑎𝑡 ′ (𝑥, 𝑦)\𝑛𝑝 3𝑆 : 𝑥)/𝑛𝑝 : 𝑦 𝑛𝑝 : 𝑎𝑝𝑝𝑙𝑒𝑠 ′
>
𝑠 : 𝑒𝑎𝑡 ′ (𝑥, 𝑎𝑝𝑝𝑙𝑒𝑠 ′)\𝑛𝑝 3𝑆 : 𝑥
When combining a functor with an argument, it must be ensured that the argument fits
the functor, that is, it must be unifiable with it (for more on unification see Section 6.6).
The unification of np:y with np: apples ′ results in np: apples ′ since apples ′ is more specific
than the variable y. Apart from its occurrence in the term np:y, y occurs in the description
of the verb in another position (s: eat ′(x, y)\np3S :x) and therefore also receives the value
apples ′ there. Thus, the result of this combination is s: eat ′(x, apples ′)\np3𝑆 :x as shown
in (5).
Steedman notes that this notation becomes less readable with more complex deriva-
tions and instead uses the more standard 𝜆-notation:
(6) eats := (s\np3S )/np: 𝜆𝑦.𝜆𝑥 .eat′ (𝑥, 𝑦)
Lambdas are used to allow access to open positions in complex semantic representations
(see Section 2.3). A semantic representation such as 𝜆𝑦.𝜆𝑥 .eat′ (𝑥, 𝑦) can be combined
with the representation of apples by removing the first lambda expression and inserting
the denotation of apples in all the positions where the corresponding variable (in this
case, y) appears (see Section 2.3 for more on this point):
(7) 𝜆𝑦.𝜆𝑥 .𝑒𝑎𝑡 ′ (𝑥, 𝑦) apples ′
𝜆𝑥 .𝑒𝑎𝑡 ′ (𝑥, 𝑎𝑝𝑝𝑙𝑒𝑠 ′)
This removal of lambda expressions is called 𝛽-reduction.
If we use the notation in (6), the combinatorial rules must be modified as follows:
(8) X/Y:f * Y:a = X: f a
Y:a * X\Y:f = X: f a
In such rules, the semantic contribution of the argument (a) is written after the seman-
tic denotation of the functor (f). The open positions in the denotation of the functor
are represented using lambdas. The argument can be combined with the first lambda
expression using 𝛽-reduction.
Figure 8.4 on the facing page shows the derivation of a simple sentence with a transi-
tive verb. After forward and backward application, 𝛽-reduction is immediately applied.
1
I have adapted his notation to correspond to the one used in this book.
2
The assumption that apples means apples′ and not apples′ (z) minus the quantifier contribution is a simpli-
fication here.
248
8.2 Passive
8.1.3 Adjuncts
As noted in Section 1.6, adjuncts are optional. In phrase structure grammars, this can be
captured, for example, by rules that have a certain element (for instance a VP) on the
left-hand side of the rule and the same element and an adjunct on the right-hand side of
the rule. Since the symbol on the left is the same as the one on the right, this rule can be
applied arbitrarily many times. (9) shows some examples of this:
(9) a. VP → VP PP
b. Noun → Noun PP
One can analyze an arbitrary amount of PPs following a VP or noun using these rules.
In Categorial Grammar, adjuncts have the following general form: X\X or X/X. Ad-
jectives are modifiers, which must occur before the noun. They have the category n/n.
Modifiers occurring after nouns (prepositional phrases and relative clauses) have the
category n\n instead.3 For VP-modifiers, X is replaced by the symbol for the VP (s\np)
and this yields the relatively complex expression (s\np)\(s\np). Adverbials in English are
VP-modifiers and have this category. Prepositions that can be used in a PP modifying
a verb require an NP in order to form a complete PP and therefore have the category
((s\np)\(s\np))/np. Figure 8.5 on the next page gives an example of an adverb (quickly)
and a preposition (round). Note that the result of the combination of round and the gar-
den corresponds to the category of the adverb ((s\np)\(s\np)). In GB theory, adverbs and
prepositions were also placed into a single class (see page 94). This overarching class
was then divided into subclasses based on the valence of the elements in question.
8.2 Passive
In Categorial Grammar, the passive is analyzed by means of lexical rule (Dowty 1978:
412; Dowty 2003: Section 3.4). (10) shows the rule in Dowty (2003: 49).
(10) Syntax: 𝛼 ∈ (s\np)/np → PST-PART(𝛼) ∈ PstP/np𝑏𝑦
Semantics: 𝛼′ → 𝜆𝑦𝜆𝑥𝛼 ′ (𝑦)(𝑥)
3
In Categorial Grammar, there is no category symbol like X for intermediate projections of X theory. So
rather than assuming N/N, CG uses n/n. See Exercise 2.
249
8 Categorial Grammar
Here, PstP stands for past participle and np𝑏𝑦 is an abbreviation for a verb phrase mod-
ifier of the form vp\vp or rather (s\np)\(s\np). The rule says the following: if a word
belongs to the set of words with the category (s\np)/np, then the word with past partici-
ple morphology also belongs in the set of words with the category PstP/np𝑏𝑦 .
(11a) shows the lexical entry for the transitive verb touch and (11b) the result of rule
application:
(11) a. touch: (s\np)/np
b. touched: PstP/np𝑏𝑦
The auxiliary was has the category (s\np)/PstP and the preposition by has the category
np𝑏𝑦 /np, or its unabbreviated form ((s\np)\(s\np))/np. In this way, (12) can be analyzed
as in Figure 8.6.
(12) John was touched by Mary.
The question as to how to analyze the pair of sentences in (13) still remains unanswered.4
(13) a. He gave the book to Mary.
b. The book was given to Mary.
gave has the category ((s\np)/pp)/np, that is, the verb must first combine with an NP
(the book) and a PP (to Mary) before it can be combined with the subject. The problem
4
Thanks to Roland Schäfer (p. c., 2009) for pointing out these data to me.
250
8.3 Verb position
is that the rule in (10) cannot be applied to gave with a to-PP since the pp argument is
sandwiched between both np arguments in ((s\np)/pp)/np. One would have to generalize
the rule in (10) somehow by introducing new technical means5 or assume additional rules
for cases such as (13b).
Steedman uses the feature SUB to differentiate between subordinate and non-subordinate
sentences. Both lexical items are related via lexical rules.
One should note here that the NPs are combined with the verb in different orders. The
normal order is:
The corresponding derivations for German sentences with a bivalent verb are shown in
Figures 8.7 and 8.8.
𝑒𝑟 𝑖ℎ𝑛 𝑖𝑠𝑠𝑡
𝑛𝑝 [𝑛𝑜𝑚] 𝑛𝑝 [𝑎𝑐𝑐] (𝑠 +SUB \𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐]
<
𝑠 +SUB \𝑛𝑝 [𝑛𝑜𝑚]
<
𝑠 +SUB
𝑖𝑠𝑠𝑡 𝑒𝑟 𝑖ℎ𝑛
((𝑠 −SUB /𝑛𝑝 [𝑎𝑐𝑐])/𝑛𝑝 [𝑛𝑜𝑚] 𝑛𝑝 [𝑛𝑜𝑚] 𝑛𝑝 [𝑎𝑐𝑐]
>
𝑠 −SUB /𝑛𝑝 [𝑎𝑐𝑐]
>
𝑠 −SUB
In Figure 8.7, the verb is first combined with an accusative object, whereas in Fig-
ure 8.8, the verb is first combined with the subject. For criticism of these kinds of analy-
ses with variable branching, see Netter (1992) and Müller (2005b; 2017a).
5
Baldridge (p. M. 2010) suggests using regular expressions in a general lexical rule for passive.
251
8 Categorial Grammar
Jacobs (1991) developed an analysis which corresponds to the verb movement analysis
in GB. He assumes verb-final structures, that is, there is a lexical entry for verbs where
arguments are selected to the left of the verb. A transitive verb would therefore have
the entry in (16a). Additionally, there is a trace in verb-final position that requires the
arguments of the verb and the verb itself in initial position. (16b) shows what the verb
trace looks like for a transitive verb in initial position:
(16) a. Verb in final position:
(s\np[nom])\np[acc]
b. Verb trace for the analysis of verb-first:
((s\((s\np[nom])\np[acc]))\np[nom])\np[acc]
The entry for the verb trace is very complex. It is probably simpler to examine the
analysis in Figure 8.9.
𝑖𝑠𝑠𝑡 𝑒𝑟 𝑖ℎ𝑛 _
(𝑠\𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐] 𝑛𝑝 [𝑛𝑜𝑚] 𝑛𝑝 [𝑎𝑐𝑐] (((𝑠\(𝑠\𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐])\𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐]
<
(𝑠\(𝑠\𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐])\𝑛𝑝 [𝑛𝑜𝑚]
<
𝑠\((𝑠\𝑛𝑝 [𝑛𝑜𝑚])\𝑛𝑝 [𝑎𝑐𝑐])
<
𝑠
The trace is the head in the entire analysis: it is first combined with the accusative
object and then with the subject. In a final step, it is combined with the transitive verb in
initial-position.6 A problem with this kind of analysis is that the verb isst ‘eats’, as well
as er ‘he’ and ihn ‘him’/‘it’, are arguments of the verb trace in (17).
(17) Morgen [isst [er [ihn _]]]
tomorrow eats he him
‘He will eat it/him tomorrow.’
Since adjuncts can occur before, after or between arguments of the verb in German,
one would expect that morgen ‘tomorrow’ can occur before the verb isst, since isst is
just a normal argument of the verbal trace in final position. As adjuncts do not change
the categorial status of a projection, the phrase morgen isst er ihn ‘tomorrow he eats
him’ should be able to occur in the same positions as isst er ihn. This is not the case,
however. If we replace isst er ihn by morgen isst er ihn in (18a), the result is (18b), which
is ungrammatical.
(18) a. Deshalb isst er ihn.
therefore eats he him
‘Therefore he eats it/him.’
b. * Deshalb morgen isst er ihn.
therefore tomorrow eats he him
6
See Netter (1992) for a similar analysis in HPSG.
252
8.4 Local reordering
An approach which avoids this problem comes from Kiss & Wesche (1991) (see Sec-
tion 9.3). Here, the authors assume that there is a verb in initial position which selects a
projection of the verb trace. If adverbials are only combined with verbs in final-position,
then a direct combination of morgen ‘tomorrow’ and isst er ihn ‘eats he it’ is ruled out.
If one assumes that the verb in first-position is the functor, then it is possible to capture
the parallels between complementizers and verbs in initial position (Höhle 1997): finite
verbs in initial position differ from complementizers only in requiring a projection of a
verb trace, whereas complementizers require projections of overt verbs:
(19) a. dass [er ihn isst]
that he it eats
b. Isst [er ihn _ ]
eats he it
This description of verb position in German captures the central insights of the GB analy-
sis in Section 3.2.
254
8.5 Long-distance dependencies
These rules will be explained using forward composition as an example. (23a) can be
understood as follows: X/Y more or less means; if I find a Y, then I am a complete X.
In the combinatorial rule, X/Y is combined with Y/Z. Y/Z stands for a Y that is not yet
complete and is still missing a Z. The requirement that Y must find a Z in order to be
complete is postponed: we pretend that Y is complete and use it anyway, but we still
bear in mind that something is actually still missing. Hence, if we combine X/Y with
Y/Z, we get something which becomes an X when combined with a Z.
infinitive form, have requires a participle and been must combine with a present partici-
ple. In the above figure, the arrow with a small ‘T’ stands for type raising, whereas the
arrows with a ‘B’ indicate composition. The direction of composition is shown by the
direction of the arrow.
For the analysis of (21a), we are still missing one small detail, a rule that turns the
NP at the beginning of the sentence into a functor which can be combined with s/np.
Normal type raising cannot handle this because it would produce s/(s\np) when s/(s/np)
is required.
Steedman (1989a: 217) suggests the rule in (24):
(24) Topicalization (↑):
X ⇒ st/(s/X)
where X ∈ { np, pp, vp, ap, s′ }
st stands for a particular type of sentence (s), namely one with topicalization (t). The ⇒
expresses that one can type raise any X into an st/(s/X).
If we replace X with np, we can turn these apples into st/(s/np) and complete the
analysis of (21a) as shown in Figure 8.11 on the next page. The mechanism presented here
will of course also work for dependencies that cross sentence boundaries. Figure 8.12 on
the following page shows the analysis for (25):
(25) Apples, I believe that Harry eats.
255
8 Categorial Grammar
Using the previously described tools, it is, however, only possible to describe extractions
where the fronted element in the sentence would have occurred at the right edge of the
phrase without fronting. This means it is not possible to analyze sentences where the
middle argument of a ditransitive verb has been extracted (Steedman 1985: 532). Pollard
(1988: 406) provides the derivation in Figure 8.13 for (26).
(26) Fido we put downstairs.
In this analysis, it is not possible to combine we and put using the rule in (23a) since
(s\np) is not directly accessible: breaking down ((s\np)/pp)/np into functor and argument
gives us ((s\np)/pp) and np. In order to deal with such cases, we need another variant of
composition:
256
8.6 Summary and classification
If we assume that verbs can have up to four arguments (z. B. buy: buyer, seller, goods,
price), then it would be necessary to assume a further rule for composition as well as
another topicalization rule. Furthermore, one requires a topicalization rule for subject
extraction (Pollard 1988: 405). Steedman has developed a notation which provides a
compact notation of the previously discussed rules, but if one considers what exactly
these representations stand for, one still arrives at the same number of rules that have
been discussed here.
(30) (n\n)/(s/np)
257
8 Categorial Grammar
This means the following: if there is a sentence missing an NP to the right of a relative
pronoun, then the relative pronoun can form an N-modifier (n\n) with this sentence.
The relative pronoun is the head (functor) in this analysis.
Utilizing both additional operations of type raising and composition, the examples
with relative clauses can be analyzed as shown in Figure 8.14. The lexical entry for the
Figure 8.14: Categorial Grammar analysis of a relative clause with long-distance depen-
dency
verbs corresponds to what was discussed in the preceding sections: married is a normal
transitive verb and says is a verb that requires a sentential complement and forms a VP
(s\np) with it. This VP yields a sentence when combined with an NP. The noun phrases in
Figure 8.14 have been type raised. Using forward composition, it is possible to combine
Anna and married to yield s/np. This is the desired result: a sentence missing an NP to its
right. Manny and says and then Manny says and Anna married can also be combined via
forward composition and we then have the category s/np for Manny says Anna married.
This category can be combined with the relative pronoun using forward application and
we then arrive at n\n, which is exactly the category for postnominal modifiers.
However, the assumption that the relative pronoun constitutes the head is problematic
since one has to then go to some lengths to explain pied-piping constructions such as
those in (31).
(31) a. Here’s the minister [[in [the middle [of [whose sermon]]]] the dog barked].7
b. Reports [the height of the lettering on the covers of which] the government
prescribes should be abolished.8
In (31), the relative pronoun is embedded in a phrase that has been extracted from the
rest of the relative clause. The relative pronoun in (31a) is the determiner of sermon.
Depending on the analysis, whose is the head of the phrase whose sermon. The NP is
embedded under of and the phrase of whose sermon depends on middle. The entire NP
the middle of the sermon is a complement of the preposition in. It would be quite a stretch
to claim that whose is the head of the relative clause in (31a). The relative pronoun in
(31b) is even more deeply embedded. Steedman (1997: 50) gives the following lexical
entries for who, whom and which:
7
Pollard & Sag (1994: 212).
8
Ross (1967: 109).
258
8.6 Summary and classification
259
8 Categorial Grammar
260
8.6 Summary and classification
9
(Bech 1955: 79). See Haider (1985a) and Müller (1999b: Section 10.7) for a discussion of pied-piping in relative
clauses with fronted verbal projections.
261
8 Categorial Grammar
sentences. And furthermore it must be possible to build a phrase of the respective cate-
gory containing a relative pronoun. In the case of adverbial relative phrases, the relative
phrase has to be the adverbial element. Since how does not function as relative element,
it cannot appear as relative phrase. German has wie as relative element and hence we
have examples like (39d).
Comprehension questions
Exercises
Compare the resulting analysis with the structure given in Figure 2.4 on
page 67 and think about which categories of X syntax the categories in Cate-
gorial Grammar correspond to.
262
8.6 Summary and classification
Further reading
263
9 Head-Driven Phrase Structure
Grammar
Head-Driven Phrase Structure Grammar (HPSG) was developed by Carl Pollard and Ivan
Sag in the mid-80’s in Stanford and in the Hewlett-Packard research laboratories in Palo
Alto (Pollard & Sag 1987; 1994; see Flickinger et al. 2020 for more on the history of
HPSG). Like LFG, HPSG is part of so-called West Coast linguistics. Another similarity
to LFG is that HPSG aims to provide a theory of competence which is compatible with
performance (Sag & Wasow 2011; 2015; Wasow 2020, see also Chapter 15).
The formal properties of the description language for HPSG grammars are well-un-
derstood and there are many systems for processing such grammars (Dörre & Seiffert
1991; Dörre & Dorna 1993; Popowich & Vogel 1991; Uszkoreit, Backofen, Busemann, Di-
agne, Hinkelman, Kasper, Kiefer, Krieger, Netter, Neumann, Oepen & Spackman 1994;
Erbach 1995; Schütz 1996; Schmidt, Theofilidis, Rieder & Declerck 1996; Schmidt, Rieder
& Theofilidis 1996; Uszkoreit, Backofen, Calder, Capstick, Dini, Dörre, Erbach, Estival,
Manandhar, Mineur & Oepen 1996; Müller 1996c; 2004d; Carpenter & Penn 1996; Penn
& Carpenter 1999; Götz, Meurers & Gerdemann 1997; Copestake 2002; Callmeier 2000;
Dahllöf 2003; Meurers, Penn & Richter 2002; Penn 2004; Müller 2007d; Sato 2008; Kauf-
mann 2009; Slayden 2012; Packard 2015).1 Currently, the LKB system by Ann Copestake
and the TRALE system, that was developed by Gerald Penn (Meurers, Penn & Richter
2002; Penn 2004), have the most users. The DELPH-IN consortium – whose grammar
fragments are based on the LKB – and various TRALE users have developed many small
and some large grammar fragments of various languages. The following is a list of im-
plementations in different systems:
• Arabic (Haddar, Boukedi & Zalila 2010; Hahn 2011; Masum, Islam, Rahman &
Ahmed 2012; Boukedi & Haddar 2014; Loukam, Balla & Laskri 2015; Arad Greshler,
Herzig Sheinfux, Melnik & Wintner 2015),
• Bengali (Paul 2004; Islam, Hasan & Rahman 2012),
• Bulgarian (Simov, Osenova, Simov & Kouylekov 2004; Osenova 2010a,b; 2011),
• Cantonese (Fan, Song & Bond 2015),
• Danish (Ørsnes 1995; 2009b; Neville & Paggio 2004; Müller 2009c; Müller & Ørsnes
2011; Müller 2012b; Müller & Ørsnes 2015),
1
Uszkoreit et al. (1996) and Bolc et al. (1996) compare systems that were available or were developed at the
beginnings of the 1990s. Melnik (2007) compares LKB and TRALE. See also Müller (2015c: Section 5.1).
9 Head-Driven Phrase Structure Grammar
• Dutch (van Noord & Bouma 1994; Bouma, van Noord & Malouf 2001; Fokkens
2011),
• German (Kiss 1991; Netter 1993; 1996; Meurers 1994; Hinrichs et al. 1997; Kordoni
1999; Tseng 2000; Geißler & Kiss 1994; Keller 1994; Müller 1996c; 1999b; Müller &
Kasper 2000; Crysmann 2003; 2005b,c; Müller 2007a; Kaufmann & Pfister 2007;
2008; Kaufmann 2009; Fokkens 2011),
• English (Copestake & Flickinger 2000; Flickinger, Copestake & Sag 2000; Flickin-
ger 2000; Dahllöf 2002; 2003; De Kuthy & Meurers 2003a; Meurers, De Kuthy &
Metcalf 2003; De Kuthy, Metcalf & Meurers 2004; Müller & Machicao y Priemer
2019; Müller 2018a),
• Esperanto (Li 1996),
• French (Tseng 2003),
• Ga (Kropp Dakubu, Hellan & Beermann 2007; Hellan 2007),
• Georgian (Abzianidze 2011),
• Greek (Kordoni & Neu 2005),
• Hausa (Crysmann 2005a; 2009; 2011; 2012; 2016),
• Hebrew (Melnik 2007; Haugereid, Melnik & Wintner 2013; Arad Greshler, Herzig
Sheinfux, Melnik & Wintner 2015),
• Indonesian (Moeljadi, Bond & Song 2015)
• Japanese (Siegel 2000; Siegel & Bender 2002; Bender & Siegel 2005; Siegel, Bender
& Bond 2016),
• Korean (Kim & Yang 2003; 2004; 2006; 2009; Kim, Sells & Yang 2007; Song, Kim,
Bond & Yang 2010; Kim, Yang, Song & Bond 2011),
• Maltese (Müller 2009b),
• Mandarin Chinese (Liu 1997; Ng 1997; Müller & Lipenkova 2009; 2013; Yang &
Flickinger 2014; Fan, Song & Bond 2015),
• Norwegian (Hellan & Haugereid 2003; Beermann & Hellan 2004; Hellan & Beer-
mann 2006; Haugereid 2017),
• Persian (Müller 2010b; Müller & Ghayoomi 2010),
• Polish (Przepiórkowski, Kupść, Marciniak & Mykowiecka 2002; Mykowiecka, Mar-
ciniak, Przepiórkowski & Kupść 2003),
• Portuguese (Branco & Costa 2008a,b; 2010),
266
• Russian (Avgustinova & Zhang 2009),
• Sahaptin (Drellishak 2009),
• Spanish (Pineda & Meza 2005a,b; Bildhauer 2008; Marimon 2013),
• Sign Language (German, French, British, Greek!) (Sáfár & Marshall 2002; Marshall
& Sáfár 2004; Sáfár & Glauert 2010),
• South African Sign Language (Bungeroth 2002),
• Turkish (Fokkens, Poulson & Bender 2009),
• Wambaya (Bender 2008a,c; 2010).
• Yiddish (Müller & Ørsnes 2011),
The first implemented HPSG grammar was a grammar of English developed in the Hew-
lett-Packard labs in Palo Alto (Flickinger, Pollard & Wasow 1985; Flickinger 1987). Gram-
mars for German were developed in Heidelberg, Stuttgart and Saarbrücken in the LILOG
project. Subsequently, grammars for German, English and Japanese were developed in
Heidelberg, Saarbrücken and Stanford in the Verbmobil project. Verbmobil was the largest
ever AI project in Germany. It was a machine translation project for spoken language
in the domains of trip planning and appointment scheduling (Wahlster 2000).
Currently there are two larger groups that are working on the development of gram-
mars: the DELPH-IN consortium (Deep Linguistic Processing with HPSG)2 and the group
that developed out of the network CoGETI (Constraintbasierte Grammatik: Empirie,
Theorie und Implementierung). Many of the grammar fragments that are listed above
were developed by members of DELPH-IN and some were derived from the Grammar
Matrix which was developed for the LKB to provide grammar writers with a typologi-
cally motivated initial grammar that corresponds to the properties of the language un-
der development (Bender, Flickinger & Oepen 2002). The CoreGram project3 is a similar
project that was started at the Freie Universität Berlin and which is now being run at the
Humboldt-Universität zu Berlin. It is developing grammars for German, Danish, Persian,
Maltese, Mandarin Chinese, Spanish, French, Welsh, and Yiddish that share a common
core. Constraints that hold for all languages are represented in one place and used by all
grammars. Furthermore there are constraints that hold for certain language classes and
again they are represented together and used by the respective grammars. So while the
Grammar Matrix is used to derive grammars that individual grammar writers can use,
adapt and modify to suit their needs, CoreGram really develops grammars for various
languages that are used simultaneously and have to stay in sync. A description of the
CoreGram can be found in Müller (2013b; 2015c).
There are systems that combine linguistically motivated analyses with statistics com-
ponents (Brew 1995; Miyao et al. 2005; Miyao & Tsujii 2008) or learn grammars or lexica
from corpora (Fouvry 2003; Cramer & Zhang 2009).
2
http://www.delph-in.net/. 2018-02-20.
3
https://hpsg.hu-berlin.de/Projects/CoreGram.html. 02.09.2020.
267
9 Head-Driven Phrase Structure Grammar
For further information on the interaction between HPSG and computational linguistics
see Bender & Emerson (2020).
268
9.1 General remarks on the representational format
information about metrical grids and weak or strong accents. See Bird & Klein (1994),
Orgun (1996), Höhle (1999), Walther (1999), Crysmann (2002: Chapter 6), and Bildhauer
(2008) for phonology in the framework of HPSG. The details of the description in (1) will
be explained in the following sections.
HPSG has adopted various insights from other theories and newer analyses have been
influenced by developments in other theoretical frameworks. Functor-argument struc-
tures, the treatment of valence information and function composition have been adopted
from Categorial Grammar. Function composition plays an important role in the analy-
sis of verbal complexes in languages like German and Korean. The Immediate Domi-
nance/Linear Precedence format (ID/LP format, see Section 5.1.2) as well as the Slash
mechanism for long-distance dependencies (see Section 5.4) both come from GPSG. The
analysis assumed here for verb position in German is inspired by the one that was de-
veloped in the framework of Government & Binding (see Section 3.2). Starting in 1995,
HPSG also incorporated insights from Construction Grammar (Sag 1997, see also Sec-
tion 10.6.2 on Sign-Based Construction Grammar, which is a HPSG variant).
269
9 Head-Driven Phrase Structure Grammar
valence class the verb belongs to. In Section 5.5, it was pointed out that morpholog-
ical processes need to refer to valence information. Hence, it is desirable to remove
redundant valence information from grammatical rules. For this reason, HPSG – like
Categorial Grammar – includes descriptions of the arguments of a head in the lexical
entry of that head. There are the features SPECIFIER (SPR ) and COMPLEMENTS (COMPS ),
whose values are lists containing descriptions of the elements that must combine with a
head in order to yield a complete phrase. (5) gives some examples for the verbs in (2):
(5) verb COMPS
schlafen ‘to sleep’ ⟨ NP[nom] ⟩
erwarten ‘to expect’ ⟨ NP[nom], NP[acc] ⟩
sprechen ‘to speak’ ⟨ NP[nom], PP[über] ⟩
geben ‘to give’ ⟨ NP[nom], NP[dat], NP[acc] ⟩
dienen ‘to serve’ ⟨ NP[nom], NP[dat], PP[mit] ⟩
The table and the following figures use COMPS as a valence feature. Former versions of
HPSG (Pollard & Sag 1987) used the feature SUBCAT instead, which stands for subcatego-
rization. It is often said that a head subcategorizes for certain arguments. See page 91 for
more on the term subcategorization. Depending on the language, subjects are treated dif-
ferently from other arguments (see for example Chomsky (1981a: 26–28), Hoekstra (1987:
33)). The subject in SVO languages like English has properties that differ from those of
objects. For example, the subject is said to be an extraction island. This is not the case
for SOV languages like German and hence it is usually assumed that all arguments of
finite verbs are treated alike (Pollard 1996; Eisenberg 1994a: 376). Therefore the subject
is included in the lists above. I will return to English shortly.
Figure 9.1 shows the analysis for (6a) and the analysis for (6b) is in Figure 9.2 on the
next page:
V[COMPS ⟨⟩]
1 NP[nom] V[COMPS ⟨ 1 ⟩]
Peter schläft
Peter sleeps
Figure 9.1: Analysis of Peter schläft ‘Peter sleeps’ in dass Peter schläft ‘that Peter sleeps’
270
9.1 General remarks on the representational format
In Figures 9.1 and 9.2, one element of the COMPS list is combined with its head in each local
tree. The elements that are combined with the selecting head are then no longer present
in the COMPS list of the mother node. V[COMPS ⟨ ⟩] corresponds to a complete phrase (VP
or S). The boxes with numbers show the structure sharing (see Section 6.4). Structure
sharing is the most important means of expression in HPSG. It plays a central role for
phenomena such as valence, agreement and long-distance dependencies. In the examples
above, 1 indicates that the description in the COMPS list is identical to another daughter in
the tree. The descriptions contained in valence lists are usually partial descriptions, that
is, not all properties of the argument are exhaustively described. Therefore, it is possible
that a verb such as schläft ‘sleeps’ can be combined with various kinds of linguistic
objects: the subject can be a pronoun, a proper name or a complex noun phrase, it only
matters that the linguistic object in question is complete (has an empty SPR list and an
empty COMPS list) and bears the correct case.5
V[COMPS ⟨⟩]
1 NP[nom] V[COMPS ⟨ 1 ⟩]
2 NP[acc] V[COMPS ⟨ 1 , 2 ⟩]
As mentioned above researchers working on German usually assume that subjects and
objects should be treated similarly since they do not differ in fundamental ways as they
do in SVO languages like English. Hence, for German, all arguments are represented in
the same list. However, for SVO languages it proved useful to assume a special valence
feature for preverbal dependents (Borsley 1987; Pollard & Sag 1994: Chapter 9). The argu-
ments can be split in subjects, that are represented in the SPR list, and other arguments
(complements), which are represented in the COMPS list. The equivalent of our table for
German is given as (7):
(7) verb SPR COMPS ARG-ST
sleep ⟨ NP[nom] ⟩ ⟨⟩ ⟨ NP[nom] ⟩
expect ⟨ NP[nom] ⟩ ⟨ NP[acc] ⟩ ⟨ NP[nom], NP[acc] ⟩
speak ⟨ NP[nom] ⟩ ⟨ PP[about] ⟩ ⟨ NP[nom], PP[about] ⟩
give ⟨ NP[nom] ⟩ ⟨ NP[acc], NP[acc] ⟩ ⟨ NP[nom], NP[acc], NP[acc] ⟩
serve ⟨ NP[nom] ⟩ ⟨ NP[acc], PP[with] ⟩ ⟨ NP[nom], NP[acc], PP[with] ⟩
5
Furthermore, it must agree with the verb. This is not shown here.
271
9 Head-Driven Phrase Structure Grammar
V[SPR ⟨⟩,
COMPS ⟨⟩]
1 NP V[SPR ⟨ 1 ⟩,
COMPS ⟨⟩]
V[SPR ⟨ 1 ⟩, 2 P[SPR ⟨ ⟩,
COMPS ⟨ 2 ⟩] COMPS ⟨ ⟩]
P[SPR ⟨ ⟩, 3 N[SPR ⟨ ⟩,
COMPS ⟨ 3 ⟩] COMPS ⟨ ⟩]
4 Det N[SPR ⟨ 4 ⟩,
COMPS ⟨ ⟩]
A head is combined with all its complements first and then with its specifier.6 So, talks
is combined with about the summer and the resulting VP is combined with its subject
Kim. The SPR list works like the COMPS list: if an element is combined with its head, it is
not contained in the SPR list of the mother. Figure 9.3 also shows the analysis of an NP:
nouns select a determiner via SPR. The combination of the and summer is complete as
far as specifiers are concerned and hence the SPR list at the node for the summer is the
empty list. Nominal elements with empty SPR list and COMPS list will be abbreviated as
NP. Similarly fully saturated linguistic objects with Ps as heads are PPs.
(7) provides the SPR and COMPS values of some example verbs. In addition it also pro-
vides the ARG-ST value. ARG-ST stands for argument structure and is a list of all arguments
of a head. This argument structure list plays a crucial role in establishing the connection
between syntax (valence) and semantics. The term for this is linking. We will deal with
linking in more detail in Section 9.1.6.
6
I present the analyses in a bottom-up way here but it is very important that HPSG does not make any
statements about the order in which linguistic objects are combined. This is crucial when it comes to
psycholinguistic plausibility of linguistic theories. See Chapter 15 for discussion.
272
9.1 General remarks on the representational format
After this brief discussion of English constituent structure, I will turn to German again
and ignore the SPR feature. The value of SPR in all the verbal structures that are discussed
in the following is the empty list.
NP
Det N
dem Mann
the man
Figure 9.4: Analysis of dem Mann ‘the man’
PHON ⟨ dem Mann ⟩
(9) HEAD-DTR PHON ⟨ Mann ⟩
NON-HEAD-DTRS PHON ⟨ dem ⟩
In (9), there is exactly one head daughter (HEAD-DTR). The head daughter is always the
daughter containing the head. In a structure with the daughters das ‘the’ and Bild von
Maria ‘picture of Maria’, the latter would be the head daughter. In principle, there can be
multiple non-head daughters. If we were to assume a flat structure for a sentence with a
ditransitive verb, as in Figure 2.1 on page 54, we would have three non-head daughters.
It also makes sense to assume binary branching structures without heads (see Müller
2007a: Chapter 11 for an analysis of relative clauses). In such structures we would also
have more than one non-head daughter, namely exactly two.
Before it is shown how it is ensured that only those head-complement structures are
licensed in which the argument matches the requirements of the head, I will present the
7
However, phrase structure rules are used in some computer implementations of HPSG in order to improve
the efficiency of processing.
273
9 Head-Driven Phrase Structure Grammar
general structure of feature descriptions in HPSG. The structure presented at the start
of this chapter is repeated in (10) with all the details relevant to the present discussion:
word
PHON ⟨ Grammatik ⟩
local
category
noun
HEAD
CAT CASE 1
SPR
⟨ DET[CASE 1 ] ⟩
COMPS ⟨⟩
LOC
(10) mrs
SYNSEM
PER third
NUM sg
CONT IND 2
GEN fem
grammatik
RELS
INST 2
INHER|SLASH ⟨⟩
NONLOC
TO-BIND|SLASH ⟨⟩
In the outer layer, there are the features PHON and SYNSEM. As previously mentioned,
PHON contains the phonological representation of a linguistic object. The value of SYN-
SEM is a feature structure which contains syntactic and semantic information that can
be selected by other heads. The daughters of phrasal signs are represented outside of
SYNSEM. This ensures that there is a certain degree of locality involved in selection: a
head cannot access the internal structure of the elements which it selects (Pollard & Sag
1987: 143–145; 1994: 23). See also Sections 10.6.2.1 and 18.2 for a discussion of locality. In-
side SYNSEM, there is information relevant in local contexts (LOCAL, abbreviated to loc) as
well as information important for long-distance dependencies (NONLOCAL or NONLOC for
short). Locally relevant information includes syntactic (CATEGORY or CAT), and semantic
(CONTENT or CONT) information. Syntactic information encompasses information that
determines the central characteristics of a phrase, that is, the head information. This is
represented under HEAD. Further details of this will be discussed in Section 9.1.4. Among
other things, the part of speech of a linguistic object belongs to the head properties of
a phrase. As well as HEAD, SPR and COMPS belongs to the information contained inside
CAT. The semantic content of a sign is present under CONT. The type of the CONT value
is mrs, which stands for Minimal Recursion Semantics (Copestake, Flickinger, Pollard &
Sag 2005). An MRS structure is comprised of an index and a list of relations which re-
strict this index. Of the NONLOCAL features, only SLASH is given here. There are further
features for dealing with relative and interrogative clauses (Pollard & Sag 1994; Sag 1997;
Ginzburg & Sag 2000; Holler 2005), which will not be discussed here.
274
9.1 General remarks on the representational format
As can be seen, the description of the word Grammatik ‘grammar’ becomes relatively
complicated. In theory, it would be possible to list all properties of a given object directly
in a single list of feature-value pairs. This would, however, have the disadvantage that
the identity of groups of feature-value pairs could not be expressed as easily. Using the
feature geometry in (10), one can express the fact that the CAT values of both conjuncts
in symmetric coordinations such as those in (11) are identical.
(11b) should be compared with the examples in (12). In (12a), the verbs select for an
accusative and a dative object, respectively and in (12b), the verbs select for an accusative
and a prepositional object:
(12) a. * Er kennt und hilft dieser Frau / diese Frau.
he.NOM knows and helps this.DAT woman this.ACC woman
Intended: ‘He knows and helps this woman.’
b. * weil er auf Maria kennt und wartet
because he for Maria knows and waits
Intended: ‘because he knows Maria and waits for her’
While the English translation of (12a) is fine, since both knows and helps take an ac-
cusative, (12a) is out, since kennt ‘knows’ takes an accusative and hilft ‘helps’ a dative
object. Similarly, (12b) is out since kennt ‘knows’ selects an accusative object and wartet
‘waits’ selects for a prepositional phrase containing the preposition auf ‘for’.
If valence and the part of speech information were not represented in one common
sub-structure, we would have to state separately that utterances such as (11) require that
both conjuncts have the same valence and part of speech.
After this general introduction of the feature geometry that is assumed here, we can
now turn to the Head-Complement Schema:
275
9 Head-Driven Phrase Structure Grammar
(13) ⟨ x, y ⟩ = ⟨ x ⟩ ⊕ ⟨ y ⟩ or
⟨⟩ ⊕ ⟨ x, y ⟩ or
⟨ x, y ⟩ ⊕ ⟨⟩
The list ⟨ x, y ⟩ can be subdivided into two lists each containing one element, or alterna-
tively into the empty list and ⟨ x, y ⟩.
Schema 1 can be read as follows: if an object is of the type head-complement-phrase
then it must have the properties on the right-hand side of the implication. In concrete
terms, this means that these objects always have a valence list which corresponds to 1 ,
that they have a head daughter with a valence list that can be divided into two sublists
1 and ⟨ 2 ⟩ and also that they have a non-head daughter whose syntactic and semantic
properties (SYNSEM value) are compatible with the last element of the COMPS list of the
head daughter ( 2 ). (14) provides the corresponding feature description for the example
in (6a).
head-complement-phrase
PHON Peter schläft
SYNSEM|LOC|CAT|COMPS ⟨⟩
" #
(14) HEAD-DTR PHON schläft
SYNSEM|LOC|CAT|COMPS ⟨ 1 NP[nom] ⟩
PHON ⟨ Peter ⟩
NON-HEAD-DTRS
SYNSEM 1
NP[nom] is an abbreviation for a complex feature description. Schema 1 divides the
COMPS list of the head daughter into a single-element list and what is left. Since schläft
‘sleeps’ only has one element in its COMPS list, what remains is the empty list. This
remainder is also the COMPS value of the mother.
276
9.1 General remarks on the representational format
Prepositions have an INITIAL value ‘+’ and therefore have to precede arguments. Verbs
in final position bear the value ‘−’ and have to follow their arguments.
(16) a. [in [den Schrank]]
in the cupboard
b. * [[den Schrank] in]
the cupboard in
c. dass [er [ihn umfüllt]]
that he it decants
d. * dass [er [umfüllt ihn]]
that he decants it
277
9 Head-Driven Phrase Structure Grammar
GPSG has the Head Feature Convention that ensures that head features on the mother
node are identical to those on the node of the head daughter. In HPSG, there is a similar
principle. Unlike GPSG, head features are explicitly contained as a group of features in
the feature structures. They are listed under the path SYNSEM|LOC|CAT|HEAD. (19) shows
the lexical item for gibt ‘gives’:
(19) gibt ‘gives’:
word
PHON gibt
HEAD verb
SYNSEM|LOC|CAT VFORM fin
COMPS ⟨ NP[nom], NP[dat], NP[acc] ⟩
278
9.1 General remarks on the representational format
HEAD 1
COMPS ⟨ ⟩
HEAD 1
2 NP[nom]
COMPS ⟨ 2 ⟩
HEAD 1
3 NP[dat]
COMPS ⟨ 2 , 3 ⟩
HEAD 1 verb
VFORM fin
4 NP[acc]
COMPS ⟨ 2 , 3 , 4 ⟩
279
9 Head-Driven Phrase Structure Grammar
sign
word phrase
non-headed-phrase headed-phrase
head-complement-phrase
Figure 9.7: Type hierarchy for sign: all subtypes of headed-phrase inherit constraints
The arrow corresponds to a logical implication, as mentioned above. Therefore, (20) can
be read as follows: if a structure is of type headed-phrase, then it must hold that the value
of SYNSEM|LOC|CAT|HEAD is identical to the value of HEAD-DTR|SYNSEM|LOC|CAT|HEAD.
An extract from the type hierarchy under sign is given in Figure 9.7. word and phrase
are subclasses of linguistic signs. Phrases can be divided into phrases with heads (headed-
phrase) and those without (non-headed-phrase). There are also subtypes for phrases of
type non-headed-phrase and headed-phrase. We have already discussed head-complement-
phrase, and other subtypes of headed-phrase will be discussed in the later sections. As
well as word and phrase, there are the types root and stem, which play an important
role for the structure of the lexicon and the morphological component. Due to space
considerations, it is not possible to further discuss these types here, but see Chapter 23.
The description in (21) shows the Head-Complement Schema from page 275 together
with the restrictions that the type head-complement-phrase inherits from headed-phrase.
(21) Head-Complement Schema + Head Feature Principle:
head-complement-phrase
HEAD 1
SYNSEM|LOC|CAT
COMPS 2
HEAD-DTR|SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⊕ ⟨ 3 ⟩
NON-HEAD-DTRS ⟨ [ SYNSEM 3 ] ⟩
280
9.1 General remarks on the representational format
head-complement-phrase
PHON das Buch gibt
" #
SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⟨ NP[nom], NP[dat] ⟩
word
PHON gibt
HEAD-DTR HEAD 1 verb
SYNSEM|LOC|CAT VFORM fin
(22)
COMPS ⊕ ⟨ ⟩
2 3
PHON ⟨ das Buch ⟩
+
noun
*
SYNSEM 3 LOC|CAT HEAD CAS acc
NON-HEAD-DTRS
COMPS ⟨⟩
HEAD-DTR …
NON-HEAD-DTRS …
For the entire clause er das Buch dem Mann gibt ‘he the book to the man gives’, we arrive
at a structure (already shown in Figure 9.6) described by (23):
HEAD verb
VFORM fin
(23) SYNSEM|LOC|CAT
SPR ⟨⟩
COMPS ⟨⟩
This description corresponds to the sentence symbol S in the phrase structure grammar
on page 53, however (23) additionally contains information about the form of the verb.
Using dominance schemata as an example, we have shown how generalizations about
linguistic objects can be captured, however, we also want to be able to capture generaliza-
tions in other areas of the theory: like Categorial Grammar, the HPSG lexicon contains a
very large amount of information. Lexical entries (roots and words) can also be divided
into classes, which can then be assigned types. In this way, it is possible to capture what
all verbs, intransitive verbs and transitive verbs, have in common. See Figure 23.1 on
page 690.
Now that some fundamental concepts of HPSG have been introduced, the following
section will show how the semantic contribution of words is represented and how the
meaning of a phrase can be determined compositionally.
281
9 Head-Driven Phrase Structure Grammar
9.1.6 Semantics
An important difference between theories such as GB, LFG and TAG, on the one hand,
and HPSG and CxG on the other is that the semantic content of a linguistic object is
modeled in a feature structure just like all its other properties. As previously mentioned,
semantic information is found under the path SYNSEM|LOC|CONT. (24) gives an example
of the CONT value for Buch ‘book’. The representation is based on Minimal Recursion
Semantics (MRS):9
mrs
PER 3
IND 1 NUM sg
(24) GEN neu
buch
RELS
INST 1
IND stands for index and RELS is a list of relations. Features such as person, number and
gender are part of a nominal index. These are important in determining reference or
coreference. For example, sie ‘she’ in (25) can refer to Frau ‘woman’ but not to Buch
‘book’. On the other hand, es ‘it’ cannot refer to Frau ‘woman’.
(25) Die Frau𝑖 kauft ein Buch 𝑗 . Sie𝑖 liest es 𝑗 .
the woman buys a book she reads it
‘The woman buys a book. She reads it.’
In general, pronouns have to agree in person, number and gender with the element
they refer to. Indices are then identified accordingly. In HPSG, this is done by means
of structure sharing. It is also common to speak of coindexation. (26) provides some
examples of coindexation of reflexive pronouns:
(26) a. Ich𝑖 sehe mich𝑖 .
I see myself
b. Du𝑖 siehst dich𝑖 .
you see yourself
c. Er𝑖 sieht sich𝑖 .
he sees himself
d. Wir𝑖 sehen uns𝑖 .
we see ourselves
e. Ihr𝑖 seht euch𝑖 .
you see yourselves
9
Pollard & Sag (1994) and Ginzburg & Sag (2000) make use of Situation Semantics (Barwise & Perry 1983;
Cooper, Mukai & Perry 1990; Devlin 1992). An alternative approach which has already been used in HPSG
is Lexical Resource Semantics (Richter & Sailer 2004). For an early underspecification analysis in HPSG,
see Nerbonne (1993).
282
9.1 General remarks on the representational format
The list that contains all the arguments of a head is called argument structure list and
it is represented as value of the ARG-ST feature. This list plays a very important role in
HPSG grammars: case is assigned there, the Binding Theory operates on ARG-ST and the
linking between syntax and semantics takes place on ARG-ST as well.
For finite verbs, the value of ARG-ST is identical to the value of COMPS in German.
As was explained in Section 9.1.1, the first element of the ARG-ST list is the subject in
languages like English and it is represented in the SPR list (Sag, Wasow & Bender 2003:
Section 4.3, 7.3.1; Müller 2019c). All other elements from ARG-ST are contained in the
COMPS list. So there are language specific ways to represent the valence but there is one
283
9 Head-Driven Phrase Structure Grammar
common representation that is the same for all argument structure representations. This
makes it possible to capture cross-linguistic generalizations.
Since we use general terms such as AGENT and PATIENT for argument roles, it is possi-
ble to state generalizations about valence classes and the realization of argument roles.
For example, one can divide verbs into verbs taking an agent, verbs with an agent and
theme, verbs with agent and patient etc. These various valence/linking patterns can be
represented in type hierarchies and these classes can be assigned to the specific lexical
entries, that is, one can have them inherit constraints from the respective types. A type
constraint for verbs with agent, theme and goal takes the form of (29):
mrs
IND 4 event
agent-goal-theme-rel
CONT * EVENT 4
+
(29) RELS AGENT
1
GOAL
2
THEME
D E
3
ARG-ST [] , [] , []
1 2 3
[] 1 stands for an object of unspecified syntactic category with the index 1 . The type for
the relation geben ′ is a subtype of agent-goal-theme-rel. The lexical entry for the word
geben ‘give’ or rather the root geb- has the linking pattern in (29). For more on theories of
linking in HPSG, see Davis (1996), Wechsler (1995) and Davis & Koenig (2000). Wechsler
et al. (2020) provide an overview of approaches to linking within HPSG.
Up to now, we have only seen how the meaning of lexical entries can be represented.
The Semantics Principle determines the computation of the semantic contribution of
phrases: the index of the entire expression corresponds to the index of the head daugh-
ter, and the RELS value of the entire sign corresponds to the concatenation of the RELS
values of the daughters plus any relations introduced by the dominance schema. The last
point is important because the assumption that schemata can add something to meaning
can capture the fact that there are some cases where the entire meaning of a phrase is
more than simply the sum of its parts. Pertinent examples are often discussed as part of
Construction Grammar. Semantic composition in HPSG is organized such that meaning
components that are due to certain patterns can be integrated into the complete meaning
of an utterance. For examples, see Section 21.10.
The connection between the semantic contribution of the verb and its arguments is
established in the lexical entry. As such, we ensure that the argument roles of the verb
are assigned to the correct argument in the sentence. This is, however, not the only thing
that the semantics is responsible for. It has to be able to generate the various readings
associated with quantifier scope ambiguities (see page 90) as well as deal with semantic
embedding of predicates under other predicates. All these requirements are fulfilled by
MRS. Due to space considerations, we cannot go into detail here. The reader is referred
to the article by Copestake, Flickinger, Pollard & Sag (2005) and to Section 19.3 in the
discussion chapter.
284
9.1 General remarks on the representational format
9.1.7 Adjuncts
Analogous to the selection of arguments by heads via COMPS, adjuncts can also select
their heads using a feature (MODIFIED). Adjectives, prepositional phrases that modify
nouns, and relative clauses select an almost complete nominal projection, that is, a noun
that only still needs to be combined with a determiner to yield a complete NP. (30) shows
a description of the respective synsem object. The symbol N, which is familiar from X
theory (see Section 2.5), is used as abbreviation for this feature description.
interessantes is an adjective that does not take any arguments and therefore has an empty
COMPS list. Adjectives such as treu ‘loyal’ have a dative NP in their COMPS list.
(32) ein dem König treues Mädchen
a the.DAT king loyal girl
‘a girl loyal to the king’
The CAT value is given in (33):
(33) CAT value for treues ‘loyal’:
" #
adj
HEAD
MOD N
COMPS ⟨ NP[dat] ⟩
dem König treues ‘loyal to the king’ forms an adjective phrase, which modifies Mädchen.
Unlike the selectional feature COMPS that belongs to the features under CAT, MOD is
a head feature. The reason for this is that the feature that selects the modifying head
has to be present on the maximal projection of the adjunct. The N-modifying property
of the adjective phrase dem König treues ‘loyal to the king’ has to be included in the
representation of the entire AP just as it is present in the lexical entry for adjectives in
(31) at the lexical level. The adjectival phrase dem König treues has the same syntactic
properties as the basic adjective interessantes ‘interesting’:
10
In what follows, I am also omitting the SPR feature, whose value would be the empty list.
285
9 Head-Driven Phrase Structure Grammar
(34) CAT value für dem König treues ‘loyal to the king’:
" #
adj
HEAD
MOD N
COMPS ⟨⟩
Since MOD is a head feature, the Head Feature Principle (see page 278) ensures that the
MOD value of the entire projection is identical to the MOD value of the lexical entry for
treues ‘loyal’.
As an alternative to the selection of the head by the modifier, one could assume a
description of all possible adjuncts on the head itself. This was suggested by Pollard &
Sag (1987: 161). Pollard & Sag (1994: Section 1.9) revised the earlier analysis since the
semantics of modification could not be captured.11
Figure 9.8 demonstrates selection in head-adjunct structures.
AP[HEAD|MOD 1 ] 1 N
interessantes Buch
interesting book
286
9.2 Passive
9.2 Passive
HPSG follows Bresnan’s argumentation (see Section 7.2) and takes care of the passive
in the lexicon.12 A lexical rule takes the verb stem as its input and licenses the par-
ticiple form and the most prominent argument (the so-called designated argument) is
suppressed.13 Since grammatical functions are not part of theory in HPSG, we do not
12
Some exceptions to this are analyses influenced by Construction Grammar such as Tseng (2007) and Hau-
gereid (2007). These approaches are problematic, however, as they cannot account for Bresnan’s adjectival
passives. For other problems with Haugereid’s analysis, see Müller (2007b) and Section 21.3.6.
13
For more on the designated argument, see Haider (1986a). HPSG analyses of the passive in German have
been considerably influenced by Haider. Haider uses the designated argument to model the difference
between so-called unaccusative and unergative verbs (Perlmutter 1978): unaccusative verbs differ from
unergatives and transitives in that they do not have a designated argument. We cannot go into the literature
on unaccusativity here. The reader is referred to the original works by Haider and the chapter on the
passive in Müller (2007a).
287
9 Head-Driven Phrase Structure Grammar
require any mapping principles that map objects to subjects. Nevertheless, one still has
to explain the change of case under passivization. If one fully specifies the case of a par-
ticular argument in the lexical entries, one has to ensure that the accusative argument of
a transitive verb is realized as nominative in the passive. (38) shows what the respective
lexical rule would look like:
(38) Lexical rule for personal passives adapted from Kiss (1992):
stem
PHON 1
↦→
SYNSEM|LOC|CAT|HEAD
D verb E
ARG-ST NP[nom], NP[acc] ⊕ 3
2
word
PHON 𝑓 ( 1 )
SYNSEM|LOC|CAT|HEAD|VFORM
D E passive-part
ARG-ST NP[nom] ⊕ 3
2
14
This lexical rule takes a verb stem as its input, which requires a nominative argument,
an accusative argument and possibly further arguments (if 3 is not the empty list) and
licenses a lexical entry that requires a nominative argument and possibly the arguments
in 3 .15 The output of the lexical rule specifies the VFORM value of the output word. This
is important as the auxiliary and the main verb must go together. For example, it is not
possible to use the perfect participle instead of the passive participle since these differ
in their valence in Kiss’ approach:
(39) a. Der Mann hat den Weltmeister geschlagen.
the man has the world.champion beaten
‘The man has beaten the world champion.’
b. * Der Mann wird den Weltmeister geschlagen.
the man is the world.champion beaten
14
The term stem includes roots (helf - ‘help-’), products of derivation (besing- ‘to sing about’) and compounds.
The lexical rule can therefore also be applied to stems like helf - and derived forms such as besing-.
15
This rule assumes that arguments of ditransitive verbs are in the order nominative, accusative, dative.
Throughout this chapter, I assume a nominative, dative, accusative order, which corresponds to the un-
marked order of arguments in the German clause. Kiss (2001) argued that a representation of the unmarked
order is needed to account for scope facts in German. Furthermore, the order of the arguments corresponds
to the order one would assume for English, which has the advantage that cross-linguistic generalizations
can be captured. In earlier work I assumed that the order is nominative, accusative, dative since this order
encodes a prominence hierarchy that is relevant in a lot of areas in German grammar. Examples are: ellipsis
(Klein 1985), Topic Drop (Fries 1988), free relatives (Bausewein 1990; Pittner 1995; Müller 1999a), depictive
secondary predicates (Müller 2004b; 2002a; 2008), Binding Theory (Grewendorf 1985; Pollard & Sag: 1992;
1994: Chapter 6). This order also corresponds to the Obliqueness Hierarchy suggested by Keenan & Com-
rie (1977) and Pullum (1977). In order to capture this hierarchy, a special list with nominative, accusative,
dative order would have to be assumed.
The version of the passive lexical rule that will be suggested below is compatible with both orders of
arguments.
288
9.2 Passive
289
9 Head-Driven Phrase Structure Grammar
acc-passive-lexical-rule
PHON 𝑓 ( 1 )
SYNSEM|LOC|CAT|HEAD|VFORM passive-part
D E
ARG-ST NP[nom] 2 ⊕ 3
(41)
stem
PHON 1
LEX-DTR SYNSEM|LOC|CAT|HEAD verb
D E
ARG-ST NP[nom], NP[acc]
2 ⊕ 3
What is on the left-hand side of the rule in (38), is contained in the value of LEX-DTR in
(41). Since this kind of lexical rule is fully integrated into the formalism, feature struc-
tures corresponding to these lexical rules also have their own type. If the result of the
application of a given rule is an inflected word, then the type of the lexical rule (acc-
passive-lexical-rule in our example) is a subtype of word. Since lexical rules have a type,
it is possible to state generalizations over lexical rules.
The lexical rules discussed thus far work well for the personal passive. For the imper-
sonal passive, however, we would require a second lexical rule. Furthermore, we would
have two different lexical items for the passive and the perfect, although the forms are
always identical in German. In the following, I will discuss the basic assumptions that
are needed for a theory of the passive that can sufficiently explain both personal and
impersonal passives and thereby only require one lexical item for the participle form.
290
9.2 Passive
This lexical rule does exactly what we expect it to do from a pretheoretical perspective
on the passive: it suppresses the most prominent argument with structural case, that is,
the argument that corresponds to the subject in the active clause.
(44) a. geschlafen ‘slept’: ARG-ST ⟨ ⟩
b. unterstützt ‘supported’: ARG-ST ⟨ NP[str]𝑘 ⟩
c. geholfen ‘helped’: ARG-ST ⟨ NP[ldat]𝑘 ⟩
d. geschenkt ‘given’: ARG-ST ⟨ NP[ldat]𝑘 , NP[str]𝑙 ⟩
The standard analysis of verb auxiliary constructions in German assumes that the main
verb and the auxiliary forms a verbal complex (Hinrichs & Nakazawa 1994a; Pollard
1994; Müller 1999b; 2002a; Meurers 2000; Kathol 2000). The arguments of the embedded
verb are taken over by the auxiliary. For the analysis of the passive this means that the
auxiliary has an ARG-ST that starts with the elements shown in (44). (44) differs from
(42) in that a different NP is in first position. If this NP has structural case, it will receive
nominative case. If there is no NP with structural case, as in (44c), the case remains as it
was, that is, lexically specified.
We cannot go into the analysis of the perfect here. It should be noted, however, that
the same lexical item for the participle is used for (45).
(45) a. Er hat den Weltmeister geschlagen.
he has the world.champion beaten
‘He has beaten the world champion.’
b. Der Weltmeister wurde geschlagen.
the world.champion was beaten
‘The world champion was beaten.’
291
9 Head-Driven Phrase Structure Grammar
It is the auxiliary that determines which arguments are realized (Haider 1986a; Müller
2007a: Chapter 17). The lexical rule in (43) licenses a form that can be used both in
passive and perfect. Therefore, the VFORM value is of ppp, which stands for perfect passive
participle.
One should note that this analysis of the passive works without movement of con-
stituents. The problems with the GB analysis do not arise here. Reordering of arguments
(see Section 9.4) is independent of passivization. The accusative object is not mentioned
at all unlike in GPSG, Categorial Grammar or Bresnan’s LFG analysis from before the
introduction of Lexical Mapping Theory (see page 232). The passive can be analyzed
directly as the suppression of the subject. Everything else follows from interaction with
other principles of grammar.
Figure 9.9 on the next page gives an overview of this. The verb trace in final position
behaves just like the verb both syntactically and semantically. The information about the
missing word is represented as the value of the feature DOUBLE SLASH (abbreviated: DSL).
This is a head feature and is therefore passed up to the maximal projection (VP). The
verb in initial position has a VP in its COMPS list which is missing a verb (VP//V). This is
the same verb that was the input for the lexical rule and that would normally occur in
final position. In Figure 9.9, there are two maximal verb projections: jeder diesen Mann
_𝑘 with the trace as the head and kennt jeder diesen Mann _𝑘 with kennt as the head.
This analysis will be explained in more detail in what follows. For the trace in Fig-
ure 9.9, one could assume the lexical entry in (47).
292
9.3 Verb position
VP
V ⟨ VP//V ⟩ VP//V
V NP V′//V
NP V//V
This lexical entry differs from the normal verb kennt only in its PHON value. The syntactic
aspects of an analysis with this trace are represented in Figure 9.10 on the following page.
The combination of the trace with diesen Mann ‘this man’ and jeder ‘everbody’ follows
the rules and principles that we have encountered thus far. This begs the immediate
question as to what licenses the verb kennt in Figure 9.10 and what status it has.
If we want to capture the fact that the finite verb in initial position behaves like a
complementizer (Höhle 1997), then it makes sense to give head status to kennt in Fig-
ure 9.10 and have kennt select a saturated, verb-final verbal projection. Finite verbs in
initial position differ from complementizers in that they require a projection of a verb
trace, whereas complementizers need projections of overt verbs:
(48) a. dass [jeder diesen Mann kennt]
that everybody this man knows
‘that everybody knows this man’
293
9 Head-Driven Phrase Structure Grammar
V[COMPS ⟨⟩]
V V[COMPS ⟨⟩]
3 NP[nom] V[COMPS ⟨ 3 ⟩]
4 NP[acc] [V[COMPS ⟨ 3 , 4 ⟩]
Figure 9.10: Analysis of Kennt jeder diesen Mann? ‘Does everyone know this man?’
294
9.3 Verb position
by introducing a head feature whose value is identical to the LOCAL value of the trace.
This feature is referred to as DSL. As was already mentioned above, DSL stands for double
slash. It is called so because it has a similar function to the SLASH feature, which we will
encounter in the following section.17 (50) shows the modified entry for the verb trace:
Through sharing of the LOCAL value and the DSL value in (50), the syntactic and semantic
information of the verb trace is present at its maximal projection, and the verb in initial
position can check whether the projection of the trace is compatible.18
The special lexical item for verb-initial position is licensed by the following lexical
rule:19
17
The feature DSL was proposed by Jacobson (1987a) in the framework of Categorial Grammar to describe
head movement in English inversions. Borsley (1989) adopted this idea and translated it into HPSG terms,
thereby showing how head movement in a HPSG variant of the CP/IP system can be modeled using DSL.
The introduction of the DSL feature to describe head movement processes in HPSG is motivated by the fact
that, unlike long-distance dependencies as will be discussed in Section 9.5, this kind of movement is local.
The suggestion to percolate information about the verb trace as part of the head information comes
from Oliva (1992).
18
Note that the description in (50) is cyclic since the tag 1 is used inside itself. See Section 6.5 on cyclic
feature descriptions. This cyclic description is the most direct way to express that a linguistic object with
certain local properties is missing and to pass this information on along the head path as the value of the
DSL feature. This will be even clearer when we look at the final version of the verb trace in (52) on page 297.
19
The lexical rule analysis cannot explain sentences such as (i):
This has to do with the fact that the lexical rule cannot be applied to the result of coordination, which
constitutes a complex syntactic object. If we apply the lexical rule individually to each verb, then we arrive
at variants of the verbs which would each select verb traces for kennen ‘to know’ and lieben ‘to love’. Since
the CAT values of the conjuncts are identified with each other in coordinations, coordinations involving
the V1 variants of kennt and liebt would be ruled out since the DSL values of the selected VPs contain the
meaning of the respective verbs and are hence not compatible (Müller 2005b: 13). Instead of a lexical rule,
one must assume a unary syntactic rule that applies to the phrase kennt und liebt ‘knows and loves’. As
we have seen, lexical rules in the HPSG formalization assumed here correspond to unary rules such that
the difference between (51) and a corresponding syntactic rule is mostly a difference in representation.
295
9 Head-Driven Phrase Structure Grammar
V[COMPS ⟨⟩]
V[COMPS ⟨ 1 ⟩] 1 V[DSL|CAT|COMPS 2 ,
COMPS ⟨⟩]
V1-LR
4 NP[acc] V[DSL|CAT|COMPS 2 ,
COMPS 2 ⟨ 3 , 4 ⟩ ]
Figure 9.11: Visualization of the analysis of Kennt jeder diesen Mann? ‘Does everyone
know this man?’
a verb that selects a VP ( 1 in Figure 9.11). The DSL value of this VP corresponds to the
LOCAL value of the verb that is the input of the lexical rule. Part of the DSL value is also
296
9.4 Local reordering
the valence information represented in Figure 9.11 ( 2 ). Since DSL is a head feature, the
DSL value of the VP is identical to that of the verb trace and since the LOCAL value of
the verb trace is identified with the DSL value, the COMPS information of the verb kennen
is also available at the trace. The combination of the trace with its arguments proceeds
exactly as with an ordinary verb.
It would be unsatisfactory if we had to assume a special trace for every verb. Fortu-
nately, this is not necessary as a general trace as in (52) will suffice for the analysis of
sentences with verb movement.
(52) General
" verb trace following Meurers
# (2000: 206–208):
PHON ⟨⟩
SYNSEM|LOC 1 CAT|HEAD|DSL 1
This may seem surprising at first glance, but if we look closer at the interaction of the
lexical rule (51) and the percolation of the DSL feature in the tree, then it becomes clear
that the DSL value of the verb projection and therefore the LOCAL value of the verb trace
is determined by the LOCAL value of the input verb. In Figure 9.11, kennt is the input
for the verb movement lexical rule. The relevant structure sharing ensures that, in the
analysis of (46), the LOCAL value of the verb trace corresponds exactly to what is given
in (50).
The most important points of the analysis of verb position are summarized below:
• A lexical rule licenses a special lexical item for each finite verb.
• This lexical item occupies the initial position and requires as its argument a com-
plete projection of a verb trace.
• The projection of the verb trace must have a DSL value corresponding to the LOCAL
value of the input verb of the lexical rule.
• Since DSL is a head feature, the selected DSL value is also present on the trace.
• As the DSL value of the trace is identical to its LOCAL value, the LOCAL value of the
trace is identical to the LOCAL value of the input verb in the lexical rule.
After discussing the analysis of verb-first sentences, we will now turn to local reordering.
297
9 Head-Driven Phrase Structure Grammar
to be freely ordered inside such lists. See Reape (1994) and Section 11.7.2.2 of this book
for the formal details of these approaches. Both the completely flat analysis and the
compromise have proved to be on the wrong track (see Müller 2005b; 2014c and Müller
2007a: Section 9.5.1) and therefore, I will only discuss the analysis with binary branching
structures.
Figure 9.12 shows the analysis of (53a).
V[COMPS ⟨⟩]
1 NP[nom] V[COMPS ⟨ 1 ⟩]
2 NP[acc] V[COMPS ⟨ 1 , 2 ⟩]
The arguments of the verb are combined with the verb starting with the last element of
the COMPS list, as explained in Section 9.1.2. The analysis of the marked order is shown
in Figure 9.13. Both trees differ only in the order in which the elements are taken off
V[COMPS ⟨⟩]
2 NP[acc] V[COMPS ⟨ 2 ⟩]
1 NP[nom] V[COMPS ⟨ 1 , 2 ⟩]
298
9.4 Local reordering
from the COMPS list: in Figure 9.12, the last element of the COMPS list is discharged first
and in Figure 9.13 the first one is.
The following schema is a revised version of the Head-Complement Schema:
Schema 3 (Head-Complement Schema (binary branching))
head-complement-phrase ⇒
SYNSEM|LOC|CAT|COMPS 1 ⊕ 3
HEAD-DTR|SYNSEM|LOC|CAT|COMPS 1 ⊕⟨ 2 ⟩ ⊕ 3
NON-HEAD-DTRS ⟨ [ SYNSEM 2 ] ⟩
Whereas in the first version of the Head-Complement Schema it was always the last
element from the COMPS list that was combined with the head, the COMPS list is divided
into three parts using append: a list of arbitrary length ( 1 ), a list consisting of exactly
one element (⟨ 2 ⟩) and a further list of arbitrary length ( 3 ). The lists 1 and 3 are
combined and the result is the COMPS value of the mother node.
Languages with fixed constituent order (such as English) differ from languages such
as German in that they discharge the arguments starting from one side (for more on
the subject in English, see Section 9.1.1), whereas languages with free constituent order
can combine arguments with the verb in any order. In languages with fixed constituent
order, either 1 or 3 is always the empty list. Since German structures are not restricted
with regard to 1 or 3 , that is 1 and 3 can either be the empty list or contain elements, the
intuition is captured that there are less restrictions in languages with free constituent
order than in languages with fixed order. We can compare this to the Kayneian analysis
from Section 4.6.1, where it was assumed that all languages are derived from the base or-
der [specifier [head complement]] (see Figure 4.20 on page 149 for Laenzlinger’s analysis
of German as an SVO-language (Laenzlinger 2004)). In these kinds of analyses, languages
such as English constitute the most basic case and languages with free ordering require
some considerable theoretical effort to get the order right. In comparison to that, the
analysis proposed here requires more theoretical restrictions if the language has more
restrictions on permutations of its constituents. The complexity of the licensed struc-
tures does not differ considerably from language to language under an HPSG approach.
Languages differ only in the type of branching they have.20,21
The analysis presented here utilizing the combination of arguments in any order is
similar to that of Fanselow (2001) in the framework of GB/MP as well as the Categorial
Grammar analyses of Hoffman (1995: Section 3.1) and Steedman & Baldridge (2006).
Gunji proposed similar HPSG analyses for Japanese as early as 1986. See also Kim (2016:
16) for such an analysis of Korean.
20
This does not exclude that the structures in question have different properties as far as their processability
by humans is concerned. See Gibson (1998); Hawkins (1999) and Chapter 15.
21
Haider (1997b: 18) has pointed out that the branching type of VX languages differs from those of XV lan-
guages in analyses of the kind that is proposed here. This affects the c-command relations and therefore
has implications for Binding Theory in GB/MP. However, the direction of branching is irrelevant for HPSG
analyses as Binding Principles are defined using o-command (Pollard & Sag 1994: Chapter 6) and o-com-
mand makes reference to the Obliqueness Hierarchy, that is, the order of elements in the COMPS list rather
than the order in which these elements are combined with the head.
299
9 Head-Driven Phrase Structure Grammar
VP
NP VP/NP
V VP/NP
V NP/NP V′
NP V
In principle, one could also assume that the object is extracted from its unmarked
position (see Section 3.5 on the unmarked position). The extraction trace would then
follow the subject:
(55) [Diesen Mann] 𝑗 kennt𝑖 jeder _ 𝑗 _𝑖 .
this man knows everyone
‘Everyone knows this man.’
22
In HPSG, nothing is actually ‘passed up’ in a literal sense in feature structures or trees. This could be
seen as one of the most important differences between deterministic (e.g., HPSG) and derivational theories
like transformational grammars (see Section 15.1). Nevertheless, it makes sense for expository purposes to
explain the analysis as if the structure were built bottom-up, but linguistic knowledge is independent of
the direction of processing. In recent computer implementations, structure building is mostly carried out
bottom-up but there were other systems which worked top-down. The only thing that is important in the
analysis of nonlocal dependencies is that the information about the missing element on all intermediate
nodes is identical to the information in the filler and the gap.
300
9.5 Long-distance dependencies
Fanselow (2004c) argues that certain phrases can be placed in the Vorfeld without having
a special pragmatic function. For instance, (expletive) subjects in active sentences (56a),
temporal adverbials (56b), sentence adverbials (56c), dative objects of psychological verbs
(56d) and objects in passives (56e) can be placed in the Vorfeld, even though they are
neither topic nor focus.
(56) a. Es regnet.
it rains
‘It rains.’
b. Am Sonntag hat ein Eisbär einen Mann gefressen.
on Sunday has a polar.bear a man eaten
‘On Sunday, a polar bear ate a man.’
c. Vielleicht hat der Schauspieler seinen Text vergessen.
perhaps has the actor his text forgotten
‘Perhaps, the actor has forgotton his text.’
d. Einem Schauspieler ist der Text entfallen.
a.DAT actor is the.NOM text forgotten
‘An actor forgot the text.’
e. Einem Kind wurde das Fahrrad gestohlen.
a.DAT child was the.NOM bike stolen
‘A bike was stolen from a child.’
Fanselow argues that information structural effects can be due to reordering in the Mit-
telfeld. So by ordering the accusative object as in (57), one can reach certain effects:
(57) Kennt diesen Mann jeder?
knows this man everybody
‘Does everybody know this man?’
If one assumes that there are frontings to the Vorfeld that do not have information struc-
tural constraints attached to them and that information structural constraints are asso-
ciated with reorderings in the Mittelfeld, then the assumption that the initial element
in the Mittelfeld is fronted explains why the examples in (56) are not information struc-
turally marked. The elements in the Vorfeld are unmarked in the initial position in the
Mittelfeld as well:
(58) a. Regnet es?
rains it
‘Does it rain?’
b. Hat am Sonntag ein Eisbär einen Mann gefressen?
has on Sunday a polar.bear a man eaten
‘Did a polar bear eat a man on Sunday?’
301
9 Head-Driven Phrase Structure Grammar
302
9.5 Long-distance dependencies
Information about whether there has been combination with a trace and not with a gen-
uine argument is represented inside the complex sign and passed upward in the tree.
The long-distance dependency can then be resolved by an element in the prefield higher
in the tree.
Long-distance dependencies are introduced by the trace, which has a feature corre-
sponding to the LOCAL value of the required argument in its SLASH list. (61) shows the
description of the trace as is required for the analysis of (54):
Since traces do not have internal structure (no daughters), they are of type word. The
trace has the same properties as the accusative object. The fact that the accusative object
is not present at the position occupied by the trace is represented by the value of SLASH.
The following principle is responsible for ensuring that NONLOC information is passed
up the tree.
303
9 Head-Driven Phrase Structure Grammar
V[COMPS ⟨⟩,
INHER|SLASH ⟨⟩]
V1-LR
Figure 9.15: Analysis of Diesen Mann kennt jeder. ‘Everyone knows this man.’ combined
with the verb movement analysis for verb-initial order
304
9.6 New developments and theoretical variants
for non-fronted arguments. The SLASH value of the extraction trace is passed up the tree
and bound off by the Head-Filler Schema.
(61) provides the lexical entry for a trace that can function as the accusative object
of kennen ‘to know’. As with the analysis of verb movement, it is not necessary to have
numerous extraction traces with differing properties listed in the lexicon. A more general
entry such as the one in (62) will suffice:
This has to do with the fact that the head can satisfactorily determine the LOCAL proper-
ties of its arguments and therefore also the local properties of the traces that it combines
with. The identification of the object in the COMPS list of the head with the SYNSEM value
of the trace coupled with the identification of the information in SLASH with information
about the fronted element serves to ensure that the only elements that can be realized
in the prefield are those that fit the description in the COMPS list of the head. The same
holds for fronted adjuncts: since the LOCAL value of the constituent in the prefield is
identified with the LOCAL value of the trace via the SLASH feature, there is then sufficient
information available about the properties of the trace.
The central points of the preceding analysis can be summarized as follows: informa-
tion about the local properties of a trace is contained in the trace itself and then present
on all nodes dominating it until one reaches the filler. This analysis can offer an expla-
nation for so-called extraction path marking languages where certain elements show
inflection depending on whether they are combined with a constituent out of which
something has been extracted in a long-distance dependency. Bouma, Malouf & Sag
(2001) cite Irish, Chamorro, Palauan, Icelandic, Kikuyu, Ewe, Thompson Salish, Moore,
French, Spanish, and Yiddish as examples of such languages and provide correspond-
ing references. Since information is passed on step-by-step in HPSG analyses, all nodes
intervening in a long-distance dependency can access the elements in that dependency.
305
9 Head-Driven Phrase Structure Grammar
(2004); Sato (2006); Wetta (2011). I also suggested linearization-based analyses (Müller
1999b; 2002a) and implemented a large-scale grammar fragment based on Reape’s ideas
(Müller 1996c). Linearization-based approaches to the German sentence structure are
similar to the GPSG approach in that it is assumed that verb and arguments and adjuncts
are members of the same linearization domain and hence may be realized in any order.
For instance, the verb may precede arguments and adjuncts or follow them. Hence, no
empty element for the verb in final position is necessary. While this allows for gram-
mars without empty elements for the analysis of the verb position, it is unclear how
examples with apparent multiple frontings can be accounted for, while such data can be
captured directly in the proposal suggested in this chapter. The whole issue is discussed
in more detail in Müller (2017a). I will not explain Reape’s formalization here, but defer
its discussion until Section 11.7.2.2, where the discontinuous, non-projective structures
of some Dependency Grammars are compared to linearization-based HPSG approaches.
Apparent multiple frontings and the problems they pose for simple linearization-based
approaches are discussed in Section 11.7.1.
306
9.7 Summary and classification
the information about the relative pronoun is contained in the representation of the
phrase an den. This information is bound off when the relative clause is put together
(Pollard & Sag 1994: Chapter 5; Sag 1997). It is possible to use the same lexical entry for
den in the analyses of both (63) and (64) as – unlike in Categorial Grammar – the relative
pronoun does not have to know anything about the contexts in which it can be used.
(64) der Mann, [RS [NP den] [S/NP wir kennen]]
the man that we know
‘the man that we know’
Any theory that wants to maintain the analysis sketched here will have to have some
mechanism to make information available about the relative pronoun in a complex
phrase. If we have such a mechanism in our theory – as is the case in LFG and HPSG –
then we can also use it for the analysis of long-distance dependencies. Theories such as
LFG and HPSG are therefore more parsimonious with their descriptive tools than other
theories when it comes to the analysis of relative phrases.
In the first decade of HPSG history (Pollard & Sag 1987; 1994; Nerbonne, Netter &
Pollard 1994), despite the differences already mentioned here, HPSG was still very sim-
ilar to Categorial Grammar in that it was a strongly lexicalized theory. The syntactic
make-up and semantic content of a phrase was determined by the head (hence the term
head-driven). In cases where head-driven analyses were not straight-forwardly possible,
because no head could be identified in the phrase in question, then it was commonplace
to assume empty heads. An example of this is the analysis of relative clauses in Pollard
& Sag (1994: Chapter 5). Since an empty head can be assigned any syntactic valence and
an arbitrary semantics (for discussion of this point, see Chapter 19), one has not really
explained anything as one needs very good reasons for assuming an empty head, for
example that this empty position can be realized in other contexts. This is, however, not
the case for empty heads that are only proposed in order to save theoretical assumptions.
Therefore, Sag (1997) developed an analysis of relative clauses without any empty ele-
ments. As in the analyses sketched for (63) and (64), the relative phrases are combined
directly with the partial clause in order to form the relative clause. For the various ob-
servable types of relative clauses in English, Sag proposes different dominance rules. His
analysis constitutes a departure from strong lexicalism: in Pollard & Sag (1994), there are
six dominance schemata, whereas there are 23 in Ginzburg & Sag (2000).
The tendency to a differentiation of phrasal schemata can also be observed in the
proceedings of recent conferences. The proposals range from the elimination of empty
elements to radically phrasal analyses (Haugereid 2007; 2009).24
Even if this tendency towards phrasal analyses may result in some problematic analy-
ses, it is indeed the case that there are areas of grammar where phrasal analyses are re-
quired (see Section 21.10). For HPSG, this means that it is no longer entirely head-driven
and is therefore neither Head-Driven nor Phrase Structure Grammar.
HPSG makes use of typed feature descriptions to describe linguistic objects. General-
izations can be expressed by means of hierarchies with multiple inheritance. Inheritance
24
For discussion, see Müller (2007b) and Section 21.3.6.
307
9 Head-Driven Phrase Structure Grammar
also plays an important role in Construction Grammar. In theories such as GPSG, Cat-
egorial Grammar and TAG, it does not form part of theoretical explanations. In imple-
mentations, macros (abbreviations) are often used for co-occurring feature-value pairs
(Dalrymple, Kaplan & King 2004). Depending on the architecture assumed, such macros
are not suitable for the description of phrases since, in theories such as GPSG and LFG,
phrase structure rules are represented differently from other feature-value pairs (how-
ever, see Asudeh, Dalrymple & Toivonen (2008; 2013) for macros and inheritance used
for c-structure annotations). Furthermore, there are further differences between types
and macros, which are of a more formal nature: in a typed system, it is possible under
certain conditions to infer the type of a particular structure from the presence of certain
features and of certain values. With macros, this is not the case as they are only abbrevi-
ations. The consequences for linguistic analyses made by this differences are, however,
minimal.
HPSG differs from GB theory and later variants in that it does not assume transforma-
tions. In the 80s, representational variants of GB were proposed, that is, it was assumed
that there was no D-structure from which an S-structure is created by simultaneous
marking of the original position of moved elements. Instead, one assumed the S-struc-
ture with traces straight away and the assumption that there were further movements
in the mapping of S-structure to Logical Form was also abandoned (Koster 1978; Haider
1993: Section 1.4; Frey 1993: 14). This view corresponds to the view in HPSG and many
of the analyses in one framework can be translated into the other.
In GB theory, the terms subject and object do not play a direct role: one can use
these terms descriptively, but subjects and objects are not marked by features or similar
devices. Nevertheless it is possible to make the distinction since subjects and objects are
usually realized in different positions in the trees (the subject in specifier position of IP
and the object as the complement of the verb). In HPSG, subject and object are also not
primitives of the theory. Since valence lists (or ARG-ST lists) are ordered, however, this
means that it is possible to associate the ARG-ST elements to grammatical functions: if
there is a subject, this occurs in the first position of the valence list and objects follow.25
For the analysis of (65b) in a transformation-based grammar, the aim is to connect the
base order in (65a) and the derived order in (65b). Once one has recreated the base order,
then it is clear what is the subject and what is the object. Therefore, transformations
applied to the base structure in (65a) have to be reversed.
(65) a. [weil] jeder diesen Mann kennt
because everyone this man knows
‘because everyone knows this man’
b. [weil] diesen Mann jeder kennt
because this man everyone knows
In HPSG and also in other transformation-less models, the aim is to assign arguments in
the order in (65b) to descriptions in the valence list. The valence list (or ARG-ST in newer
approaches) corresponds in a sense to Deep Structure in GB. The difference is that the
25
When forming complex predicates, an object can occur in first position. See Müller (2002a: 157) for the
long passive with verbs such as erlauben ‘allow’. In general, the following holds: the subject is the first
argument with structural case.
308
9.7 Summary and classification
head itself is not included in the argument structure, whereas this is the case with D-
structure.
Bender (2008c) has shown how one can analyze phenomena from non-configura-
tional languages such as Wambaya by referring to the argument structure of a head.
In Wambaya, words that would normally be counted as constituents in English or Ger-
man can occur discontinuously, that is an adjective that semantically belongs to a noun
phrase and shares the same case, number and gender values with other parts of the
noun phrase can occur in a position in the sentence that is not adjacent to the remaining
noun phrase. Nordlinger (1998) has analyzed the relevant data in LFG. In her analysis,
the various parts of the constituent refer to the f-structure of the sentence and thus in-
directly ensure that all parts of the noun phrase have the same case. Bender adopts a
variant of HPSG where valence information is not removed from the valence list after an
argument has been combined with its head, but rather this information remains in the
valence list and is passed up towards the maximal projection of the head (Meurers 1999c;
Przepiórkowski 1999b; Müller 2007a: Section 17.4). Similar proposals were made in GB
by Higginbotham (1985: 560) and Winkler (1997). By projecting the complete valence
information, it remains accessible in the entire sentence and discontinuous constituents
can refer to it (e.g., via MOD) and the respective constraints can be formulated.26 In this
analysis, the argument structure in HPSG corresponds to f-structure in LFG. The ex-
tended head domains of LFG, where multiple heads can share the same f-structure, can
also be modeled in HPSG. To this end, one can utilize function composition as it was
presented in the chapter on Categorial Grammar (see Chapter 8.5.2). The exact way in
which this is translated into HPSG cannot be explained here due to space restrictions.
The reader is referred to the original works by Hinrichs & Nakazawa (1994a) and the
explanation in Müller (2007a: Chapter 15).
Valence information plays an important role in HPSG. The lexical item of a verb in
principle predetermines the set of structures in which the item can occur. Using lexical
rules, it is possible to relate one lexical item to other lexical items. These can be used in
other sets of structures. So one can see the functionality of lexical rules in establishing a
relation between sets of possible structures. Lexical rules correspond to transformations
in Transformational Grammar. This point is discussed in more detail in Section 19.5. The
effect of lexical rules can also be achieved with empty elements. This will also be the
matter of discussion in Section 19.5.
In GPSG, metarules were used to license rules that created additional valence patterns
for lexical heads. In principle, metarules could also be applied to rules without a lexical
head. This is explicitly ruled out by Flickinger (1983) and Gazdar et al. (1985: 59) using a
special constraint. Flickinger, Pollard & Wasow (1985: 265) pointed out that this kind of
constraint is unnecessary if one uses lexical rules rather than metarules since the former
can only be applied to lexical heads.
For a comparison of HPSG and Stabler’s Minimalist Grammars, see Section 4.6.4.
Torr’s implementation of Minimalist Grammars is discussed in Section 4.7.2 on pages
177–180.
26
See also Müller (2008) for an analysis of depictive predicates in German and English that makes reference
to the list of realized or unrealized arguments of a head, respectively. This analysis is also explained in
Section 18.2.
309
9 Head-Driven Phrase Structure Grammar
Comprehension questions
Exercises
310
9.7 Summary and classification
Further reading
Here, the presentation of the individual parts of the theory was – as with other
theories – kept relatively short. For a more comprehensive introduction to HPSG,
including motivation of the feature geometry, see Müller (2007a). In particular,
the analysis of the passive was sketched in brief here. The entire story including
the analysis of unaccusative verbs, adjectival participles, modal infinitives as well
as diverse passive variants and the long passive can be found in Müller (2002a:
Chapter 3) and Müller (2007a: Chapter 17).
Overviews of HPSG can be found in Levine & Meurers (2006), Przepiórkow-
ski & Kupść (2006), Bildhauer (2014) and Müller (2015a). Language Science Press
will publish a large handbook on HPSG (Müller et al. 2020) containing chapters
on foundational assumptions, the history of the framework, various syntactic
phenomena, non-syntactic levels of description like morphology, semantics, in-
formation structure, dialog and the comparison with other frameworks (Minimal-
ism, Categorial Grammar, Construction Grammar, Lexical Functional Grammar,
Dependency Grammar).
Müller (2014b) and Müller & Machicao y Priemer (2019) are two papers in
collections comparing frameworks. The first one is in German and contains an
analysis of a newspaper text.a The second one is in English and contains a gen-
eral description of the framework and a detailed analysis of the sentence in (69):b
(69) After Mary introduced herself to the audience, she turned to a man that
she had met before.
The books are similar to this one in that the respective authors describe a shared
set of phenomena within their favorite theories but the difference is that the
descriptions come straight from the horse’s mouth. Especially the newspaper
text is interesting since for some theories it was the first time for them to be
applied to real live data. As a result of this one sees phenomena covered that are
rarely treated in the rest of the literature.
a
See https://hpsg.hu-berlin.de/~stefan/Pub/artenvielfalt.html for the example sentences and some
interactive analyses of the examples.
b
See https://hpsg.hu-berlin.de/~stefan/Pub/current-approaches-hpsg.html for an interactive
analysis of the example.
311
10 Construction Grammar
Like LFG and HPSG, Construction Grammar (CxG) forms part of West Coast linguistics.
It has been influenced considerably by Charles Fillmore, Paul Kay and George Lakoff (all
three at Berkeley) and Adele Goldberg (who completed her PhD in Berkeley and is now
in Princeton) (Fillmore 1988; Fillmore, Kay & O’Connor 1988; Kay & Fillmore 1999; Kay
2002; 2005; Goldberg 1995; 2006).
Fillmore, Kay, Jackendoff and others have pointed out the fact that, to a large extent,
languages consist of complex units that cannot straightforwardly be described with the
tools that we have seen thus far. In frameworks such as GB, an explicit distinction is
made between core grammar and the periphery (Chomsky 1981a: 8), whereby the pe-
riphery is mostly disregarded as uninteresting when formulating a theory of Universal
Grammar. The criticism leveled at such practices by CxG is justified since what counts
as the ‘periphery’ sometimes seems completely arbitrary (Müller 2014d) and no progress
is made by excluding large parts of the language from the theory just because they are
irregular to a certain extent.
In Construction Grammar, idiomatic expressions are often discussed with regard to
their interaction with regular areas of grammar. Kay & Fillmore (1999) studied the What’s
X doing Y?-construction in their classic essay. (1) contains some examples of this con-
struction:
(1) a. What is this scratch doing on the table?
b. What do you think your name is doing in my book?
The examples show that we are clearly not dealing with the normal meaning of the
verb do. As well as the semantic bleaching here, there are particular morphosyntactic
properties that have to be satisfied in this construction. The verb do must always be
present and also in the form of the present participle. Kay and Fillmore develop an
analysis explaining this construction and also capturing some of the similarities between
the WXDY-construction and the rest of the grammar.
There are a number of variants of Construction Grammar:
• Berkeley Construction Grammar (Fillmore 1988; Kay & Fillmore 1999; Fried 2015)
• Goldbergian/Lakovian Construction Grammar (Lakoff 1987; Goldberg 1995; 2006)
• Cognitive Grammar (Langacker 1987; 2000; 2008; Dąbrowska 2004)
• Radical Construction Grammar (Croft 2001)
• Embodied Construction Grammar (Bergen & Chang 2005)
10 Construction Grammar
The aim of Construction Grammar is to both describe and theoretically explore language
in its entirety. In practice, however, irregularities in language are often given far more
importance than the phenomena described as ‘core grammar’ in GB. Construction Gram-
mar analyses usually analyze phenomena as phrasal patterns. These phrasal patterns are
represented in inheritance hierarchies (e.g., Croft 2001; Goldberg 2003b). An example
for the assumption of a phrasal construction is Goldberg’s analysis of resultative con-
structions. Goldberg (1995) and Goldberg & Jackendoff (2004) argue for the construction
status of resultatives. In their view, there is no head in (2) that determines the number
of arguments.
(2) Willy watered the plants flat.
The number of arguments is determined by the construction instead, that is, by a rule
or schema saying that the subject, verb, object and a predicative element must occur to-
gether and that the entire complex has a particular meaning. This view is fundamentally
different from analyses in GB, Categorial Grammar, LFG1 and HPSG. In the aforemen-
tioned theories, it is commonly assumed that arguments are always selected by lexical
heads and not independently licensed by phrasal rules. See Simpson (1983), Neeleman
(1994), Wunderlich (1997), Wechsler (1997), and Müller (2002a) for corresponding work
in LFG, GB, Wunderlich’s Lexical Decomposition Grammar and HPSG.
Like the theories discussed in Chapters 5–9, CxG is also a non-transformational the-
ory. Furthermore, no empty elements are assumed in most variants of the theory and the
assumption of lexical integrity is maintained as in LFG and HPSG. It can be shown that
these assumptions are incompatible with phrasal analyses of resultative constructions
(see Section 21.2.2 and Müller 2006; 2007b). This point will not be explained further
here. Instead, I will discuss the work of Fillmore and Kay to prepare the reader to be
able to read the original articles and subsequent publications. Although the literature
on Construction Grammar is now relatively vast, there is very little work on the basic
formal assumptions or analyses that have been formalized precisely. Examples of more
formal works are Kay & Fillmore (1999), Kay (2002), Michaelis & Ruppenhofer (2001),
and Goldberg (2003b). Another formal proposal was developed by Jean-Pierre Koenig
(1999) (formerly Berkeley). This work is couched in the framework of HPSG, but it has
been heavily influenced by CxG. Fillmore and Kay’s revisions of their earlier work took
place in close collaboration with Ivan Sag. The result was a variant of HPSG known
as Sign-Based Construction Grammar (SBCG) (Sag 2010; 2012). See Section 10.6.2 for
further discussion.
John Bryant, Nancy Chang, Eva Mok have developed a system for the implementa-
tion of Embodied Construction Grammar (Bryant 2003). Luc Steels is working on the
simulation of language evolution and language acquisition (Steels 2003). Steels works
experimentally modeling virtual communities of interacting agents. Apart from this he
1
See Alsina (1996) and Asudeh, Dalrymple & Toivonen (2008; 2013), however. For more discussion of this
point, see Sections 21.1.3 and 21.2.2.
314
10.1 General remarks on the representational format
uses robots that interact in language games (Steels 2015). In personal communication
(p. c. 2007) Steels stated that it is a long way to go until robots finally will be able to learn
to speak but the current state of the art is already impressive. Steels can use robots that
have a visual system (camera and image processing) and use visual information paired
with audio information in simulations of language acquisition. The implementation of
Fluid Construction Grammar is documented in Steels (2011) and Steels (2012). The second
book contains parts about German, in which the implementation of German declarative
clauses and w interrogative clauses is explained with respect to topological fields (Mi-
celli 2012). The FCG system, various publications and example analyses are available
at: http://www.fcg-net.org/. Jurafsky (1996) developed a Construction Grammar for En-
glish that was paired with a probabilistic component. He showed that many performance
phenomena discussed in the literature (see Chapter 15 on the Competence/Performance
Distinction) can be explained with recourse to probabilities of phrasal constructions and
valence properties of words. Bannard, Lieven & Tomasello (2009) use a probabilistic con-
text-free grammar to model grammatical knowledge of two and three year old children.
Lichte & Kallmeyer (2017) show that their version of Tree-Adjoining Grammar can be
seen as a formalization of several tenets of Construction Grammar. See for example the
analysis of idioms explained in Figure 18.6 on page 564.
315
10 Construction Grammar
A head is combined with at least one complement (the ‘+’ following the box stands for
at least one sign that fits the description in that box). LOC+ means that this element must
be realized locally. The value of ROLE tells us something about the role that a particular
element plays in a construction. Unfortunately, here the term filler is used somewhat
differently than in GPSG and HPSG. Fillers are not necessarily elements that stand in a
long-distance dependency to a gap. Instead, a filler is a term for a constituent that fills
the argument slot of a head.
The verb phrase construction is a sub-construction of the head-complement construc-
tion:
(6) Verb phrase Construction:
cat v
The syntactic category of the entire construction is V. Its complements cannot have the
grammatical function subject.
The VP construction is a particular type of head-complement construction. The fact
that it has much in common with the more general head-complement construction is
represented as follows:
(7) Verb phrase Construction with inheritance statement:
INHERIT HC
cat v
gf ¬subj +
This representation differs from the one in HPSG, aside from the box notation, only in the
fact that feature descriptions are not typed and as such it must be explicitly stated in the
representation from which superordinate construction inheritance takes place. HPSG –
in addition to the schemata – has separate type hierarchies specifying the inheritance
relation between types.
316
10.2 Passive
10.1.3 Semantics
Semantics in CxG is handled exactly the same way as in HPSG: semantic information is
contained in the same feature structure as syntactic information. The relation between
syntax and semantics is captured by using the same variable in the syntactic and seman-
tic description. (8) contains a feature description for the verb arrive:
(8) Lexical entry for arrive following Kay & Fillmore (1999: 11):
cat v " #)
(
I
FRAME ARRIVE
sem ARGS { A }
val [ SEM { A } ]
Kay & Fillmore (1999: 9) refer to their semantic representations as a notational variant of
the Minimal Recursion Semantics of Copestake, Flickinger, Pollard & Sag (2005). In later
works, Kay (2005) explicitly uses MRS. As the fundamentals of MRS have already been
discussed in Section 9.1.6, I will not repeat them here. For more on MRS, see Section 19.3.
10.1.4 Adjuncts
For the combination of heads and modifiers, Kay and Fillmore assume further phrasal
constructions that are similar to the verb phrase constructions discussed above and cre-
ate a relation between a head and a modifier. Kay and Fillmore assume that adjuncts also
contribute something to the VAL value of the mother node. In principle, VAL is nothing
more than the set of all non-head daughters in a tree.
10.2 Passive
The passive has been described in CxG by means of so-called linking constructions,
which are combined with lexical entries in inheritance hierarchies. In the base lexicon, it
is only listed which semantic roles a verb fulfils and the way in which these are realized
is determined by the respective linking constructions with which the basic lexical entry
2
Sets in BCG work differently from those used in HPSG. A discussion of this is deferred to Section 10.6.1.
317
10 Construction Grammar
lexeme
318
10.2 Passive
The structure in (9a) says that the valence set of a linguistic object that is described by
the transitive construction has to contain an element that has the grammatical function
object and whose DA value is ‘−’. The DA value of the argument that would be the subject
in an active clause is ‘+’ and ‘−’ for all other arguments. The subject construction states
that an element of the valence set must have the grammatical function subject. In the
passive construction, there has to be an element with the grammatical function oblique
that also has the DA value ‘+’. In the passive construction the element with the DA value
‘+’ is realized either as a by-PP or not at all (zero).
The interaction of the constructions in (9) will be explained on the basis of the verb
schlagen ‘to beat’:
(10) Lexical entry for schlag- ‘beat’:
SYN CAT v
" #
h
ROLE 𝜃 agent
i
, ROLE 𝜃 patient
VAL DA +
If we combine this lexical entry with the transitive and subject constructions, we arrive
at (11a) following Fillmore, Kay, Michaelis, and Ruppenhofer, whereas combining it with
the subject and passive construction yields (11b):5
(11) a. schlag- + Subject and Transitive Construction:
SYN CAT v
VOICE active
𝜃 agent
𝜃 patient
ROLE GF subj ROLE GF obj
VAL ,
DA + DA −
b. schlag- + Subject and Passive Construction:
SYN CAT v
FORM PastPart
𝜃 agent "
#
ROLE GF obl ROLE 𝜃 patient
VAL DA + , GF subj
SYN P[von]/zero
Using the entries in (11), it is possible to analyze the sentences in (12):
(12) a. Er schlägt den Weltmeister.
he beats the world.champion
‘He is beating the world champion.’
5
This assumes a particular understanding of set unification. For criticism of this, see Section 10.6.1.
319
10 Construction Grammar
(13) The
SubjectConstruction
h with Pollard & Moschier’s definition of sets:
SYN|CAT V i
∧ ROLE GF subj ⊂ 1
VAL 1
The restriction in (13) states that the valence set of a head has to contain an element that
has the grammatical function subj. By these means, it is possible to suppress arguments
(by specifying SYN as zero), but it is not possible to add any additional arguments to the
fixed set of arguments of schlagen ‘to beat’.6 For the analysis of Middle Constructions
such as (14), inheritance-based approaches do not work as there is no satisfactory way
to add the reflexive pronoun to the valence set:7
(14) Das Buch liest sich gut.
the book reads REFL good
‘The book reads well / is easy to read.’
If we want to introduce additional arguments, we require auxiliary features. An analy-
sis using auxiliary features has been suggested by Koenig (1999). Since there are many
argument structure changing processes that interact in various ways and are linked to
particular semantic side-effects, it is inevitable that one ends up assuming a large num-
ber of syntactic and semantic auxiliary features. The interaction between the various
linking constructions becomes so complex that this analysis also becomes cognitively
implausible and has to be viewed as technically unusable. For a more detailed discussion
of this point, see Müller (2007a: Section 7.5.2).
6
Rather than requiring that schlagen ‘to beat’ has exactly two arguments as in HPSG, one could also assume
that the constraint on the main lexical item would be of the kind in (11a). One would then require that
schlagen has at least the two members in its valence set. This would complicate everything considerably
and furthermore it would not be clear that the subject referred to in (13) would be one of the arguments
that are referred to in the description of the lexical item for schlagen in (11a).
7
One technically possible solution would be the following: one could assume that verbs that occur in middle
constructions always have a description of a reflexive pronoun in their valence set. The Transitive Con-
struction would then have to specify the SYN value of the reflexive pronoun as zero so that the additional
reflexive pronoun is not realized in the Transitive Construction. The middle construction would suppress
the subject, but realizes the object and the reflexive.
This solution cannot be applied to the recursive processes we will encounter in a moment such as
causativization in Turkish, unless one wishes to assume infinite valence sets.
320
10.2 Passive
The following empirical problem is much more serious: some processes like passiviza-
tion, impersonalization and causativization can be applied in combination or even al-
low for multiple application, but if the grammatical function of a particular argument
is determined once and for all by unification, additional unifications cannot change the
initial assignment. We will first look at languages which allow for a combination of
passivization and impersonalization, such as Lithuanian (Timberlake 1982: Section 5),
Irish (Noonan 1994), and Turkish (Özkaragöz 1986; Knecht 1985: Section 2.3.3). I will use
Özkaragöz’s Turkish examples in (15) for illustration (1986: 77):
(15) a. Bu şato-da boğ-ul-un-ur.
this château-LOC strangle-PASS-PASS-AOR
‘One is strangled (by one) in this château.’
b. Bu oda-da döv-ül-ün-ür.
this room-LOC hit-PASS-PASS-AOR
‘One is beaten (by one) in this room.’
c. Harp-te vur-ul-un-ur.
war-LOC shoot-PASS-PASS-AOR
‘One is shot (by one) in war.’
-In, -n, and -Il are allomorphs of the passive/impersonal morpheme.8
Approaches that assume that the personal passive is the unification of some general
structure with some passive-specific structure will not be able to capture double pas-
sivization or passivization + impersonalization since they have committed themselves
to a certain structure too early. The problem for nontransformational approaches that
state syntactic structure for the passive is that such a structure, once stated, cannot be
modified. That is, we said that the underlying object is the subject in the passive sen-
tence. But in order to get the double passivization/passivization + impersonalization,
we have to suppress this argument as well. What is needed is some sort of process (or
description) that takes a representation and relates it to another representation with a
suppressed subject. This representation is related to a third representation which again
suppresses the subject resulting in an impersonal sentence. In order to do this one needs
different strata as in Relational Grammar (Timberlake 1982; Özkaragöz 1986), metarules
(Gazdar, Klein, Pullum & Sag 1985), lexical rules (Dowty, 1978: 412; 2003: Section 3.4;
Bresnan 1982b; Pollard & Sag 1987; Blevins 2003; Müller 2003b), transformations (Chom-
sky 1957), or just a morpheme-based morphological analysis that results in items with
different valence properties when the passivization morpheme is combined with a head
(Chomsky 1981a).
The second set of problematic data that will be discussed comes from causativization
in Turkish (Lewis 1967: 146):
8
According to Özkaragöz, the data is best captured by an analysis that assumes that the passive applies
to a passivized transitive verb and hence results in an impersonal passive. The cited authors discussed
their data as instances of double passivization, but it was argued by Blevins (2003) that these and similar
examples from other languages are impersonal constructions that can be combined with personal passives.
321
10 Construction Grammar
(16) öl-dür-t-tür-t-
‘to cause someone to cause someone to cause someone to kill someone’
(kill = cause someone to die)
The causative morpheme -t is combined four times with the verb (tür is an allomorph of
the causative morpheme). This argument structure-changing process cannot be modeled
in an inheritance hierarchy since if we were to say that a word can inherit from the
causative construction three times, we would still not have anything different to what
we would have if the inheritance via the causative construction had applied only once.
For this kind of phenomenon, we would require rules that relate a linguistic object to
another, more complex object, that is, lexical rules (unary branching rules which change
the phonology of a linguistic sign) or binary rules that combine a particular sign with a
derivational morpheme. These rules can semantically embed the original sign (that is,
add cause to kill).
The problem of repeated combination with causativization affixes is an instance of
a more general problem: derivational morphology cannot be handled by inheritance
as was already pointed out by Krieger & Nerbonne (1993) with respect to cases like
preprepreversion.
If we assume that argument alternations such as passive, causativization and the Mid-
dle Construction should be described with the same means across languages, then evi-
dence from Lithuanian and Turkish form an argument against inheritance-based analy-
ses of the passive (Müller 2006; 2007b; Müller & Wechsler 2014a). See also Section 21.2.2
for the discussion of an inheritance-based approach to passive in LFG and Section 21.4.2
for the discussion of an inheritance-based approach in Simpler Syntax.
Welke states that arguments of a verb come in a fixed order in what he calls the pri-
mary perspectivization. So in (17), nominative comes before accusative. In addition to
9
Note that none of the constituent tests that were discussed in Section 1.3 justifies such an analysis and that
no other theory in this book assumes the Mittelfeld to be a constituent.
322
10.3 Verb position
(18a) is the schema for verb-initial clauses, (18b) the schema for verb-second clauses and
(18c) the schema for verb-final clauses. These schemata stand for argument structure
constructions with exactly three arguments, but of course there are others with fewer
or more arguments. Welke (2019: Section 7.4) notes that modifiers can be placed any-
where between the arguments in German. He assumes that there is a fusion process that
can insert a modifier into argument structure constructions. Welke notes that inserting
modifiers into constructions like (18) causes problems with the verb-second property of
German since a pattern Modifier Arg Verb Arg Arg is V3 rather than V2. He hence con-
cludes that a generalized construction consisting of verb(s), arguments and modifiers is
needed: a construction that corresponds to the topological fields model of the German
clause (Welke 2019: Section 7.5). Details about the integration of the semantics of the ad-
juncts, about the interaction of the topological fields schemata with argument structure
constructions and about the analysis of nonlocal dependencies are not provided.
So, it must be said that neither Micelli’s analysis nor Welke’s provides fully worked
out proposals of the phenomena discussed in this book and hence I will not discuss
them any further, but instead explore some of the possibilities for analyzing German
sentence structure that are at least possible in principle in a CxG framework. Since
there are neither empty elements nor transformations, the GB and HPSG analyses from
Chapter 3 and 9 as well as their variants in Categorial Grammar are ruled out. The
following options remain:
Different variants of CxG make different assumptions about how abstract constructions
can be. In Categorial Grammar, we have very general combinatorial rules which com-
bine possibly complex signs without adding any meaning of their own (see rule (2) on
page 246 for example). (19) shows an example in which the abstract rule of forward
application was used:
323
10 Construction Grammar
324
10.4 Local reordering
325
10 Construction Grammar
constituent with an open valence slot.12 This approach corresponds to the LFG analysis
of Kaplan & Zaenen (1989) based on functional uncertainty.
12
Note again, that there are problems with the formalization of this proposal in Kay & Fillmore’s paper. The
formalization of VAL, which was provided by Andreas Kathol, seems to presuppose a formalization of sets
as the one that is used in HPSG, but the rest of Fillmore & Kay’s paper assumes a different formalization,
which is inconsistent. See Section 10.6.1.
326
10.6 New developments and theoretical variants
The first NP in (24) is underspecified with respect to its case. The case of the NP in the
second set is specified as nominative. NP[nom] does not unify with NP[acc] but with
NP.
This particular conception of unification has consequences. Unification is usually de-
fined as follows:
(25) The unification of two structures FS1 and FS2 is the structure FS3 that is subsumed
by both FS1 and FS2 where there is no other structure that subsumes FS1 and FS2
and is subsumed by FS3 .
A structure FS1 is said to subsume FS3 iff FS3 contains all feature value pairs and structure
sharings from FS1 . FS3 may contain additional feature value pairs or structure sharings.
The consequence is that the subsumption relations in (26b,c) have to hold if unification
of valence sets works as in (26a):
(26) Properties of the set unification according to Kay & Fillmore (1999):
a. { NP[nom] } ∧ { NP[acc] } = { NP[nom], NP[acc] }
b. { NP[nom] } ⪰ { NP[nom], NP[acc] }
c. { NP[acc] } ⪰ { NP[nom], NP[acc] }
(26b) means that a feature structure with a valence set that contains just one NP[nom]
is more general than a feature structure that contains both an NP[nom] and an NP[acc].
Therefore the set of transitive verbs is a subset of the intransitive verbs. This is rather
unintuitive, but compatible with Fillmore & Kay’s system for the licensing of arguments.
However, there are problems with the interaction of valence specifications and linking
constructions, which we turn to now.
We have seen the result of combining lexical items with linking constructions in (11a)
and (11b), but the question of how these results are derived has not been addressed so
far. Kay (2002) suggests an automatic computation of all compatible combinations of
maximally specific constructions. Such a procedure could be used to compute the lexical
representations we saw in Section 10.2 and these could be then used to analyze the well-
formed sentences in (12).
However, problems would result for ungrammatical sentences like (27b). grauen ‘to
dread’ is a subjectless verb. If one would simply combine all compatible linking construc-
tions with grauen, the Kay & Fillmoreian conception of set unification would cause the
introduction of a subject into the valence set of grauen. (27b) would be licensed by the
grammar:
(27) a. Dem Student graut vor der Prüfung.
the.DAT student dreads before the exam
‘The student dreads the exam.’
b. * Ich graue dem Student vor der Prüfung.
I dread the.DAT student before the exam
One could solve this problem by specifying an element with the grammatical function
subject in the lexical entry of grauen ‘to dread’. In addition, it would have to be stipulated
327
10 Construction Grammar
that this subject can only be realized as an overt or covert expletive (The covert expletive
would be SYN zero). For the covert expletive, this means it has neither a form nor a
meaning. Such expletive pronouns without phonological realization are usually frowned
upon in Construction Grammar and analyses that can do without such abstract entities
are to be preferred.
Kay & Fillmore (1999) represent the semantic contribution of signs as sets as well. This
excludes the possibility of preventing the unwanted unification of linking constructions
by referring to semantic constraints since we have the same effect as we have with va-
lence sets: if the semantic descriptions are incompatible, the set is extended. This means
that in an automatic unification computation all verbs are compatible with the Transi-
tive Construction in (9a) and this would license analyses for (28) in addition to those of
(27b).
(28) a. * Der Mann schläft das Buch.
the man sleeps the book
b. * Der Mann denkt an die Frau das Buch.
the man thinks at the woman the book
An intransitive verb was unified with the Transitive Construction in the analysis of (28a)
and in (28b) a verb that takes a prepositional object was combined with the Transitive
Construction. This means that representations like (11) cannot be computed automati-
cally as was intended by Kay (2002). Therefore one would have to specify subconstruc-
tions for all argument structure possibilities for every verb (active, passive, middle, …).
This does not capture the fact that speakers can form passives after acquiring new verbs
without having to learn about the fact that the newly learned verb forms one.
Michaelis & Ruppenhofer (2001) do not use sets for the representation of semantic
information. Therefore they could use constraints regarding the meaning of verbs in
the Transitive Construction. To this end, one needs to represent semantic relations with
feature descriptions as it was done in Section 9.1.6. Adopting such a representation, it
is possible to talk about two-place relations in an abstract way. See for instance the
discussion of (29) on page 284. However, the unification with the Subject Construction
cannot be blocked with reference to semantics since there exist so-called raising verbs
that take a subject without assigning a semantic role to it. As is evidenced by subject
verb agreement, du ‘you’ is the subject in (29), but the subject does not get a semantic
role. The referent of du is not the one who seems.
(29) Du scheinst gleich einzuschlafen.
you seem.2SG soon in.to.sleep
‘You seem like you will fall asleep soon.’
This means that one is forced to either assume an empty expletive subject for verbs like
grauen or to specify explicitly which verbs may inherit from the subject construction
and which may not.
328
10.6 New developments and theoretical variants
In addition to (29), there exist object raising constructions with accusative objects that
can be promoted to subject in passives. The subject in the passive construction does not
get a semantic role from the finite verb:
(30) a. Richard lacht ihn an.
Richard laughs him towards
‘Richard smiles at him.’
b. Richard fischt den Teich leer.
Richard fishes the pond empty
The objects in (30) are semantic arguments of an ‘towards’ and leer ‘empty’, respectively,
but not semantic arguments of the verbs lacht ‘laughs’ and fischt ‘fishes’, respectively.
If one wants to explain these active forms and the corresponding passive forms via the
linking constructions in (9), one cannot refer to semantic properties of the verb. There-
fore, one is forced to postulate specific lexical entries for all possible verb forms in active
and passive sentences.
329
10 Construction Grammar
(31) Head-Complement Construction following Sag, Wasow & Bender (2003: 481):
head-comp-cx ⇒
MOTHER|SYN|VAL|COMPS ⟨⟩
HEAD-DTR 0 word
SYN|VAL|COMPS 𝐴
DTRS ⟨ 0 ⟩ ⊕ 𝐴 nelist
The value of COMPS is then a list of the complements of a head (see Section 9.1.1). Unlike
in standard HPSG, it is not synsem objects that are selected with valence lists, but rather
signs. The analysis of the phrase ate a pizza takes the form in (32).13
head-comp-cx
phrase
FORM ate, a, pizza
HEAD verb
MOTHER
SYN SPR ⟨ NP[nom] ⟩
COMPS ⟨⟩
SEM …
(32) word
FORM ⟨ ate ⟩
HEAD verb
HEAD-DTR 1
SYN SPR ⟨ NP[nom]⟩
COMPS ⟨ 2 NP[acc] ⟩
SEM …
DTRS ⟨ 1, 2 ⟩
The difference to HPSG in the version of Pollard & Sag (1994) is that for Sag, Wasow &
Bender, signs do not have daughters and this makes the selection of daughters impos-
sible. As a result, the SYNSEM feature becomes superfluous (selection of the PHON value
and of the value of the newly introduced FORM feature is allowed in Sag, Wasow & Ben-
der (2003) and Sag (2012)). The information about the linguistic objects that contribute
to a complex sign is only represented in the very outside of the structure. The sign repre-
sented under MOTHER is of the type phrase but does not contain any information about
the daughters. The object described in (32) is of course also of another type than the
phrasal or lexical signs that can occur as its daughters. We therefore need the following
extension so that the grammar will work (Sag, Wasow & Bender 2003: 478):14
13
SBCG uses a FORM feature in addition to the PHON feature, which is used for phonological information as
in earlier versions of HPSG (Sag 2012: Section 3.1, Section 3.6). The FORM feature is usually provided in
example analyses.
14
A less formal version of this constraint is given as the Sign Principle by Sag (2012: 105): “Every sign must
be listemically or constructionally licensed, where: a sign is listemically licensed only if it satisfies some
listeme, and a sign is constructionally licensed if it is the mother of some well-formed construct.”
330
10.6 New developments and theoretical variants
331
10 Construction Grammar
332
10.6 New developments and theoretical variants
which is why Klaus is also accessible to wissen. However, the purpose of the new, more
restrictive feature geometry was to rule out such nonlocal access to arguments.
An alternative to projecting the complete argument structure was suggested by Kay
et al. (2015: Section 6): instead of assuming that the subject is the XARG in idiomatic con-
structions like those in (35), they assume that the accusative or dative argument is the
XARG. This is an interesting proposal that could be used to fix the cases under discus-
sion, but the question is whether it scales up if interaction with other phenomena are
considered. For instance, Bender & Flickinger (1999) use XARG in their account of ques-
tion tags in English. So, if English idioms can be found that require a non-subject XARG
in embedded sentences while also admitting the idiom parts in the embedded sentence
to occur as full clause with question tag, we would have conflicting demands and would
have to assume different XARGs for root and embedded clauses, which would make this
version of the lexical theory rather unattractive, since we would need two lexical items
for the respective verb.
(35d) is especially interesting, since here the X that refers to material outside the idiom
is in an adjunct. If such cases existed, the XARG mechanism would be clearly insufficient
since XARG is not projected from adjuncts. However, as Kay et al. (2015) point out the X
does not necessarily have to be a pronoun that is coreferent with an element in a matrix
clause. They provide the following example:
(38) Justin Bieber—Once upon a time ∅ butter wouldn’t melt in little Justin’s mouth.
Now internationally famous for being a weapons-grade petulant brat …
So, whether examples of the respective kind can be found is an open question.
Returning to our horse examples, Richter & Sailer (2009: 313) argue that the idiomatic
reading is only available if the accusative pronouns is fronted and the embedded clause
is V2. The examples in (39) do not have the idiomatic reading:
(39) a. Ich glaube, dass mich ein Pferd tritt.
I believe that me a horse kicks
‘I believe that a horse kicks me.’
b. Ich glaube, ein Pferd tritt mich.
I believe a horse kicks me
‘I believe that a horse kicks me.’
Richter & Sailer assume a structure for X_Acc tritt ein Pferd in (35b) that contains, among
others, the constraints in (40).
The feature geometry in (40) differs somewhat from what was presented in Chapter 9
but that is not of interest here. It is only of importance that the semantic contribution
of the entire phrase is surprised ′(x 2 ). The following is said about the internal structure
of the phrase: it consists of a filler-daughter (an extracted element) and also of a head
daughter corresponding to a sentence from which something has been extracted. The
head daughter means ‘a horse kicks x 2 ’ and has an internal head somewhere whose
argument structure list contains an indefinite NP with the word Pferd ‘horse’ as its head.
The second element in the argument structure is a pronominal NP in the accusative
333
10 Construction Grammar
phrase
" #
CAT|LISTEME very-surprised
SYNSEM|LOC ′
CONT|MAIN surprised (𝑥 2 )
word
FILLER-DTR SYNSEM|LOC 1
LF|EXC ‘a horse kicks x ’
2
word
" #
(40)
HEAD TENSE pres
SYNSEM|LOC|CAT
DTRS LISTEME treten
H-DTR (DTRS|H-DTR)+ NP[LISTEME pferd, DEF −, sg],
* +
CAT|HEAD|CASE
acc
ARG-ST LOC 1 ppro
CONT
INDEX 2
whose LOCAL value is identical to that of the filler ( 1 ). The entire meaning of this part
of the sentence is surprised ′(x 2 ), whereby 2 is identical to the referential index of the
pronoun. In addition to the constraints in (40), there are additional ones that ensure that
the partial clause appears with the relevant form of glauben ‘to believe’ or denken ‘to
think’. The exact details are not that important here. What is important is that one can
specify constraints on complex syntactic elements, that is, it must be possible to refer to
daughters of daughters. This is possible with the classical HPSG feature geometry, but
not with the feature geometry of SBCG. For a more general discussion of locality, see
Section 18.2.
The restrictions on Pferd clauses in (40) are too strict, however, since there are variants
of the idiom that do not have the accusative pronoun in the Vorfeld:
(41) a. ich glaub es tritt mich ein Pferd wenn ich einen derartigen Unsinn
I believe EXPL kicks me a horse when I a such nonsense
16
lese.
read
‘I am utterly surprised when I read such nonsense.’
b. omg dieser xBluuR der nn ist wieder da ey nein ich glaub es tritt
omg this XBluuR he is again there no I believe EXPL kicks
mich ein Pferd!!17
me a horse
‘OMG, this xBluuR, the nn, he is here again, no, I am utterly surprised.’
16
http://www.welt.de/wirtschaft/article116297208/Die-verlogene-Kritik-an-den-Steuerparadiesen.html,
commentary section, 2018-02-20.
17
http://forum.gta-life.de/index.php?user/3501-malcolm/, 10.12.2015.
334
10.6 New developments and theoretical variants
18
http://www.castingshow-news.de/menowin-frhlich-soll-er-zum-islam-konvertieren-7228/, 2018-02-20.
19
A note of caution is necessary since there were misunderstandings in the past regarding the degree of for-
malizations of SBCG: in comparison to most other theories discussed in this book, SBCG is well-formalized.
For instance it is easy to come up with a computer implementation of SBCG fragments. I implemented one
in the TRALE system myself. The reader is referred to Richter (2004) to get an idea what kind of deeper
formalization is talked about here.
335
10 Construction Grammar
generative power). However, the locality restrictions of SBCG can be circumvented eas-
ily by structure sharing (Müller 2013a: Section 9.6.1). To see this consider a construction
with the following form:20
sign
PHON phonological-object
FORM morphological-object
MOTHER
SYN syntactic information
(42)
SEM semantic information
NASTY 1
DTRS 1 list of signs
The feature NASTY in the MOTHER sign refers to the value of DTRS and hence all the
internal structure of the sign that is licensed by the constructional schema in (42) is
available. Of course one could rule out such things by stipulation – if one considered it to
be empirically adequate, but then one could as well continue to use the feature geometry
of Constructional HPSG (Sag 1997) and stipulate constraints like “Do not look into the
daughters.” An example of such a constraint given in prose is the Locality Principle of
Pollard & Sag (1987: 143–144).
336
10.6 New developments and theoretical variants
Note also that the treatment of raising in SBCG admits nonlocal selection of phonol-
ogy values, since the analysis of raising in SBCG assumes that the element on the valence
list of the embedded verb is identical to an element in the ARG-ST list of the matrix verb
(Sag 2012: 159). Hence, both verbs in (44) can see the phonology of the subject:
(44) Kim can eat apples.
In principle there could be languages in which the form of the downstairs verb depends
on the presence of an initial consonant in phonology of the subject. English allows for
long chains of raising verbs and one could imagine languages in which all the verbs on
the way are sensitive to the phonology of the subject. Such languages probably do not
exist.
Now, is this a problem? Not for me, but if one develops a general setup in a way to
exclude everything that is not attested in the languages of the world (as for instance the
selection of arguments of arguments of arguments), then it is a problem that heads can
see the phonology of elements that are far away.
There are two possible conclusions for practitioners of SBCG: either the MOTHER fea-
ture could be given up since one agrees that theories that do not make wrong predictions
are sufficiently constrained and one does not have to explicitly state what cannot occur
in languages or one would have to react to the problem with nonlocally selected phonol-
ogy values and therefore assume a SYNSEM or LOCAL feature that bundles information
that is relevant in raising and does not include the phonology. This supports the argu-
ment I made on MOTHER in the previous subsection.
337
10 Construction Grammar
HPSG was motivated by the wish to structure-share information. Everything within LO-
CAL was shared between filler and extraction side. This kind of motivation is given up
in SBCG.
Note also that not sharing the complete filler with the gap means that the FORM value
of the element in the ARG-ST list at the extraction side is not constrained. Without
any constraint the theory would be compatible with infinitely many models, since the
FORM value could be anything. For example, the FORM value of an extracted adjective
could be ⟨ Donald Duck ⟩ or ⟨ Dunald Dock ⟩ or any arbitrary chaotic sequence of letters/
phonemes. To exclude this, one can stipulate the FORM values of extracted elements to
be the empty list, but this leaves one with the unintuitive situation that the element in
GAP has an empty FORM list while the corresponding filler has a different, filled one.
The problem with such an approach is that VPs differ from other phrasal projections
in having an element on their VALENCE list. APs, NPs, and (some) PPs have an empty
VALENCE list. In other versions of HPSG the complements are represented on the COMPS
list and generalizations about phrases with fully saturated COMPS lists can be expressed
directly. One such generalization is that projections with an empty COMPS list (NPs, PPs,
VPs, adverbs, CPs) can be extraposed in German (Müller 1999b: Section 13.1.2).
Note also that reducing the number of valence features does not necessarily reduce
the number of constructions in the grammar. While classical HPSG has a Specifier-Head
Schema and a Head-Complement Schema, SBCG has two Head-Complement Construc-
tions: one with a mother with a singleton COMPS list and one with a mother with an
empty COMPS list (Sag 2012: 152). That two seperate constructions are needed is due to
the assumption of flat structures.
338
10.6 New developments and theoretical variants
2012: 125). While the use of letters instead of numbers is just a presentational variant, the
exclamation mark is a non-trivial extension. (47) provides an example: the constraints
on the type pred-hd-comp-cxt:
(47) Predicational Head-Complement Construction following Sag (2012: 152):
pred-hd-comp-cxt ⇒
MOTHER|SYN X ! [VAL ⟨ Y ⟩]
" #
word
HEAD-DTR Z:
SYN X: VAL ⟨ Y ⟩ ⊕ L
DTRS ⟨ Z ⟩ ⊕ L:nelist
The X stands for all syntactic properties of the head daughter. These are identified with
the value of SYN of the mother with the exception of the VAL value, which is specified
to be a list with the element Y. It is interesting to note that the !-notation is not without
problems: Sag (2012: 145) states that the version of SBCG that he presents is “purely
monotonic (non-default)”, but if the SYN value of the mother is not identical because
of overwriting of VAL, it is unclear how the type of SYN can be constrained. ! can be
understood as explicitly sharing all features that are not mentioned after the !. Note
though, that the type has to be shared as well. This is not trivial since structure sharing
cannot be applied here since structure sharing the type would also identify all features
belonging to the respective value. So one would need a relation that singles out a type
of a structure and identifies this type with the value of another structure. Note also that
information from features behind the ! can make the type of the complete structure more
specific. Does this affect the shared structure (e.g., HEAD-DTR|SYN in (47))? What if the
type of the complete structure is incompatible with the features in this structure? What
seems to be a harmless notational device in fact involves some non-trivial machinery in
the background. Keeping the Head Feature Principle makes this additional machinery
unnecessary.
10.6.2.6 Conclusion
Due to the conceptual problems with meta-statements and the relatively simple ways
of getting around locality restrictions, the reorganization of features (MOTHER vs. SYN-
SEM) does not bring with it any advantages. Since the grammar becomes more complex
due to the meta-constraint, we should reject this change.22 Other changes in the fea-
22
In Müller (2013a: 253) I claimed that SBCG uses a higher number of features in comparison to other variants
of HPSG because of the assumption of the MOTHER feature. As Van Eynde (2015) points out this is not true
for more recent variants of HPSG since they have the SYNSEM feature, which is not needed if MOTHER
is assumed. (Van Eynde refers to the LOCAL feature, but the LOCAL feature was eliminated because it was
considered superfluous because of the lexical analysis of extraction. If one simply omits the MOTHER feature
from SBCG one is back to the 1987 version of HPSG (Pollard & Sag 1987), which also used a SYN and a SEM
feature. What would be missing would be the locality of selection (Sag 2012: 149) that was enforced to
some extent by the SYNSEM feature. Note that the locality of selection that is enforced by SYNSEM can be
circumvented by the use of relational constraints as well (see Frank Richter and Manfred Sailer’s work on
collocations (Richter & Sailer 1999a; Soehn & Sailer 2008)). So in principle, we end up with style guides in
this area of grammar as well.
339
10 Construction Grammar
ture geometry (elimination of the LOCAL feature and use of a single valence feature) are
problematic as well. However, if we do reject the revised feature geometry and revert to
the feature geometry that was used before, then Sign-Based Construction Grammar and
Constructional HPSG (Sag 1997) are (almost) indistinguishable.
23
For a similar construction, see Bergen & Chang (2005: 162).
340
10.6 New developments and theoretical variants
DetNoun
F|ORTH 1 ⊕ 2
CASE 3
NUMBER 4
M 5
CommonNoun
(49) Determiner
F|ORTH 2
* F|ORTH 1 +
CASE 3
CASE 3 ,
DTRS NUMBER 4
NUMBER 4
GENDER 6
GENDER 6
M 5
tion where the determiner directly precedes the noun because the form contribution of
the determiner has been combined with that of the noun. This strict adjacency constraint
makes sense as the claim that the determiner must precede the noun is not restrictive
enough since sequences such as (50b) would be allowed:
(50) a. [dass] die Frauen Türen öffnen
that the women doors open
‘that the woman open doors’
b. * die Türen öffnen
Frauen
If discontinuous phrases are permitted, die Türen ‘the doors’ can be analyzed with the
DetNoun Construction although another noun phrase intervenes between the deter-
miner and the noun (Müller 1999b: 424; 1999d). The order in (50b) can be ruled out
by linearization constraints or constraints on the continuity of arguments. If we want
the construction to require that the determiner and noun be adjacent, then we would
simply use meets instead of before in the specification of the construction.
This discussion has shown that (49) is more restrictive than (48). There are, how-
ever, contexts in which one could imagine using discontinuous constituents such as the
deviant one in (50b). For example, discontinuous constituents have been proposed for
verbal complexes, particle verbs and certain coordination data (Wells 1947). Examples for
analyses with discontinuous constituents in the framework of HPSG are Reape (1994),
Kathol (1995), Kathol (2000), Crysmann (2008), and Beavers & Sag (2004).24 These analy-
ses, which are discussed in Section 11.7.2.2 in more detail, differ from those previously
presented in that they use a feature DOMAIN instead of or in addition to the daughters
features. The value of the DOMAIN feature is a list containing the head and the elements
dependent on it. The elements do not have to necessarily be adjacent in the utterance,
that is, discontinuous constituents are permitted. Which elements are entered into this
24
Crysmann, Beaver and Sag deal with coordination phenomena. For an analysis of coordination in TAG
that also makes use of discontinuous constituents, see Sarkar & Joshi (1996) and Section 21.6.2.
341
10 Construction Grammar
list in which way is governed by the constraints that are part of the linguistic theory.
This differs from the simple before statement in ECG in that it is much more flexible
and in that one can also restrict the area in which a given element can be ordered since
elements can be freely ordered inside their domain only.
There is a further difference between the representation in (48) and the general HPSG
schemata: in the ECG variant, linearization requirements are linked to constructions.
In HPSG and GPSG, it is assumed that linearization rules hold generally, that is, if we
were to assume the rules in (51), we would not have to state for each rule explicitly that
shorter NPs tend to precede longer ones and that animate nouns tend to occur before
inanimate ones.
(51) a. S → NP[nom], NP[acc], V
b. S → NP[nom], NP[dat], V
c. S → NP[nom], NP[dat], NP[acc], V
d. S → NP[nom], NP[acc], PP, V
It is possible to capture these generalizations in ECG if one specifies linearization con-
straints for more general constructions and more specific constructions inherit them
from these. As an example, consider the Active-Ditransitive Construction discussed by
Bergen & Chang (2005: 170):
These restrictions allow the sentences in (53a,b) and rule out those in (53c):
(53) a. Mary tossed me a drink.
b. Mary happily tossed me a drink.
c. * Mary tossed happily me a drink.
The restriction agent.f before action.f forces an order where the subject occurs before
the verb but also allows for adverbs to occur between the subject and the verb. The other
342
10.6 New developments and theoretical variants
constraints on form determine the order of the verb and its object: the recipient must
be adjacent to the verb and the theme must be adjacent to the recipient. The require-
ment that an agent in the active must occur before the verb is not specific to ditransitive
constructions. This restriction could therefore be factored out as follows:
(54) Construction Active-Agent-Verb
subcase of Pred-Expr
constructional
agent:Ref-Expr
action:Verb
form
agent.f before action.f
The Active-Ditransitive Construction in (52) would then inherit the relevant information
from (54).
In addition to the descriptive means used in (48), there is the evokes operator (Bergen
& Chang 2005: 151–152). An interesting example is the representation of the term hy-
potenuse: this concept can only be explained by making reference to a right-angled tri-
angle (Langacker 1987: Chapter 5). Chang (2008: 67) gives the following formalization:
(55) Schema hypotenuse
subcase of line-segment
evokes right-triangle as rt
constraints
self ↔ rt.long-side
This states that a hypotenuse is a particular line segment, namely the longest side of a
right-angled triangle. The concept of a right-angled triangle is activated by means of
the evokes operator. Evokes creates an instance of an object of a certain type (in the
example, rt of type right-triangle). It is then possible to refer to the properties of this
object in a schema or in a construction.
The feature description in (56) is provided in the notation from Chapter 6. It is the
equivalent to (55).
hypotenuse*
+
(56) 1 right-triangle
EVOKES LONG-SIDE 1
The type hypotenuse is a subtype of line-segment. The value of EVOKES is a list since a
schema or construction can evoke more than one concept. The only element in this list
in (56) is an object of type right-triangle. The value of the feature LONG-SIDE is identified
with the entire structure. This essentially means the following: I, as a hypotenuse, am
the long side of a right-angled triangle.
Before turning to FCG in the next subsection, we can conclude that ECG and HPSG
are notational variants.
343
10 Construction Grammar
25
A reply to van Trijp based on the discussion in this section is published as Müller (2017b).
26
Steels (2013: 153) emphasizes the point that FCG is a technical tool for implementing constructionist ideas
rather than a theoretical framework of its own. However, authors working with the FCG system publish
linguistic papers that share a certain formal background and certain linguistic assumptions. So this section
addresses some of the key assumptions made and some of the mechanisms used.
344
FORM STRING "Kim"
Production Parsing
Figure 2. FCG constructions can be applied in both production (shown on the left) and
Figure 10.2: Generation and parsing in FCG (van Trijp 2013: 99)
parsing (shown on the right) without requiring a separate parsing or production algo-
rithm. In production, the conditional units of the semantic pole are matched against the
semantic pole of the transient structure. If matching is successful, the remaining units are
10.6.4.2 Argument Structure Constructions
merged with the transient structure in two phases. In parsing, constructional knowledge
is simply
Fluid applied inGrammar
Construction the opposite direction.
assumes a phrasal approach to argument structure, that
is, it is assumed that lexical items enter into phrasal configurations that contribute in-
dependent meaning (van Trijp 2011). The FCG approach is one version of implementing
Goldberg’s plugging approach to argument structure constructions (Goldberg 1995). Van
Trijp suggests that every lexical item comes with a representation of potential argument
roles like Agent, Patient, Recipient, and Goal. Phrasal argument structure constructions
are combined with the respective lexical items and realize a subset of the argument roles,
that is they assign them to grammatical functions. Figure 10.3 shows an example: the
verb sent has the semantic roles Agent, Patient, Recipient, and Goal (upper left of the
figure). Depending on the argument structure construction that is chosen, a subset of
these roles is selected for realization.27 The figures show the relation between sender,
27
It is interesting to note here that van Trijp (2011: 141) actually suggests a lexical account since every lex-
ical item is connected to various phrasal constructions via coapplication links. So every such pair of a
lexical item and a phrasal construction corresponds to a lexical item in Lexicalized Tree Adjoining Gram-
mar (LTAG). See also Müller & Wechsler (2014a: 25) on Goldberg’s assumption that every lexical item is
associated with phrasal constructions.
Note that such coapplication links are needed since without them the approach cannot account for cases
in which two or more argument roles can only be realized together but not in isolation or in any other
combination with other listed roles.
345
10 Construction Grammar
sent, and sendee and more the more abstract semantic roles and the relation between
these roles and grammatical functions for the sentences in (59):
(59) a. He sent her the letter.
b. He sent the letter.
c. The letter was sent to her.
While in (59a) the agent, the patient and the recipient are mapped to grammatical func-
tions, only the agent and the patient are mapped to grammatical functions in (59b). The
recipient is left out. (59c) shows an argument realization in which the sendee is realized
as a to phrase. According to van Trijp this semantic role is not a recipient but a goal.
Semantic and syntactic potential of linkage introduced by "sent". He sent her the letter.
semantic pole syntactic pole semantic pole syntactic pole
Active ditransitive
construction
sender Agent subject sender Agent subject
indirect indirect
sendee Recipient sendee Recipient
object object
indirect
sendee Recipient sendee Goal oblique
object
indirect
Goal oblique Recipient
object
Figure 10.3: Lexical items and phrasal constructions. Figure from van Trijp (2011: 122)
Note that under such an approach, it is necessary to have a passive variant of every
active construction. For languages that allow for the combination of passive and imper-
sonal constructions, one would be forced to assume a transitive-passive-impersonal con-
struction. As was argued in Müller (2006: Section 2.6) free datives (commodi/incommodi)
in German can be added to almost any construction. They interact with the dative pas-
sive and hence should be treated as arguments. So, for the resultative construction one
would need an active variant, a passive variant, a variant with dative argument, a variant
with dative argument and dative passive, and a middle variant. While it is technically
possible to list all these patterns and it is imaginable that we store all this information
in our brains, the question is whether such listings really reflect our linguistic knowl-
edge. If a new construction comes into existence, lets say an active sentence pattern
with a nominative and two datives in German, wouldn’t we expect that this pattern
can be used in the passive? While proposals that establish relations between active and
346
10.6 New developments and theoretical variants
passive constructions would predict this, alternative proposals that just list the attested
possibilities do not.
The issue of how such generalizations should be captured was discussed in connection
with the organization of the lexicon in HPSG (Flickinger 1987; Meurers 2001). In the lex-
ical world, one could simply categorize all verbs according to their valence and say that
loves is a transitive verb and the passive variant loved is an intransitive verb. Similarly
gives would be categorized as a ditransitive verb and given as a two-place verb. Obvi-
ously this misses the point that loved and given share something: they both are related
to their active form in a systematic way. This kind of generalization is captured by lex-
ical rules that relate two lexical items. The respective generalizations that are captured
by lexical rules are called a horizontal generalizations as compared to vertical general-
izations, which describe relations between subtypes and supertypes in an inheritance
hierarchy.
The issue is independent of the lexical organization of knowledge, it can be applied to
phrasal representations as well. Phrasal constructions can be organized in hierarchies
(vertical), but the relation between certain variants is not covered by this. The analog
to the lexical rules in a lexical approach are GPSG-like metarules in a phrasal approach.
So what seems to be missing in FCG is something that relates phrasal patterns, e.g.,
allostructions (Cappelle 2006; Goldberg 2014: 116, see also footnote 11).
347
10 Construction Grammar
So, Steels & van Trijp (2011: 319–320) suggest that only if regular constructions cannot
apply, merging is allowed. The problem with this is that human language is highly am-
biguous and in the case at hand this could result in situations in which there is a reading
for an utterance, so that the repair strategy would never kick in. Consider (61):29
(61) Schlag den Mann tot!
beat the man dead
‘Beat the man to death!’ or ‘Beat the dead man!’
(61) has two readings: the resultative reading in which tot ‘dead’ expresses the result of
the beating and another reading in which tot is a depictive predicate. The second reading
is dispreferred, since the activity of beating dead people is uncommon, but the structure
is parallel to other sentences with depictive predicates:
(62) Iss den Fisch roh!
eat the fish raw
The depictive reading can be forced by coordinating tot with a predicate that is not a
plausible result predicate:
(63) Schlag ihn tot oder lebendig!
beat him dead or alive
‘Beat him when he is dead or while he is alive!’
So, the problem is that (61) has a reading which does not require the invocation of the
repair mechanism: schlug ‘beat’ is used with the transitive construction and tot is an
adjunct (see Winkler 1997). However, the more likely analysis of (61) is the one with
the resultative analysis, in which the valence frame is extended by an oblique element.
So this means that one has to allow the application of merging independent of other
analyses that might be possible. As Steels & van Trijp (2011: 320) note, if merging is
allowed to apply freely, utterances like (64a) will be allowed and of course (64b) as well.
(64) a. * She sneezed her boyfriend.
b. * She dined a steak.
In (64) sneeze and dined are used in the transitive construction.
The way out of this dilemma is to establish information in lexical items that specifies
in which syntactic environments a verb can be used. This information can be weighted
and for instance the probability of dine to be used transitively would be extremely low.
Steels and van Trijp would connect their lexical items to phrasal constructions via so-
called coapplication links and the strength of the respective link would be very low for
29
I apologize for these examples …. An English example that shows that there may be ambiguity between
the depictive and the resultative construction is the following one that is due to Haider (2016):
I use the German example below since the resultative reading is strongly preferred over the depictive one.
348
10.6 New developments and theoretical variants
dine and the transitive construction and reasonably high for sneeze and the Caused-
Motion Construction. This would explain the phenomena (and in a usage-based way),
but it would be a lexical approach, as it is common in CG, HPSG, SBCG, and DG.
349
10 Construction Grammar
Figure 10.4: The analysis of What did the boy hit? according to van Trijp (2014: 265)
Van Trijp claims that sentences with nonlocal dependency constructions in English
start with a topic.30 Bresnan’s sentences in (2) and (3) were discussed on page 224 (Bres-
nan 2001: 97) and are repeated below for convenience:
(67) Q: What did you name your cat?
A: Rosie I named her. (Rosie = FOCUS)
These sentences show that the pre-subject position is not unambiguously a topic or a
focus position. So, a statement saying that the fronted element is a topic is empirically
not correct. If this position is to be associated with an information structural function,
this association has to be a disjunction admitting both topics and focused constituents.
A further problematic aspect of van Trijp’s analysis is that he assumes that the aux-
iliary do is an object marker (p. 10, 22) or a non-subject marker (p. 23). It is true that do
support is not necessary in subject questions like (69a), but only in (69b), but this does
not imply that all items that are followed by do are objects.
(69) a. Who saw the man?
b. Who did John see?
First, do can be used to emphasize the verb:
(70) Who did see the man?
Second, all types of other grammatical functions can precede the verb:
30
Van Trijp (2014: 256) uses the following definitions for topic and focus: “Topicality is defined in terms of
aboutness: the topic of an utterance is what the utterance is ‘about’. Focality is defined in terms of salience:
focus is used for highlighting the most important information given the current communicative setting.”
350
10.6 New developments and theoretical variants
351
10 Construction Grammar
263) and that the evidence from one language does not necessarily mean that the analysis
for that language is also appropriate for another language. While I agree with this view
in principle (see Section 13.1), I do think that extraction is a rather fundamental property
of languages and that nonlocal dependencies should be analyzed in parallel for those
languages that have it.
10.6.4.5 Coordination
One of the success stories of non-transformational grammar is the SLASH-based analy-
sis of nonlocal dependencies by Gazdar (1981b). This analysis made it possible for the
first time to explain Ross’s Across the Board Extraction (Ross 1967). The examples were
already discussed on page 201 and are repeated here for convenience:
(74) a. The kennel which Mary made and Fido sleeps in has been stolen.
(= S/NP & S/NP)
b. The kennel in which Mary keeps drugs and Fido sleeps has been stolen.
(= S/PP & S/PP)
c. * The kennel (in) which Mary made and Fido sleeps has been stolen.
(= S/NP & S/PP)
The generalization is that two (or more) constituents can be coordinated if they have
identical syntactic categories and identical SLASH values. This explains why which and
in which in (74a,b) can fill two positions in the respective clauses. Now, theories that
do not use a SLASH feature for the percolation of information about missing elements
have to find different ways to make sure that all argument slots are filled and that the
correct correspondence between extracted elements and the respective argument role is
established. Note that this is not straightforward in models like the one suggested by
van Trijp, since he has to allow the preposition in to be combined with some material to
the left of it that is simultaneously also the object of made. Usually an NP cannot simply
be used by two different heads as their argument. As an example consider (75a):
(75) a. * John said about the cheese that I like.
b. John said about the cheese that I like it.
If it would be possible to use material several times, a structure for (75a) would be pos-
sible in which the cheese is the object of the preposition about and of the verb like. This
sentence, however, is totally out: the pronoun it has to be used to fill the object slot.
352
10.6 New developments and theoretical variants
lexico-phrasal processing in FCG). So the parser will find similar constituents for
all four utterances, as shown in examples (21–24). Since auxiliary-do in example
(24) falls outside the immediate domain of the VP, it is not yet recognized as a
member of the VP.
All of these phrases are disconnected, which means that the grammar still has to
identify the relations between the phrases. (van Trijp 2014: 252)
In his (21)–(24), van Trijp provides several tree fragments that contain NPs for sub-
ject and object and states that these have to be combined in order to analyze the sen-
tences he discusses. This is empirically inadequate: if FCG does not make the compe-
tence/performance distinction, then the way utterances are analyzed should reflect the
way humans process language (and this is what is usually claimed about FCG). However,
all we know about human language processing points towards an incremental process-
ing, that is, we process information as soon as it is available. We start to process the first
word taking into account all of the relevant aspects (phonology, stress, part of speech,
semantics, information structure) and come up with an hypothesis about how the ut-
terance could proceed. As soon as we have two words processed (in fact even earlier:
integration already happens during the processing of words) we integrate the second
word into what we know already and continue to follow our hypothesis, or revise it, or
simply fail. See Section 15.2 for details on processing and the discussion of experiments
that show that processing is incremental. So, we have to say that van Trijp’s analysis
fails on empirical grounds: his modeling of performance aspects is not adequate.
The parsing scheme that van Trijp describes is pretty much similar to those of com-
putational HPSG parsers, but these usually come without any claims about human per-
formance. Modeling human performance is rather complex since a lot of factors play a
role. It is therefore reasonable to separate competence and performance and continue to
work the way it is done in HPSG and FCG. This does not mean that performance aspects
should not be modeled, in fact psycholinguistic models using HPSG have been developed
in the past (Konieczny 1996), but developing both a grammar with large coverage and
the performance model that combines with it demands a lot of resources.
353
10 Construction Grammar
But if such constructions can be discontinuous one has to make sure that (77b) cannot
be an instantiation of the subject-verb construction:
(77) a. The boy I think left.
b. * I the boy think left.
Here it is required to have some adjacency between the subject and the verb it belongs to,
modulo some intervening adverbials. This is modelled quite nicely in phrase structure
grammars that have a VP node. Whatever the internal structure of such a VP node
may be, it has to be adjacent to the subject in sentences like (76) and (77a) above. The
dislocated element has to be adjacent to the complex consisting of subject and VP. This
is what the Filler-Head Schema does in HPSG and SBCG. Van Trijp criticizes SBCG for
having to stipulate such a schema, but I cannot see how his grammar can be complete
without a statement that ensures the right order of elements in sentences with fronted
elements.
Van Trijp stated that FCG differs from what he calls generative approaches in that it
does not want to characterize only the well-formed utterances of a language. According
to him, the parsing direction is much more liberal in accepting input than other the-
ories. So it could well be that he is happy to find a structure for (77b). Note though
that this is incompatible with other claims made by van Trijp: he argued that FCG is
superior to other theories in that it comes with a performance model (or rather in not
separating competence from performance at all). But then (77b) should be rejected both
on competence and performance grounds. It is just unacceptable and speakers reject it
for whatever reasons. Any sufficiently worked out theory of language has to account
for this.
354
10.6 New developments and theoretical variants
(79) Dieses Buch hat der Mann mir versprochen, seiner Frau zu geben, der gestern
this book has the man me promised his wife to give who yesterday
hier aufgetreten ist.
here performed is
‘The man who performed here yesterday promised me to give this book to his
wife.’
We see that material that refers to der Mann ‘the man’, namely the relative clause der
gestern hier aufgetreten ist ‘who performed here yesterday’, appears to the right. And
the object of geben ‘to give’, which would normally be part of the phrase dieses Buch
seiner Frau zu geben ‘this book his wife to give’ appears to the left. So, in general it
is possible to mix parts of phrases, but this is possible in a very restricted way only.
Some dependencies extend all the way to the left of certain units (fronting) and others
all the way to the right (extraposition). Extraposition is clause-bound, while extraction
is not. In approaches like GPSG, HPSG and SBCG, the facts are covered by assuming
that constituents for a complete clause are continuous apart from constituents that are
fronted or extraposed. The fronted and extraposed constituents are represented in SLASH
and EXTRA (Keller 1995; Müller 1999b: Section 13.2; Crysmann 2013), respectively, rather
than in valence features, so that it is possible to require of constituents that have all their
valents saturated to be continuous (Müller 1999c: 294).
Summing up the discussion of parsimony, it has to be said that van Trijp has to provide
the details on how continuity is ensured. The formalization of this is not trivial and only
after this is done can FCG be compared with the SLASH-based approach.
In addition to all the points discussed so far, there is a logical flaw in van Trijp’s
argumentation. He states that:
whereas the filler-gap analysis cannot explain WHY do-support does not occur in
wh-questions where the subject is assigned questioning focus, this follows natu-
rally from the interaction of different linguistic perspectives in this paper’s ap-
proach. (van Trijp 2014: 263)
The issue here is whether a filler-gap analysis or an analysis with discontinuous con-
stituents is suited better for explaining the data. A correct argumentation against the
filler-gap analysis would require a proof that information structural or other functional
constraints cannot be combined with this analysis. This proof was not provided and in
fact I think it cannot be provided since there are approaches that integrate information
structure. Simply pointing out that a theory is incomplete does not falsify a theory. This
point was already made in my review of Boas (2003) and in a reply to Boas (2014). See
Müller (2005a: 655–656), Müller (2007a: Chapter 20), and Müller & Wechsler (2014b:
Footnote 15).
The conclusion about the FCG analysis of nonlocal dependencies is that there are
some empirical flaws that can be easily fixed or assumptions that can simply be dropped
(role of do as object marker, claim that the initial position in English fronting construc-
tion is the topic), some empirical shortcomings (coordination, admittance of illformed
355
10 Construction Grammar
utterances with discontinuous constituents), some empirical problems when the analy-
sis is extended to other languages (scope of adjuncts in German), and the parsimony of
the analyses is not really comparable since the restrictions on continuity are not really
worked out (or at least not published). If the formalization of restrictions on continuity
in FCG turns out to be even half as complex as the formalization that is necessary for
accounts of nonlocal dependencies (extraction and extraposition) in linearization-based
HPSG that Reape (2000) suggested,32 the SLASH-based analysis would be favorable.
In any case, I do not see how nonlocal dependencies could be used to drive a wedge
between SBCG and FCG. If there are functional considerations that have to be taken
into account, they should be modeled in both frameworks. In general, FCG should be
more restrictive than SBCG since FCG claims to integrate a performance model, so both
competence and performance constraints should be operative. I will come back to the
competence/performance distinction in the following section, which is a more general
comparison of SBCG and FCG.
Table 10.1: Differences between SBCG and FCG according to van Trijp (2013: 112)
356
10.6 New developments and theoretical variants
competence and performance. We will deal with both the generative-enumerative vs.
constraint-based view and with the competence/performance distinction in more detail
in the Chapters 14 and 15, respectively. Concerning the cognitive-functional approach,
van Trijp writes:
The goal of a cognitive-functional grammar, on the other hand, is to explain how
speakers express their conceptualizations of the world through language (= produc-
tion) and how listeners analyze utterances into meanings (= parsing). Cognitive-
functional grammars therefore implement both a competence and a processing
model. (van Trijp 2013: 90)
It is true that HPSG and SBCG make a competence/performance distinction (Sag & Wa-
sow 2011). HPSG theories are theories about the structure of utterances that are moti-
vated by distributional evidence. These theories do not contain any hypothesis regarding
brain activation, planning of utterances, processing of utterances (garden path effects)
and similar things. In fact, none of the theories that are discussed in this book contains an
explicit theory that explains all these things. I think that it is perfectly legitimate to work
in this way: it is legitimate to study the structure of words without studying their seman-
tics and pragmatics, it is legitimate to study phonology without caring about syntax, it
is legitimate to deal with specific semantic problems without caring about phonology
and so on, provided there are ways to integrate the results of such research into a bigger
picture. In comparison, it is wrong to develop models like those developed in current
versions of Minimalism (called Biolinguistics), where it is assumed that utterances are
derived in phases (NPs, CPs, depending on the variant of the theory) and then shipped
to the interfaces (spell out and semantic interpretation). This is not what humans do
(see Chapter 15).33 But if we are neutral with respect towards such issues, we are fine. In
fact, there is psycholinguistic work that couples HPSG grammars to performance mod-
els (Konieczny 1996) and similar work exists for TAG (Shieber & Johnson 1993; Demberg
& Keller 2008).
Finally, there is also work in Construction Grammar that abstracts away from perfor-
mance considerations. For instance, Adele Goldberg’s book from 1995 does not contain
a worked out theory of performance facts. It contains boxes in which grammatical func-
tions are related to semantic roles. So this basically is a competence theory as well. Of
course there are statements about how this is connected to psycholinguistic findings,
but this is also true for theories like HPSG, SBCG and Simpler Syntax (Jackendoff 2011:
600) that explicitly make the competence/performance distinction.
357
10 Construction Grammar
358
10.6 New developments and theoretical variants
This can be translated into phrase structure grammar rules in a straight-forward way:
head-complement-phrase
HEAD 1
SYNSEM|LOC|CAT
COMPS 2
(81) a. → 4 ,
5
HEAD-DTR 4 |SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⊕⟨ 3 ⟩
NON-HEAD-DTRS ⟨ 5 [ SYNSEM 3 ]⟩
head-complement-phrase
HEAD 1
SYNSEM|LOC|CAT
COMPS 2
b. → 5 ,
4
HEAD-DTR 4 |SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⊕⟨ 3 ⟩
NON-HEAD-DTRS ⟨ 5 [ SYNSEM 3 ]⟩
The left hand side of the rule is the mother node of the tree, that is, the sign that is
licensed by the schema provided that the daughters are present. The right hand side in
(81a) consists of the head daughter 4 followed by the non-head daughter 5 . We have
the opposite order in (81b), that is, the head daughter follows the non-head daughter.
The two orders correspond to the two orders that are permitted by LP-rules: the head
precedes its argument if it is marked INITIAL+ and it follows it if it is marked INITIAL−.
The following code shows how (81b) is implemented in TRALE:
arg_h rule (head_complement_phrase,
synsem:loc:cat:head:initial:minus,
head_dtr:HeadDtr,
non_head_dtrs:[NonHeadDtr])
===>
cat> NonHeadDtr,
cat> HeadDtr.
A rule starts with an identifier that is needed for technical reasons like displaying inter-
mediate structures in the parsing process in debugging tools. A description of the mother
359
10 Construction Grammar
node follows and after the arrow we find a list of daughters, each introduced by the op-
erator cat>.34 Structure sharing is indicated by values with capital letters. The above
TRALE rule is a computer-readable variant of (81b) additionally including the explicit
specification of the value of INITIAL.
Now, the translation of a parallel schema using a MOTHER feature like (82a) into a
phrase structure rule is almost as simple:
head-complement-cx
HEAD 1
MOTHER|SYNSEM|LOC|CAT
COMPS 2
(82)
a.
HEAD-DTR|SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⊕ ⟨ 3 ⟩
NON-HEAD-DTRS ⟨ [ SYNSEM 3 ] ⟩
head-complement-cx
HEAD 1
MOTHER 6 |SYNSEM|LOC|CAT
COMPS 2
b. 6 → 4 , 5 where
HEAD-DTR 4 |SYNSEM|LOC|CAT HEAD 1
COMPS 2 ⊕ ⟨ 3 ⟩
NON-HEAD-DTRS ⟨ 5 [ SYNSEM 3 ] ⟩
(82b) is only one of the two phrase structure rules that correspond to (82a), but since the
other one only differs from (82b) in the ordering of 4 and 5 , it is not given here.
For grammars in which the order of the elements corresponds to the observable order
of the daughters in a DTRS list, the connection to phrase structure rules is even simpler:
construction
(83) 1 → 2 where MOTHER 1
DTRS 2
The value of DTRS is a list and hence 2 stands for the list of daughters on the right
hand side of the phrase structure rule as well. The type construction is a supertype of all
constructions and hence (83) can be used to analyze all phrases that are licensed by the
grammar. In fact, (83) is one way to put the meta constraint in (33).
So, this shows that the version of SBCG that has been developed by Sag (2012) has
a straightforward implementation in TRALE.35 The question remains whether “SBCG’s
performance-independence hypothesis remains conjecture until proven otherwise” as
34
Other operators are possible in TRALE. For instance, sem_head can be used to guide the generator. This is
control information that has nothing to do with linguistic theory and not necessarily with the way humans
process natural language. There is also a cats operator, which precedes lists of daughters. This can be used
to implement phrase structures with more than one non-head daughter.
35
A toy fragment of English using a MOTHER feature and phrase structure rules with specifications of the
kind given above can be downloaded at https://hpsg.hu-berlin.de/Fragments/SBCG-TRALE/.
360
10.6 New developments and theoretical variants
van Trijp sees it. The answer is: it is not a conjecture since any of the old constraint-
solving systems of the nineties could be used to process SBCG. The question of whether
this is efficient is an engineering problem that is entirely irrelevant for theoretical linguis-
tics. Theoretical linguistics is concerned with human languages and how they are pro-
cessed by humans. So whether some processing system that does not make any claims
about human language processing is efficient or not is absolutely irrelevant. Phrase
structure-based backbones are therefore irrelevant as well, provided they refer to the
grammar as described in theoretical work.
Now, this begs the question whether there is a contradiction in my claims. On page 335
I pointed out that SBCG is lacking a formalization in Richter’s framework (Richter 2004).
Richter and also Levine & Meurers (2006) pointed out that there are problems with cer-
tain theoretically possible expressions and it is these expressions that mathematical lin-
guists care about. So the goal is to be sure that any HPSG grammar has a meaning and
that it is clear what it is. Therefore, this goal is much more foundational than writing a
single grammar for a particular fragment of a language. There is no such foundational
work for FCG since FCG is a specific toolkit that has been used to implement a set of
grammars.
10.6.4.9.3 Static constraints vs. dynamic mappings and signature + grammar vs. open-
endedness
On very interesting feature of Fluid Construction Grammar is its fluidity, that is there are
certain constraints that can be adapted if there is pressure, the inventory of the theory
is open-ended, so categories and features can be added if need be.
Again, this is not a fundamental difference between HPSG/SBCG and FCG. An HPSG
grammar fragment of a specific language is a declarative representation of linguistic
knowledge and as such it of course just represents a certain fragment and does not con-
tain any information how this set of constraints evolved or how it is acquired by speak-
ers. For this we need specific theories about language evolution/”language change/”language
acquisition. This is parallel to what was said about the competence/performance distinc-
tion, in order to account for language evolution we would have to have several HPSG
grammars and say something about how one developed from the other. This will involve
weighted constraints, it will involve recategorization of linguistic items and lots more.36
So basically HPSG has to be extended, has to be paired with a model about language
evolution in the very same way as FCG is.
36
There are systems that use weighted constraints. We had a simple version of this in the German HPSG
grammar that was developed in Verbmobil project (Müller & Kasper 2000) already. Further theoretical ap-
proaches to integrate weighted constraints are Brew (1995) and more recently Guzmán Naranjo (2015).
Usually such weighted constraints are not part of theoretical papers, but there are exceptions as for in-
stance the paper by Briscoe and Copestake about lexical rules (Briscoe & Copestake 1999).
361
10 Construction Grammar
362
10.6 New developments and theoretical variants
whatever, rather languages are treated on their own as it is common in the Construction
Grammar community. This does not imply that there is no interest in generalizations
and universals or near universals or tendencies, but again the style of working and the
rhetoric in HPSG/SBCG is usually different from the ones in Mainstream Generative
Grammar. Therefore, I think that the purported difference between SBCG and FCG does
not exist.
37
This is not a problem if all FCG papers are read as papers documenting the FCG-system (see Footnote 26
on page 344) since then it would be necessary to include these technical details. If the FCG papers are to
be read as theoretical linguistics papers that document a certain Construction Grammar analysis, the Lisp
statements and the implementational details are simply an obstacle.
363
10 Construction Grammar
separation is given up in FCG, it will remain an engineering project without much appeal
to the general linguist.
Exercises
364
10.7 Summary and classification
Further reading
There are two volumes on Construction Grammar in German: Fischer & Ste-
fanowitsch (2006) and Stefanowitsch & Fischer (2008). Deppermann (2006) dis-
cusses Construction Grammar from the point of view of conversational analysis.
The 37(3) volume of the Zeitschrift für germanistische Linguistik from 2009 was
also devoted to Construction Grammar. Goldberg (2003a) and Michaelis (2006)
are overview articles in English. Goldberg’s books constitute important contri-
butions to Construction Grammar (1995; 2006; 2009). Goldberg (1995) has argued
against lexical analyses such as those common in GB, LFG, CG, HPSG, and DG.
These arguments can be invalidated, however, as will be shown in Section 21.7.1.
Sag (1997), Borsley (2006), Jacobs (2008) and Müller & Lipenkova (2009) give ex-
amples of constructions that require a phrasal analysis if one wishes to avoid
postulating empty elements. Jackendoff (2008) discusses the noun-preposition-
noun construction that can only be properly analyzed as a phrasal construction
(see Section 21.10). The discussion on whether argument structure construc-
tions should be analyzed phrasally or lexically (Goldberg 1995; 2006; Müller
2006) culminated in a series of papers (Goldberg 2013a) and a target article by
Müller & Wechsler (2014a) with several responses in the same volume. Müller
(2018a) discusses phrasal LFG approaches to benefactives and resultatives and
compares them with lexical HPSG proposals showing one more time that phrasal
approaches face problems. Müller (2020b) compares HPSG and Construction
Grammar.
Tomasello’s publications on language acquisition (Tomasello 2000; 2003; 2005;
2006c) constitute a Construction Grammar alternative to the Principle & Param-
eters theory of acquisition as it does not have many of the problems that P&P
analyses have (for more on language acquisition, see Chapter 16). For more on
language acquisition and Construction Grammar, see Behrens (2009).
Dąbrowska (2004) looks at psycholinguistic constraints for possible grammat-
ical theories.
365
11 Dependency Grammar
Dependency Grammar (DG) is the oldest framework described in this book. According
to Hudson (2020), the basic assumptions made today in Dependency Grammar were
already present in the work of the Hungarian Sámuel Brassai in 1873 (see Imrényi 2013),
the Russian Aleksej Dmitrievsky in 1877 and the German Franz Kern (1884). The most
influential version of DG was developed by the French linguist Lucien Tesnière (1893–
1954). His foundational work Eléments de syntaxe structurale ‘Elements of structural
syntax’ was basically finished in 1938 only three years after Ajdukiewicz’s paper on
Categorial Grammar (1935), but the publication was delayed until 1959, five years after
his death. Since valence is central in Dependency Grammar, it is sometimes also referred
to as Valence Grammar. Tesnière’s ideas are wide-spread nowadays. The conceptions of
valence and dependency are present in almost all of the current theories (Ágel & Fischer
2010: 262–263, 284).
Although there is some work on English (Anderson 1971; Hudson 1984), Dependency
Grammar is most popular in central Europe and especially so in Germany (Engel 1996:
56–57). Ágel & Fischer (2010: 250) identified a possible reason for this: Tesnière’s original
work was not available in English until very recently (Tesnière 2015), but there has been
a German translation for more than 35 years now (Tesnière 1980). Since Dependency
Grammar focuses on dependency relations rather than linearization of constituents, it
is often felt to be more appropriate for languages with freer constituent order, which
is one reason for its popularity among researchers working on Slavic languages: the
New Prague School represented by Sgall, Hajičová and Panevova developed Dependency
Grammar further, beginning in the 1960s (see Hajičová & Sgall 2003 for an overview).
Igor A. Meľčuk and A. K. Žolkovskij started in the 1960s in the Soviet Union to work
on a model called Meaning–Text Theory, which was also used in machine translation
projects (Mel’čuk 1964; 1981; 1988; Kahane 2003). Mel’čuk left the Soviet Union towards
Canada in the 1970s and now works in Montréal.
Dependency Grammar is very wide-spread in Germany and among scholars of Ger-
man linguistics worldwide. It is used very successfully for teaching German as a foreign
language (Helbig & Buscha 1969; 1998). Helbig and Buscha, who worked in Leipzig, East
Germany, started to compile valence dictionaries (Helbig & Schenkel 1969) and later re-
searchers working at the Institut für Deutsche Sprache (Institute for German Language)
in Mannheim began similar lexicographic projects (Schumacher et al. 2004).
The following enumeration is a probably incomplete list of linguists who are/were
based in Germany: Vilmos Ágel (2000), Kassel; Klaus Baumgärtner (1965; 1970), Leipzig
later Stuttgart; Ulrich Engel (1977; 2014), IDS Mannheim; Hans-Werner Eroms (1985;
1987; 2000), Passau; Heinz Happ, Tübingen; Peter Hellwig (1978; 2003), Heidelberg; Jür-
11 Dependency Grammar
gen Heringer (1996), Augsburg; Jürgen Kunze (1968; 1975), Berlin; Henning Lobin (1993),
Gießen; Klaus Schubert (1987), Hildesheim; Heinz Josef Weber (1997), Trier; Klaus Welke
(1988; 2011), Humboldt University Berlin; Edeltraud Werner (1993), Halle-Wittenberg.
Although work has been done in many countries and continuously over the decades
since 1959, a periodical international conference was established as late as 2011.1,2
From early on, Dependency Grammar was used in computational projects. Meľčuk
worked on machine translation in the Soviet Union (Mel’čuk 1964) and David G. Hays
worked on machine translation in the United States (Hays & Ziehe 1960). Jürgen Kunze,
based in East Berlin at the German Academy of Sciences, where he had a chair for compu-
tational linguistics, also started to work on machine translation in the 1960s. A book that
describes the formal background of the linguistic work was published as Kunze (1975).
Various researchers worked in the Collaborative Research Center 100 Electronic linguistic
research (SFB 100, Elektronische Sprachforschung) from 1973–1986 in Saarbrücken. The
main topic of this SFB was machine translation as well. There were projects on Russian
to German, French to German, English to German, and Esperanto to German transla-
tion. For work from Saarbrücken in this context see Klein (1971), Rothkegel (1976), and
Weissgerber (1983). Muraki et al. (1985) used Dependency Grammar in a project that ana-
lyzed Japanese and generated English. Richard Hudson started to work in a dependency
grammar-based framework called Word Grammar in the 1980s (Hudson 1984; 2007) and
Sleator and Temperly have been working on Link Grammar since the 1990s (Sleator &
Temperley 1991; Grinberg et al. 1995). Fred Karlsson’s Constraint Grammars (1990) are
developed for many languages (bigger fragments are available for Danish, Portuguese,
Spanish, English, Swedish, Norwegian, French, German, Esperanto, Italian, and Dutch)
and are used for school teaching, corpus annotation and machine translation. An online
demo is available at the project website.3
In recent years, Dependency Grammar became more and more popular among com-
putational linguists. The reason for this is that there are many annotated corpora (tree
banks) that contain dependency information.4 Statistical parsers are trained on such tree
banks (Yamada & Matsumoto 2003; Attardi 2006; Nivre 2003; Kübler et al. 2009; Bohnet
2010). Many of the parsers work for multiple languages since the general approach is
language independent. It is easier to annotate dependencies consistently since there are
fewer possibilities to do so. While syntacticians working in constituency-based models
may assume binary branching or flat models, high or low attachment of adjuncts, empty
elements or no empty elements and argue fiercely about this, it is fairly clear what the
dependencies in an utterance are. Therefore it is easy to annotate consistently and train
statistical parsers on such annotated data.
Apart from statistical modeling, there are also so-called deep processing systems, that
is, systems that rely on a hand-crafted, linguistically motivated grammar. I already men-
tioned Meľčuk’s work in the context of machine translation; Hays & Ziehe (1960) had
1
http://depling.org/. 2018-02-20.
2
A conference on Meaning–Text Theory has taken place biannually since 2003.
3
http://beta.visl.sdu.dk/constraint_grammar. 2018-02-20.
4
According to Kay (2000), the first treebank ever was developed by Hays and did annotate dependencies.
368
a parser for Russian; Starosta & Nomura (1986) developed a parser that was used with
an English grammar, Jäppinen, Lehtola & Valkonen (1986) developed a parser that was
demoed with Finnish, Hellwig (1986; 2003; 2006) implemented grammars of German in
the framework of Dependency Unification Grammar, Hudson (1989) developed a Word
Grammar for English, Covington (1990) developed a parser for Russian and Latin, which
can parse discontinuous constituents, and Menzel (1998) implemented a robust parser of
a Dependency Grammar of German. Other work on computational parsing to be men-
tioned is Kettunen (1986); Lehtola (1986); Menzel & Schröder (1998b). The following is a
list of languages for which Dependency Grammar fragments exist:
• German (Hellwig 1986; Coch 1996; Heinecke et al. 1998; Menzel & Schröder 1998a;
Hellwig 2003; 2006; Gerdes & Kahane 2001)
• Irish (Dhonnchadha & van Genabith 2006)
• Japanese (Muraki, Ichiyama & Fukumochi 1985)
369
11 Dependency Grammar
The Constraint Grammar webpage5 additionally lists grammars for Basque, Catalan, En-
glish, Finnish, German, Italian, Sami, and Swedish.
N N
D D
is depicted by the dependency links between the node representing the verb and the
nodes representing the respective nouns. The nouns themselves require a determiner,
which again is shown by the dependency links to the and a respectively. Note that the
analysis presented here corresponds to the NP analysis that is assumed in HPSG for
instance, that is, the noun selects its specifier (see Section 9.1.1). It should be noted,
though, that the discussion whether an NP or a DP analysis is appropriate also took place
within the Dependency Grammar community (Hudson 1984: 90; Van Langendonck 1994;
Hudson 2004). See Engel (1977) for an analysis with the N as head and Welke (2011: 31)
for an analysis with the determiner as head.
The verb is the head of the clause and the nouns are called dependents. Alternative
terms for head and dependent are nucleus and satellite, respectively.
5
http://beta.visl.sdu.dk/constraint_grammar_languages.html, 2018-02-20.
370
11.1 General remarks on the representational format
ROOT
OBJ
DET SBJ DET
Figure 11.2: Alternative presentation of the analysis of The child reads a book.
A third form of representing the same dependencies provided in Figure 11.3 has the tree
format again. This tree results if we pull the root node in Figure 11.2 upwards. Since we
reads
SBJ OBJ
child book
DET DET
the a
Figure 11.3: Alternative presentation of the analysis of The child reads a book.
have a clear visualization of the dependency relation that represents the nucleus above
the dependents, we do not need to use arrows to encode this information. However,
some variants of Dependency Grammar – for instance Word Grammar – use mutual
dependencies. So for instance, some theories assume that his depends on child and child
depends on his in the analysis of his child. If mutual dependencies have to be depicted,
either arrows have to be used for all dependencies or some dependencies are represented
by downward lines in hierarchical trees and other dependencies by arrows.
Of course part of speech information can be added to the Figures 11.2 and 11.3, gram-
matical function labels could be added to Figure 11.1, and word order can be added to
Figure 11.3.
The above figures depict the dependency relation that holds between a head and the
respective dependents. This can be written down more formally as an 𝑛-ary rule that is
similar to phrase structure rules that were discussed in Chapter 2 (Gaifman 1965: 305;
Hays 1964: 513; Baumgärtner 1970: 61; Heringer 1996: Section 4.1). For instance Baumgärt-
ner suggests the general rule format in (1):
371
11 Dependency Grammar
11.1.2 Adjuncts
Another metaphor that was used by Tesnière is the drama metaphor. The core par-
ticipants of an event are the actants and apart from this there is the background, the
stage, the general setting. The actants are the arguments in other theories and the stage-
describing entities are called circumstants. These circumstants are modifiers and usually
analyzed as adjuncts in the other theories described in this book. As far as the representa-
tion of dependencies is concerned, there is not much of a difference between arguments
and adjuncts in Dependency Grammar. Figure 11.4 shows the analysis of (3):
(3) The child often reads the book slowly.
N Adv N Adv
D D
Figure 11.4: Analysis of The child often reads the book slowly.
The dependency annotation uses a technical device suggested by Engel (1977) to depict
different dependency relations: adjuncts are marked with an additional line upwards
from the adjunct node (see also Eroms 2000). An alternative way to specify the argu-
ment/adjunct, or rather the actant/circumstant distinction, is of course an explicit speci-
fication of the status as argument or adjunct. So one can use explicit labels for adjuncts
372
11.1 General remarks on the representational format
and arguments as it was done for grammatical functions in the preceding. German gram-
mars and valence dictionaries often use the labels E and A for Ergänzung and Angabe,
respectively.
11.1.3 Linearization
So far we have seen dependency graphs that had connections to words that were lin-
earized in a certain order. The order of the dependents, however, is in principle not
determined by the dependency and therefore a Dependency Grammar has to contain ad-
ditional statements that take care of the proper linearization of linguistic objects (stems,
morphemes, words). Engel (2014: 50) assumes the dependency graph in Figure 11.5 for
the sentences in (4).6
(4) a. Gestern war ich bei Tom.
yesterday was I with Tom
‘I was with Tom yesterday.’
b. Ich war gestern bei Tom.
I was yesterday with Tom
c. Bei Tom war ich gestern.
with Tom was I yesterday
d. Ich war bei Tom gestern.
I was with Tom yesterday
Figure 11.5: Dependency graph for several orders of ich, war, bei Tom, and gestern ‘I was
with Tom yesterday.’ according to Engel (2014: 50)
According to Engel (2014: 50), the correct order is enforced by surface syntactic rules as
for instance the rules that states that there is always exactly one element in the Vorfeld
in declarative main clauses and that the finite verb is in second position.7,8 Furthermore,
6
Engel uses Esub for the subject and Eacc , Edat , and Egen for the objects with respective cases.
7
“Die korrekte Stellung ergibt sich dann zum Teil aus oberflächensyntaktischen Regeln (zum Beispiel: im
Vorfeld des Konstativsatzes steht immer genau ein Element; das finite Verb steht an zweiter Stelle) […]”
8
Engel (1970: 81) provides counterexamples to the claim that there is exactly one element in the Vorfeld.
Related examples will be discussed in Section 11.7.1.
373
11 Dependency Grammar
there are linearization rules that concern pragmatic properties, as for instance given in-
formation before new information. Another rule ensures that weak pronouns are placed
into the Vorfeld or at the beginning of the Mittelfeld. This conception of linear order is
problematic both for empirical and conceptual reasons and we will turn to it again in
Section 11.7.1. It should be noted here that approaches that deal with dependency alone
admit discontinuous realizations of heads and their dependents. Without any further
constraints, Dependency Grammars would share a problem that was already discussed
on page 341 in Section 10.6.3 on Embodied Construction Grammar and in Section 10.6.4.4
with respect to Fluid Construction Grammar: one argument could interrupt another ar-
gument as in Figure 11.6. In order to exclude such linearizations in languages in which
N N
Figure 11.6: Unwanted analysis of dass die Frauen Türen öffnen ‘that the women open
doors’
they are impossible, it is sometimes assumed that analyses have to be projective, that is
crossing branches like those in Figure 11.6 are not allowed. This basically reintroduces
the concept of constituency into the framework, since this means that all dependents of
a head have to be realized close to the head unless special mechanisms for liberation are
used (see for instance Section 11.5 on nonlocal dependencies).9 Some authors explicitly
use a phrase structure component to be able to formulate restrictions on serializations
of constituents (Gerdes & Kahane 2001; Hellwig 2003).
11.1.4 Semantics
Tesnière already distinguished the participants of a verb in a way that was later common
in theories of semantic roles. He suggested that the first actant is the agent, the second
one a patient and the third a benefactive (Tesnière 2015: Chapter 106). Given that Depen-
dency Grammar is a lexical framework, all lexical approaches to argument linking can
9
While this results in units that are also assumed in phrase structure grammars, there is a difference: the
units have category labels in phrase structure grammars (for instance NP), which is not the case in Depen-
dency Grammars. In Dependency Grammars, one just refers to the label of the head (for instance the N
that belongs to child in Figure 11.4) or one refers to the head word directly (for instance, the word child in
Figure 11.3). So there are fewer nodes in Dependency Grammar representations (but see the discussion in
Section 11.7.2.3).
374
11.2 Passive
be adopted. However, argument linking and semantic role assignment are just a small
part of the problem that has to be solved when natural language expressions have to be
assigned a meaning. Issues regarding the scope of adjuncts and quantifiers have to be
solved and it is clear that dependency graphs representing dependencies without taking
into account linear order are not sufficient. An unordered dependency graph assigns
grammatical functions to a dependent of a head and hence it is similar in many respects
to an LFG f-structure.10 For a sentence like (25a) on page 230, repeated here as (5), one
gets the f-structure in (25b) on page 230. This f-structure contains a subject (David), an
object (a sandwich), and an adjunct set with two elements (at noon and yesterday).
(5) David devoured a sandwich at noon yesterday.
This is exactly what is encoded in an unordered dependency graph. Because of this
parallel it comes as no surprise that Bröker (2003: 308) suggested to use glue seman-
tics (Dalrymple, Lamping & Saraswat 1993; Dalrymple 2001: Chapter 8) for Dependency
Grammar as well. Glue semantics was already introduced in Section 7.1.5.
There are some variants of Dependency Grammar that have explicit treatments of
semantics. One example is Meaning–Text Theory (Mel’čuk 1988). Word Grammar is
another one (Hudson 1991: Chapter 7; 2007: Chapter 5). The notations of these theories
cannot be introduced here. It should be noted though that theories like Hudson’s Word
Grammar are rather rigid about linear order and do not assume that all the sentences
in (4) have the same dependency structure (see Section 11.5). Word Grammar is closer
to phrase structure grammar and therefore can have a semantics that interacts with
constituent order in the way it is known from constituent-based theories.
11.2 Passive
Dependency Grammar is a lexical theory and valence is the central concept. For this
reason, it is not surprising that the analysis of the passive is a lexical one. That is, it is
assumed that there is a passive participle that has a different valence requirement than
the active verb (Hudson 1990: Chapter 12; Eroms 2000: Section 10.3; Engel 2014: 53–54).
Our standard example in (6) is analyzed as shown in Figure 11.7 on the next page.
(6) [dass] der Weltmeister geschlagen wird
that the world.champion beaten is
‘that the world champion is (being) beaten’
This figure is an intuitive depiction of what is going on in passive constructions. A
formalization would probably amount to a lexical rule for the personal passive. See
Hellwig (2003: 629–630) for an explicit suggestion of a lexical rule for the analysis of the
passive in English.
Note that der Weltmeister ‘the world champion’ is not an argument of the passive aux-
iliary wird ‘is’ in Engel’s analysis. This means that subject–verb agreement cannot be
10
Tim Osborne (p. c. 2015) reminds me that this is not true in all cases: for instance non-predicative preposi-
tions are not reflected in f-structures, but of course they are present in dependency graphs.
375
11 Dependency Grammar
Vfin, ⟨ prt ⟩
Figure 11.7: Analysis of [dass] der Weltmeister geschlagen wird ‘that the world champion
is (being) beaten’ parallel to the analyses provided by Engel (2014: 53–54)
determined locally and some elaborated mechanism has to be developed for ensuring
agreement.11 Hudson (1990), Eroms (2000: Section 5.3) and Groß & Osborne (2009) as-
sume that subjects depend on auxiliaries rather than on the main verb. This requires
some argument transfer as it is common in Categorial Grammar (see Section 8.5.2) and
HPSG (Hinrichs & Nakazawa 1994a). The adapted analysis that treats the subject of the
participle as a subject of the auxiliary is given in Figure 11.8 on the facing page.
11
This problem would get even more pressing for cases of the so-called remote passive:
Here the object of zu reparieren, which is the object of a verb which is embedded two levels deep, agrees
with the auxiliaries wurde ‘was’ and wurden ‘were’. However, the question how to analyze these remote
passives is open in Engel’s system anyway and the solution of this problem would probably involve the
mechanism applied in HPSG: the arguments of zu reparieren are raised to the governing verb versucht,
passive applies to this verb and turns the object into a subject which is then raised by the auxiliary. This
explains the agreement between the underlying object of zu reparieren ‘to repair’ and wurde ‘was’. Hudson
(1997), working in the framework of Word Grammar, suggests an analysis of verbal complementation in
German that involves what he calls generalized raising. He assumes that both subjects and complements
may be raised to the governing head. Note that such an analysis involving generalized raising would make
an analysis of sentences like (i) straightforward, since the object would depend on the same head as the
subject, namely on hat ‘has’ and hence can be placed before the subject.
For a discussion of Groß & Osborne’s account of (ii) see page 596.
376
11.3 Verb position
Figure 11.8: Analysis of [dass] der Weltmeister geschlagen wird ‘that the world champion
is (being) beaten’ with the subject as dependent of the auxiliary
377
11 Dependency Grammar
N N
Figure 11.9: Analysis of [dass] jeder diesen Mann kennt ‘that everybody knows this man’
N N
Figure 11.10: Analysis of Kennt jeder diesen Mann? ‘Does everybody know this man?’
378
11.5 Long-distance dependencies
N N
Figure 11.11: Analysis of [dass] diesen Mann jeder kennt ‘that everybody knows this man’
N N
Figure 11.12: Analysis of Diesen Mann kennt jeder. ‘This man, everybody knows.’ without
special treatment of fronting
379
11 Dependency Grammar
N Subjunction
N Vprt
Figure 11.13: Non-projective analysis of Wen glaubst du, dass ich gesehen habe? ‘Who do
you think I saw?’
graphs we have seen before in not being projective. This means that there are crossing
lines: the connection between Vprt and the N for wen ‘who’ crosses the lines connecting
glaubst ‘believe’ and du ‘you’ with their category symbols. Depending on the version
of Dependency Grammar assumed, this is seen as a problem or it is not. Let us explore
the two options: if discontinuity of the type shown in Figure 11.13 is allowed for as in
Heringer’s and Eroms’ grammars (Heringer 1996: 261; Eroms 2000: Section 9.6.2),14 there
has to be something in the grammar that excludes discontinuities that are ungrammati-
cal. For instance, an analysis of (11) as in Figure 11.14 on the next page should be excluded.
(11) * Wen glaubst ich du, dass gesehen habe?
who.ACC believe.2SG I.NOM you.NOM that seen have
Intended: ‘Who do you think I saw?’
Note that the order of elements in (11) is compatible with statements that refer to topolog-
ical fields as suggested by Engel (2014: 50): there is a Vorfeld filled by wen ‘who’, there is
a left sentence bracket filled by glaubst ‘believe’, and there is a Mittelfeld filled by ich ‘I’,
du ‘you’ and the clausal argument. Having pronouns like ich and du in the Mittelfeld is
13
Scherpenisse (1986: 84).
14
However, the authors mention the possibility of raising an extracted element to a higher node. See for
instance Eroms & Heringer (2003: 260).
380
11.5 Long-distance dependencies
N Subjunction
N Vprt
Figure 11.14: Unwanted dependency graph of * Wen glaubst ich du, dass gesehen habe?
‘Who do you think I saw?’
perfectly normal. The problem is that these two pronouns come from different clauses:
du belongs to the matrix verb glaubst ‘believe’ while ich ‘I’ depends on gesehen ‘seen’
habe ‘have’. What has to be covered by a theory is the fact that fronting and extraposi-
tion target the left-most and right-most positions of a clause, respectively. This can be
modeled in constituency-based approaches in a straightforward way, as has been shown
in the previous chapters.
As an alternative to discontinuous constituents, one could assume additional mecha-
nisms that promote the dependency of an embedded head to a higher head in the struc-
ture. Such an analysis was suggested by Kunze (1968), Hudson (1997; 2000), Kahane
(1997), Kahane et al. (1998), and Groß & Osborne (2009). In what follows, I use the analy-
sis by Groß & Osborne (2009) as an example for such analyses. Groß & Osborne depict
the reorganized dependencies with a dashed line as in Figure 11.15.15,16 The origin of the
dependency (Vprt ) is marked with a g and the dependent is connected to the node to
which it has risen (the topmost V) by a dashed line. Instead of realizing the accusative
dependent of gesehen ‘seen’ locally, information about the missing element is transferred
to a higher node and realized there.
The analysis of Groß & Osborne (2009) is not very precise. There is a 𝑔 and there
is a dashed line, but sentences may involve multiple nonlocal dependencies. In (12) for
instance, there is a nonlocal dependency in the relative clauses den wir alle begrüßt haben
15
Eroms & Heringer (2003: 260) make a similar suggestion but do not provide any formal details.
16
Note that Groß & Osborne (2009) do not assume a uniform analysis of simple and complex V2 sentences.
That is, for cases that can be explained as local reordering they assume an analysis without rising. Their
analysis of (9) is the one depicted in Figure 11.12. This leads to problems which will be discussed in Sec-
tion 11.7.1.
381
11 Dependency Grammar
N N Subjunction
N Vprt, g
Figure 11.15: Projective analysis of Wen glaubst du, dass ich gesehen habe? ‘Who do you
think I saw?’ involving rising
‘who we all greeted have’ and die noch niemand hier gesehen hat ‘who yet nobody here
seen has’: the relative pronouns are fronted inside the relative clauses. The phrase dem
Mann, den wir alle kennen ‘the man who we all know’ is the fronted dative object of
gegeben ‘given’ and die noch niemand hier gesehen hat ‘who yet nobody here seen has’
is extraposed from the NP headed by Frau ‘woman’.
(12) Dem Mann, den wir alle begrüßt haben, hat die Frau das Buch gegeben, die
the man who we all greeted have has the woman the book given who
noch niemand hier gesehen hat.
yet nobody here seen has
‘The woman who nobody ever saw here gave the book to the man, who all of us
greeted.’
So this means that the connections (dependencies) between the head and the dislocated
element have to be made explicit. This is what Hudson (1997; 2000) does in his Word
Grammar analysis of nonlocal dependencies: in addition to dependencies that relate a
word to its subject, object and so on, he assumes further dependencies for extracted
elements. For example, wen ‘who’ in (10) – repeated here as (13) for convenience – is the
object of gesehen ‘seen’ and the extractee of glaubst ‘believe’ and dass ‘that’:
(13) Wen glaubst du, dass ich gesehen habe?
who believe you that I seen have
‘Who do you believe that I saw?’
Hudson states that the use of multiple dependencies in Word Grammar corresponds
to structure sharing in HPSG (Hudson 1997: 15). Nonlocal dependencies are modeled
as a series of local dependencies as it is done in GPSG and HPSG. This is important
382
11.5 Long-distance dependencies
since it allows one to capture extraction path marking effects (Bouma, Malouf & Sag
2001: 1–2, Section 3.2): for instance, there are languages that use a special form of the
complementizer for sentences from which an element is extracted. Figure 11.16 shows the
analysis of (13) in Word Grammar. The links above the words are the usual dependency
c
l s
x< s r
x<
x<o
Figure 11.16: Projective analysis of Wen glaubst du, dass ich gesehen habe? ‘Who do you
think I saw?’ in Word Grammar involving multiple dependencies
links for subjects (s) and objects (o) and other arguments (r is an abbreviation for sharer,
which refers to verbal complements, l stands for clausal complement) and the links below
the words are links for extractees (x<). The link from gesehen ‘seen’ to wen ‘who’ is
special since it is both an object link and an extraction link (x<o). This link is an explicit
statement which corresponds to both the little 𝑔 and the N that is marked by the dashed
line in Figure 11.15. In addition to what is there in Figure 11.15, Figure 11.16 also has an
extraction link from dass ‘that’ to wen ‘who’. One could use the graphic representation
of Engel, Eroms, and Gross & Osborne to display the Word Grammar dependencies: one
would simply add dashed lines from the V𝑝𝑟𝑡 node and from the Subjunction node to the
N node dominating wen ‘who’.
While this looks simple, I want to add that Word Grammar employs further principles
that have to be fulfilled by well-formed structures. In the following I explain the No-
tangling Principle, the No-dangling Principle and the Sentence-root Principle.
383
11 Dependency Grammar
the non-projective analysis. This principle also rules out (14b), where green depends on
peas but is not adjacent to peas. Since on selects peas the arrow from on to peas would
cross the one from peas to green.
(14) a. He lives on green peas.
b. * He lives green on peas.
The No-dangling Principle makes sure that there are no isolated word groups that are
not connected to the main part of the structure. Without this principle (14b) could be
analyzed with the isolated word green (Hudson 2000: 23).
The Sentence-root Principle is needed to rule out structures with more than one high-
est element. glaubst ‘believe’ is the root in Figure 11.16. There is no other word that
dominates it and selects for it. The principle makes sure that there is no other root.
So the principle rules out situations in which all elements in a phrase are roots, since
otherwise the No-dangling Principle would lose its force as it could be fulfilled trivially
(Hudson 2000: 25).
I added this rather complicated set of principles here in order to get a fair comparison
with phrase structure-based proposals. If continuity is assumed for phrases in general,
the three principles do not have to be stipulated. So, for example, LFG and HPSG do not
need these three principles.
Note that Hudson (1997: 16) assumes that the element in the Vorfeld is extracted even
for simple sentences like (9). I will show in Section 11.7.1 why I think that this analysis
has to be preferred over analyses assuming that simple sentences like (9) are just order
variants of corresponding verb-initial or verb-final sentences.
384
11.6 New developments and theoretical variants
substance process
concrete noun verb
abstract adjective adverb
specified in Table 11.1.17 Tesnière assumed these categories to be universal and suggested
that there are constraints in which way these categories may depend on others.
According to Tesnière, nouns and adverbs may depend on verbs, adjectives may de-
pend on nouns, and adverbs may depend on adjectives or adverbs. This situation is
depicted in the general dependency graph in Figure 11.17. The ‘*’ means that there can
be an arbitrary number of dependencies between Es. It is of course easy to find examples
O E*
E*
in which adjectives depend on verbs and sentences (verbs) depend on nouns. Such cases
are handled via so-called transfers in Tesnière’s system. Furthermore, conjunctions, de-
terminers, and prepositions are missing from this set of categories. For the combination
of these elements with their dependents Tesnière used special combinatoric relations:
junction and transfer. We will deal with these in the following subsection.
17
As Weber (1997: 77) points out this categorization is not without problems: in what sense is Angst ‘fear’ a
substance? Why should glauben ‘believe’ be a concrete process? See also Klein (1971: Section 3.4) for the
discussion of schlagen ‘to beat’ and Schlag ‘the beat’ and similar cases. Even if one assumes that Schlag is
derived from the concrete process schlag- by a transfer into the category O, the assumption that such Os
stand for concrete substances is questionable.
385
11 Dependency Grammar
11.6.2.1 Junction
Figure 11.18 illustrates the junction relation: the two conjuncts John and Mary are con-
nected with the conjunction and. It is interesting to note that both of the conjuncts are
connected to the head laugh.
N Conj N
In the case of two coordinated nouns we get dependency graphs like the one in Fig-
ure 11.19. Both nouns are connected to the dominating verb and both nouns dominate
the same determiner.
N Conj N
Det
386
11.6 New developments and theoretical variants
verb does not select a Conj, but an N. The trick that could be applied here is basically the
same trick as in Categorial Grammar (see Section 21.6.2): the category of the conjunction
in Categorial Grammar is (X\X)/X. We have a functor that takes two arguments of the
same category and the result of the combination is an object that has the same category
as the two arguments. Translating this approach to Dependency Grammar, one would
get an analysis as the one depicted in Figure 11.20 rather than the ones in Figure 11.18
and Figure 11.19. The figure for all girls and boys looks rather strange since both the
V V
N N
N N Det N N
Figure 11.20: Analysis of coordination without junction and the conjunction as head
determiner and the two conjuncts depend on the conjunction, but since the two Ns are
selecting a Det, the same is true for the result of the coordination. In Categorial Grammar
notation, the category of the conjunction would be ((NP\Det)\(NP\Det))/(NP\Det) since
X is instantiated by the nouns which would have the category (NP\Det) in an analysis
in which the noun is the head and the determiner is the dependent.
Note that both approaches have to come up with an explanation of subject–verb agree-
ment. Tesnière’s original analysis assumes two dependencies between the verb and the
individual conjuncts.19 As the conjuncts are singular and the verb is plural, agreement
cannot be modeled in tandem with dependency relations in this approach. If the second
analysis finds ways of specifying the agreement properties of the coordination in the
conjunction, the agreement facts can be accounted for without problems.
The alternative to a headed approach as depicted in Figure 11.20 is an unheaded one.
Several authors working in phrase structure-based frameworks suggested analyses of
coordination without a head. Such analyses are also assumed in Dependency Grammar
(Hudson 1988; Kahane 1997). Hudson (1988) and others who make similar assumptions
assume a phrase structure component for coordination: the two nouns and the conjunc-
tion are combined to form a larger object which has properties which do not correspond
to the properties of any of the combined words.
Similarly, the junction-based analysis of coordination poses problems for the interpre-
tation of the representations. If semantic role assignment happens in parallel to depen-
dency relations, there would be a problem with graphs like the one in Figure 11.18, since
19
Eroms (2000: 467) notes the agreement problem and describes the facts. In his analysis, he connects the
first conjunct to the governing head, although it seems to be more appropriate to assume an internally
structured coordination structure and then connect the highest conjunction.
387
11 Dependency Grammar
the semantic role of laugh cannot be filled by John and Mary simultaneously. Rather it
is filled by one entity, namely the one that refers to the set containing John and Mary.
This semantic representation would belong to the phrase John and Mary and the natu-
ral candidate for being the topmost entity in this coordination is and, as it embeds the
meaning of John and the meaning of Mary: and ′(John ′, Mary ′).
Such junctions are also assumed for the coordination of verbs. This is, however, not
without problems, since adjuncts can have scope over the conjunct that is closest to
them or over the whole coordination. An example is the following sentence from Levine
(2003: 217):
(15) Robin came in, found a chair, sat down, and whipped off her logging boots in
exactly thirty seconds flat.
The adjunct in exactly thirty seconds flat can refer either to whipped off her logging boots
as in (16a) or scope over all three conjuncts together as in (16b):
(16) a. Robin came in, found a chair, sat down, and [[pulled off her logging boots] in
exactly thirty seconds flat].
b. Robin [[came in, found a chair, sat down, and pulled off her logging boots] in
exactly thirty seconds flat].
The Tesnièreian analysis in Figure 11.21 corresponds to (17), while an analysis that treats
the conjunction as the head as in Figure 11.22 on the next page corresponds to (16b).
(17) Robin came in in exactly thirty seconds flat and Robin found a chair in exactly
thirty seconds flat and Robin pulled off her logging boots in exactly thirty seconds
flat.
The reading in (17) results when an adjunct refers to each conjunct individually rather
then referring to a cumulative event that is expressed by a verb phrase as in (16b).
V Conj V
N Part N P
Det N
Det
388
11.6 New developments and theoretical variants
N V V P
Part N N
Det Det
Levine (2003: 217) discusses these sentences in connection to the HPSG analysis of
extraction by Bouma, Malouf & Sag (2001). Bouma, Malouf & Sag suggest an analysis in
which adjuncts are introduced lexically as dependents of a certain head. Since adjuncts
are introduced lexically, the coordination structures basically have the same structure as
the ones assumed in a Tesnièreian analysis. It may be possible to come up with a way to
get the semantic composition right even though the syntax does not correspond to the
semantic dependencies (see Chaves 2009 for suggestions), but it is clear that it is simpler
to derive the semantics from a syntactic structure which corresponds to what is going
on in semantics.
11.6.2.2 Transfer
Transfers are used in Tesnière’s system for the combination of words or phrases with a
head of one of the major categories (for instance nouns) with words in minor categories
(for instance prepositions). In addition, transfers can transfer a word or phrase into
another category without any other word participating.
Figure 11.23 shows an example of a transfer. The preposition in causes a category
change: while Traumboot ‘dream boat’ is an O (noun), the combination of the preposi-
tion and the noun is an E. The example shows that Tesnière used the grammatical cate-
gory to encode grammatical functions. In theories like HPSG there is a clear distinction:
there is information about part of speech on the one hand and the function of elements
as modifiers and predicates on the other hand. The modifier function is encoded by the
selectional feature MOD, which is independent of the part of speech. It is therefore pos-
sible to have modifying and non-modifying adjectives, modifying and non-modifying
prepositional phrases, modifying and non-modifying noun phrases and so on. For the
example at hand, one would assume a preposition with directional semantics that selects
for an NP. The preposition is the head of a PP with a filled MOD value.
389
11 Dependency Grammar
steigt (I)
enter
er (O)
he E
in Traumboot (O)
in dream.boat
der
the
Figure 11.23: Transfer with an example adapted from Weber (1997: 83)
Another area in which transfer is used is morphology. For instance, the derivation of
French frappant ‘striking’ by suffixation of -ant to the verb stem frapp is shown in Fig-
ure 11.24. Such transfers can be subsumed under the general connection relation if the
un exemple Adj
A V
affix is treated as the head. Morphologists working in realizational morphology and con-
struction morphology argue against such morpheme-based analyses since they involve
a lot of empty elements for conversions as for instance the conversion of the verb play
into the noun play (see Figure 11.25). Consequently, lexical rules are assumed for deriva-
tions and conversions in theories like HPSG. HPSG lexical rules are basically equivalent
to unary branching rules (see the discussion of (41) on page 289 and Section 19.5). The
affixes are integrated into the lexical rules or into realization functions that specify the
morphological form of the item that is licensed by the lexical rule.
390
11.6 New developments and theoretical variants
O V
play _ play _
• unary phrase structure rules or binary branching phrase structure rules together
with an empty head if a phrase is converted to another category without any ad-
ditional element present or
• a (unary) lexical rule if a word or stem is mapped to a word or a stem.
For further discussion of the relation between Tesnière’s transfer rules and constituency
rules see Kahane & Osborne (2015: Section 4.9.1–4.9.2). Kahane & Osborne point out
that transfer rules can be used to model exocentric constructions, that is, constructions
in which there is no single part that could be identified as the head. For more on headless
constructions see Section 11.7.2.4.
11.6.3 Scope
As Kahane & Osborne (2015: lix) point out, Tesnière uses so-called polygraphs to rep-
resent scopal relations. So, since that you saw yesterday in (18) refers to red cars rather
than cars alone, this is represented by a line that starts at the connection between red
and cars rather than on one of the individual elements (Tesnière 2015: 150, Stemma 149).
(18) red cars that you saw yesterday
Tesnière’s analysis is depicted in the left representation in Figure 11.26. It is worth not-
ing that this representation corresponds to the phrase structure tree on the right of Fig-
ure 11.26. The combination B between red and cars corresponds to the B node in the right-
hand figure and the combination A of red cars and that you saw yesterday corresponds to
the A node. So, what is made explicit and is assigned a name in phrase structure gram-
mars remains nameless in Tesnière’s analysis, but due to the assumption of polygraphs,
it is possible to refer to the combinations. See also the discussion of Figure 11.46, which
shows additional nodes that Hudson assumes in order to model semantic relations.
391
11 Dependency Grammar
cars A
B B
A
red that you saw yesterday red cars that you saw yesterday
Figure 11.26: Tesnière’s way of representing scope and the comparison with phrase struc-
ture-based analyses by Kahane & Osborne (2015: lix)
11.7.1 Linearization
We have seen several approaches to linearization in this chapter. Many just assume a
dependency graph and some linearization according to the topological fields model. As
has been argued in Section 11.5, allowing discontinuous serialization of a head and its
dependents opens up Pandora’s box. I have discussed the analysis of nonlocal dependen-
cies by Kunze (1968), Hudson (1997; 2000), Kahane, Nasr & Rambow (1998), and Groß &
Osborne (2009). With the exception of Hudson those authors assume that dependents
of a head rise to a dominating head only in those cases in which a discontinuity would
arise otherwise. However, there seems to be a reason to assume that fronting should be
treated by special mechanisms even in cases that allow for continuous serialization. For
instance, the ambiguity or lack of ambiguity of the examples in (19) cannot be explained
in a straightforward way:
392
11.7 Summary and classification
Adv N N Adv
Det
for (19b) and (19c) do not differ. The graphs would be the same, only differing in serial-
ization. Therefore, differences in scope could not be derived from the dependencies and
complicated statements like (20) would be necessary:
(20) If a dependent is linearized in the Vorfeld it can both scope over and under all
other adjuncts of the head it is a dependent of.
Eroms (1985: 320) proposes an analysis of negation in which the negation is treated as
the head; that is, the sentence in (21) has the structure in Figure 11.28.20
(21) Er kommt nicht.
he comes not
‘He does not come.’
20
But see Eroms (2000: Section 11.2.3).
393
11 Dependency Grammar
Adv
er kommt nicht
he comes not
Figure 11.28: Analysis of negation according to Eroms (1985: 320)
This analysis is equivalent to analyses in the Minimalist Program assuming a NegP and
it has the same problem: the category of the whole object is Adv, but it should be V. This
is a problem since higher predicates may select for a V rather than an Adv.21
The same is true for constituent negation or other scope bearing elements. For exam-
ple, the analysis of (22) would have to be the one in Figure 11.29.
(22) der angebliche Mörder
the alleged murderer
Adj
Det
Figure 11.29: Analysis that would result if one considered all scope-bearing adjuncts to
be heads
This structure would have the additional problem of being non-projective. Eroms does
treat the determiner differently from what is assumed here, so this type of non-projectiv-
ity may not be a problem for him. However, the head analysis of negation would result
in non-projectivity in so-called coherent constructions in German. The sentence in (23)
has two readings: in the first reading, the negation scopes over singen ‘sing’ and in the
second one over singen darf ‘sing may’.
21
See for instance the analysis of embedded sentences like (23) below.
394
11.7 Summary and classification
Subjunction
Adv
N V
Figure 11.30: Analysis that results if one assumes the negation to be a head
analysis in which the negation is a word part (‘Wortteiläquivalent’). This does, however,
not help here since first the negation and the verb are not adjacent in V2 contexts like
(19a) and even in verb-final contexts like (23). Eroms would have to assume that the
object to which the negation attaches is the whole verbal complex singen darf ‘sing
may’, that is, a complex object consisting of two words.
This leaves us with the analysis provided in Figure 11.27 and hence with a problem
since we have one structure with two possible adjunct realizations that correspond to
different readings. This is not predicted by an analysis that treats the two possible lin-
earizations simply as alternative orderings.
Thomas Groß (p. c. 2013) suggested an analysis in which oft does not depend on the
verb but on the negation. This corresponds to constituent negation in phrase structure
approaches. The dependency graph is shown on the left-hand side in Figure 11.31. The
figure on the right-hand side shows the graph for the corresponding verb-final sentence.
The reading corresponding to constituent negation can be illustrated with contrastive
expressions. While in (24a) it is only oft ‘often’ which is negated, it is oft gelesen ‘often
read’ that is in the scope of negation in (24b).
(24) a. Er hat das Buch nicht oft gelesen, sondern selten.
he has the book not often read but seldom
‘He did not read the book often, but seldom.’
b. Er hat das Buch nicht oft gelesen, sondern selten gekauft.
he has the book not often read but seldom bought
‘He did not read the book often but rather bought it seldom.’
395
11 Dependency Grammar
V
V
oft liest er das Buch nicht er das Buch nicht oft liest
often reads he the book not he the book not often reads
Figure 11.31: Dependency graph for Oft liest er das Buch nicht. ‘He does not read the book
often.’ according to Groß and verb-final variant
V V
N V N V
NP V NP V
er das Buch nicht oft liest er das Buch nicht oft liest
he the book not often reads he the book not often reads
Figure 11.32: Possible syntactic analyses for er das Buch nicht oft liest ‘he does not read
the book often’
These two readings correspond to the two phrase structure trees in Figure 11.32. Note
that in an HPSG analysis, the adverb oft would be the head of the phrase nicht oft ‘not
often’. This is different from the Dependency Grammar analysis suggested by Groß.
Furthermore, the Dependency Grammar analysis has two structures: a flat one with all
adverbs depending on the same verb and one in which oft depends on the negation. The
phrase structure-based analysis has three structures: one with the order oft before nicht,
one with the order nicht before oft and the one with direct combination of nicht and oft.
The point about the example in (19a) is that one of the first two structures is missing in
the Dependency Grammar representations. This probably does not make it impossible
to derive the semantics, but it is more difficult than it is in constituent-based approaches.
396
11.7 Summary and classification
N N N
Figure 11.33: Dependency graph for Dem Saft eine kräftige Farbe geben Blutorangen.
‘Blood oranges give the juice a strong color.’
Furthermore, note that models that directly relate dependency graphs to topological
fields will not be able to account for sentences like (25).
(25) Dem Saft eine kräftige Farbe geben Blutorangen.22
the juice a strong color give blood.oranges
‘Blood oranges give a strong color to the juice.’
The dependency graph of this sentence is given in Figure 11.33.
Such apparent multiple frontings are not restricted to NPs. Various types of depen-
dents can be placed in the Vorfeld. An extensive discussion of the data is provided in
Müller (2003a). Additional data have been collected in a research project on multiple
frontings and information structure (Bildhauer 2011). Any theory based on dependencies
alone and not allowing for empty elements is forced to give up the restriction commonly
assumed in the analysis of V2 languages, namely that the verb is in second position. In
comparison, analyses like GB and those HPSG variants that assume an empty verbal head
can assume that a projection of such a verbal head occupies the Vorfeld. This explains
why the material in the Vorfeld behaves like a verbal projection containing a visible
verb: such Vorfelds are internally structured topologically. They may have a filled Nach-
feld and even a particle that fills the right sentence bracket. See Müller (2005c; 2017a)
for further data, discussion, and a detailed analysis. The equivalent of the analysis in
Gross & Osborne’s framework (2009) would be something like the graph that is shown
in Figure 11.34, but note that Groß & Osborne (2009: 73) explicitly reject empty elements,
and in any case an empty element which is stipulated just to get the multiple fronting
cases right would be entirely ad hoc.23 It is important to note that the issue is not solved
by simply dropping the V2 constraint and allowing dependents of the finite verb to be
22
Bildhauer & Cook (2010) found this example in the Deutsches Referenzkorpus (DeReKo), hosted at Institut
für Deutsche Sprache, Mannheim: http://www.ids-mannheim.de/kl/projekte/korpora, 2018-02-20.
23
I stipulated such an empty element in a linearization-based variant of HPSG allowing for discontinuous
constituents (Müller 2002b), but later modified this analysis so that only continuous constituents are al-
lowed, verb position is treated as head-movement and multiple frontings involve the same empty verbal
head as is used in the verb movement analysis (Müller 2005c; 2017a).
397
11 Dependency Grammar
Vg
V N
N N
Figure 11.34: Dependency graph for Dem Saft eine kräftige Farbe geben Blutorangen.
‘Blood oranges give the juice a strong color.’ with an empty verbal head
for the Vorfeld
realized to its left, since the fronted constituents do not necessarily depend on the finite
verb as the examples in (26) show:
(26) a. [Gezielt] [Mitglieder] [im Seniorenbereich] wollen die Kendoka
specifically members in.the senior.citizens.sector want.to the Kendoka
allerdings nicht werben.24
however not recruit
‘However, the Kendoka do not intend to target the senior citizens sector with
their member recruitment strategy.’
b. [Kurz] [die Bestzeit] hatte der Berliner Andreas Klöden […] gehalten.25
briefly the best.time had the Berliner Andreas Klöden held
‘Andreas Klöden from Berlin had briefly held the record time.’
And although the respective structures are marked, such multiple frontings can even
cross clause boundaries:
(27) Der Maria einen Ring glaube ich nicht, daß er je schenken wird.26
the.DAT Maria a.ACC ring believe I not that he ever give will
‘I don’t think that he would ever give Maria a ring.’
If such dependencies are permitted it is really difficult to constrain them. The details
cannot be discussed here, but the reader is referred to Müller (2005c; 2017a).
Note also that Engel’s statement regarding the linear order in German sentences (2014:
50) referring to one element in front of the finite verb (see footnote 7) is very imprecise.
24
taz, 07.07.1999, p. 18. Quoted from Müller (2002b).
25
Märkische Oderzeitung, 28./29.07.2001, p. 28.
26
Fanselow (1993: 67).
398
11.7 Summary and classification
One can only guess what is intended by the word element. One interpretation is that it
is a continuous constituent in the classical sense of constituency-based grammars. An
alternative would be that there is a continuous realization of a head and some but not
necessarily all of its dependents. This alternative would allow an analysis of extraposi-
tion with discontinuous constituents of (28) as it is depicted in Figure 11.35.
(28) Ein junger Kerl stand da, mit langen blonden Haaren, die sein Gesicht
a young guy stood there with long blond hair that his face
einrahmten, […]27
framed
‘A young guy was standing there with long blond hair that framed his face’
N Adv
Det Adj P
Adj Adj
Figure 11.35: Dependency graph for Ein junger Kerl stand da, mit langen blonden Haaren.
‘A young guy was standing there with long blond hair.’ with a discontinuous
constituent in the Vorfeld
A formalization of such an analysis is not trivial, since one has to be precise about what
exactly can be realized discontinuously and which parts of a dependency must be real-
ized continuously. Kathol & Pollard (1995) developed such an analysis of extraposition
in the framework of HPSG. See also Müller (1999b: Section 13.3). I discuss the basic
mechanisms for such linearization analyses in HPSG in the following section.
27
Charles Bukowski, Der Mann mit der Ledertasche. München: Deutscher Taschenbuch Verlag, 1994, p. 201,
translation by Hans Hermann.
399
11 Dependency Grammar
NP N
D N D
a book a book
structure grammars directly since some involve more elaborate structure. For instance,
the rule S → NP, VP cannot be translated into a dependency rule, since NP and VP are
both complex categories.
In what follows, I want to show how the dependency graph in Figure 11.1 can be recast
as headed phrase structure rules that license a similar tree, namely the one in Figure 11.37.
I did not use the labels NP and VP to keep the two figures maximally similar. The P part
of NP and VP refers to the saturation of a projection and is often ignored in figures. See
Chapter 9 on HPSG, for example. The grammar that licenses the tree is given in (29),
again ignoring valence information.
(29) N→DN N → child D → the D→a
V→NVN N → book V → reads
If one replaces the N and V in the right-hand side of the two left-most rules in (29) with
the respective lexical items and then removes the rules that license the words, one arrives
at the lexicalized variant of the grammar given in (30):
400
11.7 Summary and classification
N V N
D N D N
Figure 11.37: Analysis of The child reads a book. in a phrase structure with flat rules
28
As mentioned on page 371, Gaifman (1965: 305), Hays (1964: 513), Baumgärtner (1970: 57) and Heringer
(1996: 37) suggest a general rule format for dependency rules that has a special marker (‘*’ and ‘~’, respec-
tively) in place of the lexical words in (30). Heringer’s rules have the form in (31):
X is the category of the head, Y1, Y2, and Y3 are dependents of the head and ‘~’ is the position into which
the head is inserted.
29
See page 192 for a similar rule in GPSG and see Kasper (1994) for an HPSG analysis of German that assumes
entirely flat structures and integrates an arbitrary number of adjuncts. Dahl (1980) argues that one needs
“higher nodes” (N nodes and VP nodes in other terminology) for adjunct attachment for semantic reasons.
I think this is not correct since – as Kasper showed – relational constraints could be used to determine
complex semantic representations. I agree though that assuming these nodes makes things a lot easier. See
also footnote 36.
401
11 Dependency Grammar
Such generalized phrase structures would give us the equivalent of projective Depen-
dency Grammars.30 However, as we have seen, some researchers allow for crossing
edges, that is, for discontinuous constituents. In what follows, I show how such Depen-
dency Grammars can be formalized in HPSG.
(33) ⟨ a, b ⟩ ⃝ ⟨ c, d ⟩ = ⟨ a, b, c, d ⟩ ∨
⟨ a, c, b, d ⟩ ∨
⟨ a, c, d, b ⟩ ∨
⟨ c, a, b, d ⟩ ∨
⟨ c, a, d, b ⟩ ∨
⟨ c, d, a, b ⟩
The result is a disjunction of six lists. a is ordered before b and c before d in all of
these lists, since this is also the case in the two lists ⟨ a, b ⟩ and ⟨ c, d ⟩ that have been
30
Sylvain Kahane (p. c. 2015) states that binarity is important for Dependency Grammars, since there is one
rule for the subject, one for the object and so on (as for instance in Kahane 2009, which is an implementation
of Dependency Grammar in the HPSG formalism). However, I do not see any reason to disallow for flat
structures. For instance, Ginzburg & Sag (2000: 364) assumed a flat rule for subject auxiliary inversion in
HPSG. In such a flat rule the specifier/subject and the other complements are combined with the verb in
one go. This would also work for more than two valence features that correspond to grammatical functions
like subject, direct object, indirect object. See also footnote 28 on flat rules.
402
11.7 Summary and classification
combined. But apart from this, b can be placed before, between or after c and d. Every
word comes with a domain value that is a list that contains the word itself:
(34) Domain contribution of single words, here gibt ‘gives’:
PHON gibt
1 SYNSEM …
DOM ⟨ 1 ⟩
The description in (34) may seem strange at first glance, since it is cyclic, but it can be
understood as a statement saying that gibt contributes itself to the items that occur in
linearization domains.
The constraint in (35) is responsible for the determination of the PHON values of
phrases:
PHON 1 ⊕ … ⊕ 𝑛
* +
(35) phrase ⇒ sign sign
DOM , …,
PHON 1 PHON 𝑛
It states that the PHON value of a sign is the concatenation of the PHON values of its
DOMAIN elements. Since the order of the DOMAIN elements corresponds to their surface
order, this is the obvious way to determine the PHON value of the whole linguistic object.
Figure 11.38 shows how this machinery can be used to license binary branching struc-
tures with discontinuous constituents. Words or word sequences that are separated by
commas stand for separate domain objects, that is, ⟨ das, Buch ⟩ contains the two objects
NP[nom, DOM ⟨ ein, Mann ⟩] V[DOM ⟨ der Frau, das Buch, gibt ⟩]
Figure 11.38: Analysis of dass der Frau ein Mann das Buch gibt ‘that a man gives the
woman the book’ with binary branching structures and discontinuous con-
stituents
403
11 Dependency Grammar
NP[nom, DOM ⟨ ein, Mann ⟩] V[DOM ⟨ der Frau, das Buch, gibt ⟩]
Figure 11.39: Analysis of dass der Frau ein Mann das Buch gibt ‘that a man gives the
woman the book’ with binary branching structures and discontinuous con-
stituents showing the discontinuity
das and Buch and ⟨ das Buch, gibt ⟩ contains the two objects das Buch and gibt. The im-
portant point to note here is that the arguments are combined with the head in the order
accusative, dative, nominative, although the elements in the constituent order domain
are realized in the order dative, nominative, accusative rather than nominative, dative,
accusative, as one would expect. This is possible since the formulation of the computa-
tion of the DOM value using the shuffle operator allows for discontinuous constituents.
The node for der Frau das Buch gibt ‘the woman the book gives’ is discontinuous: ein
Mann ‘a man’ is inserted into the domain between der Frau ‘the woman’ and das Buch
‘the book’. This is more obvious in Figure 11.39, which has a serialization of NPs that
corresponds to their order.
Such binary branching structures were assumed for the analysis of German by Kathol
(1995; 2000) and Müller (1995; 1996c; 1999b; 2002a), but as we have seen throughout
this chapter, Dependency Grammar assumes flat representations (but see Footnote 30
on page 402). Schema 1 licenses structures in which all arguments of a head are realized
in one go.31
31
I assume here that all arguments are contained in the COMPS list of a lexical head, but nothing hinges on
that. One could also assume several valence features and nevertheless get a flat structure. For instance,
Borsley (1989: 339) suggests a schema for auxiliary inversion in English and verb-initial sentences in Welsh
that refers to both the valence feature for subjects and for complements and realizes all elements in a flat
structure.
404
11.7 Summary and classification
To keep the presentation simple, I assume that the COMPS list contains descriptions of
complete signs. Therefore the whole list can be identified with the list of non-head
daughters.32 The computation of the DOM value can be constrained in the following
way:
HEAD-DTR 1
(36) headed-phrase ⇒ NON-HEAD-DTRS ⟨ 2 , …, 𝑛 ⟩
DOM ⟨ 1 ⟩⃝⟨ 2 ⟩⃝…⃝⟨ 𝑛 ⟩
This constraint says that the value of DOM is a list which is the result of shuffling singleton
lists each containing one daughter as elements. The result of such a shuffle operation is
a disjunction of all possible permutations of the daughters. This seems to be overkill for
something that GPSG already gained by abstracting away from the order of the elements
on the right hand side of a phrase structure rule. Note, however, that this machinery can
be used to reach even freer orders: by referring to the DOM values of the daughters rather
than the daughters themselves, it is possible to insert individual words into the DOM list.
HEAD-DTR|DOM 1
(37) headed-phrase ⇒ NON-HEAD-DTRS ⟨ [ DOM 2 ] … [ DOM 𝑛 ] ⟩
DOM ⟨ 1 ⟩ ⃝ ⟨ 2 ⟩ ⃝ … ⃝ ⟨ 𝑛 ⟩
Using this constraint we have DOM values that basically contain all the words in an
utterance in any permutation. What we are left with is a pure Dependency Grammar
without any constraints on projectivity. With such a grammar we could analyze the
non-projecting structure of Figure 11.6 on page 374 and much more. The analysis in
terms of domain union is shown in Figure 11.40. It is clear that such discontinuity is
Figure 11.40: Unwanted analysis of dass die Frauen Türen öffnen ‘that the women open
doors’ using Reape-style constituent order domains
unwanted and hence one has to have restrictions that enforce continuity. One possible
restriction is to require projectivity and hence equivalence to phrase structure grammars
in the sense that was discussed above.
32
Without this assumption one would need a relational constraint that maps a list with descriptions of type
synsem onto a list with descriptions of type sign. See Meurers (1999c: 198) for details.
405
11 Dependency Grammar
V V
N N
Figure 11.41: The dependency graph of Dass Peter kommt, klärt nicht, ob Klaus spielt. ‘That
Peter comes does not resolve the question of whether Klaus plays.’ can be
derived from the semantic representation.
does not hold in the general case. Take for instance the example in (40):
(40) Dass Peter kommt, klärt nicht, ob Klaus kommt.
that Peter comes resolves not whether Klaus plays
‘That Peter comes does not resolve the question of whether Klaus comes.’
Here the word kommt appears twice. Without any notion of constituency or restrictions
regarding adjacency, linear order and continuity, we cannot assign a dependency graph
unambiguously. For instance, the graph in Figure 11.42 is perfectly compatible with the
406
11.7 Summary and classification
meaning of this sentence: dass dominates kommt and kommt dominates Peter, while ob
dominates kommt and kommt dominates Klaus. I used the wrong kommt in the depen-
V V
N N
407
11 Dependency Grammar
V V
N N
Figure 11.43: The dependency graph of the word salad Deshalb klärt dass ob Peter Klaus
kommt spielt. ‘Therefore resolves that whether Peter Klaus comes plays’
which is admitted by non-projective Dependency Grammars that do not
restrict discontinuity
are fronted with it (p. 184). This “item with all its dependents” is the constituent in con-
stituent-based grammars. The difference is that this object is not given an explicit name
and is not assumed to be a separate entity containing the head and its dependents in
most Dependency Grammars.33
Summing up what has been covered in this section so far, I have shown what a phrase
structure grammar that corresponds to a certain Dependency Grammar looks like. I
have also shown how discontinuous constituents can be allowed for. However, there are
issues that remained unaddressed so far: not all properties that a certain phrase has are
identical to its lexical head and the differences have to be represented somewhere. I will
discuss this in the following subsection.
11.7.2.3 Features that are not identical between heads and projections
As Oliva (2003) points out, the equivalence of Dependency Grammar and HPSG only
holds up as far as HEAD values are concerned. That is, the node labels in dependency
graphs correspond to the HEAD values in an HPSG. There are, however, additional fea-
tures like CONT for the semantics and SLASH for nonlocal dependencies. These values
usually differ between a lexical head and its phrasal projections. For illustration, let us
have a look at the phrase a book. The semantics of the lexical material and the complete
phrase is given in (42):34
33
See however Hellwig (2003) for an explicit proposal that assumes that there is a linguistic object that
represents the whole constituent rather than just the lexical head.
34
For lambda expressions see Section 2.3.
408
11.7 Summary and classification
With this kind of representation one could maintain analyses in which the semantic con-
tribution of a head together with its dependents is a function of the semantic contribution
of the parts.
Now, there are probably further features in which lexical heads differ from their pro-
jections. One such feature would be SLASH, which is used for nonlocal dependencies
in HPSG and could be used to establish the relation between the risen element and the
head in an approach à la Groß & Osborne (2009). Of course we can apply the same trick
again. We would then have a feature LEXICAL-SLASH. But this could be improved and
the features of the lexical item could be grouped under one path. The general skeleton
would then be (44):
CONT
SLASH
(44)
LEXICAL CONT
SLASH
But if we rename LEXICAL to HEAD-DTR, we basically get the HPSG representation.
Hellwig (2003: 602) states that his special version of Dependency Grammar, which he
calls Dependency Unification Grammar, assumes that governing heads select complete
nodes with all their daughters. These nodes may differ in their properties from their
head (p. 604). They are in fact constituents. So this very explicit and formalized variant
of Dependency Grammar is very close to HPSG, as Hellwig states himself (p. 603).
35
Hudson (2003: 391–392) is explicit about this: “In dependency analysis, the dependents modify the head
word’s meaning, so the latter carries the meaning of the whole phrase. For example, in long books about
linguistics, the word books means ‘long books about linguistics’ thanks to the modifying effect of the de-
pendents.” For a concrete implementation of this idea see Figure 11.44.
An alternative is to assume different representational levels as in Meaning–Text Theory (Mel’čuk 1981).
In fact the CONT value in HPSG is also a different representational level. However, this representational
level is in sync with the other structure that is build.
409
11 Dependency Grammar
Hudson’s Word Grammar (2018) is also explicitly worked out and, as will be shown,
it is rather similar to HPSG. The representation in Figure 11.44 is a detailed description
of what the abbreviated version in Figure 11.45 stands for. What is shown in the first-
were”
children’ playing’
Figure 11.44: Analysis of Small children were playing outside. according to Hudson (2018:
105)
Figure 11.45: Abbreviated analysis of Small children were playing outside. according to
Hudson (2018: 105)
diagram is that a combination of two nodes results in a new node.36 For instance, the
combination of playing and outside yields playing ′, the combination of small and chil-
dren yields children ′, and the combination of children ′ and playing ′ yields playing ′′. The
combination of were and playing ′′ results in were ′ and the combination of children ′′ and
were ′ yields were ′′. The only thing left to explain is why there is a node for children that
is not the result of the combination of two nodes, namely children ′′. The line with the
triangle at the bottom stands for default inheritance. That is, the upper node inherits all
properties from the lower node by default. Defaults can be overridden, that is, informa-
tion at the upper node may differ from information at the dominated node. This makes
it possible to handle semantics compositionally: nodes that are the result of the com-
bination of two nodes have a semantics that is the combination of the meaning of the
two combined nodes. Turning to children again, children ′ has the property that it must
be adjacent to playing, but since the structure is a raising structure in which children is
raised to the subject of were, this property is overwritten in a new instance of children,
namely children ′′.
36
By assuming these additional nodes Hudson addresses earlier criticism by Dahl (1980), who pointed out
that ordinary in ordinary French house does not refer to house but to French house. So there has to be a
representation for French house somewhere. At least at the semantic level. Hudson’s additional nodes and
classical N nodes solve this problem as well.
410
11.7 Summary and classification
The interesting point now is that we get almost a normal phrase structure tree if we
replace the words in the diagram in Figure 11.44 by syntactic categories. The result of
the replacement is shown in Figure 11.46. The only thing unusual in this graph (marked
V[fin]′′
N′′ V[fin]′
V[fin] V[ing]′′
N′ V[ing]′
Figure 11.46: Analysis of Small children are playing outside. with category symbols
by dashed lines) is that N ′ is combined with V[ing]′ and the mother of N′, namely N′′,
is combined with V[fin]′. As explained above, this is due to the analysis of raising in
Word Grammar, which involves multiple dependencies between a raised item and its
heads. There are two N nodes (N′ and N′′) in Figure 11.46 and two instances of children
in Figure 11.44. Apart from this, the structure corresponds to what an HPSG grammar
would license. The nodes in Hudson’s diagram which are connected with lines with
triangles at the bottom are related to their children using default inheritance. This too
is rather similar to those versions of HPSG that use default inheritance. For instance,
Ginzburg & Sag (2000: 33) use a Generalized Head Feature Principle that projects all
properties of the head daughter to the mother by default.
The conclusion of this section is that the only principled difference between phrase
structure grammars and Dependency Grammar is the question of how much intermedi-
ate structure is assumed: is there a VP without the subject? Are there intermediate nodes
for adjunct attachment? It is difficult to decide these questions in the absence of fully
worked out proposals that include semantic representations. Those proposals that are
worked out – like Hudson’s and Hellwig’s – assume intermediate representations, which
makes these approaches rather similar to phrase structure-based approaches. If one com-
pares the structures of these fully worked out variants of Dependency Grammar with
phrase structure grammars, it becomes clear that the claim that Dependency Grammars
are simpler is unwarranted. This claim holds for compacted schematic representations
like Figure 11.45 but it does not hold for fully worked out analyses.
The simplicity claim is repeatedly made in Timothy Osborne’s work (for example in
Osborne & Groß 2016: 132; Osborne 2018b: 2). In a reply to Osborne (2018b), I mentioned
some of the phenomena discussed above and pointed out that they are not captured
411
11 Dependency Grammar
by simple dependency structures (Müller 2019b) and argued that additional structure is
needed in order to account for these phenomena. Somewhat ironically, Osborne (2019)
worked out analyses in his reply that introduced a new concept (the Colocant) and addi-
tional structure to capture semantic groupings. By doing so, he proved the point made
above and in Müller (2019b).
(i) die Frau, von deren Schwester ich ein Bild gesehen habe
the woman of whose sister I a picture seen have
‘the woman of whose sister I saw a picture’
39
See Chapter 19 on empty elements in general and Subsection 21.10.3 on relative clauses in particular.
412
11.7 Summary and classification
Sag (1997) working on relative clauses in English suggested a phrasal analysis of relative
clauses in which the relative phrase and the clause from which it is extracted form a new
phrase. A similar analysis was assumed by Müller (1996c) and is documented in Müller
(1999b: Chapter 10). As was discussed in Section 8.6 it is neither plausible to assume the
relative pronoun or some other element in the relative phrase to be the head of the entire
relative clause, nor is it plausible to assume the verb to be the head of the entire relative
clause (pace Sag), since relative clauses modify Ns, something that projections of (finite)
verbs usually do not do. So assuming an empty head or a phrasal schema seems to be
the only option.
Chapter 21 is devoted to the discussion of whether certain phenomena should be ana-
lyzed as involving phrase structural configurations or whether lexical analyses are better
suited in general or for modeling some phenomena. I argue there that all phenomena in-
teracting with valence should be treated lexically. But there are other phenomena as well
and Dependency Grammar is forced to assume lexical analyses for all linguistic phenom-
ena. There always has to be some element on which others depend. It has been argued
by Jackendoff (2008) that it does not make sense to assume that one of the elements in
N-P-N constructions like those in (46) is the head.
(46) a. day by day, paragraph by paragraph, country by country
b. dollar for dollar, student for student, point for point
c. face to face, bumper to bumper
d. term paper after term paper, picture after picture
e. book upon book, argument upon argument
Of course there is a way to model all the phenomena that would be modeled by a phrasal
construction in frameworks like GPSG, CxG, HPSG, or Simpler Syntax: an empty head.
Figure 11.47 shows the analysis of student after student. The lexical item for the empty N
N P N
Figure 11.47: Dependency Grammar analysis of the N-P-N Construction with empty head
would be very special, since there are no similar non-empty lexical nouns, that is, there
is no noun that selects for two bare Ns and a P.
Bargmann (2015) pointed out an additional aspect of the N-P-N construction, which
makes things more complicated. The pattern is not restricted to two nouns. There can
be arbitrarily many of them:
(47) Day after day after day went by, but I never found the courage to talk to her.
So rather than an N-P-N pattern Bargmann suggests the pattern in (48), where ‘+’ stands
for at least one repetition of a sequence.
413
11 Dependency Grammar
(48) N (P N)+
Now, such patterns would be really difficult to model in selection-based approaches,
since one would have to assume that an empty head or a noun selects for an arbitrary
number of pairs of the same preposition and noun or nominal phrase. Of course one
could assume that P and N form some sort of constituent, but still one would have to
make sure that the right preposition is used and that the noun or nominal projection
has the right phonology. Another possibility would be to assume that the second N in
N-P-N can be an N-P-N and thereby allow recursion in the pattern. But if one follows
this approach it is getting really difficult to check the constraint that the involved Ns
should have the same or at least similar phonologies.
One way out of these problems would of course be to assume that there are special
combinatorial mechanisms that assign a new category to one or several elements. This
would basically be an unheaded phrase structure rule and this is what Tesnière sug-
gested: transfer rules (see Section 11.6.2.2). But this is of course an extension of pure
Dependency Grammar towards a mixed model. See also Hudson (2020) for a treatment
of the N-P-N construction in Word Grammar involving a complex network of nodes,
that is, something that leaves the normal descriptive devices of Dependency Grammars.
See Section 21.10 for the discussion of further cases which are probably problematic
for purely selection-based grammars.
Exercises
Provide the dependency graphs for the following three sentences:
(49) a. Ich habe einen Mann getroffen, der blonde Haare hat.
I have a man met who blond hair has
‘I have met a man who has blond hair.’
b. Einen Mann getroffen, der blonde Haare hat, habe ich noch nie.
a man met who blond hair has have I yet never
‘I have never met a man who has blond hair.’
c. Dass er morgen kommen wird, freut uns.
that he tomorrow come will pleases us
‘That he will come tomorrow pleases us.’
You may use non-projective dependencies. For the analysis of relative clauses
authors usually propose an abstract entity that functions as a dependent of the
modified noun and as a head of the verb in the relative clause.
414
11.7 Summary and classification
Further reading
In the section on further reading in Chapter 3, I referred to the book called Syn-
taktische Analyseperspektiven ‘Syntactic perspectives on analyses’. The chapters
in this book have been written by proponents of various theories and all analyze
the same newspaper article. The book also contains a chapter by Engel (2014),
assuming his version of Dependency Grammar, namely Dependent Verb Gram-
mar.
Ágel, Eichinger, Eroms, Hellwig, Heringer & Lobin (2003; 2006) published a
handbook on dependency and valence that discusses all aspects related to Depen-
dency Grammar in any imaginable way. Many of the papers have been cited in
this chapter. Papers comparing Dependency Grammar with other theories are es-
pecially relevant in the context of this book: Lobin (2003) compares Dependency
Grammar and Categorial Grammar, Oliva (2003) deals with the representation of
valence and dependency in HPSG, Hudson (2020) discusses Dependency Gram-
mar in general and compares his version of the theory, namely Word Grammar,
with HPSG, and Bangalore, Joshi & Rambow (2003) describe how valence and
dependency are covered in TAG. Hellwig (2006) compares rule-based grammars
with Dependency Grammars with special consideration given to parsing by com-
puter programs.
Osborne & Groß (2012) compare Dependency Grammar with Construction
Grammar and Osborne, Putnam & Groß (2011) argue that certain variants of Min-
imalism are in fact reinventions of dependency-based analyses.
The original work on Dependency Grammar by Tesnière (1959) is also avail-
able in parts in German (Tesnière 1980) and in full in English (Tesnière 2015).
415
12 Tree Adjoining Grammar
Tree Adjoining Grammar (TAG) was developed by Aravind Joshi at the University of
Pennsylvania in the USA (Joshi, Levy & Takahashi 1975). Several important disserta-
tions in TAG have been supervised by Aravind Joshi and Anthony Kroch at the Univer-
sity of Pennsylvania (e.g., Rambow 1994). Other research centers with a focus on TAG
are Paris 7 (Anne Abeillé), Columbia University in the USA (Owen Rambow) and Düssel-
dorf, Germany (Laura Kallmeyer). Rambow (1994) and Gerdes (2002b) are more detailed
studies of German.1
TAG and its variants with relevant extensions are of interest because it is assumed that
this grammatical formalism can – with regard to its expressive power – relatively accu-
rately represent what humans do when they produce or comprehend natural language.
The expressive power of Generalized Phrase Structure Grammar was deliberately con-
strained so that it corresponds to context-free phrase structure grammars (Type-2 lan-
guages) and it has in fact been demonstrated that this is not enough (Shieber 1985; Culy
1985).2 Grammatical theories such as HPSG and CxG can generate/describe so-called
Type-0 languages and are thereby far above the level of complexity presently assumed
for natural languages. The assumption is that this complexity lies somewhere between
context-free and context-sensitive (Type-1) languages. This class is thus referred to as
mildly context-sensitive. Certain TAG-variants are inside of this language class and it is
assumed that they can produce exactly those structures that occur in natural languages.
For more on complexity, see Section 12.6.3 and Chapter 17.
There are various systems for the processing of TAG grammars (Doran, Hockey, Sar-
kar, Srinivas & Xia 2000; Parmentier, Kallmeyer, Maier, Lichte & Dellert 2008; Kall-
meyer, Lichte, Maier, Parmentier, Dellert & Evang 2008; Koller 2017). Smaller and larger
TAG fragments have been developed for the following languages:
• Arabic (Fraj, Zribi & Ahmed 2008),
• German (Rambow 1994; Gerdes 2002a; Kallmeyer & Yoon 2004; Lichte 2007),
• English (XTAG Research Group 2001; Frank 2002; Kroch & Joshi 1987),
• French (Abeillé 1988; Candito 1996; 1998; 1999; Crabbé 2005),
• Italian (Candito 1998; 1999),
1
Since my knowledge of French leaves something to be desired, I just refer to the literature in French here
without being able to comment on the content.
2
See Pullum (1986) for a historical overview of the complexity debate and G. Müller (2011) for argumenta-
tion for the non-context-free nature of German, which follows parallel to Culy with regard to the N-P-N
construction (see Section 21.10.4).
12 Tree Adjoining Grammar
• Korean (Han, Yoon, Kim & Palmer 2000; Kallmeyer & Yoon 2004),
• Vietnamese (Le, Nguyen & Roussanaly 2008)
Candito (1996) has developed a system for the representation of meta grammars which
allows the uniform specification of crosslinguistic generalizations. This system was used
by some of the projects mentioned above for the derivation of grammars for specific lan-
guages. For instance Kinyon, Rambow, Scheffler, Yoon & Joshi (2006) derive the verb
second languages from a common meta grammar. Among those grammars for verb sec-
ond languages is a grammar of Yiddish for which there was no TAG grammar until 2006.
Resnik (1992) combines TAG with a statistics component.
S
VP
NP NP↓ VP
ADV VP*
John V
always
laughs
marked (NP↓ in the tree for laughs). Nodes for the insertion of adjuncts into a tree are
also marked (VP∗ in the tree for always). Grammars where elementary trees always
contain at least one word are referred to as Lexicalized Tree Adjoining Grammar (LTAG,
Schabes, Abeillé & Joshi (1988)).
12.1.2 Substitution
Figure 12.2 on the next page shows the substitution of nodes. Other subtrees have to be
inserted into substitution nodes such as the NP node in the tree for laughs. The tree for
John is inserted there in the example derivation.
418
12.1 General remarks on representational format
S S
NP↓ VP NP VP
{
V John V
NP
laughs laughs
John
S
S
VP NP VP
NP VP
ADV VP* { John ADV VP
John V
always always V
laughs
laughs
12.1.3 Adjunction
Figure 12.3 shows an example of how the adjunction tree for always can be used.
Adjunction trees can be inserted into other trees. Upon insertion, the target node
(bearing the same category as the node marked with ‘*’) is replaced by the adjunction
tree.
TAG differs considerably from the simple phrase structure grammars we encountered
in Chapter 2 in that the trees extend over a larger domain: for example, there is an NP
node in the tree for laughs that is not a sister of the verb. In a phrase structure grammar
(and of course in GB and GPSG since these theories are more or less directly built on
phrase structure grammars), it is only ever possible to describe subtrees with a depth of
one level. For the tree for laughs, the relevant rules would be those in (1):
419
12 Tree Adjoining Grammar
(1) S → NP VP
VP → V
V → laughs
In this context, it is common to speak of locality domains. The extension of the locality
domain is of particular importance for the analysis of idioms (see Section 18.2).
TAG differs from other grammatical theories in that it is possible for structures to be
broken up again. In this way, it is possible to use adjunction to insert any amount of ma-
terial into a given tree and thereby cause originally adjacent constituents to end up being
arbitrarily far away from each other in the final tree. As we will see in Section 12.5, this
property is important for the analysis of long-distance dependencies without movement.
12.1.4 Semantics
There are different approaches to the syntax-semantics interface in TAG. One possibil-
ity is to assign a semantic representation to every node in the tree. The alternative is
to assign each elementary tree exactly one semantic representation. The semantics con-
struction does not make reference to syntactic structure but rather the way the structure
is combined. This kind of approach has been proposed by Candito & Kahane (1998) and
then by Kallmeyer & Joshi (2003), who build on it. The basic mechanisms will be briefly
presented in what follows.
In the literature on TAG, a distinction is made between derived trees and derivation
trees. Derived trees correspond to constituent structure (the trees for John laughs and
John always laughs in Figures 12.2 and 12.3). The derivation tree contains the deriva-
tional history, that is, information about how the elementary trees were combined. The
elements in a derivation tree represent predicate-argument dependencies, which is why
it is possible to derive a semantic derivation tree from them. This will be shown on the
basis of the sentence in (2):
The elementary tree for (2) and the derived tree are given in Figure 12.4 on the next page.
The nodes in trees are numbered from top to bottom and from left to right. The result of
this numbering of nodes for likes is shown in Figure 12.5 on the facing page. The topmost
node in the tree for likes is S and has the position 0. Beneath S, there is an NP and a VP
node. These nodes are again numbered starting at 1. NP has the position 1 and VP the
position 2. The VP node has in turn two daughters: V and the object NP. V receives
number 1 and the object NP 2. This makes it possible to combine these numbers and
then it is possible to unambiguously access individual elements in the tree. The position
for the subject NP is 1 since this is a daughter of S and occurs in first position. The object
NP has the numeric sequence 2.2 since it is below the VP (the second daughter of S = 2)
and occurs in second position (the second daughter of VP = 2).
With these tree positions, the derivation tree for (2) can be represented as in Figure 12.6
on the next page. The derivation tree expresses the fact that the elementary tree for likes
was combined with two arguments that were inserted into the substitution positions 1
420
12.1 General remarks on representational format
S
NP↓ VP
NP VP
V NP↓ {
NP
Max V NP
likes
Max NP
likes Anouk
Anouk
Figure 12.4: Elementary trees and derived tree for Max likes Anouk.
S (0)
likes
likes
1 2.2
Max Anouk
421
12 Tree Adjoining Grammar
and 2.2. The derivation tree also contains information about what exactly was placed
into these nodes.
Kallmeyer & Joshi (2003) use a variant of Minimal Recursion Semantics as their se-
mantic representational formalism (Copestake, Flickinger, Pollard & Sag 2005). I will
use a considerably simplified representation here, as I did in Section 9.1.6 on semantics
in HPSG. For the elementary trees Max, likes and Anouk, we can assume the semantic
representations in (3).
(3) Semantic representations for elementary trees:
like(x, y)
max(x)
anouk(y)
arg: −
Kallmeyer & Joshi (2003) show how an extension of TAG, Multi-Component LTAG, can
handle quantifier scope and discuss complex cases with embedded verbs. Interested read-
ers are referred to the original article.
422
12.2 Local reordering
S0
𝛼=
NP1 VP2
V2.1 NP2.2
This means that it is possible to derive all orders that were derived in GPSG with flat
sentence rules despite the fact that there is a constituent in the tree that consists of NP
and VP. Since the dominance rules include a larger locality domain, such grammars are
called LD/LP grammars (local dominance/linear precedence) rather than ID/LP gram-
mars (immediate dominance/linear precedence) (Joshi, Vijay-Shanker & Weir 1990).
Simple variants of TAG such as those presented in Section 12.1 cannot deal with re-
ordering if the arguments of different verbs are scrambled as in (8).
(8) weil ihm das Buch jemand zu lesen versprochen hat3
because him.DAT the.ACC book somebody.NOM to read promised has
‘because somebody promised him to read the book’
In (8), das Buch ‘the book’ is the object of zu lesen ‘to read’, and ihm ‘him’ and jemand
‘somebody’ are dependent on versprochen and hat, respectively. These cases can be ana-
lyzed by LD/LP-TAG developed by Joshi (1987b) and Free Order TAG (FO-TAG) (Becker,
Joshi & Rambow 1991: 21) since both of these TAG variants allow for crossing edges.
Since certain restrictions cannot be expressed in FO-TAG (Rambow 1994: 48–50), so-
called Multi-Component TAG was developed. Joshi, Becker & Rambow (2000) illustrate
3
For more on this kind of examples, see Bech (1955).
423
12 Tree Adjoining Grammar
the problem that simple LTAG grammars have with sentences such as (8) using examples
such as (9):4
(9) a. … daß der Detektiv dem Klienten [den Verdächtigen des
that the.NOM detective the.DAT client the.ACC suspect the.GEN
Verbrechens zu überführen] versprach
crime to indict promised
‘that the detective promised the client to indict the suspect of the crime’
b. … daß des Verbrechens𝑘 der Detektiv den Verdächtigen 𝑗
that the.GEN crime the.NOM detective the.ACC suspect
dem Klienten [_ 𝑗 _𝑘 zu überführen] versprach
the.DAT client to indict promised
In LTAG, the elementary trees for the relevant verbs look as shown in Figure 12.8. The
S S
NP22 ↓ S NP11 ↓ VP
NP12 ↓ S NP21 ↓ S* V1
NP VP versprach
promised
PRO NP12 NP22 V2
e e zu überführen
to indict
Figure 12.8: Elementary trees of an infinitive and a control verb
verbs are numbered according to their level of embedding. The NP arguments of a verb
bear the same index as that verb and each has a superscript number that distinguishes
it from the other arguments. The trees are very similar to those in GB. In particular, it
is assumed that the subject occurs outside the VP. For non-finite verbs, it is assumed
that the subject is realized by PRO. PRO is, like e, a phonologically empty pronominal
category that also comes from GB. The left tree in Figure 12.8 contains traces in the
normal position of the arguments and the relevant NP slots in higher trees positions. An
interesting difference to other theories is that these traces only exist in the tree. They are
not represented as individual entries in the lexicon as the lexicon only contains words
and the corresponding trees.
4
The authors use versprochen hat ‘has promised’ rather than versprach ‘promised’, which sounds better but
does not correspond to the trees they use.
424
12.2 Local reordering
The tree for versprach ‘promised’ can be inserted into any S node in the tree for zu
überführen ‘to indict’ and results in trees such as those in the Figures 12.9 and 12.10.
NP22 ↓ S
NP12 ↓ S
NP11 ↓ VP
NP21 ↓ S V1
NP VP versprach
promised
PRO NP12 NP22 V2
e e zu überführen
to indict
Figure 12.9: Analysis of the order NP22 NP12 NP11 NP21 V2 V1 : adjunction to the lowest S
node
In Figure 12.9, the tree for versprach is inserted directly above the PRO NP and in Fig-
ure 12.10 above NP12 .
It is clear that it is not possible to derive a tree in this way where an argument of über-
führen ‘to indict’ occurs between the arguments of versprach ‘promised’. Joshi, Becker
& Rambow (2000) therefore suggest an extension of the LTAG formalism. In MC-TAG,
the grammar does not consist of elementary trees but rather finite sets of elementary
trees. In every derivational step, a set is selected and the elements of that set are simulta-
neously added to the tree. Figure 12.11 on the following page shows an elementary tree
for versprach ‘promised’ consisting of multiple components. This tree contains a trace
of NP11 that was moved to the left. The bottom-left S node and the top-right S node are
connected by a dashed line that indicates the dominance relation. However, immediate
dominance is not required. Therefore, it is possible to insert the two subtrees into an-
other tree separately from each other and thereby analyze the order in Figure 12.12 on
page 427, for example.
Other variants of TAG that allow for other constituent orders are V-TAG (Rambow
1994) and TT-MC-TAG (Lichte 2007).
425
12 Tree Adjoining Grammar
NP22 ↓ S
NP11 ↓ VP
NP21 ↓ S V1
NP12 ↓ S versprach
promised
NP VP
e e zu überführen
to indict
Figure 12.10: Analysis of the order NP22 NP11 NP21 NP12 V2 V1 : adjunction to the S node
between NP22 and NP12
S
NP11 VP
S
e NP21 ↓ S* V1
NP11 ↓ S
versprach
promised
Figure 12.11: Elementary tree set for versprach consisting of multiple components
426
12.3 Verb position
NP11 ↓ S
NP22 ↓ S
NP11 VP
e NP21 ↓ S V1
NP12 ↓ S versprach
promised
NP VP
e e zu überführen
to indict
Figure 12.12: Analysis of the order NP11 NP22 NP21 NP12 V2 V1 : adjunction to the S node
between NP22 and NP12
12.4 Passive
There is a possible analysis for the passive that is analogous to the transformations in
Transformational Grammar: one assumes lexical rules that license a lexical item with a
passive tree for every lexical item with an active tree (Kroch & Joshi 1985: 50–51).
Kroch & Joshi (1985: 55) propose an alternative to this transformation-like approach
that more adequately handles so-called raising constructions. Their analysis assumes
427
12 Tree Adjoining Grammar
that arguments of verbs are represented in subcategorization lists. Verbs are entered
into trees that match their subcategorization list. Kroch and Joshi formulate a lexical
rule that corresponds to the HPSG lexical rule that was discussed on page 288, that is, an
accusative object is explicitly mentioned in the input of the lexical rule. Kroch and Joshi
then suggest a complex analysis of the impersonal passive which uses a semantic null
role for a non-realized object of intransitive verbs (p. 56). Such an analysis with abstract
auxiliary entities can be avoided easily: one can instead use the HPSG analysis going
back to Haider (1986a), which was presented in Section 9.2.
There are also proposals in TAG that use inheritance to deal with valence changing
processes in general and the passive in particular (Candito 1996 and Kinyon, Rambow,
Scheffler, Yoon & Joshi 2006 following Candito). As we saw in Section 10.2 of the Chap-
ter on Construction Grammar, inheritance is not a suitable descriptive tool for valence
changing processes. This is because these kinds of processes interact syntactically and
semantically in a number of ways and can also be applied multiple times (Müller 2006;
2007b; 2007a: Section 7.5.2; 2013c; 2014a). See also Section 21.4 of this book.
428
12.6 New developments and theoretical variants
S.
⇝
WH
. i .SOA
. .S.
who
. COMP
. S. INFL
. NP
. VP
.
.
that
. NP
. VP
. did
. John
. V. NP
. S*.
. . . .
Bill
. V. NP
. tell
. Sam
.
. . .
likes
. _.i
. .
S.
WH
. i S.
who
. INFL
. NP
. VP
. .
.
did
. John
. V. NP
. S.
. .
tell
. Sam. that Bill. likes _i
. . .
Figure 12.13: Analysis of long-distance dependencies in TAG
variants existing in 1994. In the following, I will discuss two interesting variants of TAG:
Feature Structure-Based TAG (FTAG, Vijay-Shanker & Joshi 1988) and Vector-TAG (V-
TAG, Rambow 1994).
12.6.1 FTAG
In FTAG, nodes are not atomic (N, NP, VP or S), but instead consist of feature descrip-
tions. With the exception of substitution nodes, each node has a top structure and a
bottom structure. The top structure says something about what kind of properties a
given tree has inside a larger structure, and the bottom structure says something about
the properties of the structure below the node. Substitution nodes only have a top struc-
ture. Figure 12.14 on the next page shows an example tree for laughs. A noun phrase can
429
12 Tree Adjoining Grammar
CAT S
CAT S
CAT VP
CAT NP
AGR 1 PER 3
AGR 1 NUM sing
CAT VP
[] CAT V
CAT NP
CAT V
AGR PER 3
NUM sing
John laughs
be combined with the tree for laughs in Figure 12.14. Its top structure is identified with
the NP node in the tree for laughs. The result of this combination is shown in Figure 12.15
on the facing page.
In a complete tree, all top structures are identified with the corresponding bottom
structures. This way, only sentences where the subject is in third person singular can
be analyzed with the given tree for laughs, that is, those in which the verb’s agreement
features match those of the subject.
For adjunction, the top structure of the tree that is being inserted must be unifiable
with the top structure of the adjunction site, and the bottom structure of the node marked
‘*’ in the inserted tree (the so-called foot node) must be unifiable with the adjunction site.
The elementary trees discussed so far only consisted of nodes where the top part
matched the bottom part. FTAG allows for an interesting variant of specifying nodes
that makes adjunction obligatory in order for the entire derivation to be well-formed.
Figure 12.16 on the next page shows a tree for laughing that contains two VP nodes with
incompatible MODE values. In order for this subtree to be used in a complete structure,
another tree has to be added so that the two parts of the VP node are separated. This
happens by means of an auxiliary tree as shown in Figure 12.16. The highest VP node
of the auxiliary tree is unified with the upper VP node of laughing. The node of the
auxiliary tree marked with ‘*’ is unified with the lower VP node of laughing. The result
of this is given in Figure 12.17 on page 432.
If a tree is used as a final derivation, the top structures are identified with the bottom
structures. Thus, the AGR value of the highest VP node is identified with that of the lower
430
12.6 New developments and theoretical variants
CAT S
CAT S
CAT VP
CAT NP
AGR 1 PER 3
AGR 1 NUM sing
CAT NP
PER 3 CAT VP
AGR
NUM sing
CAT V
CAT V
John laughs
Figure 12.15: Combination of the trees for John and laughs in FTAG
[ ]
cat. S
[ ]
cat S
[ ]
cat . VP
[ ] cat VP
cat NP
cat VP . agr . 1
agr 2 agr 1
mode ind
mode ind [ ]
cat VP
mode ger
[ ] [ ]
cat VP
cat. V . [ ]
mode ger
[ ] cat. V
cat V [ ]
cat VP ∗ [ ]
agr 2 per 3
num sing cat V
is. laughing
.
.
Figure 12.16: Obligatory adjunction in FTAG
431
12 Tree Adjoining Grammar
CAT S
CAT S
CAT VP
AGR 1
MODE ind
CAT VP
CAT NP AGR 2
AGR 1 MODE ind
CAT V
CAT VP
CAT V MODE ger
AGR 2 PER 3 CAT VP
NUM sing
MODE ger
is CAT V
CAT V
laughing
one in the tree in Figure 12.17. As such, only NPs that have the same AGR value as the
auxiliary can be inserted into the NP slot.
This example shows that, instead of the marking for obligatory adjunction that we
saw in the section on long-distance dependencies, the same effect can be achieved by
using incompatible feature specifications on the top and bottom structures. If there are
incompatible top and bottom structures in a tree, then it cannot be a final derivation tree
and therefore this means that at least one adjunction operation must still take place in
order to yield a well-formed tree.
12.6.2 V-TAG
V-TAG is a variant of TAG proposed by Owen Rambow (1994) that also assumes feature
structures on nodes. In addition, like MC-TAG, it assumes that elementary trees consist
of multiple components. Figure 12.18 on the next page shows the elementary lexical
set for the ditransitive verb geben ‘give’. The lexicon set consists of a tree for the verb,
432
12.6 New developments and theoretical variants
VP VP VP VP VP
NP↓ VP NP↓ VP NP↓ VP geben 𝜖
give VP
Figure 12.18: Lexicon set for geben ‘to give’ in V-TAG according to Rambow (1994: 6)
an empty element of the category VP and three trees where a VP has been combined
with an NP. As in MC-TAG, dominance relations are also indicated. The dominance
constraints in Figure 12.18 ensure that all lower VP nodes dominate the highest VP node
of the tree further to the right. The order of the arguments of the verb as well as the
position of the verb is not given. The only thing required is that lower VP in the NP
trees and lower VP in the geben tree dominate the empty VP node. With this lexicon
set, it is possible to derive all permutations of the arguments. Rambow also shows how
such lexical entries can be used to analyze sentences with verbal complexes. Figure 12.19
shows a verbal complex formed from zu reparieren ‘to repair’ and versprochen ‘promised’
and the relevant dominance constraints. Both of the first NP trees have to dominate
VP
VP VP VP VP
VP VP
NP↓ VP NP↓ VP NP↓ VP NP↓ VP zu reparieren versprochen
𝜖 to repair 𝜖 promised
versprochen and the third and fourth NP tree have to dominate zu reparieren. The order
of the NP trees is not restricted and thus all permutations of NPs can be derived.
The interesting thing here is that this approach is similar to the one proposed by
Berman (1996: Section 2.1.3) in LFG (see Section 7.4): in Berman’s analysis, the verb
projects directly to form a VP and the arguments are then adjoined.
433
12 Tree Adjoining Grammar
A difference to other analyses discussed in this book is that there is always an empty
element in the derived trees regardless of verb position.
434
12.6 New developments and theoretical variants
Joshi et al. (2000) discuss verbal complexes with reordered arguments. The general
pattern that they discuss has the form shown in (15):
(15) 𝜎(NP1 NP2 … NP𝑛 ) V𝑛 V𝑛−1 … V1
Here, 𝜎 stands for any permutation of noun phrases and V1 is the finite verb. The authors
investigate the properties of Lexicalized Tree Adjoining Grammar (LTAG) with regard
to this pattern and notice that LTAG cannot analyze the order in (16) if the semantics is
supposed to come out correctly.
(16) NP2 NP3 NP1 V3 V2 V1
Since (17) is possible in German, LTAG is not sufficient to analyze all languages.
(17) dass ihm2 das Buch3 niemand1 zu lesen3 versprechen2 darf1
that him the book nobody to read promise be.allowed.to
‘that nobody is allowed to promise him to read the book’
Therefore, they propose the extension of TAG discussed in Section 12.2; so-called tree-
local multi-component LTAG (Tree-local MC-LTAG or TL-MCTAG). They show that TL-
MCTAG can analyze (17) but not (18) with the correct semantics. They claim that these
orders are not possible in German and argue that in this case, unlike the relative clause
examples, one has both options, that is, the unavailability of such patterns can be ex-
plained as a performance phenomenon or as a competence phenomenon.
(18) NP2 NP4 NP3 NP1 V4 V3 V2 V1
If we treat this as a performance phenomenon, then we are making reference to the
complexity of the construction and the resulting processing problems for the hearer.
The fact that these orders do not occur in corpora can be explained with reference to the
principle of cooperativeness. Speakers normally want to be understood and therefore
formulate their sentences in such a way that the hearer can understand them. Verbal
complexes in German with more than four verbs are hardly ever found since it is possible
to simplify very complex sentences with multiple verbs in the right sentence bracket by
extraposing material and therefore avoiding ambiguity (see Netter 1991: 5 and Müller
2007a: 262).
The alternative to a performance explanation would involve using a grammatical for-
malism which is just powerful enough to allow embedding of two verbs and reordering
of their arguments, but rules out embedding of three verbs and reordering of the argu-
ments. Joshi et al. (2000) opt for this solution and therefore attribute the impossibility
of the order of arguments in (18) to competence.
In HPSG (and also in Categorial Grammar and in some GB analyses), verbal complexes
are analyzed by means of argument composition (Hinrichs & Nakazawa 1989a; 1994a).
Under this approach, a verbal complex behaves exactly like a simplex verb and the argu-
ments of the verbs involved can be placed in any order. The grammar does not contain
any restriction on the number of verbs that can be combined, nor any constraints that
ban embedding below a certain level. In the following, I will show that many reorder-
ings are ruled out by communication rules that apply even with cases of simple two-
435
12 Tree Adjoining Grammar
place verbs. The conclusion is that the impossibility of embedding four or more verbs
should in fact be explained as a performance issue.
Before I present arguments against a competence-based exclusion of (18), I will make
a more general comment: corpora cannot help us here since one does not find any in-
stances of verbs with four or more embeddings. Bech (1955) provides an extensive collec-
tion of material, but had to construct the examples with four embedded verbs. Meurers
(1999b: 94–95) gives constructed examples with five verbs that contain multiple auxil-
iaries or modal verbs. These examples are barely processable and are not relevant for
the discussion here since the verbs in (18) have to select their own arguments. There
are therefore not that many verbs left when constructing examples. It is possible to only
use subject control verbs with an additional object (e.g., versprechen ‘to promise’), object
control verbs (e.g., zwingen ‘to force’) or AcI verbs (e.g., sehen ‘to see’ or lassen ‘to let’) to
construct examples. When constructing examples, it is important make sure that all the
nouns involved differ as much as possible with regard to their case and their selectional
restrictions (e.g., animate/inanimate) since these are features that a hearer/reader could
use to possibly assign reordered arguments to their heads. If we want to have patterns
such as (18) with four NPs each with a different case, then we have to choose a verb that
governs the genitive. There are only a very small number of such verbs in German. Al-
though the example constructed by Joshi et al. (2000) in (9b) fulfills these requirements,
it is still very marked. It therefore becomes clear that the possibility of finding a corre-
sponding example in a newspaper article is extremely small. This is due to the fact that
there are very few situations in which such an utterance would be imaginable. Addition-
ally, all control verbs (with the exception of helfen ‘to help’) require an infinitive with zu
‘to’ and can also be realized incoherently, that is, with an extraposed infinitival comple-
ment without verbal complex formation. As mentioned above, a cooperative speaker/
author would use a less complex construction and this reduces the probability that these
kinds of sentences arise even further.
Notice that tree-local MC-LTAG does not constrain the number of verbs in a sentence.
The formalism allows for an arbitrary number of verbs. It is therefore necessary to as-
sume, as in other grammatical theories, that performance constraints are responsible for
the fact that we never find examples of verbal complexes with five or more verbs. Tree-
local MC-LAG makes predictions about the possibility of arguments to be reordered. I
consider it wrong to make constraints regarding mobility of arguments dependent on
the power of the grammatical formalism since the restrictions that one finds are inde-
pendent of verbal complexes and can be found with simplex verbs taking just two argu-
ments. The problem with reordering is that it still has to be possible to assign the noun
phrases to the verbs they belong to. If this assignment leads to ambiguity that cannot
be resolved by case, selectional restrictions, contextual knowledge or intonation, then
the unmarked constituent order is chosen. Hoberg (1981: 68) shows this very nicely with
examples similar to the following:5
5
Instead of das ‘the’, Hoberg uses the possessive pronoun ihr ‘her’. This makes the sentences more seman-
tically plausible, but one then gets interference from the linearization requirements for bound pronouns. I
have therefore replaced the pronouns with the definite article.
436
12.6 New developments and theoretical variants
(19) a. Hanna hat immer schon gewußt, daß das Kind sie verlassen will.
Hanna has always already known that the child she leave wants
‘Hanna has always known that the child wants to leave her.’
b. # Hanna hat immer schon gewußt, daß sie das Kind verlassen will.
Hanna has always already known that she the child leave wants
Preferred reading: ‘Hanna has always known that she wants to leave the
child.’
c. Hanna hat immer schon gewußt, daß sie der Mann verlassen
Hanna has always already known that she the.NOM man leave
will.
wants.to
‘Hanna has always known that the man wants to leave her.’
It is not possible to reorder (19a) to (19b) without creating a strong preference for another
reading. This is due to the fact that neither sie ‘she’ nor das Kind ‘the child’ are unam-
biguously marked as nominative or accusative. (19b) therefore has to be interpreted as
Hanna being the one that wants something, namely to leave the child. This reordering
is possible, however, if at least one of the arguments is unambiguously marked for case
as in (19c).
For noun phrases with feminine count nouns, the forms for nominative and accusative
as well as genitive and dative are the same. For mass nouns, it is even worse. If they are
used without an article, all cases are the same for feminine nouns (e.g., Milch ‘milk’) and
also for masculines and neuters with exception of the genitive. In the following example
from Wegener (1985: 45) it is hardly possible to switch the dative and accusative object,
whereas this is possible if the nouns are used with articles as in (20c,d):
The two nouns can only be switched if the meaning of the sentence is clear from the
context (e.g., through explicit negation of the opposite) and if the sentence carries a
certain intonation.
437
12 Tree Adjoining Grammar
The problem with verbal complexes is now that with four noun phrases, two of them
almost always have the same case if one does not wish to resort to the few verbs gov-
erning the genitive. A not particularly nice-sounding example of morphologically un-
ambiguously marked case is (21):
(21) weil er den Mann dem Jungen des Freundes gedenken
because he.NOM the.ACC man the.DAT boy of.the.GEN friend remember
helfen lassen will
help let wants
‘because he wants to let the man help the boy remember his friend’
Another strategy is to choose verbs that select animate and inanimate objects so that
animacy of the arguments can aid interpretation. I have constructed such an example
where the most deeply embedded predicate is not a verb but rather an adjective. The
predicate leer fischen ‘to fish empty’ is a resultative construction that should be analyzed
parallel to verbal complexes (Müller 2002a: Chapter 5).
(22) weil niemand1 [den Mann]2 [der Frau]3 [diesen Teich]4 leer4
because nobody.NOM the.ACC man the.DAT woman this.ACC pond empty
fischen3 helfen2 sah1
fish help saw
‘because nobody saw the man help the woman fish the pond empty’
If one reads the sentences with the relevant pauses, it is comprehensible. Case is unam-
biguously marked on the animate noun phrases and our word knowledge helps us to
interpret diesen Teich ‘this pond’ as the argument of leer ‘empty’.
The sentence in (22) would correctly be analyzed by an appropriately written tree-
local MC-LTAG and also by argument composition analyses for verbal complexes and
resultative constructions. The sentence in (23) is a variant of (22) that corresponds ex-
actly to the pattern of (18):
(23) weil [der Frau]2 [diesen Teich]4 [den Mann]3 niemand1 leer4
because the.DAT woman this.ACC pond the.ACC man nobody.NOM empty
fischen3 helfen2 sah1
fish help saw
‘because nobody saw the man help the woman fish the pond empty’
(23) is more marked than (22), but this is always the case with local reordering (Gis-
bert Fanselow, p. c. 2006). This sentence should not be ruled out by the grammar. Its
markedness is more due to the same factors that were responsible for the markedness
of reordering of arguments of simplex verbs. Tree-local MC-LTAG can not correctly
analyze sentences such as (23), which shows that this TAG variant is not sufficient for
analyzing natural language.
There are varying opinions among TAG researchers as to what should be counted as
competence and what should be counted as performance. For instance, Rambow (1994:
15) argues that one should not exclude reorderings that cannot be processed by means of
438
12.7 Summary and classification
6
See Rambow (1994) and Kallmeyer (2005: 194), however, for TAG analyses with an empty element in the
lexicon.
439
12 Tree Adjoining Grammar
Kasper, Kiefer, Netter & Vijay-Shanker (1995) show that it is possible to transfer HPSG
grammars that fulfill certain requirements into TAG grammars. This is interesting as
in this way one arrives at a grammar whose complexity behavior is known. Whereas
HPSG grammars are generally in the Type-0 area, TAG grammars can, depending on
the variant, fall into the realm of Type-2 languages (context-free) or even in the larger
set of the mildly context-sensitive grammars (Joshi 1985). Yoshinaga, Miyao, Torisawa &
Tsujii (2001) have developed a procedure for translating FB-LTAG grammars into HPSG
grammars.
Comprehension questions
1. How are long-distance dependencies analyzed in TAG? Does one need empty
elements for this?
Exercises
440
12.7 Summary and classification
Further reading
Some important articles are Joshi, Levy & Takahashi (1975), Joshi (1987a), and
Joshi & Schabes (1997). Many works discuss formal properties of TAG and are
therefore not particularly accessible for linguistically interested readers. Kroch
& Joshi (1985) give a good overview of linguistic analyses. An overview of lin-
guistic and computational linguistic works in TAG can be found in the volume
edited by Abeillé and Rambow from 2000. Rambow (1994) compares his TAG
variant (V-TAG) to Karttunen’s Radical Lexicalism approach, Uszkoreit’s GPSG,
Combinatorial Categorial Grammar, HPSG and Dependency Grammar.
Shieber & Johnson (1993) discuss psycholinguistically plausible processing
models and show that it is possible to do incremental parsing with TAG. They
also present a further variant of TAG: synchronous TAG. In this TAG variant,
there is a syntactic tree and a semantic tree connected to it. When building syn-
tactic structure, the semantic structure is always built in parallel. This structure
built in parallel corresponds to the level of Logical Form derived from S-structure
using transformations in GB.
Rambow (1994: Chapter 6) presents an automaton-based performance theory.
He applies it to German and shows that the processing difficulties that arise when
reordering arguments of multiple verbs can be explained.
Kallmeyer & Romero (2008) show how it is possible to derive MRS represen-
tations directly via a derivation tree using FTAG. In each top node, there is a
reference to the semantic content of the entire structure and each bottom node
makes reference to the semantic content below the node. In this way, it becomes
possible to insert an adjective (e.g., mutmaßlichen ‘suspected’) into an NP tree
alle Mörder ‘all murderers’ so that the adjective has scope over the nominal part
of the NP (Mörder ‘murderers’): for adjunction of the adjective to the N node,
the adjective can access the semantic content of the noun. The top node of mut-
maßlichen is then the top node of the combination mutmaßlichen Mörder ‘sus-
pected murderers’ and this ensures that the meaning of mutmaßlichen Mörder is
correctly embedded under the universal quantifier.
441
Part II
General discussion
Preface for Part II
This book is very long. For technical reasons it was split into two parts in the print
version of earlier editions. As of 2020 the book can be published as one volume, but
there are still the two parts and I think it is helpful for the readers to keep this preface
for Part II.
The first part contains the introduction to all the theories and the second part is a col-
lection of topics that are relevant for more than one theory, so it would be inappropriate
to discuss them within one of the chapters of Part I. While Part I has a more introductory
character and can be used for teaching BA and MA students, the material in Part II is for
more advanced readers. I never used it for teaching; it may be a good resource for classes
on these topics nevertheless. In what follows, I give a brief overview of the chapters of
Part II.
Chapter 13 concerns the assumption of innate domain-specific knowledge, some sort
of Universal Grammar. This is probably the hottest debate in linguistics and the side one
takes in this debate has severe consequences for the theories that one considers accept-
able. In Mainstream Generative Grammar (MGG), lots of invisible elements are postu-
lated and in some versions of MGG, it is claimed that these are present in the grammars of
all languages of the world, even though there is no direct evidence for these categories
in some of the languages. Usually the motivation for assuming an empty element is
that there is another language with visible material in the respective position. Whether
one considers such an argumentation as legitimate crucially depends on whether one
believes in innate domain-specific knowledge. Chapter 13 tries to summarize the discus-
sion and to show that for all claims regarding the existence of Universal Grammar there
are counterclaims. Hauser, Chomsky & Fitch (2002) greatly revised Chomsky’s assump-
tions regarding UG. According to them, UG contains rather few and abstract constraints,
but nevertheless UG lives on in the theories developed today and in the way arguments
for them are made. Hence, a chapter like Chapter 13 is important to understand the
discussions and the alternatives to the theories developed in MGG.
Chapter 14 deals with the difference of generative-enumerative and model-theoretic
approaches. Generative-enumerative approaches (basically phrase-structure grammars
and MGG variants of it) enumerate a set of strings considered to be well-formed with
respect to a grammar and possibly a set of transformations. The model-theoretic view
does not say anything about sets but rather deals with formulating well-formedness
conditions for utterances. While this seems to be very similar at first glance, there are
interesting differences in various respects. Chapter 14 deals with utterance fragments
and graded acceptability and discusses alleged problems for model-theoretic approaches.
Chapter 15 introduces the competence/performance distinction. Some researchers re-
ject this distinction completely (most of the researchers working within CxG), others
assume it and try to develop models that are performance-compatible (HPSG, LFG, CG,
TAG) and others develop models that are highly implausible from a performance point of
view (lots of Minimalist work). Chapter 15 introduces the concepts, discusses whether it
makes sense to distinguish competence and performance (I think it does) and examines
what is required for performance-compatible competence models.
It is often argued that theory X must be wrong since it does not explain how language
can be acquired. Interestingly, these accusations go both ways: Construction Grammari-
ans criticize Minimalists for assuming half of their theory to be hard coded in our genetic
material and Minimalists claim that constructionist theories do not have an explanation
for there being recursive structures. Chapter 16 discusses the major approaches to acqui-
sition and explains where they differ and what shortcomings exist.
Chapter 17 deals with the generative capacity of grammar formalisms. The generative
capacity played an important role in the history of generative grammar. Early versions
turned out to be too powerful, resulting in quite radical changes of the framework in
the 1970s and 1980s. One key advantage of GPSG was that it was much more restrictive
than the transformational approaches that were around at the time. As it turned out, it
was too restrictive to model language as such since there were languages that could be
shown to require more powerful machinery. This lead to the development of HPSG. The
HPSG formalism has Turing power, which is the worst complexity a formalism can have.
Somewhat ironically, most proponents of HPSG do not care about this at all. Chapter 17
explains why.
Chapter 18 is a brief chapter including the discussion of three topics that come up again
and again: binary branching vs. flat structures, locality and recursion. Some researchers
argue (without proof) that all theories should assume binary branching structures be-
cause otherwise the grammars are not acquirable, while others argue for the opposite
view, again often with acquisition arguments (Section 18.1). Locality is an issue that
is important both in Minimalism and in other theories like HPSG and LFG. However,
there are differences in what is understood by locality. Section 18.2 deals with these is-
sues. Languages usually have recursive structures. So frameworks have to have ways
to account for this. All the frameworks discussed in this book do account for recursion
despite claims to the contrary. This is topic of Section 18.3.
Chapter 19 deals with empty elements. While some theories have more invisible units
in their trees than visible ones (see Figure 4.21 on page 150), there are frameworks that
do not assume any empty elements, referring to acquisition again. The chapter shows
that certain grammars with empty elements can be converted into grammars without
empty elements. I show that empty elements are not required for semantic reasons; un-
derspecification can be used instead. The chapter shows how important the assumptions
about UG (Chapter 13) are: if empty elements correspond to visible material that in cer-
tain situations occupies the place of the empty element, then there are chances for them
to be acquired from data. If the empty elements are stipulated with reference to material
in other languages, there is a real acquisition problem. A final section shows that (some)
empty elements can be replaced by lexical rules (or templates), which correspond to a
certain type of transformation. This relativizes the debate that arose around stipulating
such theoretical entities.
446
When it comes to mechanisms that are used in different frameworks, there is a fur-
ther difference between (some variants of) GB/Minimalism and the other theories. Some
theories in the transformational frameworks assume that extraction, scrambling and pas-
sive are dealt with using the same descriptive tool: movement. Other theories use lexical
rules, different phrasal schemata and SLASH propagation techniques. Chapter 20 shows
that phenomena like the so-called remote passive that seem to require movement can
be dealt with without movement. The chapter repeats an example from the GB chapter
that showed that the movement-based analysis of the German passive is problematic,
since nothing moves in German sentences. Passive is a phenomenon that is independent
of movement; it is just English that is SVO and requires a subject before the verb. Since
German does not require subjects, nothing has to be reordered. Dependency Grammar
proposals assuming the same descriptive tool for the three phenomena are discussed as
well, their shortcomings are pointed out and it is concluded that the three phenomena
are independent or at least have to be distinguishable in terms of their treatment in a
theory.
Chapter 21 discusses a further highly controversial issue: the question of whether lan-
guage consists of phrasal patterns or whether there are abstract combinatorial rules that
combine lexical items that contain rich information. Again, questions of language acqui-
sition play a role here. This chapter is rather long but it reflects the complexity of the
discussion in the literature. Many arguments for and against phrasal constructions are
evaluated and it is shown that grammars have to be able to account for phrasal patterns,
but that so-called argument structure constructions are better treated lexically. This dis-
cussion connects nicely to the GPSG–HPSG transition in the 1980s where researchers
switched from a phrasal model to a lexical one in the spirit of Categorial Grammar.
The brief Chapter 22 is related to the topic of Chapter 21 but it compares the TAG,
LFG and HPSG approaches to complex predicates and points out that there are problems
with a certain treatment of complex predicates in TAG. TAG is very similar in spirit to
phrasal Construction Grammar approaches: the elementary trees of TAG are phrasal pat-
terns. It is shown that lexical models like HPSG have an advantage over the phrasal TAG
approach since they specify lexical potential of items, which is not necessarily the same
as actual realization of dependents. It is pointed out that the LFG analysis of complex
predicates allowing for overrides of lexical information at phrasal nodes lies somewhere
in the middle between TAG and HPSG.
Chapter 23 showcases a way to develop linguistic theories that capture cross-linguistic
generalizations. It differs from the general top-down approach in Minimalism, where it is
assumed that certain constraints hold for all languages. Instead, constraints that hold for
known languages are collected in sets in a bottom-up way. The most general set contains
all constraints that hold for all known languages. This approach is independent of the
assumption of an innate UG and hence compatible with both CxG and Minimalism. The
chapter also contains some speculations on how to integrate universal constraints in an
Cinque-like UG way in case there turns out to be an empirical basis for this.
Chapter 24 draws some conclusions as to what an appropriate framework for describ-
ing languages should look like.
447
13 The innateness of linguistic
knowledge
If we try and compare the theories presented in this book, we notice that there are a
number of similarities.1 In all of the frameworks, there are variants of theories that
use feature-value pairs to describe linguistic objects. The syntactic structures assumed
are sometimes similar. Nevertheless, there are some differences that have often led to
fierce debates between members of the various schools. Theories differ with regard to
whether they assume transformations, empty elements, phrasal or lexical analyses, bi-
nary branching or flat structures.
Every theory has to not only describe natural language, but also explain it. It is pos-
sible to formulate an infinite number of grammars that license structures for a given
language (see Exercise 1 on page 78). These grammars are observationally adequate. A
grammar achieves descriptive adequacy if it corresponds to observations and the intu-
itions of native speakers.2 A linguistic theory is descriptively adequate if it can be used
to formulate a descriptively adequate grammar for every natural language. However,
grammars achieving descriptive adequacy do not always necessarily reach explanatory
adequacy. Grammars that achieve explanatory adequacy are those that are compati-
ble with acquisition data, that is, grammars that could plausibly be acquired by human
speakers (Chomsky 1965: 24–25).
Chomsky (1965: 25) assumes that children already have domain-specific knowledge
about what grammars could, in principle, look like and then extract information about
what a given grammar actually looks like from the linguistic input. The most prominent
1
The terms theory and framework may require clarification. A framework is a common set of assumptions
and tools that is used when theories are formulated. In this book, I discussed theories of German. These
theories were developed in certain frameworks (GB, GPSG, HPSG, LFG, …) and of course there are other
theories of other languages that share the same fundamental assumptions. These theories differ from
the theories of German presented here but are formulated in the same framework. Haspelmath (2010b)
argues for framework-free grammatical theory. If grammatical theories used incompatible tools, it would
be difficult to compare languages. So assuming transformations for English nonlocal dependencies and
a SLASH mechanism for German would make comparison impossible. I agree with Haspelmath that the
availability of formal tools may lead to biases, but in the end the facts have to be described somehow. If
nothing is shared between theories, we end up with isolated theories formulated in one man frameworks.
If there is shared vocabulary and if there are standards for doing framework-free grammatical theory, then
the framework is framework-free grammatical theory. See Müller (2015c) and Chapter 23 of this book for
further discussion.
2
This term is not particularly useful as subjective factors play a role. Not everybody finds grammatical
theories intuitively correct where it is assumed that every observed order in the languages of the world
has to be derived from a common Specifier-Head-Complement configuration, and also only by movement
to the left (see Section 4.6.1 for the discussion of such proposals).
13 The innateness of linguistic knowledge
By attributing arbitrary assumptions to UG, it is possible to keep the rest of the analysis
very simple.
The following section will briefly review some of the arguments for language-specific
innate knowledge. We will see that none of these arguments are uncontroversial. In
the following chapters, I will discuss fundamental questions about the architecture of
grammar, the distinction between competence and performance and how to model per-
formance phenomena, the theory of language acquisition as well as other controversial
questions, e.g., whether it is desirable to postulate empty elements in linguistic repre-
sentations and whether language should be explained primarily based on the properties
of words or rather phrasal patterns.
Before we turn to these hotly debated topics, I want to discuss the one that is most
fiercely debated, namely the question of innate linguistic knowledge. In the literature,
one finds the following arguments for innate knowledge:
450
13.1 Syntactic universals
• the fact that all children learn a language, but primates do not,
• the fact that children spontaneously regularize pidgin languages,
• the localization of language processing in particular parts of the brain,
• the alleged dissociation of language and general cognition:
– Williams Syndrome,
– the KE family with FoxP2 mutation and
• the Poverty of the Stimulus Argument.
Pinker (1994) offers a nice overview of these arguments. Tomasello (1995) provides a
critical review of this book. The individual points will be discussed in what follows.
451
13 The innateness of linguistic knowledge
452
13.1 Syntactic universals
introduce special categories for both prepositions and postpositions, then a four-way
division of parts of speech like the one on page 94 would no longer be possible. One
would instead require an additional binary feature and one would thereby automatically
predict eight categories although only five (the four commonly assumed plus an extra
one) are actually needed.
One can see that the relation between direction of government that Pinker formulated
as a universal claim is in fact correct but rather as a tendency than as a strict rule, that is,
there are many languages where there is a correlation between the use of prepositions
or postpositions and the position the verb (Dryer 1992: 83).6
In many languages, adpositions have evolved from verbs. In Chinese grammar, it is
commonplace to refer to a particular class of words as coverbs. These are words that can
be used both as prepositions and as verbs. If we view languages historically, then we
can find explanations for these tendencies that do not have to make reference to innate
linguistic knowledge (see Evans & Levinson 2009a: 445).
Furthermore, it is possible to explain the correlations with reference to processing pref-
erences: in languages with the same direction of government, the distance between the
verb and the pre-/postposition is less (Figure 13.1a–b) than in languages with differing
directions of government (Figure 13.1c–d). From the point of view of processing, lan-
guages with the same direction of government should be preferred since they allow the
hearer to better identify the parts of the verb phrase (Newmeyer (2004a: 219–221) cites
Hawkins (2004: 32) with a relevant general processing preference, see also Dryer (1992:
131)). This tendency can thus be explained as the grammaticalization of a performance
preference (see Chapter 15 for the distinction between competence and performance)
and recourse to innate language-specific knowledge is not necessary.
13.1.2 X structures
It is often assumed that all languages have syntactic structures that correspond to the X
schema (see Section 2.5) (Pinker 1994: 238; Meisel 1995: 11, 14; Pinker & Jackendoff 2005:
216). There are, however, languages such as Dyirbal (Australia) where it does not seem
to make sense to assume hierarchical structure for sentences. Thus, Bresnan (2001: 110)
assumes that Tagalog, Hungarian, Malayalam, Warlpiri, Jiwarli, Wambaya, Jakaltek and
other corresponding languages do not have a VP node, but rather a rule taking the form
of (3):
(3) S → C∗
Here, C∗ stands for an arbitrary number of constituents and there is no head in the
structure. Other examples for structures without heads will be discussed in Section 21.10.
6
Pinker (1994: 234) uses the word usually in his formulation. He thereby implies that there are exceptions
and that the correlation between the ordering of adpositions and the direction of government of verbs
is actually a tendency rather than a universally applicable rule. However, in the pages that follow, he
argues that the Head Directionality Parameter forms part of innate linguistic knowledge. Travis (1984: 55)
discusses data from Mandarin Chinese that do not correspond to the correlations she assumes. She then
proposes treating the Head Directionality Parameter as a kind of Default Parameter that can be overridden
by other constraints in the language.
453
13 The innateness of linguistic knowledge
IP IP
NP VP NP VP
V NP PP PP NP V
P NP NP P
(a) SVO with prepositions (common) (b) SOV with postpositions (common)
IP IP
NP VP NP VP
V NP PP PP NP V
NP P P NP
(c) SVO with postpositions (rare) (d) SOV with prepositions (rare)
Figure 13.1: Distance between verb and preposition for various head orders according to
Newmeyer (2004a: 221)
X structure was introduced to restrict the form of possible rules. The assumption
was that these restrictions reduce the class of grammars one can formulate and thus –
according to the assumption – make the grammars easier to acquire. But as Kornai &
Pullum (1990) have shown, the assumption of X structures does not lead to a restriction
with regard to the number of possible grammars if one allows for empty heads. In GB,
a number of null heads were used and in the Minimalist Program, there has been a
significant increase of these. For example, the rule in (3) can be reformulated as follows:
(4) V ′ → V 0 C∗
Here, V0 is an empty head. Since specifiers are optional, V′ can be projected to VP and
we arrive at a structure corresponding to the X schema.
Apart from the problem with languages with very free constituent order, there are
further problems with adjunction structures: Chomsky’s analysis of adjective structure
in X theory (Chomsky 1970: 210; see also Section 2.5 of this book, in particular Figure 2.8
on page 74) is not straightforwardly applicable to German since, unlike English, adjec-
tive phrases in German are head-final and degree modifiers must directly precede the
adjective:
(5) a. der auf seinen Sohn sehr stolze Mann
the of his son very proud man
‘the man very proud of his son’
454
13.1 Syntactic universals
Following the X schema, auf seinen Sohn has to be combined with stolze and only then
can the resulting A projection be combined with its specifier (see Figure 2.8 on page 74
for the structure of adjective phrases in English). It is therefore only possible to derive
orders such as (5b) or (5c). Neither of these is possible in German. It is only possible to
rescue the X schema if one assumes that German is exactly like English and, for some
reason, the complements of adjectives must be moved to the left. If we allow this kind
of repair approaches, then of course any language can be described using the X schema.
The result would be that one would have to postulate a vast number of movement rules
for many languages and this would be extremely complex and difficult to motivate from
a psycholinguistic perspective. See Chapter 15 for grammars compatible with perfor-
mance.
A further problem for X theory in its strictest form as presented in Section 2.5 is posed
by so-called hydra clauses (Perlmutter & Ross 1970; Link 1984; Kiss 2005):
(6) a. [[der Kater] und [die Katze]], die einander lieben
the tomcat and the cat that each.other love
‘the tomcat and the (female) cat that love each other’
b. [[The boy] and [the girl]] who dated each other are friends of mine.
Since the relative clauses in (6) refer to a group of referents, they can only attach to
the result of the coordination. The entire coordination is an NP, however, and adjuncts
should actually be attached at the X level. The reverse case of relative clauses in German
and English is posed by adjectives in Persian: Samvelian (2007) argues for an analysis
where adjectives are combined with nouns directly, and only the combination of nouns
and adjectives is then combined with a PP argument.
The discussion of German and English shows that the introduction of specifiers and
adjuncts cannot be restricted to particular projection levels, and the preceding discussion
of non-configurational languages has shown that the assumption of intermediate levels
does not make sense for every language.
It should also be noted that Chomsky himself assumed in 1970 that languages can
deviate from the X schema (1970: 210).
If one is willing to encode all information about combination in the lexicon, then
one could get by with very abstract combinatorial rules that would hold universally.
An example of this kind of combinatorial rules is the multiplication rules of Categorial
Grammar (see Chapter 8) as well as Merge in the Minimalist Program (see Section 4). The
rules in question simply state that two linguistic objects are combined. These kinds of
combination of course exist in every language. With completely lexicalized grammars,
however, it is only possible to describe languages if one allows for null heads and makes
certain ad hoc assumptions. This will be discussed in Section 21.10.
455
13 The innateness of linguistic knowledge
7
However, Chomsky (1981a: 27) allows for languages not to have a subject. He assumes that this is handled
by a parameter. Bresnan (2001: 311) formulates the Subject Condition, but mentions in a footnote that it
might be necessary to parameterize this condition so that it only holds for certain languages.
8
For further discussion of subjectless verbs in German, see Haider (1993: Sections 6.2.1, 6.5), Fanselow
(2000b), Nerbonne (1986b: 912) and Müller (2007a: Section 3.2).
456
13.1 Syntactic universals
Nevertheless, the applicability of the EPP and the Subject Condition is sometimes also
assumed for German. Grewendorf (1993: 1311) assumes that there is an empty expletive
that fills the subject position of subjectless constructions.
Berman (1999: 11; 2003a: Chapter 4), working in LFG, assumes that verbal morphology
can fulfill the subject role in German and therefore even in sentences where no subject
is overtly present, the position for the subject is filled in the f-structure. A constraint
stating that all f-structures without a PRED value must be third person singular applies
to the f-structure of the unexpressed subject. The agreement information in the finite
9
Haider (1986a: 18).
457
13 The innateness of linguistic knowledge
verb has to match the information in the f-structure of the unexpressed subject and hence
the verbal inflection in subjectless constructions is restricted to be 3rd person singular
(Berman 1999).
As we saw on page 166, some researchers working in the Minimalist Program even
assume that there is an object in every sentence (Stabler quoted in Veenstra (1998: 61,
124)). Objects of monovalent verbs are assumed to be empty elements.
If we allow these kinds of tools, then it is of course easy to maintain the existence
of many universals: we claim that a language X has the property Y and then assume
that the structural items are invisible and have no meaning. These analyses can only be
justified theory-internally with the goal of uniformity (see Culicover & Jackendoff 2005:
Section 2.1.2).10
458
13.1 Syntactic universals
This is possible in (12b), however, as there is no such c-command relation. For er ‘he’, it
must only be the case that it does not refer to another argument of the verb getrunken
‘drunk’ and this is indeed the case in (12b). Similarly, there is no c-command relation
between er ‘he’ and Max in (12c) since the pronoun er is inside a complex structure. er
‘he’ and Max can therefore refer to the the same or different individuals in (12b) and
(12c).
Crain, Thornton & Khlentzos (2009: 147) point out that (12b,c) and the corresponding
English examples are ambiguous, whereas (12a) is not, due to Principle C. This means
that one reading is not available. In order to acquire the correct binding principles, the
learner would need information about which meanings expressions do not have. The
authors note that children already master Principle C at age three and they conclude
from this that Principle C is a plausible candidate for innate linguistic knowledge. (This is
a classic kind of argumentation. For Poverty of the Stimulus arguments, see Section 13.8
and for more on negative evidence, see Section 13.8.4).
Evans & Levinson (2009b: 483) note that Principle C is a strong cross-linguistic ten-
dency but it nevertheless has some exceptions. As an example, they mention both recip-
rocal expressions in Abaza, where affixes that correspond to each other occur in subject
position rather than object position as well as Guugu Yimidhirr, where pronouns in a
superordinate clause can be coreferent with full NPs in a subordinate clause.
Furthermore, Fanselow (1992b: 351) refers to the examples in (13) that show that Prin-
ciple C is a poor candidate for a syntactic principle.
(13) a. Mord ist ein Verbrechen.
murder is a crime
b. Ein gutes Gespräch hilft Probleme überwinden.
a good conversation helps problems overcome
‘A good conversation helps to overcome problems.’
(13a) expresses that it is a crime when somebody kills someone else, and (13b) refers to
conversations with another person rather than talking to oneself. In these sentences,
the nominalizations Mord ‘murder’ and Gespräch ‘conversation’ are used without any
arguments of the original verbs. So there aren’t any arguments that stand in a syntactic
command relation to one another. Nevertheless the arguments of the nominalized verbs
cannot be coreferential. Therefore it seems that there is a principle at work that says
that the argument slots of a predicate must be interpreted as non-coreferential as long
as the identity of the arguments is not explicitly expressed by linguistic means.
In sum, one can say that there are still a number of unsolved problems with Binding
Theory. The HPSG variants of Principles A–C in English cannot even be applied to Ger-
man (Müller 1999b: Chapter 20). Working in LFG, Dalrymple (1993) proposes a variant
459
13 The innateness of linguistic knowledge
of Binding Theory where the binding properties of pronominal expressions are deter-
mined in the lexicon. In this way, the language-specific properties of pronouns can be
accounted for.
13.1.5.1 Extraposition
Baltin (1981) and Chomsky (1986a: 40) claim that the extraposed relative clauses in (14)
have to be interpreted with reference to the embedding NP, that is, the sentences are not
11
Newmeyer (2004b: 539–540) points out a conceptual problem following from the language-specific deter-
mination of bounding nodes: it is argued that subjacency is an innate language-specific principle since
it is so abstract that it is impossible for speakers to learn it. However, if parameterization requires that a
speaker chooses from a set of categories in the linguistic input, then the corresponding constraints must be
derivable from the input at least to the degree that it is possible to determine the categories involved. This
raises the question as to whether the original claim of the impossibility of acquisition is actually justified.
See Section 13.8 on the Poverty of the Stimulus and Section 16.1 on parameter-based theories of language
acquisition.
Note also that a parameter that has as the value a part of speech requires the respective part of speech
values to be part of UG.
12
However, see Baltin (2004: 552).
460
13.1 Syntactic universals
equivalent to those where the relative clause would occur in the position marked with
t, but rather they correspond to examples where it would occur in the position of the t′.
(14) a. [NP Many books [PP with [stories t]] t ′] were sold [that I wanted to read].
b. [NP Many proofs [PP of [the theorem t]] t′] appeared
[that I wanted to think about].
Here, it is assumed that NP, PP, VP and AP are bounding nodes for rightward movement
(at least in English) and the interpretation in question here is thereby ruled out by the
Subjacency Principle (Baltin 1981: 262).
If we construct a German example parallel to (14a) and replace the embedding noun
so that it is ruled out or dispreferred as a referent, then we arrive at (15):
(15) weil viele Schallplatten mit Geschichten verkauft wurden, die ich noch
because many records with stories sold were that I still
lesen wollte
read wanted
‘because many records with stories were sold that I wanted to read’
This sentence can be uttered in a situation where somebody in a record store sees partic-
ular records and remembers that he had wanted to read the fairy tales on those records.
Since one does not read records, adjunction to the superordinate noun is implausible and
thus adjunction to Geschichten ‘stories’ is preferred. By carefully choosing the nouns, it
is possible to construct examples such as (16) that show that extraposition can take place
across multiple NP nodes:13
(16) a. Karl hat mir [ein Bild [einer Frau _𝑖 ]] gegeben, [die schon lange tot
Karl has me a picture a woman given that PART long dead
ist]𝑖 .
is
‘Karl gave me a picture of a woman that has been dead some time.’
b. Karl hat mir [eine Fälschung [des Bildes [einer Frau _𝑖 ]]] gegeben, [die
Karl has me a forgery of.the picture of.a woman given that
schon lange tot ist]𝑖 .
PART long dead is
‘Karl gave me a forgery of the picture of a woman that has been dead for some
time.’
c. Karl hat mir [eine Kopie [einer Fälschung [des Bildes [einer Frau _𝑖 ]]]]
Karl has me a copy of.a forgery of.the picture of.a woman
gegeben, [die schon lange tot ist]𝑖 .
given that PART long dead is
‘Karl gave me a copy of a forgery of the picture of a woman that has been dead
for some time.’
13
See Müller (1999b: 211) and Müller (2004c; 2007c: Section 3). For parallel examples from Dutch, see Koster
(1978: 52).
461
13 The innateness of linguistic knowledge
This kind of embedding could continue further if one were to not eventually run out
of nouns that allow for semantically plausible embedding. NP is viewed as a bounding
node in German (Grewendorf 1988: 81; 2002: 17–18; Haider 2001: 285). These examples
show that it is possible for rightward extraposed relative clauses to cross any number of
bounding nodes.
Koster (1978: 52–54) discusses some possible explanations for the data in (16), where it
is assumed that relative clauses move to the NP/PP border and are then moved on further
from there (this movement requires so-called escape hatches or escape routes). He argues
that these approaches will also work for the very sentences that should be ruled out by
subjacency, that is, for examples such as (14). This means that either data such as (14)
can be explained by subjacency and the sentences in (16) are counterexamples, or there
are escape hatches and the examples in (14) are irrelevant, deviant sentences that cannot
be explained by subjacency.
In the examples in (16), a relative clause was extraposed in each case. These rela-
tive clauses are treated as adjuncts and there are analyses that assume that extraposed
adjuncts are not moved but rather base-generated in their position, and coreference/
coindexation is achieved by special mechanisms (Kiss 2005). For proponents of these
kinds of analyses, the examples in (16) would be irrelevant to the subjacency discussion
as the Subjacency Principle only constrains movement. However, extraposition across
phrase boundaries is not limited to relative clauses; sentential complements can also be
extraposed:
(17) a. Ich habe [von [der Vermutung _𝑖 ]] gehört, [dass es Zahlen gibt, die
I have from the conjecture heard that EXPL numbers gives that
die folgenden Bedingungen erfüllen]𝑖 .
the following requirements fulfill
‘I have heard of the conjecture that there are numbers that fulfill the following
requirements.’
b. Ich habe [von [einem Beweis [der Vermutung _𝑖 ]]] gehört, [dass es
I have from a proof of.the conjecture heard that EXPL
Zahlen gibt, die die folgenden Bedingungen erfüllen]𝑖 .
numbers give that the following requirements fulfill
‘I have heard of the proof of the conjecture that there are numbers that fulfill
the following requirements.’
c. Ich habe [von [dem Versuch [eines Beweises [der Vermutung _𝑖 ]]]] gehört,
I have from the attempt of.a proof of.the conjecture heard
[dass es Zahlen gibt, die die folgenden Bedingungen erfüllen]𝑖 .
that EXPL numbers gives that the following requirements fulfill
‘I have heard of the attempt to prove the conjecture that there are numbers
that fulfill the following requirements.’
Since there are nouns that select zu infinitives or prepositional phrases and since these
can be extraposed like the sentences above, it must be ensured that the syntactic cate-
462
13.1 Syntactic universals
gory of the postposed element corresponds to the category required by the noun. This
means that there has to be some kind of relation between the governing noun and the ex-
traposed element. For this reason, the examples in (17) have to be analyzed as instances
of extraposition and provide counter evidence to the claims discussed above.
If one wishes to discuss the possibility of recursive embedding, then one is forced to
refer to constructed examples as the likelihood of stumbling across groups of sentences
such as those in (16) and (17) is very remote. It is, however, possible to find some individ-
ual cases of deep embedding: (18) gives some examples of relative clause extraposition
and complement extraposition taken from the Tiger corpus14 (Müller 2007c: 78–79; Meu-
rers & Müller 2009: Section 2.1).
(18) a. Der 43jährige will nach eigener Darstellung damit [NP den Weg [PP für
the 43.year.old wants after own depiction there.with the way for
[NP eine Diskussion [PP über [NP den künftigen Kurs [NP der stärksten
a discussion about the future course of.the strongest
Oppositionsgruppierung]]]]]] freimachen, [die aber mit 10,4 Prozent
opposition.group free.make that however with 10.4 percent
der Stimmen bei der Wahl im Oktober weit hinter den Erwartungen
of.the votes at the election in October far behind the expectations
zurückgeblieben war]. (s27639)
stayed.back was
‘In his own words, the 43-year old wanted to clear the way for a discussion
about the future course of the strongest opposition group that had, however,
performed well below expectations gaining only 10.4 percent of the votes at
the election in October.’
b. […] die Erfindung der Guillotine könnte [NP die Folge [NP eines
the invention of.the guillotine could the result of.a
verzweifelten Versuches des gleichnamigen Doktors] gewesen sein, [seine
desperate attempt the same.name doctor have been his
Patienten ein für allemal von Kopfschmerzen infolge schlechter Kissen
patients once for all.time of headaches because.of bad pillows
zu befreien]. (s16977)
to free
‘The invention of the guillotine could have been the result of a desperate at-
tempt of the eponymous doctor to rid his patients once and for all of headaches
from bad pillows.’
It is also possible to construct sentences for English that violate the Subjacency Condi-
tion. Uszkoreit (1990: 2333) provides the following example:
(19) [NP Only letters [PP from [NP those people _𝑖 ]]] remained unanswered [that had
received our earlier reply]𝑖 .
14
See Brants et al. (2004) for more information on the Tiger corpus.
463
13 The innateness of linguistic knowledge
Jan Strunk (p. c. 2008) has found examples for extraposition of both restrictive and non-
restrictive relative clauses across multiple phrase boundaries:
(20) a. For example, we understand that Ariva buses have won [NP a number [PP of
[NP contracts [PP for [NP routes in London _𝑖 ]]]]] recently, [which will not be
run by low floor accessible buses]𝑖 .15
b. I picked up [NP a copy of [NP a book _𝑖 ]] today, by a law professor, about law,
[that is not assigned or in any way required to read]𝑖 .16
c. We drafted [NP a list of [NP basic demands _𝑖 ]] that night [that had to be
unconditionally met or we would stop making and delivering pizza and go on
strike]𝑖 .17
(20a) is also published in Strunk & Snider (2013: 111). Further attested examples from
German and English can be found in this paper.
The preceding discussion has shown that subjacency constraints on rightward move-
ment do not hold for English or German and thus cannot be viewed as universal. One
could simply claim that NP and PP are not bounding nodes in English or German. Then,
these extraposition data would no longer be problematic for theories assuming subja-
cency. However, subjacency constraints are also assumed for leftward movement. This
is discussed in more detail in the following section.
13.1.5.2 Extraction
Under certain conditions, leftward movement is not possible from certain constituents
(Ross 1967). These constituents are referred to as islands for extraction. Ross (1967: Sec-
tion 4.1) formulated the Complex NP Constraint (CNPC) that states that extraction is not
possible from complex noun phrases. An example of extraction from a relative clause
inside a noun phrase is the following:
(21) * Who𝑖 did he just read [NP the report [S that was about _𝑖 ]?
Although (21) would be a semantically plausible question, the sentence is still ungram-
matical. This is explained by the fact that the question pronoun has been extracted
across the sentence boundary of a relative clause and then across the NP boundary and
has therefore crossed two bounding nodes. It is assumed that the CNPC holds for all
languages. This is not the case, however, as the corresponding structures are possible in
Danish (Erteschik-Shir & Lappin 1979: 55), Norwegian, Swedish, Japanese, Korean, Tamil
and Akan (see Hawkins (1999: 245, 262) and references therein). Since the restrictions
of the CNPC are integrated into the Subjacency Principle, it follows that the Subjacency
Principle cannot be universally applicable unless one claims that NP is not a bounding
node in the problematic languages. However, it seems indeed to be the case that the
majority of languages do not allow extraction from complex noun phrases. Hawkins
15
http://www.publications.parliament.uk/pa/cm199899/cmselect/cmenvtra/32ii/32115.htm, 2018-02-20.
16
http://greyhame.org/archives/date/2005/09/, 2008-09-27.
17
http://portland.indymedia.org/en/2005/07/321809.shtml, 2018-02-20.
464
13.1 Syntactic universals
explains this on the basis of the processing difficulties associated with the structures
in question (Section 4.1). He explains the difference between languages that allow this
kind of extraction and languages that do not with reference to the differing processing
load for structures that stem from the interaction of extraction with other grammatical
properties such as verb position and other conventionalized grammatical structures in
the respective languages (Section 4.2).
Unlike extraction from complex noun phrases, extraction across a single sentence
boundary (22) is not ruled out by the Subjacency Principle.
(22) Who𝑖 did she think that he saw _𝑖 ?
Movement across multiple sentence boundaries, as discussed in previous chapters, is ex-
plained by so-called cyclic movement in transformational theories: a question pronoun
is moved to a specifier position and can then be moved further to the next highest speci-
fier. Each of these movement steps is subject to the Subjacency Principle. The Subjacency
Principle rules out long-distance movement in one fell swoop.
The Subjacency Principle cannot explain why extraction from sentences embedded
under verbs that specify the kind of utterance (23a) or factive verbs (23b) is deviant
(Erteschik-Shir & Lappin 1979: 68–69).
(23) a. ?? Who𝑖 did she mumble that he saw _𝑖 ?
b. ?? Who𝑖 did she realize that he saw _𝑖 ?
The structure of these sentences seems to be the same as (22). In entirely syntactic
approaches, it was also attempted to explain these differences as subjacency violations
or as a violation of Ross’ constraints. It has therefore been assumed (Stowell 1981: 401–
402) that the sentences in (23) have a structure different from those in (22). Stowell
treats these sentential arguments of manner of speaking verbs as adjuncts. Since adjunct
clauses are islands for extraction by assumption, this would explain why (23a) is marked.
The adjunct analysis is compatible with the fact that these sentential arguments can be
omitted:
(24) a. She shouted that he left.
b. She shouted.
Ambridge & Goldberg (2008: 352) have pointed out that treating such clauses as adjuncts
is not justified as they are only possible with a very restricted class of verbs, namely verbs
of saying and thinking. This property is a property of arguments and not of adjuncts.
Adjuncts such as place modifiers are possible with a wide number of verb classes. Fur-
thermore, the meaning changes if the sentential argument is omitted as in (24b): whereas
(24a) requires that some information is communicated, this does not have to be the case
with (24b). It is also possible to replace the sentential argument with an NP as in (25),
which one would certainly not want to treat as an adjunct.
(25) She shouted the remark/the question/something I could not understand.
465
13 The innateness of linguistic knowledge
Kiparsky & Kiparsky (1970) suggest an analysis of factive verbs that assumes a complex
noun phrase with a nominal head. An optional fact Deletion-Transformation removes
the head noun and the determiner of the NP in sentences such as (27a) to derive sentences
such as (27b) (page 159).
(27) a. She realized [NP the fact [S that he left]].
b. She realized [NP [S that he left]].
The impossibility of extraction out of such sentences can be explained by assuming that
two boundary nodes were crossed, which was assumed to be impossible (on the island
status of this construction, see Kiparsky & Kiparsky 1970: Section 4). This analysis pre-
dicts that extraction from complement clauses of factive verbs should be just as bad as
extraction from overt NP arguments since the structure for both is the same. According
to Ambridge & Goldberg (2008: 353), this is, however, not the case:
(28) a. * Who did she realize the fact that he saw _𝑖 ?
b. ?? Who did she realize that he saw _𝑖 ?
Together with Erteschik-Shir (1981), Erteschik-Shir & Lappin (1979), Takami (1988) and
Van Valin (1998), Goldberg (2006: Section 7.2) assumes that the gap must be in a part of
the utterance that can potentially form the focus of an utterance (see Cook (2001), De
Kuthy (2002) and Fanselow (2003c) for German). This means that this part must not be
presupposed.18 If one considers what this means for the data from the subjacency discus-
sion, then one notices that in each case extraction has taken place out of presupposed
material:
(29) a. Complex NP
She didn’t see the report that was about him. → The report was about him.
b. Complement of a verb of thinking or saying
She didn’t whisper that he left. → He left.
c. Factive verb
She didn’t realize that he left. → He left.
18
Information is presupposed if it is true regardless of whether the utterance is negated or not. Thus, it
follows from both (i.a) and (i.b) that there is a king of France.
466
13.1 Syntactic universals
Goldberg assumes that constituents that belong to backgrounded information are islands
(Backgrounded constructions are islands (BCI)). Ambridge & Goldberg (2008) have tested
this semantic/pragmatic analysis experimentally and compared it to a purely syntac-
tic approach. They were able to confirm that information structural properties play a
significant role for the extractability of elements. Along with Erteschik-Shir (1973: Sec-
tion 3.H), Ambridge & Goldberg (2008: 375) assume that languages differ with regard to
how much constituents have to belong to background knowledge in order to rule out ex-
traction. In any case we should not rule out extraction from adjuncts for all languages as
there are languages such as Danish where it is possible to extract from relative clauses.19
Erteschik-Shir (1973: 61) provides the following examples, among others:
(30) a. Det𝑖 er der mange [der kan lide _𝑖 ].
that are there many that can like
‘There are many who like that.’ (lit.: ‘That, there are many who like.’)
b. Det hus𝑖 kender jeg en mand [som har købt _𝑖 ].
that house know I a man that has bought
‘I know a man that has bought that house.’ (lit.: ‘This house, I know a man that
has bought.’)
And as the following example from McCawley (1981: 108) shows, extraction out of rela-
tive clauses is possible even in English:
(31) Then you look at what happens in languages you know and languages that𝑖 you
have a friend [who knows _𝑖 ]. (Charles Ferguson, lecture at university of Chicago,
1971)
Rizzi’s parameterization of the subjacency restriction has been abandoned in many
works, and the relevant effects have been ascribed to differences in other areas of gram-
mar (Adams 1984; Chung & McCloskey 1983; Grimshaw 1986; Kluender 1992).
We have seen in this subsection that there are reasons other than syntactic properties
of structure as to why leftward movement might be blocked. In addition to information
structural properties, processing considerations also play a role (Grosu 1973; Ellefson &
Christiansen 2000; Gibson 1998; Kluender & Kutas 1993; Hawkins 1999; Sag, Hofmeis-
ter & Snider 2007). The length of constituents involved, the distance between filler and
gap, definiteness, complexity of syntactic structure and interference effects between sim-
ilar discourse referents in the space between the filler and gap are all important factors
for the acceptability of utterances. Since languages differ with regard to their syntactic
structure, varying effects of performance, such as the ones found for extraposition and
extraction, are to be expected.
19
Discussing the question of whether UG-based approaches are falsifiable, Crain, Khlentzos & Thornton
(2010: 2669) claim that it is not possible to extract from relative clauses and the existence of such languages
would call into question the very concept of UG. (“If a child acquiring any language could learn to extract
linguistic expressions from a relative clause, then this would seriously cast doubt on one of the basic tenets
of UG.”) They thereby contradict Evans and Levinson as well as Tomasello, who claim that UG approaches
are not falsifiable. If the argumentation of Crain, Khlentzos and Thornton were correct, then (30) and (31)
would falsify UG and that would be the end of the discussion.
467
13 The innateness of linguistic knowledge
In sum, we can say that subjacency constraints do not hold for extraposition in either
German or English and furthermore that one can better explain constraints on extrac-
tion with reference to information structure and processing phenomena than with the
Subjacency Principle. Assuming subjacency as a syntactic constraint in a universal com-
petence grammar is therefore unnecessary to explain the facts.
20
The question of whether these categories form part of UG is left open.
468
13.1 Syntactic universals
additional word classes ideophone, positional, coverb, classifier for the analysis of non
Indo-European languages on top of the four or five normally used.21 This situation is
not a problem for UG-based theories if one assumes that languages can choose from an
inventory of possibilities (a toolkit) but do not have to exhaust it (Jackendoff 2002: 263;
Newmeyer 2005: 11; Fitch, Hauser & Chomsky 2005: 204; Chomsky 2007: 6–7; Cinque &
Rizzi 2010: 55, 58, 65). However, if we condone this view, then there is a certain arbitrari-
ness. It is possible to assume any parts of speech that one requires for the analysis of at
least one language, attribute them to UG and then claim that most (or maybe even all)
languages do not make use of the entire set of parts of speech. This is what is suggested
by Villavicencio (2002: 157), working in the framework of Categorial Grammar, for the
categories S, NP, N, PP and PRT. This kind of assumption is not falsifiable (see Riemsdijk
1978: 148; Evans & Levinson 2009a: 436; Tomasello 2009: 471 for a discussion of similar
cases and a more general discussion).
Whereas Evans and Levinson assume that one needs additional categories, Haspel-
math (2009: 458) and Croft (2009: 453) go so far as to deny the existence of cross-linguis-
tic parts of speech. I consider this to be too extreme and believe that a better research
strategy is to try and find commonalities between languages.22 One should, however,
expect to find languages that do not fit into our Indo-European-biased conceptions of
grammar.
469
13 The innateness of linguistic knowledge
Das Verfahren der Sprache ist aber nicht bloß ein solches, wodurch eine einzelne Er-
scheinung zustande kommt; es muss derselben zugleich die Möglichkeit eröffnen,
eine unbestimmbare Menge solcher Erscheinungen und unter allen, ihr von dem
Gedanken gestellten Bedingungen hervorzubringen. Denn sie steht ganz eigentlich
einem unendlichen und wahrhaft grenzenlosen Gebiete, dem Inbegriff alles Denk-
baren gegenüber. Sie muss daher von endlichen Mitteln einen unendlichen Ge-
brauch machen, und vermag dies durch die Identität der gedanken- und sprache-
erzeugenden Kraft. (Humboldt 1988: 108)
If we just look at the data, we can see that there is an upper bound for the length of
utterances. This has to do with the fact that extremely long instances cannot be pro-
cessed and that speakers have to sleep or will eventually die at some point. If we set a
generous maximal sentence length at 100,000 morphemes and then assume a morpheme
inventory of X then one can form less than X100,000 utterances. We arrive at the number
X100,000 if we use each of the morphemes at each of the 100,000 positions. Since not all
of these sequences will be well-formed, then there are actually less than X100,000 possi-
ble utterances (see also Weydt 1972 for a similar but more elaborate argument). This
number is incredibly large, but still finite. The same is true of thought: we do not have
infinitely many possible thoughts (if infinitely is used in the mathematical sense of the
word), despite claims by Humboldt and Chomsky (2008: 137) to the contrary.25
In the literature, one sometimes finds the claim that it is possible to produce infinitely
long sentences (see for instance Nowak, Komarova & Niyogi (2001: 117) and Kim & Sells
25
Weydt (1972) discusses Chomsky’s statements regarding the existence of infinitely many sentences and
whether it is legitimate for Chomsky to refer to Humboldt. Chomsky’s quote in Current Issues in Linguistic
Theory (Chomsky 1964a: 17) leaves out the sentence Denn sie steht ganz eigentlich einem unendlichen und
wahrhaft grenzenlosen Gebiete, dem Inbegriff alles Denkbaren gegenüber. Weydt (1972: 266) argues that
Humboldt, Bühler and Martinet claimed that there are infinitely many thoughts that can be expressed.
Weydt claims that it does not follow that sentences may be arbitrarily long. Instead he suggests that there
is no upper bound on the length of texts. This claim is interesting, but I guess texts are just the next bigger
unit and the argument that Weydt put forward against languages without an upper bound for sentence
length also applies to texts. A text can be generated by the rather simplified rule in (i) that combines an
utterance U with a text T resulting in a larger text T:
(i) T → T U
U can be a sentence or another phrase that can be part of a text. If one is ready to admit that there is
no upper bound on the length of texts, it follows that there cannot be an upper bound on the length of
sentences either, since one can construct long sentences by joining all phrases of a text with and. Such
long sentences that are the product of conjoining short sentences are different in nature from very long
sentences that are admitted under the Chomskyan view in that they do not include center-self embeddings
of an arbitrary depth (see Section 15), but nevertheless the number of sentences that can be produced from
arbitrarily long texts is infinite.
As for arbitrarily long texts there is an interesting problem: Let us assume that a person produces
sentences and keeps adding them to an existing text. This enterprise will be interrupted when the human
being dies. One could say that another person could take up the text extension until this one dies and so on.
Again the question is whether one can understand the meaning and the structure of a text that is several
million pages long. 42. If this is not enough of a problem, one may ask oneself whether the language of the
person who keeps adding to the text in the year 2731 is still the same that the person who started the text
spoke in 2015. If the answer to this question is no, then the text is not a document containing sentences
from one language L but a mix from several languages and hence irrelevant for the debate.
470
13.1 Syntactic universals
(2008: 3) and Dan Everett in O’Neill & Wood (2012) at 25:19). This is most certainly not
the case. It is also not the case that the rewrite grammars we encountered in Chapter 2
allow for the creation of infinite sentences as the set of symbols of the right-hand side
of the rule has to be finite by definition. While it is possible to derive an infinite number
of sentences, the sentences themselves cannot be infinite, since it is always one symbol
that is replaced by finitely many other symbols and hence no infinite symbol sequence
may result.
Chomsky (1965: Section I.1) follows de Saussure (1916b) and draws a distinction be-
tween competence and performance: competence is the knowledge about what kind of
linguistic structures are well-formed, and performance is the application of this knowl-
edge (see Section 12.6.3 and Chapter 15). Our restricted brain capacity as well as other
constraints are responsible for the fact that we cannot deal with an arbitrary amount of
embedding and that we cannot produce utterances longer than 100,000 morphemes. The
separation between competence and performance makes sense and allows us to formu-
late rules for the analysis of sentences such as (32):
(32) a. Richard is sleeping.
b. Karl suspects that Richard is sleeping.
c. Otto claims that Karl suspects that Richard is sleeping.
d. Julius believes that Otto claims that Karl suspects that Richard is sleeping.
e. Max knows that Julius believes that Otto claims that Karl suspects that Richard
is sleeping.
The rule takes the following form: combine a noun phrase with a verb of a certain class
and a clause. By applying this rule successively, it is possible to form strings of arbitrary
length. Pullum & Scholz (2010) point out that one has to keep two things apart: the
question of whether language is a recursive system and whether it is just the case that the
best models that we can devise for a particular language happen to be recursive. For more
on this point and on processing in the brain, see Luuk & Luuk (2011). When constructing
strings of words using the system above, it cannot be shown that (a particular) language
is infinite, even if this is often claimed to be the case (Bierwisch 1966: 105–106; Pinker
1994: 86; Hauser, Chomsky & Fitch 2002: 1571; Müller 2007a: 1; Hornstein, Nunes &
Grohmann 2005: 7; Kim & Sells 2008: 3).
The “proof” of this infinitude of language is led as an indirect proof parallel to the proof
that shows that there is no largest natural number (Bierwisch 1966: 105–106; Pinker 1994:
86). In the domain of natural numbers, this works as follows: assume 𝑥 is the largest
natural number. Then form 𝑥 + 1 and, since this is by definition a natural number, we
have now found a natural number that is greater than 𝑥. We have therefore shown that
the assumption that 𝑥 is the highest number leads to a contradiction and thus that there
cannot be such a thing as the largest natural number.
When transferring this proof into the domain of natural language, the question arises
as to whether one would still want to class a string of 1,000,000,000 words as part of the
language we want to describe. If we do not want this, then this proof will not work.
471
13 The innateness of linguistic knowledge
If we view language as a biological construct, then one has to accept the fact that it
is finite. Otherwise, one is forced to assume that it is infinite, but that an infinitely large
part of the biologically real object is not biologically real (Postal 2009: 111). Luuk & Luuk
(2011) refer to languages as physically uncountable but finite sets of strings. They point
out that a distinction must be made between the ability to imagine extending a sentence
indefinitely and the ability to take a sentence from a non-countable set of strings and
really extend it. We possess the first ability but not the second.
One possibility to provide arguments for the infinitude of languages is to claim that
only generative grammars, which create sets of well-formed utterances, are suited to
modeling language and that we need recursive rules to capture the data, which is why
mental representations have a recursive procedure that generates infinite numbers of
expressions (Chomsky, 1956: 115; 2002: 86–87), which then implies that languages consist
of infinitely many expressions. There are two mistakes in this argument that have been
pointed out by Pullum & Scholz (2010): even if one assumes generative grammars, it can
still be the case that a context-sensitive grammar can still only generate a finite set even
with recursive rules. Pullum & Scholz (2010: 120–121) give an interesting example from
András Kornai.
The more important mistake is that it is not necessary to assume that grammars gen-
erate sets. There are three explicitly formalized alternatives of which only the third is
mentioned here, namely the model-theoretic and therefore constraint-based approaches
(see Chapter 14). Johnson & Postal’s Arc Pair Grammar (1980), LFG in the formaliza-
tion of Kaplan (1989), GPSG in the reformalization of Rogers (1997) and HPSG with the
assumptions of King (1999), Pollard (1999) and Richter (2007) are examples of model-
theoretic approaches. In constraint-based theories, one would analyze an example like
(32) saying that certain attitude verbs select a nominative NP and a that clause and that
these can only occur in a certain local configuration where a particular relation holds
between the elements involved. One of these relations is subject-verb agreement. In
this way, one can represent expressions such as (32) and does not have to say anything
about how many sentences can be embedded. This means that constraint-based theories
are compatible with both answers to the question of whether there is a finite or infinite
number of structures. Using competence grammars formulated in the relevant way, it is
possible to develop performance models that explain why certain strings – for instance
very long ones – are unacceptable (see Chapter 15).
472
13.1 Syntactic universals
with reference to Hale (1976), is Warlpiri. Hale’s rules for the combination of a sentence
with a relative clause are recursive, however (page 85). This recursion is made explicit
on page 98.26 Pullum & Scholz (2010: 131) discuss Hixkaryána, an Amazonian language
from the Caribbean language family that is not related to Pirahã. This language does
have embedding, but the embedded material has a different form to that of the matrix
clause. It could be the case that these embeddings cannot be carried out indefinitely.
In Hixkaryána, there is also no possibility to coordinate phrases or clauses (Derbyshire
(1979: 45) cited by Pullum & Scholz (2010: 131)), which is why this possibility of forming
recursive sentence embedding does not exist in this language either. Other languages
without self-embedding seem to be Akkadian, Dyirbal and Proto-Uralic.
There is of course a trivial sense in which all languages are recursive: they follow
a rule that says that a particular number of symbols can be combined to form another
symbol.27
(33) X → X … X
In this sense, all natural languages are recursive and the combination of simple symbols
to more complex ones is a basic property of language (Hockett 1960: 6). The fact that the
debate about Pirahã is so fierce could go to show that this is not the kind of recursion
that is meant. Also, see Fitch (2010).
It is also assumed that the combinatorial rules of Categorial Grammar hold universally.
It is possible to use these rules to combine a functor with its arguments (X/Y ∗ Y = X).
These rules are almost as abstract as the rules in (33). The difference is that one of the
elements has to be the functor. There are also corresponding constraints in the Mini-
malist Program such as selectional features (see Section 4.6.4) and restrictions on the
assignment of semantic roles. However, whether or not a Categorial Grammar licenses
recursive structures does not depend on the very general combinatorial schemata, but
rather on the lexical entries. Using the lexical entries in (34), it is only possible to analyze
the four sentences in (35) and certainly not to build recursive structures. (See Chomsky
(2014) for a similar discussion of a hypothetical “truncated English”.)
26
However, he does note on page 78 that relative clauses are separated from the sentence containing the
head noun by a pause. Relative clauses in Warlpiri are always peripheral, that is, they occur to the left or
right of a sentence with the noun they refer to. Similar constructions can be found in German:
It could be the case that we are dealing with linking of sentences at text level and not recursion at sentence
level.
27
Chomsky (2005: 11) assumes that Merge combines n objects. A special instance of this is binary Merge.
473
13 The innateness of linguistic knowledge
If we expand the lexicon to include modifiers of the category n/n or conjunctions of the
category (X\X)/X, then we arrive at a recursive grammar. For example, if we add the two
lexical items in (36), the grammar licenses sentences like (37):
(36) a. ugly: n/n
b. fat: n/n
The grammar allows for the combination of arbitrarily many instances of fat (or any
other adjectives in the lexicon) with nouns, since the result of combining an n/n with an
n is an n. There is no upper limit on such combinations.
Concerning Pirahã, Everett (2012: 560) stated: “The upper limit of a Piraha sentence
is a lexical frame with modifiers—the verb, its arguments, and one modifier for each of
these. And up to two (one at each edge of the sentence) additional sentence-level or
verb-level prepositional adjuncts” and “there can at most be one modifier per word. You
cannot say in Pirahã ‘many big dirty Brazil-nuts’. You would need to say ‘There are big
Brazil-nuts. There are many. They are dirty’.”. The restriction to have just one modifier
per noun can be incorporated easily into the lexical items for nominal modifiers. One has
to assume a feature that distinguishes words from non-words. The result of combining
two linguistic objects would be WORD− and words would be WORD+. The lexical item
for modifiers like big would be as in (38):
(38) big with Pirahã-like restriction on word modification:
n/n[WORD+]
The combination of big and Brazil-nuts is n[WORD−]. Since this is incompatible with
n[WORD+] and since all noun modifiers require an n[WORD+], no further modification is
possible. So although the combination rule of Categorial Grammar would allow the cre-
ation of structures of unbounded complexity, the Pirahã lexicon rules out the respective
combinations. Under such a view the issue could be regarded as settled: all langugaes are
assumed to combine linguistics objects. The combinations are licensed by the combina-
torial rules of Categorial Grammar, by abstract rules like in HPSG or by their respective
equivalents in Minimalism (Merge). Since these rules can apply to their own output they
are recursive.
Concluding this subsection, it can be said that the existence of languages like Pirahã
are not problematic for the assumption that all languages use rules to combine functors
with arguments. However, it is problematic for claims stating that all languages allow
474
13.1 Syntactic universals
for the creation of sentences with unbounded length and that recursive structures (NPs
containing NPs, Ss containing Ss, …) can be found in all languages.
Fitch, Hauser & Chomsky (2005: 203) note that the existence of languages that do not
license recursive structures is not a problem for UG-based theories as not all the possi-
bilities in UG have to be utilized by an individual language. Similar claims were made
with respect to part of speech and other morpho-syntactic properties. It was argued that
UG is a toolbox and languages can choose which building blocks they use. As Evans &
Levinson (2009a: 436, 443) and Tomasello (2009: 471) noticed, the toolbox approach is
problematic as one can posit any number of properties belonging to UG and then decide
on a language by language basis whether they play a role or not. An extreme variant
of this approach would be that grammars of all languages become part of UG (perhaps
with different symbols such as NPSpanish , NPGerman ). This variant of a UG-based theory
of the human capacity for language would be truly unfalsifiable
In the first edition of this book (Müller 2016), I followed the view of Evans & Klein
and Tomasello, but I want to revise my view here. The criticism applies to things like
part of speech (Section 13.1.7) since it is true that the claim that 400 and more parts of
speech are part of our genetic endowment Cinque & Rizzi (2010: 55, 57) cannot really be
falsified, but the situation is different here. All grammar formalisms that were covered
in this book are capable of analyzing recursive structures (see Section 18.3). If Pirahã
lacks recursive structures one could use a grammar formalism with lower generative ca-
pacity (see Chapter 17 on generative capacity) to model the grammar of Pirahã. Parts of
the development of theoretical linguistics were driven by the desire to find formalisms
of the right computational complexity to describe human languages. GPSG (with cer-
tain assumptions) was equivalent to context-sensitive grammars. When Shieber (1985)
and Culy (1985) discovered data from Swiss German and Bambara it was clear that con-
text free grammars are not sufficiently powerful to describe all known languages. Hence,
GPSG was not powerful enough and researchers moved on to more powerful formalisms
like HPSG. Other frameworks like CCG, TAG, and Minimalist Grammars were shown
to be powerful enough to handle so-called mildly context-sensitive grammars, which are
needed for Swiss German and Bambara. Now, we are working on languages like English
using grammar formalisms that can deal with mildly context-sensitive grammars even
though English may be context free (Pullum & Rawlins 2007). This is similar to the situa-
tion with Pirahã: even though English does not have cross-serial dependencies like Swiss
German, linguists use tools that could license them. Even though Pirahã does not have
recursive structures, linguists use tools that could license them. In general there are two
possibilities: one can have very general combinatory rules and rich specifications in the
lexicon or one can have very specific combinatory rules and less information in the lexi-
con.28 If one assumes that the basic combinatory rules are abstract (Minimalism, HPSG,
TAG, CG), the difference between Pirahã and English is represented in the lexicon only.
Pirahã uses the combinatory potential differently. In this sense, Chomsky (2014) is right
28
Personally, I argue for a middle way: Structures like Jackendoff’s N-P-N constructions (2008) are analyzed
by concrete phrasal constructions. Verb-argument combinations of the kind discussed in the first part of
this book are analyzed with abstract combinatory schemata. See Müller (2013c) and Chapter 21.
475
13 The innateness of linguistic knowledge
in saying that the existence of Pirahã is irrelevant to the discussion of what languages
can do. Chomsky also notes that Pirahã people are able to learn other languages that
have recursive structures. So in principle, they can understand and produce more com-
plex structures in much the same way as children of English parents are able to learn
Swiss German.
So the claim that all languages are infinite and make use of self-embedding recursive
structures is probably falsified by languages like Pirahã, but using recursive rules for
the description of all languages is probably a good decision. But even if we assume that
recursive rules play a role in the analysis of all natural languages, would this mean that
the respective rules and capacities are part of our genetic, language-specific endowment?
This question is dealt with in the next subsection.
29
Pinker & Jackendoff (2005: 230) note, however, that navigation differs from the kind of recursive system
described by Chomsky and that recursion is not part of counting systems in all cultures. They assume
that those cultures that have developed infinite counting systems could do this because of their linguistic
capabilities. This is also assumed by Fitch, Hauser & Chomsky (2005: 203). The latter authors claim that all
forms of recursion in other domains depend on language. For more on this point, see Chomsky (2007: 7–
8). Luuk & Luuk (2011) note that natural numbers are defined recursively, but the mathematical definition
does not necessarily play a role for the kinds of arithmetic operations carried out by humans.
476
13.2 Speed of language acquisition
domain-specific abilities (Faculty of Language in the Narrow Sense = FLN) required for
language.
13.1.9 Summary
In sum, we can say that there are no linguistic universals for which there is a consensus
that one has to assume domain-specific innate knowledge to explain them. At the 2008
meeting of the Deutsche Gesellschaft für Sprachwissenschaft, Wolfgang Klein promised
e 100 to anyone who could name a non-trivial property that all languages share (see
Klein 2009). This begs the question of what is meant by ‘trivial’. It seems clear that
all languages share predicate-argument structures and dependency relations in some
sense (Hudson 2010b; Longobardi & Roberts 2010: 2701) and, all languages have complex
expressions whose meaning can be determined compositionally (Manfred Krifka was
promised 20 e for coming up with compositionality). However, as has been noted at
various points, universality by no means implies innateness (Bates 1984: 189; Newmeyer
2005: 205): Newmeyer gives the example that words for sun and moon probably exist in
all languages. This has to do with the fact that these celestial bodies play an important
role in everyone’s lives and thus one needs words to refer to them. It cannot be concluded
from this that the corresponding concepts have to be innate. Similarly, a word that is
used to express a relation between two objects (e.g., catch) has to be connected to the
words describing both of these objects (I, elephant) in a transparent way. However, this
does not necessarily entail that this property of language is innate.
Even if we can find structural properties shared by all languages, this is still not proof
of innate linguistic knowledge, as these similarities could be traced back to other factors.
It is argued that all languages must be made in such a way as to be acquirable with the
paucity of resource available to small children (Hurford 2002: Section 10.7.2; Behrens
2009: 433). It follows from this that, in the relevant phases of its development, our brain
is a constraining factor. Languages have to fit into our brains and since our brains are
similar, languages are also similar in certain respects (see Kluender 1992: 251).
477
13 The innateness of linguistic knowledge
about the physical world around us. For example, the kind of knowledge we need when
we want to pour liquids into a container, skip with a skipping rope or the knowledge
we have about the ballistic properties of objects. The complexity in comparing these
domains of knowledge in order to be able to make claims about language acquisition
may turn out to be far from trivial. For an in-depth discussion of this aspect, see Sampson
(1989: 214–218). Müller & Riemer (1998: 1) point out that children at the age of six can
understand 23,700 words and use over 5000. It follows from this that, in the space of
four and a half years, they learn on average 14 new words every day. This is indeed an
impressive feat, but cannot be used as an argument for innate linguistic knowledge as
all theories of acquisition assume that words have to be learned from data rather than
being predetermined by a genetically-determined Universal Grammar. In any case the
assumption of genetic encoding would be highly implausible for newly created words
such as fax, iPod, e-mail, Tamagotchi.
Furthermore, the claim that first language acquisition is effortless and rapid when
compared to second language acquisition is a myth as has been shown by estimations
by Klein (1986: 9): if we assume that children hear linguistic utterances for five hours a
day (as a conservative estimate), then in the first five years of their lives, they have 9100
hours of linguistic training. But at the age of five, they have still not acquired all com-
plex constructions. In comparison, second-language learners, assuming the necessary
motivation, can learn the grammar of a language rather well in a six-week crash course
with twelve hours a day (500 hours in total).
478
13.4 Lack of acquisition among non-human primates
that a second language is learned significantly worse from the age of 15. Elman, Bates,
Johnson, Karmiloff-Smith, Parisi & Plunkett (1996: 187–188) have, however, pointed out
that there is a different curve for Johnson and Newport’s data that fits the individual
data better. The alternative curve shows no abrupt change but rather a steady decrease
in the ability to learn language and therefore offers no proof of an effect created by a
critical period.
Hakuta, Bialystok & Wiley (2003) evaluate data from a questionnaire of 2,016,317 Span-
ish speakers and 324,444 speakers of Mandarin Chinese that immigrated to the United
States. They investigated which correlations there were between age, the point at im-
migration, the general level of education of the speakers and the level of English they
acquired. They could not identify a critical point in time after which language acquisi-
tion was severely restricted. Instead, there is a steady decline in the ability to learn as
age increases. This can also be observed in other domains: for example, learning to drive
at an older age is much harder.
Summing up, it seems to be relatively clear that a critical period cannot be proven to
exist for second-language acquisition. Sometimes, it is assumed anyway that second-
language acquisition is not driven by an innate UG, but is in fact a learning process
that accesses knowledge already acquired during the critical period (Lenneberg 1967:
176). One would therefore have to show that there is a critical period for first-language
acquisition. This is, however, not straightforward as, for ethical reasons, one cannot ex-
perimentally manipulate the point at which the input is available. We cannot, say, take
20 children and let them grow up without linguistic input to the age of 3, 4, 5, 6, … or
15 and then compare the results. This kind of research is dependent on thankfully very
rare cases of neglect. For example, Curtiss (1977) studied a girl called Genie. At the time,
Genie was 13 years old and had grown up in isolation. She is a so-called feral child. As
Curtiss showed, she was no longer able to learn certain linguistic rules. For an objective
comparison, one would need other test subjects that had not grown up in complete isola-
tion and in inhumane conditions. The only possibility of gaining relevant experimental
data is to study deaf subjects that did not receive any input from a sign language up
to a certain age. Johnson & Newport (1989: 63) carried out relevant experiments with
learners of American Sign Language. It was also shown here that there is a linear decline
in the ability to learn, however nothing like a sudden drop after a certain age or even a
complete loss of the ability to acquire language.
479
13 The innateness of linguistic knowledge
edge and could constitute an important prerequisite for the development of language
(p. 179).
The question is, however, whether we differ from other primates in having special
cognitive capabilities that are specific to language or whether our capability to acquire
languages is due to domain-general differences in cognition. Fanselow (1992b: Section 2)
speaks of a human-specific formal competence that does not necessarily have to be spe-
cific to language, however. Similarly, Chomsky (2007: 7–8) has considered whether
Merge (the only structure-building operation, in his opinion), does not belong to lan-
guage-specific innate abilities, but rather to general human-specific competence (see,
however, Section 13.1.8, in particular footnote 29).
One can ascertain that non-human primates do not understand particular pointing
gestures. Humans like to imitate things. Other primates also imitate, however, not for
social reasons (Tomasello 2006b: 9–10). According to Tomasello et al. (2005: 676), only
humans have the ability and motivation to carry out coordinated activities with common
goals and socially-coordinated action plans. Primates do understand intentional actions,
however, only humans act with a common goal in mind (shared intentionality). Only hu-
mans use and understand hand gestures (Tomasello et al. 2005: 685, 724, 726). Language
is collaborative to a high degree: symbols are used to refer to objects and sometimes also
to the speaker or hearer. In order to be able to use this kind of communication system,
one has to be able to put oneself in the shoes of the interlocutor and develop common
expectations and goals (Tomasello et al. 2005: 683). Non-human primates could thus lack
the social and cognitive prerequisites for language, that is, the difference between hu-
mans and other primates does not have to be explained by innate linguistic knowledge
(Tomasello 2003: Section 8.1.2; Tomasello et al. 2005).
480
13.6 Localization in special parts of the brain
Bickerton’s claims have been criticized as it cannot be verified whether children had
input in the individual languages of the adults (Samarin 1984: 207; Seuren 1984: 209).
All that can be said considering this lack of evidence is that there are a number of de-
mographic facts that suggest that this was the case for at least some creole languages
(Arends 2008). This means that children did not only have the strings from the pid-
gin languages as an input but also sentences from the individual languages spoken by
parents and others around them. Many creolists assume that adults contribute specific
grammatical forms to the emerging language. For example, in the case of Hawaiian Cre-
ole English one can observe that there are influences from the mother tongues of the
speakers involved: Japanese speakers use SOV order as well as SVO and Philippinos use
VOS order as well as SVO order. In total, there is quite a lot of variation in the language
that can be traced back to the various native languages of the individual speakers.
It is also possible to explain the effects observed for creolization without the assump-
tion of innate language-specific knowledge: the fact that children regularize language
can be attributed to a phenomenon independent of language. In experiments, partici-
pants were shown two light bulbs and the test subjects had to predict which of the light
bulbs would be turned on next. If one of the bulbs was switched on 70% of the time,
the participants also picked this one 70% of the time (although they would have actually
had a higher success rate if they had always chosen the bulb turned on with 70% proba-
bility). This behavior is known as Probability Matching. If we add another light bulb to
this scenario and then turn this lamp on in 70% of cases and the other two each 15% of
the time, then participants choose the more frequently lit one 80–90% of the time, that
is, they regularize in the direction of the most frequent occurrence (Gardner 1957; Weir
1964).
Children regularize more than adults (Hudson & Newport 1999; Hudson Kam & New-
port 2005), a fact that can be traced back to their limited brain capacity (“less is more”-
hypothesis, Newport 1990; Elman 1993).
Like creolization, a similar situation can be found in certain social contexts with the
acquisition of sign language: Singleton & Newport (2004) have shown that a child (Si-
mon) that learned American Sign Language (ASL) makes considerably less mistakes than
his parents. The parents first learned ASL at the age of 15 or 16 and performed partic-
ular obligatory movements only 70% of the time. Simon made these movements 90%
of the time. He regularized the input from his parents, whereby the consistent use of
form-meaning pairs plays an important role, that is, he does not simply use Probability
Matching, but learns selectively. Singleton & Newport (2004: 401) suspect that these
kinds of regularizations also play a role for the emergence of creole and sign languages.
However, the relevant statistical data that one would need to confirm this hypothesis
are not available.
481
13 The innateness of linguistic knowledge
cessing (see Friederici (2009) for a current overview). Chomsky talks about there being
a center of language and even calls this (metaphorically) an organ (Chomsky 1977: 164;
Chomsky 2005: 1; Chomsky 2008: 133). This localization was seen as evidence for the
innate basis for our linguistic knowledge (see also Pinker 1994: 297–314).
However, it is the case that if these parts are damaged, other areas of the brain can
take over the relevant functions. If the damage occurs in early childhood, language can
also be learned without these special areas of the brain (for sources, see Dąbrowska 2004:
Section 4.1).
Apart from that, it can also be observed that a particular area of the brain is activated
when reading. If the conclusion about the localization of processing in a particular part of
the brain leading to the innateness of linguistic knowledge were valid, then the activation
of certain areas of the brain during reading should also lead us to conclude that the ability
to read is innate (Elman et al. 1996: 242; Bishop 2002: 57). This is, however, not assumed
(see also Fitch, Hauser & Chomsky 2005: 196).
It should also be noted that language processing affects several areas of the brain and
not just Broca’s and Wernicke’s areas (Fisher & Marcus 2005: 11; Friederici 2009). On
the other hand, Broca’s and Wernicke’s areas are also active during non-linguistic tasks
such as imitation, motoric coordination and processing of music (Maess et al. 2001). For
an overview and further sources, see Fisher & Marcus (2005).
Musso et al. (2003) investigated brain activity during second-language acquisition.
They gave German native speakers data from Italian and Japanese and noticed that there
was activation in Broca’s area. They then compared this to artificial languages that used
Italian and Japanese words but did not correspond to the principles of Universal Gram-
mar as assumed by the authors. An example of the processes assumed in their artificial
language is the formation of questions by reversing of word order as shown in (39).
(39) a. This is a statement.
b. Statement a is this?
The authors then observed that different areas of the brain were activated when learning
this artificial language. This is an interesting result, but does not show that we have
innate linguistic knowledge. It only shows that the areas that are active when processing
our native languages are also active when we learn other languages and that playing
around with words such as reversing the order of words in a sentence affects other areas
of the brain.
A detailed discussion of localization of languages in particular parts of the brain can
be found in Dąbrowska (2004: Chapter 4).
482
13.7 Differences between language and general cognition
or that there are people of normal intelligence whose linguistic ability is restricted, then
one can show that language and general cognition are independent.
483
13 The innateness of linguistic knowledge
directly responsible for the development of organs or areas of organs but rather regu-
lates a cascade of different genes. FoxP2 can therefore not be referred to as the language
gene, it is just a gene that interacts with other genes in complex ways. It is, among other
things, important for our language ability, however, in the same way that it does not
make sense to call FoxP2 a language gene, nobody would connect a hereditary muscle
disorder with a ‘walking gene’ just because this myopathy prevents upright walking
(Bishop 2002: 58). A similar argument can be found in Karmiloff-Smith (1998: 392): there
is a genetic defect that leads some people to begin to lose their hearing from the age of
ten and become completely deaf by age thirty. This genetic defect causes changes in the
hairs inside the ear that one requires for hearing. In this case, one would also not want
to talk about a ‘hearing gene’.
Fitch, Hauser & Chomsky (2005: 190) are also of the opinion that FoxP2 cannot be
responsible for linguistic knowledge. For an overview of this topic, see Bishop (2002)
and Dąbrowska (2004: Section 6.4.2.2) and for genetic questions in general, see Fisher &
Marcus (2005).
484
13.8 Poverty of the Stimulus
31
On page 281, I discussed a description that corresponds to the S symbol in phrase structure grammars. If
one omits the specification of head features in this description, then one gets a description of all complete
phrases, that is, also the man or now. See also Ginzburg & Sag (2000: Section 8.1.4) for a unary branching
rule that projects an utterance fragment to a sentential category incorporating the utterance context. See
Nykiel & Kim (2020) for further details and references on ellipsis in HPSG.
485
13 The innateness of linguistic knowledge
After every input, one can guess that the language is V∗ − 𝜎, where 𝜎 stands for the
alphabetically first sequence with the shortest length that has not yet been seen. If the
sequence in question occurs later, then this hypothesis is revised accordingly. In this
way, one will eventually arrive at the correct language.
If we expand the set of languages from which we have to choose by V∗ , then our
learning process will no longer work since, if V∗ is the target language, then the guessing
will perpetually yield incorrect results. If there were a procedure capable of learning
this language class, then it would have to correctly identify V∗ after a certain number of
inputs. Let us assume that this input is x𝑘 . How can the learning procedure tell us at this
point that the language we are looking for is not V∗ − 𝑥 𝑗 for 𝑗 ≠ 𝑘? If x𝑘 causes one to
guess the wrong grammar V∗ , then every input that comes after that will be compatible
with both the correct (V∗ − 𝑥 𝑗 ) and incorrect (V∗ ) result. Since we only have positive
data, no input allows us to distinguish between either of the hypotheses and provide the
information that we have found a superset of the language we are looking for. Gold has
shown that none of the classes of grammars assumed in the theory of formal languages
(for example, regular, context-free and context-sensitive languages) can be identified
after a finite amount of steps given the input of a text with example utterances. This is
true for all classes of languages that contain all finite languages and at least one infinite
language. The situation is different if positive and negative data are used for learning
instead of text.
The conclusion that has been drawn from Gold’s results is that, for language acquisi-
tion, one requires knowledge that helps to avoid particular hypotheses from the start.
Pullum (2003) criticizes the use of Gold’s findings as evidence for the fact that linguistic
knowledge must be innate. He lists a number of assumptions that have to be made in
order for Gold’s results to be relevant for the acquisition of natural languages. He then
shows that each of these is not uncontroversial.
486
13.8 Poverty of the Stimulus
4. Learners could work in terms of improvements. If one allows for a certain degree
of tolerance, then acquisition is easier and it even becomes possible to learn the
class of recursively enumerable languages (Wharton 1974).
Furthermore, Pullum notes that it is also possible to learn the class of context-sensitive
grammars with Gold’s procedure with positive input only in a finite number of steps if
there is an upper bound 𝑘 for the number of rules, where 𝑘 is an arbitrary number. It
is possible to make 𝑘 so big that the cognitive abilities of the human brain would not
be able to use a grammar with more rules than this. Since it is normally assumed that
natural languages can be described by context-sensitive grammars, it can therefore be
shown that the syntax of natural languages in Gold’s sense can be learned from texts
(see also Scholz & Pullum 2002: 195–196).
Johnson (2004) adds that there is another important point that has been overlooked
in the discussion about language acquisition. Gold’s problem of identifiability is differ-
ent from the problem of language acquisition that has played an important role in the
nativism debate. In order to make the difference clear, Johnson differentiates between
identifiability (in the Goldian sense) and learnability in the sense of language acquisi-
tion. Identifiability for a language class C means that there must be a function 𝑓 that for
each environment 𝐸 for each language 𝐿 in 𝐶 permanently converges on hypothesis 𝐿
as the target language in a finite amount of time.
Johnson proposes the following as the definition of learnability (p. 585): A class 𝐶
of natural languages is learnable iff, given almost any normal human child and almost
any normal linguistic environment for any language 𝐿 in 𝐶, the child will acquire 𝐿 (or
something sufficiently similar to 𝐿) as a native language between the ages of one and five.
Johnson adds the caveat that this definition does not correspond to any theory of learn-
ability in psycholinguistics, but rather it is a hint in the direction of a realistic conception
of acquisition.
Johnson notes that in most interpretations of Gold’s theorem, identifiability and learn-
ability are viewed as one and the same and shows that this is not logically correct: the
main difference between the two depends on the use of two quantifiers. Identifiability
of one language 𝐿 from a class 𝐶 requires that the learner converges on 𝐿 in every envi-
ronment after a finite amount of time. This time can differ greatly from environment to
environment. There is not even an upper bound for the time in question. It is straight-
forward to construct a sequence of environments 𝐸 1 , 𝐸 2 , … for 𝐿, so that a learner in the
environment 𝐸𝑖 will not guess 𝐿 earlier than the time 𝑡𝑖 . Unlike identifiability, learnabil-
ity means that there is a point in time after which in every normal environment, every
normal child has converged on the correct language. This means that children acquire
487
13 The innateness of linguistic knowledge
their language after a particular time span. Johnson quotes Morgan (1989: 352) claiming
that children learn their native language after they have heard approximately 4,280,000
sentences. If we assume that the concept of learnability has a finite upper-bound for
available time, then very few language classes can be identified in the limit. Johnson has
shown this as follows: let 𝐶 be a class of languages containing 𝐿 and 𝐿 ′, where 𝐿 and
𝐿 ′ have some elements in common. It is possible to construct a text such that the first 𝑛
sentences are contained both in 𝐿 and in 𝐿 ′. If the learner has 𝐿 as its working hypoth-
esis then continue the text with sentences from 𝐿 ′, if he has 𝐿 ′ as his hypothesis, then
continue with sentences from 𝐿. In each case, the learner has entertained a false hypoth-
esis after 𝑛 steps. This means that identifiability is not a plausible model for language
acquisition.
Aside from the fact that identifiability is psychologically unrealistic, it is not compat-
ible with learnability (Johnson 2004: 586). For identifiability, only one learner has to
be found (the function 𝑓 mentioned above), learnability, however, quantifies over (al-
most) all normal children. If one keeps all factors constant, then it is easier to show the
identifiability of a language class rather than its learnability. On the one hand, identi-
fiability quantifies universally over all environments, regardless of whether these may
seem odd or of how many repetitions these may contain. Learnability, on the other hand,
has (almost) universal quantification exclusively over normal environments. Therefore,
learnability refers to fewer environments than identifiability, such that there are less
possibilities for problematic texts that could occur as an input and render a language
unlearnable. Furthermore, learnability is defined in such a way that the learner does not
have to learn 𝐿 exactly, but rather learn something sufficiently similar to 𝐿. With respect
to this aspect, learnability is a weaker property of a language class than identifiability.
Therefore, learnability does not follow from identifiability nor the reverse.
Finally, Gold is dealing with the acquisition of syntactic knowledge without taking
semantic knowledge into consideration. However, children possess a vast amount of
information from the context that they employ when acquiring a language (Tomasello
et al. 2005). As pointed out by Klein (1986: 44), humans do not learn anything if they are
placed in a room and sentences in Mandarin Chinese are played to them. Language is
acquired in a social and cultural context.
In sum, one should note that the existence of innate linguistic knowledge cannot be
derived from mathematical findings about the learnability of languages.
488
13.8 Poverty of the Stimulus
T stands for tense, M for a modal verb and en stands for the participle morpheme (-
en in been/seen/… and -ed in rained). The brackets here indicate the optionality of the
expressions. Kimball notes that it is only possible to formulate this rule if (42h) is well-
formed. If this were not the case, then one would have to reorganize the material in
rules such that the three cases (M)(have+en), (M)(be+ing) and (have+en)(be+ing) would
be covered. Kimball assumes that children master the complex rule since they know that
sentences such as (42h) are well-formed and since they know the order in which modal
and auxiliary verbs must occur. Kimball assumes that children do not have positive
evidence for the order in (42h) and concludes from this that the knowledge about the
rule in (43) must be innate.
Pullum and Scholz note two problems with this Poverty of the Stimulus Argument:
first, they have found hundreds of examples, among them some from children’s stories,
so that the Kimball’s claim that sentences such as (42h) are “vanishingly rare” should
32
Also, see Abney (1996: 7) for examples from the Wall Street Journal.
489
13 The innateness of linguistic knowledge
be called into question. For PSA arguments, one should at least specify how many oc-
currences there are allowed to be if one still wants to claim that nothing can be learned
from them (Pullum & Scholz 2002: 29).
The second problem is that it does not make sense to assume that the rule in (42h)
plays a role in our linguistic knowledge. Empirical findings have shown that this rule is
not descriptively adequate. If the rule in (43) is not descriptively adequate, then it cannot
achieve explanatory adequacy and therefore, one no longer has to explain how it can be
acquired.
Instead of a rule such as (43), all theories discussed here currently assume that auxil-
iary or modal verbs embed a phrase, that is, one does not have an Aux node containing
all auxiliary and modal verbs, but rather a structure for (42h) that looks as follows:
(44) It [may [have [been raining]]].
Here, the auxiliary or modal verb always selects the embedded phrase. The acquisition
problem now looks completely different: a speaker has to learn the form of the head
verb in the verbal projection selected by the auxiliary or modal verb. If this information
has been learned, then it is irrelevant how complex the embedded verbal projections
are: may can be combined with a non-finite lexical verb (42b) or a non-finite auxiliary
(42c,d).
490
13.8 Poverty of the Stimulus
This means that there is nothing to learn with regard to the well-formedness of the
structure in (46). Furthermore, the available data for acquiring the fact that one can
refer to larger constituents is not as hopeless as Baker (p. 416) claims: there are examples
that only allow an interpretation where one refers to a larger string of words. Pullum
and Scholz offer examples from various corpora. They also provide examples from the
CHILDES corpus, a corpus that contains communication with children (MacWhinney
1995). The following example is from a daytime TV show:
(48) A: “Do you think you will ever remarry again? I don’t.”
B: “Maybe I will, someday. But he’d have to be somebody very special. Sensitive
and supportive, giving. Hey, wait a minute, where do they make guys like
this?”
A: “I don’t know. I’ve never seen one up close.”
Here, it is clear that one cannot refer to guys since A has certainly already seen guys.
Instead, it refers to guys like this, that is, men who are sensitive and supportive.
Once again, the question arises here as to how many instances a learner has to hear
for it to count as evidence in the eyes of proponents of the PSA.
491
13 The innateness of linguistic knowledge
speakers of English only rarely or even never produce examples such as (49d) (Chom-
sky in Piattelli-Palmarini (1980: 114–115)). With the help of corpus data and plausibly
constructed examples, Pullum (1996) has shown that this claim is clearly wrong. Pul-
lum (1996) provides examples from the Wall Street Journal and Pullum & Scholz (2002)
discuss the relevant examples in more detail and add to them with examples from the
CHILDES corpus showing that adult speakers cannot only produce the relevant kinds of
sentences, but also that these occur in the child’s input.34 Examples from CHILDES that
disprove the hypothesis that the first auxiliary has to be fronted are given in (50):35
(50) a. Is the ball you were speaking of in the box with the bowling pin?
b. Where’s this little boy who’s full of smiles?
c. While you’re sleeping, shall I make the breakfast?
Pullum and Scholz point out that wh-questions such as (50b) are also relevant if one
assumes that these are derived from polar questions (see page 97 in this book) and if one
wishes to show how the child can learn the structure-dependent hypothesis. This can be
explained with the examples in (51): the base form from which (51a) is derived is (51b). If
we were to front the first auxiliary in (51b), we would produce (51c).
(51) a. Where’s the application Mark promised to fill out?36
b. the application Mark [AUX PAST] promised to fill out [AUX is] there
c. * Where did the application Mark promised to fill out is?
Evidence for the fact that (51c) is not correct can, however, also be found in language
addressed to children. Pullum and Scholz provide the examples in (52):37
(52) a. Where’s the little blue crib that was in the house before?
b. Where’s the other dolly that was in here?
c. Where’s the other doll that goes in there?
These questions have the form Where’s NP?, where NP contains a relative clause.
In (50c), there is another clause preceding the actual interrogative, an adjunct clause
containing an auxiliary as well. This sentence therefore provides evidence for falsehood
of the hypothesis that the linearly first auxiliary must be fronted (Sampson 1989: 223).
In total, there are a number of attested sentence types in the input of children that
would allow them to choose between the two hypotheses. Once again, the question
arises as to how much evidence should be viewed as sufficient.
34
For more on this point, see Sampson (1989: 223). Sampson cites part of a poem by William Blake, that is
studied in English schools, as well as a children’s encyclopedia. These examples surely do not play a role
in acquisition of auxiliary position since this order is learned at the age of 3;2, that is, it has already been
learned by the time children reach school age.
35
See Lewis & Elman (2001). Researchers on language acquisition agree that the frequency of this kind of
examples in communication with children is in fact very low. See Ambridge et al. (2008: 223).
36
From the transcription of a TV program in the CHILDES corpus.
37
These sentences are taken from NINA05.CHA in DATABASE/ENG/SUPPES/.
492
13.8 Poverty of the Stimulus
Pullum und Scholz’s article has been criticized by Lasnik & Uriagereka (2002) and
Legate & Yang (2002). Lasnik and Uriagereka argue that the acquisition problem is much
bigger than presented by Pullum and Scholz since a learner without any knowledge
about the language he was going to acquire could not just have the hypothesis in (53)
that were discussed already but also the additional hypotheses in (54):
(53) a. Place the first auxiliary at the front of the clause.
b. Place the first auxiliary in matrix-Infl at the front of the clause.
493
13 The innateness of linguistic knowledge
The position of auxiliaries in English is learned by children at the age of 3;2. Accord-
ing to Legate and Yang, another acquisition phenomenon that is learned at the age of
3;2 is needed for comparison. The authors focus on subject drop38 , that is learned at 36
months (two months earlier than auxiliary inversion). According to the authors, acqui-
sition problems involve a binary decision: in the first case, one has to choose between
the two hypotheses in (53). In the second case, the learner has to determine whether
a language uses overt subjects. The authors assume that the use of expletives such as
there serves as evidence for learners that the language they are learning is not one with
optional subjects. They then count the sentences in the CHILDES corpus that contain
there-subjects and estimate F2 at 1.2 % of the sentences heard by the learner. Since, in
their opinion, we are dealing with equally difficult phenomena here, sentences such as
(49d) and (52) should constitute 1.2 % of the input in order for auxiliary inversion to be
learnable.
The authors then searched in the Nina and Adam corpora (both part of CHILDES) and
note that 0.068 to 0.045 % of utterances have the form of (52) and none have the form of
(49d). They conclude that this number is not sufficient as positive evidence.
Legate and Yang are right in pointing out that Pullum and Scholz’s data from the Wall
Street Journal are not necessarily relevant for language acquisition and also in pointing
out that examples with complex subject noun phrases do not occur in the data or at
least to a negligible degree. There are, however, three serious problems with their ar-
gumentation: first, there is no correlation between the occurrence of expletive subjects
and the property of being a pro-drop language: Galician (Raposo & Uriagereka 1990:
Section 2.5) is a pro-drop language with subject expletive pronouns, in Italian there is
an existential expletive ci,39 even though Italian counts as a pro-drop language, Franks
(1995) lists Upper and Lower Sorbian as pro-drop languages that have expletives in sub-
ject position. Since therefore expletive pronouns have nothing to do with the pro-drop
parameter, their frequency is irrelevant for the acquisition of a parameter value. If there
were a correlation between the possibility of omitting subjects and the occurrence of sub-
ject expletives, then Norwegian and Danish children should learn that there has to be a
38
This phenomenon is also called pro-drop. For a detailed discussion of the pro-drop parameter see Sec-
tion 16.1.
39
However, ci is not treated as an expletive by all authors. See Remberger (2009) for an overview.
494
13.8 Poverty of the Stimulus
subject in their languages earlier than children learning English since expletives occur
a higher percentage of the time in Danish and Norwegian (Scholz & Pullum 2002: 220).
In Danish, the constructions corresponding to there-constructions in English are twice
as frequent. It is still unclear whether there are actually differences in rate of acquisition
(Pullum 2009: 246).
Second, in constructing their Poverty of the Stimulus argument, Legate and Yang as-
sume that there is innate linguistic knowledge (the pro-drop parameter). Therefore their
argument is circular since it is supposed to show that the assumption of innate linguistic
knowledge is indispensable (Scholz & Pullum 2002: 220).
The third problem in Legate and Yang’s argumentation is that they assume that a
transformational analysis is the only possibility. This becomes clear from the following
citation (Legate & Yang 2002: 153):
The correct operation for question formation is, of course, structure dependent: it
involves parsing the sentence into structurally organized phrases, and fronting the
auxiliary that follows the subject NP, which can be arbitrarily long:
The analysis put forward by Chomsky (see page 97) is a transformation-based one, that
is, a learner has to learn exactly what Legate and Yang describe: the auxiliary must move
in front of the subject noun phrase. There are, however, alternative analyses that do not
require transformations or equivalent mechanisms. If our linguistic knowledge does not
contain any information about transformations, then their claim about what has to be
learned is wrong. For example, one can assume, as in Categorial Grammar, that auxil-
iaries form a word class with particular distributional properties. One possible placement
for them is initial positions as observed in questions, the alternative is after the subject
(Villavicencio 2002: 104). There would then be the need to acquire information about
whether the subject is realized to the left or to the right of its head. As an alternative to
this lexicon-based analysis, one could pursue a Construction Grammar (Fillmore 1988:
44; 1999; Kay & Fillmore 1999: 18), Cognitive Grammar (Dąbrowska 2004: Chapter 9), or
HPSG (Ginzburg & Sag 2000; Sag et al. 2020) approach. In these frameworks, there are
simply two40 schemata for the two sequences that assign different meanings according
to the order of verb and subject. The acquisition problem is then that the learners have
to identify the corresponding phrasal patterns in the input. They have to realize that
Aux NP VP is a well-formed structure in English that has interrogative semantics. The
relevant theories of acquisition in the Construction Grammar-oriented literature have
been very well worked out (see Section 16.3 and 16.4). Construction-based theories of
acquisition are also supported by the fact that one can see that there are frequency ef-
fects, that is, auxiliary inversion is first produced by children for just a few auxiliaries
and only in later phases of development is it then extended to all auxiliaries. If speakers
40
Fillmore (1999) assumes subtypes of the Subject Auxiliary Inversion Construction since this kind of inver-
sion does not only occur in questions.
495
13 The innateness of linguistic knowledge
have learned that auxiliary constructions have the pattern Aux NP VP, then the coordi-
nation data provided by Lasnik and Uriagereka in (58) no longer pose a problem since, if
we only assign the first conjunct to the NP in the pattern Aux NP VP, then the rest of the
coordinate structure (and those who are not coming) remains unanalyzed and cannot be
incorporated into the entire sentence. The hearer is thereby forced to revise his assump-
tion that will those who are coming corresponds to the sequence Aux NP in Aux NP VP
and instead to use the entire NP those who are coming and those who are not coming. For
acquisition, it is therefore enough to simply learn the pattern Aux NP VP first for some
and then eventually for all auxiliaries in English. This has also been shown by Lewis &
Elman (2001), who trained a neural network exclusively with data that did not contain
NPs with relative clauses in auxiliary constructions. Relative clauses were, however,
present in other structures. The complexity of the training material was increased bit
by bit just as is the case for the linguistic input that children receive (Elman 1993).41 The
neural network can predict the next symbol after a sequence of words. For sentences
with interrogative word order, the predictions are correct. Even the relative pronoun in
(60) is predicted despite the sequence Aux Det N Relp never occurring in the training
material.
(60) Is the boy who is smoking crazy?
Furthermore, the system signals an error if the network is presented with the ungram-
matical sentence (61):
(61) * Is the boy who smoking is crazy?
A present participle is not expected after the relative pronoun, but rather a finite verb.
The constructed neural network is of course not yet an adequate model of what is going
on in our heads during acquisition and speech production.42 The experiment shows,
however, that the input that the learner receives contains rich statistical information
that can be used when acquiring language. Lewis and Elman point out that the statistical
information about the distribution of words in the input is not the only information that
speakers have. In addition to information about distribution, they are also exposed to
information about the context and can make use of phonological similarities in words.
In connection to the ungrammatical sentences in (61), it has been claimed that the
fact that such sentences can never be produced shows that children already know that
grammatical operations are structure-dependent and this is why they do not entertain
the hypothesis that it is simply the linearly first verb that is moved (Crain & Nakayama
1987). The claim simply cannot be verified since children do not normally form the rele-
vant complex utterances. It is therefore only possible to experimentally illicit utterances
where they could make the relevant mistakes. Crain & Nakayama (1987) have carried out
41
There are cultural differences. In some cultures, adults do not talk to children that have not attained
full linguistic competence (Ochs 1982; Ochs & Schieffelin 1985) (also see Section 13.8.4). Children have to
therefore learn the language from their environment, that is, the sentences that they hear reflect the full
complexity of the language.
42
See Hurford (2002: 324) and Jackendoff (2007: Section 6.2) for problems that arise for certain kinds of neural
networks and Pulvermüller (2003; 2010) for an alternative architecture that does not have these problems.
496
13.8 Poverty of the Stimulus
such experiments. Their study has been criticized by Ambridge, Rowland & Pine (2008)
since these authors could show that children do really make mistakes when fronting
auxiliaries. The authors put the difference to the results of the first study by Crain and
Nakayama down to unfortunate choice of auxiliary in Crain and Nakayama’s study. Due
to the use of the auxiliary is, the ungrammatical examples had pairs of words that never
or only very rarely occur next to each other (who running in (62a)).
(62) a. The boy who is running fast can jump high. →
* Is the boy who running fast can jump high?
b. The boy who can run fast can jump high. →
* Can the boy who run fast can jump high?
If one uses the auxiliary can, this problem disappears since who and run certainly do
appear together. This then leads to the children actually making mistakes that they
should not have, as the incorrect utterances actually violate a constraint that is supposed
to be part of innate linguistic knowledge.
Estigarribia (2009) investigated English polar questions in particular. He shows that
not even half of the polar questions in children’s input have the form Aux NP VP (p. 74).
Instead, parents communicated with their children in a simplified form and used sen-
tences such as:
(63) a. That your tablet?
b. He talking?
c. That taste pretty good?
Estigarribia divides the various patterns into complexity classes of the following kind:
FRAG (fragmentary), SPRED (subject predicate) and AUX-IN (auxiliary inversion). (64) shows
corresponding examples:
(64) a. coming tomorrow? (FRAG)
b. you coming tomorrow? (SPRED)
c. Are you coming tomorrow? (AUX-IN)
What we see is that the complexity increases from class to class. Estigarribia suggests a
system of language acquisition where simpler classes are acquired before more complex
ones and the latter ones develop from peripheral modifications of more simple classes
(p. 76). He assumes that question forms are learned from right to left (right to left elab-
oration), that is, (64a) is learned first, then the pattern in (64b) containing a subject in
addition to the material in (64a), and then in a third step, the pattern (64c) in which
an additional auxiliary occurs (p. 82). In this kind of learning procedure, no auxiliary
inversion is involved. This view is compatible with constraint-based analyses such as
that of Ginzburg & Sag (2000). A similar approach to acquisition by Freudenthal, Pine,
Aguado-Orea & Gobet (2007) will be discussed in Section 16.3.
A further interesting study has been carried out by Bod (2009b). He shows that it is
possible to learn auxiliary inversion assuming trees with any kind of branching even
497
13 The innateness of linguistic knowledge
if there is no auxiliary inversion with complex noun phrases present in the input. The
procedure he uses as well as the results he gains are very interesting and will be discussed
in Section 13.8.3 in more detail.
In conclusion, we can say that children do make mistakes with regard to the position
of auxiliaries that they probably should not make if the relevant knowledge were innate.
Information about the statistical distribution of words in the input is enough to learn the
structures of complex sentences without actually having this kind of complex sentences
in the input.
13.8.2.5 Summary
Pullum & Scholz (2002: 19) show what an Argument from Poverty of the Stimulus (APS)
would have to look like if it were constructed correctly:
(65) APS specification schema:
a. ACQUIRENDUM CHARACTERIZATION: describe in detail what is alleged to
be known.
b. LACUNA SPECIFICATION: identify a set of sentences such that if the learner
had access to them, the claim of data-driven learning of the acquirendum
would be supported.
c. INDISPENSABILITY ARGUMENT: give reason to think that if learning were
data-driven, then the acquirendum could not be learned without access to
sentences in the lacuna.
d. INACCESSIBILITY EVIDENCE: support the claim that tokens of sentences in
the lacuna were not available to the learner during the acquisition process.
e. ACQUISITION EVIDENCE: give reason to believe that the acquirendum does
in fact become known to learners during childhood.
As the four case studies have shown, there can be reasons for rejecting the acquirendum.
If the acquirendum does not have to be acquired, than there is no longer any evidence for
innate linguistic knowledge. The acquirendum must at least be descriptively adequate.
This is an empirical question that can be answered by linguists. In three of the four PoS
arguments discussed by Pullum and Scholz, there were parts which were not descrip-
tively adequate. In previous sections, we already encountered other PoS arguments that
involve claims regarding linguistic data that cannot be upheld empirically (for example,
the Subjacency Principle). For the remaining points in (65), interdisciplinary work is
required: the specification of the lacuna falls into the theory of formal language (the
specification of a set of utterances), the argument of indispensability is a mathematical
task from the realm of learning theory, the evidence for inaccessibility is an empirical
question that can be approached by using corpora, and finally the evidence for acquisi-
tion is a question for experimental developmental psychologists (Pullum & Scholz 2002:
19–20).
Pullum & Scholz (2002: 46) point out an interesting paradox with regard to (65c):
without results from mathematical theories of learning, one cannot achieve (65c). If one
498
13.8 Poverty of the Stimulus
wishes to provide a valid Poverty of the Stimulus Argument, then this should automat-
ically lead to improvements in theories of learning, that is, it is possible to learn more
than was previously assumed.
X X
X X
X X
X X
Figure 13.2: Possible binary-branching structures for Watch the dog and The dog barks.
499
13 The innateness of linguistic knowledge
X X
X X X
X X
X X X
X X
X X X
X X
X X X
In the third step, we now have to compute the best tree for each utterance. For The dog
barks., there are two trees in the set of the subtrees that correspond exactly to this ut-
terance. But it is also possible to build structures out of subtrees. There are therefore
multiple derivations possible for The dog barks. all of which use the trees in Figure 13.3:
one the one hand, trivial derivations that use the entire tree, and on the other, deriva-
tions that build trees from smaller subtrees. Figure 13.4 gives an impression of how this
construction of subtrees happens. If we now want to decide which of the analyses in (67)
is the best, then we have to compute the probability of each tree.
(67) a. [[the dog] barks]
b. [the [dog barks]]
500
13.8 Poverty of the Stimulus
X X X
X
X is created by X and X ◦
dog barks
the dog barks the dog barks the
X X X
X
X is created by X and X ◦
the dog
the dog barks the dog barks barks
Figure 13.4: Analysis of The dog barks using subtrees from Figure 13.3
The probability of a tree is the sum of the probabilities of all its analyses. There are two
analyses for (67b), which can be found in Figure 13.4. The probability of the first analysis
of (67b) corresponds to the probability of choosing exactly the complete tree for [the
[dog barks]] from the set of all subtrees. Since there are twelve subtrees, the probability
of choosing that one is 1/12. The probability of the second analysis is the product of the
probabilities of the subtrees that are combined and is therefore 1/12 × 1/12 = 1/144. The
probability of the analysis in (67b) is therefore 1/12 + (1/12 × 1/12) = 13/144. One can then
calculate the probability of the tree in (67a) in the same way. The only difference here is
that the tree for [the dog] occurs twice in the set of subtrees. Its probability is therefore
2/12. The probability of the tree [[the dog] barks] is therefore: 1/12 + (1/12 × 2/12) =
14/144. We have thus extracted knowledge about plausible structures from the corpus.
This knowledge can also be applied whenever one hears a new utterance for which there
is no complete tree. It is then possible to use already known subtrees to calculate the
probabilities of possible analyses of the new utterance. Bod’s model can also be combined
with weights: those sentences that were heard longer ago by the speaker, will receive
a lower weight. One can thereby also account for the fact that children do not have all
sentences that they have ever heard available simultaneously. This extension makes the
UDOP model more plausible for language acquisition.
In the example above, we did not assign categories to the words. If we were to do this,
then we would get the tree in Figure 13.5 on the following page as a possible subtree.
These kinds of discontinuous subtrees are important if one wants to capture dependen-
cies between elements that occur in different subtrees of a given tree. Some examples
are the following sentences:
(68) a. BA carried more people than cargo in 2005.
b. What’s this scratch doing on the table?
c. Most software companies in Vietnam are small sized.
501
13 The innateness of linguistic knowledge
X X
X X
watch dog
It is then also possible to learn auxiliary inversion in English with these kinds of discon-
tinuous trees. All one needs are tree structures for the two sentences in (69) in order to
prefer the correct sentence (70a) over the incorrect one (70b).
U-DOP can learn the structures for (69) in Figure 13.6 on the next page from the sentences
in (71):
Note that these sentences do not contain any instance of the structure in (70a). With
the structures learned here, it is possible to show that the shortest possible derivation
for the position of the auxiliary is also the correct one: the correct order Is the man who
is eating hungry? only requires that the fragments in Figure 13.7 on the facing page
are combined, whereas the structure for * Is the man who eating is hungry? requires at
least four subtrees from Figure 13.6 to be combined with each other. This is shown by
Figure 13.8 on page 504.
The motivation for always taking the derivation that consists of the least subparts is
that one maximizes similarity to already known material.
The tree for (72) containing one auxiliary too many can also be created from Figure 13.6
with just two subtrees (with the tree [X isX X] and the entire tree for The man who is eating
is hungry).
(72) * Is the man who is eating is hungry?
502
13.8 Poverty of the Stimulus
X X X
X X X X X X
X X X X X X
X X X X
Figure 13.6: Structures that U-DOP learned from the examples in (69) and (71)
X
X
X X
X X
◦ X X X X
X X
X X
is hungry
the man who is eating
Figure 13.7: Derivation of the correct structure for combination with an auxiliary using
two subtrees from Figure 13.6
Interestingly, children do produce this kind of incorrect sentences (Crain & Nakayama
1987: 530; Ambridge, Rowland & Pine 2008). However, if we consider the probabilities
of the subtrees in addition to the the number of combined subparts, we get the correct
result, namely (70a) and not (72). This is due to the fact that the man who is eating
occurs in the corpus twice, in (70a) and in (71a). Thus, the probability of the man who
is eating is just as high as the probability of the man who is eating is hungry and thus
derivation in Figure 13.7 is preferred over the one for (72). This works for the constructed
examples here, however one can imagine that in a realistic corpus, sequences of the
form the man who is eating are more frequent than sequences with further words since
the man who is eating can also occur in other contexts. Bod has applied this process
503
13 The innateness of linguistic knowledge
X X
X
X X X X X
◦ ◦ ◦ X X
X X X X X X eating
is hungry
is the man who
Figure 13.8: Derivation of the incorrect structure for the combination with an auxiliary
using two subtrees from Figure 13.6
43
Computational linguistic algorithms for determining parts of speech often look at an entire corpus. But
children are always dealing with just a particular part of it. The corresponding learning process must then
also include a curve of forgetting. See Braine (1987: 67).
504
13.8 Poverty of the Stimulus
only occurrences of words in structures are evaluated. Nothing is said about whether
words stand in a particular regular relationship to one another or not (for example, a
lexical rule connecting a passive participle and perfect participle). Furthermore, noth-
ing is said about how the meaning of expressions arise (are they rather holistic in the
sense of Construction Grammar or projected from the lexicon?). These are questions
that still concern theoretical linguists (see Chapter 21) and cannot straightforwardly be
derived from the statistic distribution of words and the structures computed from them
(see Section 21.8.1 for more on this point).
A second comment is also needed: we have seen that statistical information can
be used to derive the structure of complex linguistic expressions. This now begs the
question of how this relates to Chomsky’s earlier argumentation against statistical ap-
proaches (Chomsky 1957: 16). Abney (1996: Section 4.2) discusses this in detail. The prob-
lem with his earlier argumentation is that Chomsky referred to Markov models. These
are statistical versions of finite automatons. Finite automatons can only describe type 3
languages and are therefore not appropriate for analyzing natural language. However,
Chomsky’s criticism cannot be applied to statistical methods in general.
505
13 The innateness of linguistic knowledge
The child can conclude from the fact that adults use a more involved causative construc-
tion with make that the verb disappear, unlike other verbs such as melt, cannot be used
transitively. An immediately instructive example for the role played by indirect negative
evidence comes from morphology. There are certain productive rules that can however
still not be applied if there is a word that blocks the application of the rule. An example
is the -er nominalization suffix in German. By adding an -er to a verb stem, one can de-
rive a noun that refers to someone who carries out a particular action (often habitually)
(Raucher ‘smoker’, Maler ‘painter’, Sänger ‘singer’, Tänzer ‘dancer’). However, Stehler
‘stealer’ is very unusual. The formation of Stehler is blocked by the existence of Dieb
‘thief’. Language learners therefore have to infer from the non-existence of Stehler that
the nominalization rule does not apply to stehlen ‘to steal’.
Similarly, a speaker with a grammar of English that does not have any restrictions on
the position of manner adverbs would expect that both orders in (75) are possible (Scholz
& Pullum 2002: 206):
(75) a. call the police immediately
b. * call immediately the police
Learners can conclude indirectly from the fact that verb phrases such as (75b) (almost)
never occur in the input that these are probably not part of the language. This can be
modeled using the relevant statistical learning algorithms.
The examples for the existence of negative evidence provided so far are more argu-
ments from plausibility. Stefanowitsch (2008) has combined corpus linguistic studies on
the statistical distribution with acceptability experiments and has shown that negative
evidence gained from expected frequencies correlates with acceptability judgments of
speakers. This process will be discussed now briefly: Stefanowitsch assumes the follow-
ing principle:
(76) Form expectations about the frequency of co-occurrence of linguistic features or
elements on the basis of their individual frequency of occurrence and check these
expectations against the actual frequency of co-occurrence. (Stefanowitsch 2008:
518)
Stefanowitsch works with the part of the International Corpus of English that contains
British English (ICE-GB). In this corpus, the verb say occurs 3,333 times and sentences
with ditransitive verbs (Subj Verb Obj Obj) occur 1,824 times. The entire total of verbs
in the corpus is 136,551. If all verbs occurred in all kinds of sentences with the same
frequencies, then we would expect say to occur 44.52 times (X / 1,824 = 3,333 / 136,551
and hence X = 1,824 × 3,333 / 136,551) in the ditransitive construction. But the number of
actual occurrences is actually 0 since, unlike (77b), sentences such as (77a) are not used
by speakers of English.
(77) a. * Dad said Sue something nice.
b. Dad said something nice to Sue.
506
13.9 Summary
Stefanowitsch shows that the non-occurrence of say in the ditransitive sentence pattern
is significant. Furthermore, he investigated how acceptability judgments compare to the
frequent occurrence or non-occurrence of verbs in certain constructions. In a first exper-
iment, he was able to show that the frequent non-occurrence of elements in particular
constructions correlates with the acceptability judgments of speakers, whereas this is
not the case for the frequent occurrence of a verb in a construction.
In sum, we can say that indirect negative evidence can be derived from linguistic input
and that it seems to play an important role in language acquisition.
13.9 Summary
It follows from all this that not a single one of the arguments in favor of innate linguistic
knowledge remains uncontroversial. This of course does not rule out there still being
innate linguistic knowledge but those who wish to incorporate this assumption into their
theories have to take more care than was previously the case to prove that what they
assume to be innate is actually part of our linguistic knowledge and that it cannot be
learned from the linguistic input alone.
Comprehension questions
1. Which arguments are there for the assumption of innate linguistic knowl-
edge?
Further reading
Pinker’s book (1994) is the best written book arguing for nativist models of lan-
guage.
Elman, Bates, Johnson, Karmiloff-Smith, Parisi & Plunkett (1996) discuss all
the arguments that have been proposed in favor of innate linguistic knowledge
and show that the relevant phenomena can be explained differently. The au-
thors adopt a connectionist view. They work with neuronal networks, which
507
13 The innateness of linguistic knowledge
are assumed to model what is happening in our brains relatively accurately. The
book also contains chapters about the basics of genetics and the structure of the
brain, going into detail about why a direct encoding of linguistic knowledge in
our genome is implausible.
Certain approaches using neuronal networks have been criticized because
they cannot capture certain aspects of human abilities such as recursion or the
multiple usage of the same words in an utterance. Pulvermüller (2010) discusses
an architecture that has memory and uses this to analyze recursive structures. In
his overview article, certain works are cited that show that the existence of more
abstract rules or schemata of the kind theoretical linguists take for granted can
be demonstrated on the neuronal level. Pulvermüller does not, however, assume
that linguistic knowledge is innate (p. 173).
xsPullum and Scholz have dealt with the Poverty-of-the-Stimulus argument
in detail (Pullum & Scholz 2002; Scholz & Pullum 2002).
Goldberg (2006) and Tomasello (2003) are the most prominent proponents of
Construction Grammar, a theory that explicitly tries to do without the assump-
tion of innate linguistic knowledge.
508
14 Generative-enumerative vs.
model-theoretic approaches
Generative-enumerative approaches assume that a grammar generates a set of sequences
of symbols (strings of words). This is where the term Generative Grammar comes from.
Thus, it is possible to use the grammar on page 53, repeated here as (1), to derive the
string er das Buch dem Mann gibt ‘he the book the man gives’.
(1) NP → D, N NP → er N → Buch
S → NP, NP, NP, V D → das N → Mann
D → dem V → gibt
Beginning with the start symbol (S), symbols are replaced until one reaches a sequence of
symbols only containing words. The set of all strings derived in this way is the language
described by the grammar.
The following are classed as generative-enumerative approaches:
• all phrase structure grammars
• Transformational Grammars in almost all variants
• GPSG in the formalism of Gazdar, Klein, Pullum & Sag (1985)
• many variants of Categorial Grammar
• many variants of TAG
• Chomsky’s Minimalist Grammars
LFG was also originally designed to be a generative grammar.
The opposite of such theories of grammar are model-theoretic or constraint-based
approaches (MTA). MTAs formulate well-formedness conditions on the expressions that
the grammar describes. In Section 6.7, we already discussed a model-theoretic approach
for theories that use feature structures to model phenomena. To illustrate this point,
I will discuss another HPSG example: (2) shows the lexical item for kennst ‘know’. In
the description of (2), it is ensured that the PHON value of the relevant linguistic sign is
⟨ kennst ⟩, that is, this value of PHON is constrained. There are parallel restrictions for
the features given in (2): the SYNSEM value is given. In SYNSEM, there are restrictions on
the LOC and NONLOC value. In CAT, there are individual restrictions for HEAD and COMPS.
The value of COMPS is a list with descriptions of dependent elements. The descriptions
are given as abbreviations here, which actually stand for complex feature descriptions
14 Generative-enumerative vs. model-theoretic approaches
that also consist of feature-value pairs. For the first argument of kennst, a HEAD value
of type noun is required, the PER value in the semantic index has to be second and the
NUM value has to be sg. The structure sharings in (2) are a special kind of constraint.
Values that are not specified in the descriptions of lexical entries can vary in accordance
with the feature geometry given by the type system. In (2), neither the SLASH value of
the nominative NP nor the one of the accusative NP is fixed. This means that SLASH can
either be an empty or non-empty list.
The constraints in lexical items such as (2) interact with further constraints that hold
for the signs of type phrase. For instance, in head-argument structures, the non-head
daughter must correspond to an element from the COMPS list of the head daughter.
Generative-enumerative and model-theoretic approaches view the same problem from
different sides: the generative side only allows what can be generated by a given set of
rules, whereas the model-theoretic approach allows everything that is not ruled out by
constraints.1
Pullum & Scholz (2001: 19–20) and Pullum (2007) list the following model-theoretic
approaches:2
• the non-procedural variant of Transformational Grammar of Lakoff, that formu-
lates constraints on potential tree sequences,
• Johnson and Postal’s formalization of Relational Grammar (1980),
1
Compare this to an old joke: in dictatorships, everything that is not allowed is banned, in democracies,
everything that is not banned is allowed and in France, everything that is banned is allowed. Generative-
enumerative approaches correspond to the dictatorships, model-theoretic approaches are the democracies
and France is something that has no correlate in linguistics.
2
See Pullum (2007) for a historical overview of Model Theoretic Syntax (MTS) and for further references.
510
14.1 Graded acceptability
• GPSG in the variants developed by Gazdar et al. (1988), Blackburn et al. (1993) and
Rogers (1997),
• LFG in the formalization of Kaplan (1989)3 and
3
According to Pullum (2013: Section 3.2), there seems to be a problem for model-theoretic formalizations of
so-called constraining equations.
4
The reader should take note here: there are differing views with regard to how generative-enumerative
and MTS models are best formalized and not all of the assumptions discussed here are compatible with
every formalism. The following sections mirror the important points in the general discussion.
511
14 Generative-enumerative vs. model-theoretic approaches
(4) Studenten stürmen mit Flugblättern und Megafon die Mensa und rufen alle
students storm with flyers and megaphone the canteen and call all
auf zur Vollversammlung in der Glashalle zum kommen. Vielen bleibt das
up to plenary.meeting in the glass.hall to.the come many.DAT stays the
Essen im Mund stecken und kommen sofort mit.5
food in.the mouth stick and come immediately with
‘Students stormed into the university canteen with flyers and a megaphone calling
for everyone to come to a plenary meeting in the glass hall. For many, the food
stuck in their throats and they immediately joined them.’
Chomsky (1975: Chapter 5; 1964b) tried to use a string distance function to determine the
relative acceptability of utterances. This function compares the string of an ungrammat-
ical expression with that of a grammatical expression and assigns an ungrammaticality
score of 1, 2 or 3 according to certain criteria. This treatment is not adequate, however,
as there are much more fine-grained differences in acceptability and the string distance
function also makes incorrect predictions. For examples of this and technical problems
with calculating the function, see Pullum & Scholz (2001: 29).
In model-theoretic approaches, grammar is understood as a system of well-formed-
ness conditions. An expression becomes worse, the more well-formedness conditions it
violates (Pullum & Scholz 2001: 26–27). In (3b), the person and number requirements of
the lexical item for the verb kennst are violated. In addition, the case requirements for
the object have not been fulfilled in (3c). There is a further violation of a linearization
rule for the noun phrase in (3d).
Well-formedness conditions can be weighted in such a way as to explain why certain
violations lead to more severe deviations than others (Sorace & Keller 2005). Further-
more, performance factors also play a role when judging sentences (for more on the dis-
tinction between performance and competence, see Chapter 15). As we will see in Chap-
ter 15, constraint-based approaches work very well as performance-compatible grammar
models. If we combine the relevant grammatical theory with performance models, we
will arrive at explanations for graded acceptability differences owing to performance
factors.
5
Streikzeitung der Universität Bremen, 04.12.2003, p. 2. The emphasis is mine.
512
14.2 Utterance fragments
PP
PP PP[COORD and ]
Conj PP
and P NP
of Det N
the
Figure 14.1: Structure of the fragment and of the following Pullum & Scholz (2001: 32)
513
14 Generative-enumerative vs. model-theoretic approaches
514
14.3 A problem for model-theoretic approaches?
(10) a. Niels and Odette are cousins. They are very smart.
b. The cousins/brothers/sisters are standing over there. They are very smart.
No distinctions are found in plural when it comes to nominal inflection (brothers, sisters,
books). In German, this is different. There are differences with both nominal inflection
and the reference of (some) noun phrases with regard to the sexus of the referent. Ex-
amples of this are the previously mentioned examples Cousin ‘male cousin’ and Cousine
‘female cousin’ as well as forms with the suffix -in as in Kindergärtnerin ‘female nursery
teacher’. However, gender is normally a grammatical notion that has nothing to do with
sexus. An example is the neuter noun Mitglied ‘member’, which can refer to both female
and male persons.
The question that one has to ask when discussing Ten Hacken’s problem is the follow-
ing: does gender play a role for pronominal binding in German? If this is not the case,
then the gender feature is only relevant within the morphology component, and here
the gender value is determined for each noun in the lexicon. For the binding of personal
pronouns, there is no gender difference in German.
(11) Die Schwestern / Brüder / Vereinsmitglieder / Geschwister stehen dort.
the sisters.F brothers.M club.members.N siblings stand there
Sie lächeln.
they smile.
‘The sisters/brothers/club members/siblings are standing there. They are smiling.’
Nevertheless, there are adverbials in German that agree in gender with the noun to which
they refer (Höhle 1983: Chapter 6):
(12) a. Die Fenster wurden eins nach dem anderen geschlossen.
the windows.N were one.N after the other closed
‘The windows were closed one after the other.’
b. Die Türen wurden eine nach der anderen geschlossen.
the doors.F were one.F after the other closed
‘The doors were closed one after the other.’
c. Die Riegel wurden einer nach dem anderen zugeschoben.
the bolts.M were one.M after the other closed
‘The bolts were closed one after the other.’
For animate nouns, it is possible to diverge from the gender of the noun in question and
use a form of the adverbial that corresponds to the biological sex:
(13) a. Die Mitglieder des Politbüros wurden eines / einer nach dem anderen
the members.N of.the politburo were one.N one.M after the other
aus dem Saal getragen.
out.of the hall carried
‘The members of the politburo were carried out of the hall one after the other.’
515
14 Generative-enumerative vs. model-theoretic approaches
516
14.3 A problem for model-theoretic approaches?
gender
Figure 14.2: Type hierarchy for one of the solutions of ten Hacken’s problem
In general, it is clear that cases such as the one constructed by ten Hacken will never be
a problem since there are either values that make sense, or there are contexts for which
there is no value that makes sense and one therefore does not require the features.
So, while ten Hacken’s problem is a non-issue, there are certain problems of a more
technical nature. I have pointed out one such technical problem in Müller (1999b: Sec-
tion 14.4). I show that spurious ambiguities arise for a particular analysis of verbal com-
plexes in German when one resolves the values of a binary feature (FLIP). I also show
how this problem can be avoided by the complicated stipulation of a value in certain
contexts.
Further reading
Pullum & Scholz (2001) is the main reference for a discussion of the model theo-
retic approach in comparison to generative-enumerative approaches.
517
15 The competence/performance
distinction
The distinction between competence and performance (Chomsky 1965: Section 1.1),
which is assumed by several theories of grammar, was already discussed in Section 12.6.3
about the analysis of scrambling and verbal complexes in TAG. Theories of competence
are intended to describe linguistic knowledge and performance theories are assigned
the task of explaining how linguistic knowledge is used as well as why mistakes are
made in speech production and comprehension. A classic example in the competence/
performance discussion are cases of center self-embedding. Chomsky & Miller (1963:
286) discuss the following example with recursively embedded relative clauses:
(1) (the rat (the cat (the dog chased) killed) ate the malt)
(2b) is a corresponding example in German:
(2) a. dass der Hund bellt, der die Katze jagt, die die Maus kennt, die
that the dog.M barks that.M the cat chases that.F the mouse knows who
im Keller lebt
in.the basement lives
‘that the dog that chases the cat that knows the mouse who is living in the
basement is barking’
b. dass er Hund, [1 der die Katze, [2 die die Maus, [3 die im Keller
that the dog that the cat that the mouse who in.the basement
lebt, 3 ] kennt, 2 ] jagt 1 ] bellt
lives knows chases barks
The examples in (1) and (2b) are entirely incomprehensible for most people. If one re-
arranges the material somewhat, it is possible to process the sentences and assign a
meaning to them.1 For sentences such as (2b), it is often assumed that they fall within
1
The sentence in (2a) can be continued following the pattern that was used to create the sentence. For in-
stance by adding die unter der Treppe lebte, die meine Freunde repariert haben ‘who lived under the staircase
which my friends repaired’. This shows that a restriction of the number of elements that depend on one
head to seven (Leiss 2003: 322) does not restrict the set of the sentences that are generated or licensed by
a grammar to be finite. There are at most two dependents of each head in (2a). The extraposition of the
relative clauses allows the hearer to group material into processable and reducible chunks, which reduces
the cognitive burden during processing.
This means that the restriction to seven dependents does not cause a finitization of recursion (“Verend-
lichung von Rekursivität”) as was claimed by Leiss (2003: 322). Leiss argued that Miller could not use his
insights regarding short term memory, since he worked within Transformational Grammar rather than
in Dependency Grammar. The discussion shows that dependency plays an important role, but that linear
order is also important for processing.
15 The competence/performance distinction
our grammatical competence, that is, we possess the knowledge required to assign a
structure to the sentence, although the processing of utterances such as (2b) exceeds
language-independent abilities of our brain. In order to successfully process (2b), we
would have to retain the first five noun phrases and corresponding hypotheses about
the further progression of the sentence in our heads and could only begin to combine
syntactic material when the verbs appear. Our brains become overwhelmed by this task.
These problems do not arise when analyzing (2a) as it is possible to immediately begin
to integrate the noun phrases into a larger unit.
Nevertheless, center self-embedding of relative clauses can also be constructed in such
a way that our brains can handle them. Hans Uszkoreit (p. c. 2009) gives the following
example:
(3) Die Bänke, [1 auf denen damals die Alten des Dorfes, [2 die allen
the benches on which back.then the old.people of.the village that all
Kindern, [3 die vorbeikamen 3 ], freundliche Blicke zuwarfen 2 ], lange Stunden
children that came.by friendly glances gave long hours
schweigend nebeneinander saßen 1 ], mussten im letzten Jahr einem
silent next.to.each.other sat must in.the last year a
Parkplatz weichen.
car.park give.way.to
‘The benches on which the older residents of the village, who used to give friendly
glances to all the children who came by, used to sit silently next to one another
for hours had to give way to a car park last year.’
Therefore, one does not wish to include in the description of our grammatical knowledge
that relative clauses are not allowed to be included inside each other as in (2b) as this
would also rule out (3).
We can easily accept the fact that our brains are not able to process structures past a
certain degree of complexity and also that corresponding utterances then become unac-
ceptable. The contrast in the following examples is far more fascinating:2
(4) a. # The patient [ who the nurse [ who the clinic had hired ] admitted ] met Jack.
b. * The patient who the nurse who the clinic had hired met Jack.
Although (4a) is syntactically well-formed and (4b) is not, Gibson & Thomas (1999) were
able to show that (4b) is rated better by speakers than (4a). It does not occur to some
people that an entire VP is missing. There are a number of explanations for this fact, all
of which in some way make the claim that previously heard words are forgotten as soon
as new words are heard and a particular degree of complexity is exceeded (Frazier 1985:
178; Gibson & Thomas 1999).
Instead of developing grammatical theories that treat (2b) and (4a) as unacceptable
and (3) and (4b) as acceptable, descriptions have been developed that equally allow (2b),
2
See Gibson & Thomas (1999: 227). Frazier (1985: 178) attributes the discovery of this kind of sentences to
Janet Fodor.
520
15.1 The derivational theory of complexity
(3), and (4a) (competence models) and then additionally investigate the way utterances
are processed in order to find out what kinds of structures our brains can handle and
what kinds of structures it cannot. The result of this research is then a performance
model (see Gibson (1998), for example). This does not rule out that there are language-
specific differences affecting language processing. For example, Vasishth, Suckow, Lewis
& Kern (2010) have shown that the effects that arise in center self-embedding structures
in German are different from those that arise in the corresponding English cases such as
(4): due to the frequent occurrence of verb-final structures in German, speakers of Ger-
man were able to better store predictions about the anticipated verbs into their working
memory (p. 558).
Theories in the framework of Categorial Grammar, GB, LFG, GPSG and HPSG are
theories about our linguistic competence.3 If we want to develop a grammatical theory
that directly reflects our cognitive abilities, then there should also be a corresponding
performance model to go with a particular competence model. In the following two
sections, I will recount some arguments from Sag & Wasow (2011) in favor of constraint-
based theories such as GPSG, LFG and HPSG.
3
For an approach where the parser is equated with UG, see Abney & Cole (1986: Section 3.4). For a perfor-
mance-oriented variant of Minimalism, see Phillips (2003).
In Construction Grammar, the question of whether a distinction between competence and performance
would be justified at all is controversially discussed (see Section 10.6.4.9.1). Fanselow, Schlesewsky, Cavar
& Kliegl (1999) also suggest a model – albeit for different reasons – where grammatical properties con-
siderably affect processing properties. The aforementioned authors work in the framework of Optimality
Theory and show that the OT constraints that they assume can explain parsing preferences. OT is not a
grammatical theory on its own but rather a meta theory. It is assumed that there is a component GEN that
creates a set of candidates. A further component EVAL then chooses the most optimal candidate from this
set of candidates. GEN contains a generative grammar of the kind that we have seen in this book. Normally,
a GP/MP variant or also LFG is assumed as the base grammar. If one assumes a transformational theory,
then one automatically has a problem with the Derivational Theory of Complexity that we will encounter
in the following section. If one wishes to develop OT parsing models, then one has to make reference to
representational variants of GB as the aforementioned authors seem to.
521
15 The competence/performance distinction
Theory of Complexity was in fact correct (Chomsky 1976a: 249–250).4 Some years later,
however, most psycholinguists rejected the DTC. For discussion of several experiments
that testify against the DTC, see Fodor, Bever & Garrett (1974: 320–328). One set of
phenomena where the DTC makes incorrect predictions for respective analyses is that
of elliptical constructions, for example (Fodor, Bever & Garrett 1974: 324): in elliptical
constructions, particular parts of the utterance are left out or replaced by auxiliaries. In
transformation-based approaches, it was assumed that (5b) is derived from (5a) by means
of deletion of swims and (5c) is derived from (5b) by inserting do.5
(5) a. John swims faster than Bob swims.
b. John swims faster than Bob.
c. John swims faster than Bob does.
The DTC predicts that (5b) should require more time to process than (5a), since the
analysis of (5b) first requires to build up the structure in (5a) and then delete swims. This
prediction was not confirmed.
Similarly, no difference could be identified for the pairs in (6) and (7) even though one
of the sentences, given the relevant theoretical assumptions, requires more transforma-
tions for the derivation from a base structure (Fodor, Bever & Garrett 1974: 324).
(6) a. John phoned up the girl.
b. John phoned the girl up.
In (6), we are dealing with local reordering of the particle and the object. (7b) contains
a passive clause that should be derived from an active clause under Transformational
Grammar assumptions. If we compare this sentence with an equally long sentence with
4
In the Transformational Grammar literature, transformations were later viewed as a metaphor (Lohnstein
2014: 170, also in Chomsky 2001: Footnote 4), that is, it was no longer assumed to have psycholinguistic
reality. In Derivation by phase and On phases, Chomsky refers once again to processing aspects such as
computational and memory load (Chomsky 2001: 11, 12, 15; 2007: 3, 12; 2008: 138, 145, 146, 155). See also
Marantz (2005: 440) and Richards (2015). Trinh (2011: 17; 2019: 9) cites Chomsky (p.c.) with the following
quote: “As speaking involves cognitive effort, Pronunciation Economy might be derived from the general
principle of minimizing computation.”
A structure building operation that begins with words and is followed by transformations/internal
merge and further combinations, as recently assumed by theories in the Minimalist Program, is psycholin-
guistically implausible for sentence parsing. See Labelle (2007) and Section 15.2 for more on incremental
processing.
Chomsky (2007: 6) (written later than On phases) seems to adopt a constraint-based view. He writes
that “a Merge-based system involves parallel operations” and compares the analysis of an utterance with
a proof and explicitly mentions the competence/performance distinction.
5
Similar analyses are assumed today in the Minimalist Program. For example, Trinh (2011: 63) assumes that
VP ellipsis is deletion at Phonological Form (PF). This means that a complete structure is built which is
then not pronounced. Since he talks about cognitive efforts and computation with respect to the activity
of speaking (p. 17), it follows that he regards the structures he is assuming as congnitively real.
522
15.1 The derivational theory of complexity
an adjective, like (7a), the passive clause should be more difficult to process. This is,
however, not the case.
It is necessary to add two qualifications to Sag & Wasow’s claims: if one has experi-
mental data that show that the DTC makes incorrect predictions for a particular analy-
sis, this does not necessarily mean that the DTC has been disproved. One could also try
to find a different analysis for the phenomenon in question. For example, instead of a
transformation that deletes material, one could assume empty elements for the analysis
of elliptical structures that are inserted directly into the structure without deleting any
material (see page 68 for the assumption of an empty nominal head in structures with
noun ellipsis in German). Data such as (5) would then be irrelevant to the discussion.6
However, reordering such as (6b) and the passive in (7b) are the kinds of phenomena
that are typically explained using transformations.
The second qualification pertains to analyses for which there is a representational
variant: it is often said that transformations are simply metaphors (Jackendoff 2000:
22–23; 2007: 5, 20): for example, we have seen that extractions with a transformational
grammar yield structures that are similar to those assumed in HPSG. Figure 15.1 shows
cyclic movement in GB theory compared to the corresponding HPSG analysis.
CP
CP/NP
′
NP C
C VP/NP
_𝑖 C VP
NP V′/NP
NP V′
V NP/NP
V NP
_𝑖
_𝑖
In GB, an element is moved to the specifier position of CP (SpecCP) and can then be
moved from there to the next higher SpecCP position.
(8) a. Chris𝑖 , we think [CP _𝑖 Anna claims [CP _𝑖 that David saw _𝑖 ]]. (GB)
b. Chris𝑖 , we think [CP/NP Anna claims [CP/NP that David saw _𝑖 ]]. (HPSG)
In HPSG, the same effect is achieved by structure sharing. Information about a long-
distance dependency is not located in the specifier node but rather in the mother node
6
Culicover & Jackendoff (2005: Chapters 1 and 7) argue in favor of analyzing ellipsis as a semantic or prag-
matic phenomenon rather than a syntactic one anyway.
523
15 The competence/performance distinction
of the projection itself. In Section 19.2, I will discuss various ways of eliminating empty
elements from grammars. If we apply these techniques to structures such as the GB
structure in Figure 15.1, then we arrive at structures where information about missing
elements is integrated into the mother node (CP) and the position in SpecCP is unfilled.
This roughly corresponds to the HPSG structure in Figure 15.1.7 It follows from this that
there are classes of phenomena that can be spoken about in terms of transformations
without expecting empirical differences with regard to performance when compared to
transformation-less approaches. However, it is important to note that we are dealing
with an S-structure in the left-hand tree in Figure 15.1. As soon as one assumes that this
is derived by moving constituents out of other structures, this equivalence of approaches
disappears.
524
15.2 Incremental processing
utterance of even more complexity before it is possible to conclude anything about the
meaning of a phrase/utterance. In particular, analyses in the Minimalist Program which
assume that only entire phrases or so-called phases8 are interpreted (see Chomsky 1999
and also Marantz 2005: 441, who explicitly contrasts the MP to Categorial Grammar)
must therefore be rejected as inadequate from a psycholinguistic perspective.9,10
With contrastive emphasis of individual adjectives in complex noun phrases (e.g., the
BIG blue triangle), hearers assumed that there must be a corresponding counterpart to
the reference object, e.g., a small blue triangle. The eye-tracking studies carried out by
Tanenhaus et al. (1996) have shown that taking this kind of information into account
results in objects being identified more quickly.
Similarly, Arnold et al. (2004) have shown, also using eye-tracking studies, that hear-
ers tend to direct their gaze to previously unmentioned objects if the interlocutor inter-
rupts their speech with um or uh. This can be traced back to the assumption that hearers
assume that describing previously unmentioned objects is more complex than referring
to objects already under discussion. The speaker can create more time for himself by
using um or uh.
Examples such as those above constitute evidence for approaches that assume that
when processing language, information from all available channels is used and that this
information is also used as soon as it is available and not only after the structure of the
entire utterance or complete phrase has been constructed. The results of experimental
research therefore show that the hypothesis of a strictly modular organization of linguis-
tic knowledge must be rejected. Proponents of this hypothesis assume that the output
of one module constitutes the input of another without a given module having access to
the inner states of another module or the processes taking place inside it. For example,
the morphology module could provide the input for syntax and then this would be pro-
cessed later by the semantic module. One kind of evidence for this kind of organization
of linguistic knowledge that is often cited are so-called garden path sentences such as (9):
(9) a. The horse raced past the barn fell.
b. The boat floated down the river sank.
The vast majority of English speakers struggle to process these sentences since their
parser is led down a garden path as it builds up a complete structure for (10a) or (10b) only
then to realize that there is another verb that cannot be integrated into this structure.
8
Usually, only CP and vP are assumed to be phases.
9
Sternefeld (2006: 729–730) points out that in theories in the Minimalist Program, the common assumption
of uninterpretable features is entirely unjustified. Chomsky assumes that there are features that have to
be deleted in the course of a derivation since they are only relevant for syntax. If they are not checked, the
derivation crashes at the interface to semantics. It follows from this that NPs should not be interpretable
under the assumptions of these theories since they contain a number of features that are irrelevant for the
semantics and have to therefore be deleted (see Section 4.1.2 of this book and Richards 2015). As we have
seen, these kinds of theories are incompatible with the facts.
10
It is sometimes claimed that current Minimalist theories are better suited to explain production (generation)
than perception (parsing). But these models are as implausible for generation as they are for parsing. The
reason is that it is assumed that there is a syntax component that generates structures that are then shipped
to the interfaces. This is not what happens in generation though. Usually speakers know what they want
to say (at least partly), that is, they start with semantics.
525
15 The competence/performance distinction
Pulman (1985), Stabler (1991) and Shieber & Johnson (1993: 301–308) have shown, how-
ever, that it is possible to build semantic structures incrementally, using the kind of
526
15.2 Incremental processing
phrase structure grammars we encountered in Chapter 2. This means that a partial se-
mantic representation for the string das britische ‘the British’ can be computed with-
out having to assume that the two words form a constituent in (14). Therefore, one
does not necessarily need a grammar that licenses the immediate combination of words
directly. Furthermore, Shieber & Johnson (1993) point out that from a purely techni-
cal point of view, synchronous processing is more costly than asynchronous process-
ing since synchronous processing requires additional mechanisms for synchronization
whereas asynchronous processing processes information as soon as it becomes avail-
able (p. 297–298). Shieber and Johnson do not clarify whether this also applies to syn-
chronous/asynchronous processing of syntactic and semantic information. See Shieber
& Johnson (1993) for incremental processing and for a comparison of Steedman’s Cate-
gorial Grammar and TAG.
What kind of conclusions can we draw from the data we have previously discussed?
Are there further data that can help to determine the kinds of properties a theory of
grammar should have in order to count as psycholinguistically plausible? Sag, Wasow
& Bender (2003) and Sag & Wasow (2011; 2015) list the following properties that a per-
formance-compatible competence grammar should have:11
• surface-oriented
• model-theoretic and therefore constraint-based
• sign-based organization
• strictly lexicalist
• representational underspecification of semantic information
Approaches such as CG, GPSG, LFG, HPSG, CxG and TAG are surface-oriented since
they do not assume a base structure from which other structures are derived via trans-
formations. Transformational approaches, however, require additional assumptions.12
This will be briefly illustrated in what follows. In Section 3.1.5, we encountered the fol-
lowing analysis of English interrogatives:
(15) [CP What𝑖 [C′ will𝑘 [IP Ann [I′ _𝑘 [VP read _𝑖 ]]]]].
11
Also, see Jackendoff (2007) for reflections on a performance model for a constraint-based, surface-oriented
linguistic theory.
12
An exception among transformational approaches is Phillips (2003). Phillips assumes that structures rel-
evant for phenomena such as ellipsis, coordination and fronting are built up incrementally. These con-
stituents are then reordered in later steps by transformations. For example, in the analysis of (i), the string
Wallace saw Gromit in forms a constituent where in is dominated by a node with the label P(P). This node
is then turned into a PP in a subsequent step (p. 43–44).
While this approach is a transformation-based approach, the kind of transformation here is very idiosyn-
cratic and incompatible with other variants of the theory. In particular, the modification of constituents
contradicts the assumption of Structure Preservation when applying transformations as well as the No
Tampering Condition of Chomsky (2008). Furthermore, the conditions under which an incomplete string
such as Wallace saw Gromit in forms a constituent are not entirely clear.
527
15 The competence/performance distinction
This structure is derived from (16a) by two transformations (two applications of Move-𝛼):
(16) a. Ann will read what?
b. * Will Ann read what
The first transformation creates the order in (16b) from (16a), and the second creates (15)
from (16b).
When a hearer processes the sentence in (15), he begins to build structure as soon as
he hears the first word. Transformations can, however, only be carried out when the
entire utterance has been heard. One can, of course, assume that hearers process surface
structures. However, since – as we have seen – they begin to access semantic knowledge
early into an utterance, this begs the question of what we need a deep structure for at
all.
In analyses such as those of (15), deep structure is superfluous since the relevant in-
formation can be reconstructed from the traces. Corresponding variants of GB have
been proposed in the literature (see page 123). They are compatible with the require-
ment of being surface-oriented. Chomsky (1981a: 181; 1986a: 49) and Lasnik & Saito
(1992: 59–60) propose analyses where traces can be deleted. In these analyses, the deep
structure cannot be directly reconstructed from the surface structure and one requires
transformations in order to relate the two. If we assume that transformations are applied
‘online’ during the analysis of utterances, then this would mean that the hearer would
have to keep a structure derived from previously heard material as well as a list of pos-
sible transformations during processing in his working memory. In constraint-based
grammars, entertaining hypotheses about potential upcoming transformation steps is
not necessary since there is only a single surface structure that is processed directly.
At present, it is still unclear whether it is actually possible to distinguish between these
models empirically. But for Minimalist models with a large number of movements (see
Figure 4.20 on page 149, for example), it should be clear that they are unrealistic since
storage space is required to manage the hypotheses regarding such movements and we
know that such short-term memory is very limited in humans.
Frazier & Clifton (1996: 27) assume that a transformation-based competence grammar
yields a grammar with pre-compiled rules or rather templates that is then used for pars-
ing. Therefore, theorems derived from UG are used for parsing and not axioms of UG
directly. Johnson (1989) also suggests a parsing system that applies constraints from dif-
ferent sub-theories of GB as early as possible. This means that while he does assume
the levels of representation D-structure, S-structure, LF and PF, he specifies the relevant
constraints (X theory, Theta-Theory, Case Theory, …) as logical conditions that can be
reorganized, then be evaluated in a different but logically equivalent order and be used
for structure building.13 Chomsky (2007: 6) also compares human parsing to working
through a proof, where each step of the proof can be carried out in different orders.
This view does not assume the psychological reality of levels of grammatical representa-
tion when processing language, but simply assumes that principles and structures play
13
Stabler (1992: Section 15.7) also considers a constraint-based view, but arrives at the conclusion that parsing
and other linguistic tasks should use the structural levels of the competence theory. This would again pose
problems for the DTC.
528
15.2 Incremental processing
a role when it comes to language acquisition. As we have seen, the question of whether
we need UG to explain language acquisition was not yet decided in favor of UG-based
approaches. Instead, all available evidence seems to point in the opposite direction. How-
ever, even if innate linguistic knowledge does exist, the question arises as to why one
would want to represent this as several structures linked via transformations when it is
clear that these do not play a role for humans (especially language learners) when pro-
cessing language. Approaches that can represent this knowledge using fewer technical
means, e.g., without transformations, are therefore preferable. For more on this point,
see Kuhn (2007: 615).
The requirement for constraint-based grammars is supported by incremental process-
ing and also by the ability to deduce what will follow from previously heard material.
Stabler (1991) has pointed out that Steedman’s (1989b) argumentation with regard to
incrementally processable grammars is incorrect, and instead argues for maintaining a
modular view of grammar. Stabler has developed a constraint-based grammar where syn-
tactic and semantic knowledge can be accessed at any time. He formulates both syntactic
structures and the semantic representations attached to them as conjoined constraints
and then presents a processing system that processes structures based on the availability
of parts of syntactic and semantic knowledge. Stabler rejects models of performance that
assume that one must first apply all syntactic constraints before the semantic ones can
be applied. If one abandons this strict view of modularity, then we arrive at something
like (17):
Syn1 –Syn𝑛 stand for syntactic rules or constraints and Sem1 –Sem𝑛 stand for semantic
rules or constraints. If one so desires, the expressions in brackets can be referred to as
modules. Since it is possible to randomly reorder conjoined expressions, one can imagine
performance models that first apply some rules from the syntax module and then, when
enough information is present, respective rules from the semantic module. The order of
processing could therefore be as in (18), for example:
(18) Syn2 ∧ Sem1 ∧ Syn1 ∧ … ∧ Syn𝑛 ∧ Sem2 ∧ … ∧ Sem𝑛
If one subscribes to this view of modularity, then theories such as HPSG or CxG also
have a modular structure. In the representation assumed in the HPSG variant of Pollard
& Sag (1987) and Sign-Based CxG (see Section 10.6.2), the value of SYN would correspond
to the syntax module, the value of SEM to the semantic module and the value of PHON
to the phonology module. If one were to remove the respective other parts of the lexical
entries/dominance schemata, then one would be left with the part of the theory corre-
sponding exactly to the level of representation in question.14 Jackendoff (2000) argues
14
In current theories in the Minimalist Program, an increasing amount of morphological, syntactic, semantic
and information-structural information is being included in analyses (see Section 4.6.1). While there are
suggestions for using feature-value pairs (Sauerland & Elbourne 2002: 290–291), a strict structuring of
information as in GPSG, LFG, HPSG, CxG and variants of CG and TAG is not present. This means that
there are the levels for syntax, Phonological Form and Logical Form, but the information relevant for these
levels is an unstructured part of syntax, smeared all over syntactic trees.
529
15 The competence/performance distinction
for this form of modularity with the relevant interfaces between the modules for phonol-
ogy, syntax, semantics and further modules from other areas of cognition. Exactly what
there is to be gained from assuming these modules and how these could be proved empir-
ically remains somewhat unclear to me. For skepticism with regard to the very concept
of modules, see Jackendoff (2000: 22, 27). For more on interfaces and modularization in
theories such as LFG and HPSG, see Kuhn (2007).
Furthermore, Sag & Wasow (2015: 53–54) argue that listeners often leave semantic
interpretation underspecified until enough information is present either in the utterance
itself or the context. They do not commit to a certain reading early and run into garden
paths or backtrack to other readings. This is modeled appropriately by theories that use
a variant of underspecified semantics. For a concrete example of underspecification in
semantics see Section 19.3.
In conclusion, we can say that surface-oriented, model-theoretic and strongly lexi-
calist grammatical theories such as CG, LFG, GPSG, HPSG, CxG and the correspond-
ing GB/MP variants (paired with appropriate semantic representations) can plausibly
be combined with processing models, while this is not the case for the overwhelming
majority of GB/MP theories.
530
16 Language acquisition
Linguists and philosophers are fascinated by the human ability to acquire language. As-
suming the relevant input during childhood, language acquisition normally takes place
completely effortlessly. Chomsky (1965: 24–25) put forward the requirement that a gram-
matical theory must provide a plausible model of language acquisition. Only then could
it actually explain anything and would otherwise remain descriptive at best. In this sec-
tion, we will discuss theories of acquisition from a number of theoretical standpoints.
1
See Haider (1994) and Haider (2001: Section 2.2) for an overview. Haider assumes that there is at least
a correlation between the absence of expletive subjects and pro-drop. However, Galician is a pro-drop
language with expletive subject pronouns (Raposo & Uriagereka 1990: Section 2.5). Franks (1995: 314) cites
Upper and Lower Sorbian as pro-drop languages with expletive subjects. Scholz & Pullum (2002: 218)
point out that there is an expletive pronoun ci in modern Italian although Italian is classed as a pro-drop
language.
16 Language acquisition
however, not the case (Bloom 1993: 731). Fodor (1998a: 343–344) also notes the following
three problems: 1) Parameters can affect things that are not visible from the perceptible
constituent order. 2) Many sentences are ambiguous with regard to the setting of a
particular parameter, that is, there are sometimes multiple combinations of parameters
compatible with one utterance. Therefore, the respective utterances cannot be used to
set any parameters (Berwick & Niyogi 1996; Fodor 1998b). 3) There is a problem with
the interaction of parameters. Normally multiple parameters play a role in an utterance
such that it can be difficult to determine which parameter contributes what and thus
how the values should be determined.
Points 1) and 2) can be explained using the constituent order parameters of Gibson &
Wexler: imagine a child hears sentences such as the English and the German examples
in (2):
(2) a. Daddy drinks juice.
b. Papa trinkt Saft.
daddy drinks juice
These sentences look exactly the same, even though radically different structures are
assumed for each. According to the theories under discussion, the English sentence
has the structure shown in Figure 3.9 on page 100 given in abbreviated form in (3a).
The German sentence, on the other hand, has the structure in Figure 3.14 on page 108
corresponding to (3b):
(3) a. [IP [Daddy [I′ _𝑘 [VP drinks𝑘 juice]]].
b. [CP Papa𝑖 [C′ trinkt𝑘 [IP _𝑖 [I′ [VP Saft _𝑘 ] _𝑘 ]]]].
English has the basic constituent order SVO. The verb forms a constituent with the object
(VP) and this is combined with the subject. The parameter setting must therefore be SV,
VO and −V2. German, on the other had, is analyzed as a verb-final and verb-second
language and the parameter values would therefore have to be SV, OV and +V2. If we
consider the sentences in (2), we see that both sentences do not differ from one another
with regard to the order of the verb and its arguments.
Fodor (1998a,b) concludes from this that one first has to build a structure in order to
see what grammatical class the grammar licensing the structure belongs to since one
first needs the structure in (3b) in order to be able to see that the verb in the partial
constituent occurs after its argument in the VP (Saft _𝑘 ). The question is now how one
achieves this structure. A UG with 30 parameters corresponds to 230 = 1,073,741,824
fully instantiated grammars. It is an unrealistic assumption that children try out these
grammars successively or simultaneously.
Gibson & Wexler (1994) discuss a number of solutions for this problem: parameters
have a default value and the learner can only change a parameter value if a sentence
that could previously not be analyzed can then be analyzed with the new parameter
setting (Greediness Constraint). In this kind of procedure, only one parameter can be
changed at a time (Single Value Constraint), which aims at ruling out great leaps leading
to extremely different grammars (see Berwick & Niyogi 1996: 612–613, however). This
reduces the processing demands, however with 40 parameters, the worst case could still
be that one has to test 40 parameter values separately, that is, try to parse a sentence with
533
16 Language acquisition
40 different grammars. This processing feat is still unrealistic, which is why Gibson &
Wexler (1994: 442) additionally assume that one hypothesis is tested per input sentence.
A further modification of the model is the assumption that certain parameters only begin
to play a role during the maturation of the child. At a given point in time, there could be
only a few accessible parameters that also need to be set. After setting these parameters,
new parameters could become available.
In their article, Gibson & Wexler show that the interaction between input and pa-
rameter setting is in no way trivial. In their example scenario with three parameters, a
situation can arise in which a learner sets a parameter in order to analyze a new sen-
tence, however setting this parameter leads to the fact that the target grammar cannot
be acquired because only one value can be changed at a time and changes can only be
made if more sentences can be analyzed than before. The learner reaches a so-called
local maximum in these problematic cases.2 Gibson & Wexler then suggest assigning
a default value to particular parameters, whereby the default value is the one that will
cause the learner to avoid problematic situations. For the V2 parameter, they assume ‘−’
as the default value.
Berwick & Niyogi (1996) show that Gibson & Wexler calculated the problematic con-
ditions incorrectly and that, if one shares their assumptions, it is even more frequently
possible to arrive at parameter combinations from which it is not possible to reach the
target grammar by changing individual parameter values. They show that one of the
problematic cases not addressed by Gibson & Wexler is −V2 (p. 609) and that the assump-
tion of a default value for a parameter does not solve the problem as both ‘+’ and ‘–’ can
lead to problematic combinations of parameters.3 In their article, Berwick and Niyogi
show that learners in the example scenario above (with three parameters) learn the tar-
get grammar faster if one abandons the Greediness or else the Single Value Constraint.
They suggest a process that simply randomly changes one parameter if a sentence can-
not be analyzed (Random Step, p. 615–616). The authors note that this approach does not
share the problems with the local maxima that Gibson & Wexler had in their example
and that it also reaches its goal faster than theirs. However, the fact that Random Step
converges more quickly has to do with the quality of the parameter space (p. 618). Since
there is no consensus about parameters in the literature, it is not possible to assess how
the entire system works.
Yang (2004: 453) has criticized the classic Principles & Parameters model since abrupt
switching between grammars after setting a parameter cannot be observed. Instead, he
proposes the following learning mechanism:
(4) For an input sentence, 𝑠, the child: (i) with probability P𝑖 selects a grammar G𝑖 , (ii)
analyzes 𝑠 with G𝑖 , (iii) if successful, reward G𝑖 by increasing P𝑖 , otherwise punish
G𝑖 by decreasing P𝑖 .
2
If one imagines the acquisition process as climbing a hill, then the Greediness Constraint ensures that one
can only go uphill. It could be the case, however, that one begins to climb the wrong hill and can no longer
get back down.
3
Kohl (1999; 2000) has investigated this acquisition model in a case with twelve parameters. Of the 4096
possible grammars, 2336 (57%) are unlearnable if one assumes the best initial values for the parameters.
534
16.1 Principles & Parameters
Yang discusses the example of the pro-drop and topic drop parameters. In pro-drop
languages (e.g., Italian), it is possible to omit the subject and in topic drop languages (e.g.,
Mandarin Chinese), it possible to omit both the subject and the object if it is a topic. Yang
compares English-speaking and Chinese-speaking children noting that English children
omit both subjects and objects in an early linguistic stage. He claims that the reason for
this is that English-speaking children start off using the Chinese grammar.
The pro-drop parameter is one of the most widely discussed parameters in the con-
text of Principles & Parameters theory and it will therefore be discussed in more detail
here. It is assumed that speakers of English have to learn that all sentences in English
require a subject, whereas speakers of Italian learn that subjects can be omitted. One
can observe that children learning both English and Italian omit subjects (German chil-
dren too in fact). Objects are also omitted notably more often than subjects. There are
two possible explanations for this: a competence-based one and a performance-based
one. In competence-based approaches, it is assumed that children use a grammar that
allows them to omit subjects and then only later acquire the correct grammar (by set-
ting parameters or increasing the rule apparatus). In performance-based approaches,
by contrast, the omission of subjects is traced back to the fact that children are not yet
capable of planning and producing long utterances due to their limited brain capacity.
Since the cognitive demands are greatest at the beginning of an utterance, this leads to
subjects beings increasingly left out. Valian (1991) investigated these various hypotheses
and showed that the frequency with which children learning English and Italian respec-
tively omit subjects is not the same. Subjects are omitted more often than objects. She
therefore concludes that competence-based explanations are not empirically adequate.
The omission of subjects should then be viewed more as a performance phenomenon
(see also Bloom 1993). Another argument for the influence of performance factors is the
fact that articles of subjects are left out more often than articles of objects (31% vs. 18%,
see Gerken 1991: 440). As Bloom notes, no subject article-drop parameter has been pro-
posed so far. If we explain this phenomenon as a performance phenomenon, then it is
also plausible to assume that the omittance of complete subjects is due to performance
issues.
Gerken (1991) shows that the metrical properties of utterances also play a role: in
experiments where children had to repeat sentences, they omitted the subject/article of
the subject more often than the object/article of the object. Here, it made a difference
whether the intonation pattern was iambic (weak-strong) or trochaic (strong-weak). It
can even be observed with individual words that children leave out weak syllables at the
beginning of words more often than at the end of the word. Thus, it is more probable
that “giRAFFE” is reduced to “RAFFE” than “MONkey” to “MON”. Gerken assumes the
following for the metrical structure of utterances:
1. A metrical foot contains one and only one strong syllable.
535
16 Language acquisition
Subject pronouns in English are sentence-initial and form a iambic foot with the follow-
ing strongly emphasized verb as in (5a). Object pronouns, however, can form the weak
syllable of a trochaic foot as in (5b).
(5) a. she KISSED + the DOG
b. the DOG + KISSED her
c. PETE + KISSED the + DOG
Furthermore, articles in iambic feet as in the object of (5a) and the subject of (5b) are
omitted more often than in trochaic feet such as with the object of (5c).
It follows from this that there are multiple factors that influence the omission of ele-
ments and that one cannot simply take the behavior of children as evidence for switching
between two grammars.
Apart from what has been discussed so far, the pro-drop parameter is of interest for
another reason: there is a problem when it comes to setting parameters. The standard
explanation is that learners identify that a subject must occur in all English sentences,
which is suggested by the appearance of expletive pronouns in the input.
As discussed on page 531, there is no relation between the pro-drop property and the
presence of expletives in a language. Since the pro-drop property does not correlate with
any of the other putative properties either, only the existence of subject-less sentences
in the input constitutes decisive evidence for setting a parameter. The problem is that
there are grammatical utterances where there is no visible subject. Examples of this are
imperatives such as (6), declaratives with a dropped subject as in (7a) and even declarative
sentences without an expletive such as the example in (7b) found by Valian (1991: 32) in
the New York Times.
(6) a. Give me the teddy bear!
b. Show me your toy!
The following title of a Nirvana song also comes from the same year as Valian’s article:
(8) Smells like Teen Spirit.
Teen Spirit refers to a deodorant and smell is a verb that, both in German and English,
requires a referential subject but can also be used with an expletive it as subject. The us-
age that Kurt Cobain had in mind cannot be reconstructed4 , independent of the intended
meaning, however, the subject in (8) is missing. Imperatives do occur in the input chil-
dren have and are therefore relevant for acquisition. Valian (1991: 33) says the following
about them:
4
See http://de.wikipedia.org/wiki/Smells_Like_Teen_Spirit. 2018-02-20.
536
16.1 Principles & Parameters
What is acceptable in the adult community forms part of the child’s input, and is
also part of what children must master. The utterances that I have termed “accept-
able” are not grammatical in English (since English does not have pro subjects,
and also cannot be characterized as a simple VP). They lack subjects and therefore
violate the extended projection principle (Chomsky 1981a), which we are assuming.
Children are exposed to fully grammatical utterances without subjects, in the form
of imperatives. They are also exposed to acceptable utterances which are not fully
grammatical, such as [(7a)], as well as forms like, “Want lunch now?” The Amer-
ican child must grow into an adult who not only knows that overt subjects are
grammatically required, but also knows when subjects can acceptably be omitted.
The child must not only acquire the correct grammar, but also master the discourse
conditions that allow relaxation of the grammar. (Valian 1991: 33)
This passage turns the relations on their head: we cannot conclude from the fact that a
particular grammatical theory is not compatible with certain data, that these data should
not be described by this theory, instead we should modify the incompatible grammar or,
if this is not possible, we should reject it. Since utterances with imperatives are entirely
regular, there is no reason to categorize them as utterances that do not follow grammat-
ical rules. The quotation above represents a situation where a learner has to acquire
two grammars: one that corresponds to the innate grammar and a second that partially
suppresses the rules of innate grammar and also adds some additional rules.
The question we can pose at this point is: how does a child distinguish which of the
data it hears are relevant for which of the two grammars?
Fodor (1998a: 347) pursues a different analysis that does not suffer from many of the
aforementioned problems. Rather than assuming that learners try to find a correct gram-
mar among a billion others, she instead assumes that children work with a single gram-
mar that contains all possibilities. She suggests using parts of trees (treelets) rather than
parameters. These treelets can also be underspecified and in extreme cases, a treelet can
consist of a single feature (Fodor 1998b: 6). A language learner can deduce whether a
language has a given property from the usage of a particular treelet. As an example, she
provides a VP treelet consisting of a verb and a prepositional phrase. This treelet must be
used for the analysis of the VP occurring in Look at the frog. Similarly, the analysis of an
interrogative clause with a fronted who would make use of a treelet with a wh-NP in the
specifier of a complementizer phrase (see Figure 3.7 on page 99). In Fodor’s version of
Principles and Parameters Theory, this treelet would be the parameter that licenses wh-
movement in (overt) syntax. Fodor assumes that there are defaults that allow a learner
to parse a sentence even when no or very few parameters have been set. This allows one
to learn from utterances that one would have not otherwise been able to use since there
would have been multiple possible analyses for them. Assuming a default can lead to
misanalyses, however: due to a default value, a second parameter could be set because
an utterance was analyzed with a treelet t1 and t3 , for example, but t1 was not suited to
the particular language in question and the utterance should have instead been analyzed
with the non-default treelet t2 and the treelet t17 . In this acquisition model, there must
therefore be the possibility to correct wrong decisions in the parameter setting process.
537
16 Language acquisition
Fodor therefore assumes that there is a frequency-based degree of activation for param-
eters (p. 365): treelets that are often used in analyses have a high degree of activation,
whereas those used less often have a lower degree of activation. In this way, it is not
necessary to assume a particular parameter value while excluding others.
Furthermore, Fodor proposes that parameters should be structured hierarchically, that
is, only if a parameter has a particular value does it then make sense to think about
specific other parameter values.
Fodor’s analysis is – as she herself notes (Fodor 2001: 385) – compatible with theories
such as HPSG and TAG. Pollard & Sag (1987: 147) characterize UG as the conjunction of
all universally applicable principles:
(9) UG = P1 ∧ P2 ∧ … ∧ P𝑛
As well as principles that hold universally, there are other principles that are specific
to a particular language or a class of languages. Pollard & Sag give the example of the
constituent ordering principle that only holds for English. English can be characterized
as follows if one assumes that P𝑛+1 –P𝑚 are language-specific principles, L1 –L𝑝 a complete
list of lexical entries and R1 –R𝑞 a list of dominance schemata relevant for English.
(10) English = P1 ∧ P2 ∧ … ∧ P𝑚 ∧ (L1 ∨ … ∨ L𝑝 ∨ R1 ∨ … ∨ R𝑞 )
In Pollard & Sag’s conception, only those properties of language that equally hold for all
languages are part of UG. Pollard & Sag do not count the dominance schemata as part
of this. However, one can indeed also describe UG as follows:
(11) UG = P1 ∧ P2 ∧ . . . ∧ P𝑛 ∧ (Ren-1 ∨ . . . ∨ Ren-𝑞 ∨ Rde-1 ∨ . . . ∨ Rde-𝑟 ∨ . . .)
P1 –P𝑛 are, as before, universally applicable principles and Ren-1 –Ren-𝑞 are the (core) dom-
inance schemata of English and Rde-1 –Rde-𝑟 are the dominance schemata in German. The
dominance schemata in (11) are combined by means of disjunctions, that is, not every
disjunct needs to have a realization in a specific language. Principles can make reference
to particular properties of lexical entries and rule out certain phrasal configurations. If
a language only contains heads that are marked for final-position in the lexicon, then
grammatical rules that require a head in initial position as their daughter can never be
combined with these heads or their projections. Furthermore, theories with a type sys-
tem are compatible with Fodor’s approach to language acquisition because constraints
can easily be underspecified. As such, constraints in UG do not have to make reference
to all properties of grammatical rules: principles can refer to feature values, the lan-
guage-specific values themselves do not have to already be contained in UG. Similarly, a
supertype describing multiple dominance schemata that have similar but language-spe-
cific instantiations can also be part of UG, however the language-specific details remain
open and are then deduced by the learner upon parsing (see Ackerman & Webelhuth
1998: Section 9.2). The differences in activation assumed by Fodor can be captured by
weighting the constraints: the dominance schemata Ren-1 –Ren-𝑞 etc. are sets of feature-
value pairs as well as path equations. As explained in Chapter 15, weights can be added
to such constraints and also to sets of constraints. In Fodor’s acquisition model, given
a German input, the weights for the rules of English would be reduced and those for
538
16.2 Principles and the lexicon
the German rules would be increased. Note that in Pollard & Sag’s acquisition scenario,
there are no triggers for parameter setting unlike in Fodor’s model. Furthermore, prop-
erties that were previously disjunctively specified as part of UG will now be acquired
directly. Using the treelet t17 (or rather a possibly underspecified dominance schema),
it is not the case that the value ‘+’ is set for a parameter P5 but rather the activation
potential of t17 is increased such that t17 will be prioritized for future analyses.
539
16 Language acquisition
then part of the periphery. Critics of the Principles & Parameters model have pointed
out that idiomatic and irregular constructions constitute a relatively large part of our
language and that the distinction, both fluid and somewhat arbitrary, is only motivated
theory-internally (Jackendoff 1997: Chapter 7; Culicover 1999; Ginzburg & Sag 2000: 5;
Newmeyer 2005: 48; Kuhn 2007: 619). For example, it is possible to note that there are
interactions between various idioms and syntax (Nunberg, Sag & Wasow 1994). Most
idioms in German with a verbal component allow the verb to be moved to initial posi-
tion (12b), some allow that parts of idioms can be fronted (12c) and some can undergo
passivization (12d).
It is assumed that the periphery and lexicon are not components of UG (Chomsky 1986b:
150–151; Fodor 1998a: 343) but rather are acquired using other learning methods – namely
inductively directly from the input. The question posed by critics is now why these
5
Frankfurter Rundschau, 28.06.1997, p. 2.
6
Mannheimer Morgen, 28.06.1999, Sport; Schrauben allein genügen nicht.
540
16.3 Pattern-based approaches
methods should not work for regular aspects of the language as well (Abney 1996: 20;
Goldberg 2003a: 222; Newmeyer 2005: 100; Tomasello 2006c: 36; 2006b: 20): the areas
of the so-called ‘core’ are by definition more regular then components of the periphery,
which is why they should be easier to learn.
Tomasello (2000; 2003) has pointed out that a Principles & Parameters model of lan-
guage acquisition is not compatible with the observable facts. The Principles and Param-
eters Theory predicts that children should no longer make mistakes in a particular area
of grammar once they have set a particular parameter correctly (see Chomsky 1986b:
146, Radford 1990: 21–22 and Lightfoot 1997: 175). Furthermore, it is assumed that a
parameter is responsible for very different areas of grammar (see the discussion of the
pro-drop parameter in Section 16.1). When a parameter value is set, then there should be
sudden developments with regard to a number of phenomena (Lightfoot 1997: 174). This
is, however, not the case. Instead, children acquire language from utterances in their
input and begin to generalize from a certain age. Depending on the input, they can re-
order certain auxiliaries and not others, although movement of auxiliaries is obligatory
in English.7 One argument put forward against these kinds of input-based theories is
that children produce utterances that cannot be observed to a significant frequency in
the input. One much discussed phenomenon of this kind are so called root infinitives (RI)
or optional infinitives (OI) (Wexler 1998). These are infinitive forms that can be used in
non-embedded clauses (root sentences) instead of a finite verb. Optional infinitives are
those where children use both a finite (13a) and non-finite (13b) form (Wexler 1998: 59):
(13) a. Mary likes ice cream.
b. Mary like ice cream.
Wijnen, Kempen & Gillis (2001: 656) showed that Dutch children use the order object
infinitive 90 % of the time during the two-word phase although these orders occur in less
than 10 % of their mother’s utterances that contained a verb. Compound verb forms, e.g.,
with a modal in initial position as in (14) that contain another instance of this pattern
only occurred in 30 % of the input containing a verb (Wijnen, Kempen & Gillis 2001: 647).
(14) Willst du Brei essen?
want you porridge eat
‘Do you want to eat porridge?’
At first glance, there seems to be a discrepancy between the input and the child’s ut-
terances. However, this deviation could also be explained by an utterance-final bias in
learning (Wijnen et al. 2001; Freudenthal, Pine & Gobet 2006). A number of factors can
be made responsible for the salience of verbs at the end of an utterance: 1) restrictions of
the infant brain. It has been shown that humans (both children and adults) forget words
during the course of an utterance, that is, the activation potential decreases. Since the
cognitive capabilities of small children are restricted, it is clear why elements at the
end of an utterance have an important status. 2) Easier segmentation at the end of an
7
Here, Yang’s suggestion to combine grammars with a particular probability does not help since one would
have to assume that the child uses different grammars for different auxiliaries, which is highly unlikely.
541
16 Language acquisition
utterance. At the end of an utterance, part of the segmentation problem for hearers disap-
pears: the hearer first has to divide a sequence of phonemes into individual words before
he can understand them and combine them to create larger syntactic entities. This seg-
mentation is easier at the end of an utterance since the word boundary is already given
by the end of the utterance. Furthermore according to Wijnen, Kempen & Gillis (2001:
637), utterance-final words have an above average length and do bear a pitch accent.
This effect occurs more often in language directed at children.
Freudenthal, Pine, Aguado-Orea & Gobet (2007) have modeled language acquisition
for English, German, Dutch, and Spanish. The computer model could reproduce dif-
ferences between these languages based on input. At first glance, it is surprising that
there are even differences between German and Dutch and between English and Span-
ish with regard to the use of infinitives as German and Dutch have a very similar syntax
(SOV+V2). Similarly, English and Spanish are both languages with SVO order. Never-
theless, children learning English make OI mistakes, whereas this is hardly ever the case
for children learning Spanish.
Freudenthal, Pine, Aguado-Orea & Gobet (2007) trace the differences in error frequen-
cies back to the distributional differences in each language: the authors note that 75 %
of verb-final utterances8 in English consist of compound verbs (finite verb + dependent
verb, e.g., Can he go?), whereas this is only the case 30 % of the time in Dutch.
German also differs from Dutch with regard to the number of utterance-final infini-
tives. Dutch has a progressive form that does not exist in Standard German:
(15) Wat ben je aan het doen?
what are you on it do.INF
‘What are you doing?’
Furthermore, verbs such as zitten ‘to sit’, lopen ‘to run’ and staan ‘to stand’ can be used
in conjunction with the infinitive to describe events happening in that moment:
(16) Zit je te spelen?
sit you to play
‘Are you sitting and playing?’
Furthermore, there is a future form in Dutch that is formed with ga ‘go’. These factors
contribute to the fact that Dutch has 20 % more utterance-final infinitives than German.
Spanish differs from English in that it has object clitics:
(17) (Yo) Lo quiero.
I it want
‘I want it.’
Short pronouns such as lo in (17) are realized in front of the finite verb so that the verb
appears in final position. In English, the object follows the verb, however. Furthermore,
8
For English, the authors only count utterances with a subject in third person singular since it is only in
these cases that a morphological difference between the finite and infinitive form becomes clear.
542
16.3 Pattern-based approaches
there are a greater number of compound verb forms in the English input (70 %) than in
Spanish (25 %). This is due to the higher frequency of the progressive in English and the
presence of do-support in question formation.
The relevant differences in the distribution of infinitives are captured correctly by the
proposed acquisition model, whereas alternative approaches that assume that children
possess an adult grammar but use infinitives instead of the finite forms cannot explain
the gradual nature of this phenomenon.
Freudenthal, Pine & Gobet (2009) could even show that input-based learning is supe-
rior to other explanations for the distribution of NPs and infinitives. They can explain
why this order is often used with a modal meaning (e.g., to want) in German and Dutch
(Ingram & Thompson 1996). In these languages, infinitives occur with modal verbs in
the corresponding interrogative clauses. Alternative approaches that assume that the
linguistic structures in question correspond to those of adults and only differ from them
in that a modal verb is not pronounced cannot explain why not all utterances of ob-
ject and verb done by children learning German and Dutch do have a modal meaning.
Furthermore, the main difference to English cannot be accounted for: in English, the
number of modal meanings is considerably less. Input-based models predict this exactly
since English can use the dummy verb do to form questions:
(18) a. Did he help you?
b. Can he help you?
If larger entities are acquired from the end of an utterance, then there would be both
a modal and non-modal context for he help you. Since German and Dutch normally do
not use the auxiliary tun ‘do’, the relevant endings of utterances are always associated
with modals contexts. One can thereby explain why infinitival expressions have a modal
meaning significantly more often in German and Dutch than in English.
Following this discussion of the arguments against input-based theories of acquisi-
tion, I will turn to Tomasello’s pattern-based approach. According to Tomasello (2003:
Section 4.2.1), a child hears a sentence such as (19) and realizes that particular slots can
be filled freely (see also Dąbrowska (2001) for analogous suggestions in the framework
of Cognitive Grammar).
(19) a. Do you want more juice/milk?
b. Mommy is gone.
From these utterances, it is possible to derive so-called pivot schemata such as those in
(20) into which words can then be inserted:
(20) a. more ___ → more juice/milk
b. ___ gone → mommy/juice gone
In this stage of development (22 months), children do not generalize using these sche-
mata, these schemata are instead construction islands and do not yet have any syntax
(Tomasello et al. 1997). The ability to use previously unknown verbs with a subject and
543
16 Language acquisition
an object in an SVO order is acquired slowly between the age of three and four (Toma-
sello 2003: 128–129). More abstract syntactic and semantic relations only emerge with
time: when confronted with multiple instantiations of the transitive construction, the
child is then able to generalize:
(21) a. [S [NP The man/the woman] sees [NP the dog/the rabbit/it]].
b. [S [NP The man/the woman] likes [NP the dog/the rabbit/it]].
c. [S [NP The man/the woman] kicks [NP the dog/the rabbit/it]].
According to Tomasello (2003: 107), this abstraction takes the form [Sbj TrVerb Obj].
Tomasello’s approach is immediately plausible since one can recognize how abstraction
works: it is a generalization about reoccurring patterns. Each pattern is then assigned a
semantic contribution. These generalizations can be captured in inheritance hierarchies
(see page 209) (Croft 2001: 26). The problem with this kind of approach, however, is that
it cannot explain the interaction between different areas of phenomena in the language:
it is possible to represent simple patterns such as the use of transitive verbs in (21),
but transitive verbs interact with other areas of the grammar such as negation. If one
wishes to connect the construction one assumes for the negation of transitive verbs with
the transitive construction, then one arrives at a problem since this is not possible in
inheritance hierarchies.
(22) The woman did not kick the dog.
The problem is that the transitive construction has a particular semantic contribution but
that negated transitive construction has the opposite meaning. The values of SEM fea-
tures would therefore be contradictory. There are technical tricks to avoid this problem,
however, since there are a vast number of these kinds of interactions between syntax
and semantics, this kind of technical solution will result in something highly implausi-
ble from a cognitive perspective (Müller 2006; 2007b,a; 2010b; Müller & Wechsler 2014a).
For discussion of Croft’s analysis, see Section 21.4.1.
At this point, proponents of pattern-based analyses might try and argue that these
kinds of problems are only the result of a poor/inadequate formalization and would
rather do without a formalization (Goldberg 2009: Section 5). However, this does not
help here as the problem is not the formalization itself, rather the formalization allows
one to see the problem more clearly.
An alternative to an approach built entirely on inheritance is a TAG-like approach
that allows one to insert syntactic material into phrasal constructions. Such a pro-
posal was discussed in Section 10.6.3. Bergen & Chang (2005: 170) working in Embod-
ied Construction Grammar suggest an Active-Ditransitive Construction with the form
[RefExpr Verb RefExpr RefExpr], where RefExpr stands for a referential expression and
the first RefExpr and the verb may be non-adjacent. In this way, it is possible to analyze
(23a,b), while ruling out (23c):
544
16.3 Pattern-based approaches
While the compulsory adjacency of the verb and the object correctly predicts that (23c)
is ruled out, the respective constraint also rules out coordinate structures such as (24):
(24) Mary tossed me a juice and Peter a water.
Part of the meaning of this sentence corresponds to what the ditransitive construction
contributes to Mary tossed Peter a water. There is, however, a gap between tossed and
Peter. Similarly, one can create examples where there is a gap between both objects of a
ditransitive construction:
(25) He showed me and bought for Mary the book that was recommended in the Guard-
ian last week.
In (25), me is not adjacent to the book …. It is not my aim here to request a coordination
analysis. Coordination is a very complex phenomenon for which most theories do not
have a straightforward analysis (see Section 21.6.2). Instead, I would simply like to point
out that the fact that constructions can be realized discontinuously poses a problem for
approaches that claim that language acquisition is exclusively pattern-based. The point
is the following: in order to understand coordination data in a language, a speaker must
learn that a verb which has its arguments somewhere in the sentence has a particular
meaning together with these arguments. The actual pattern [Sbj V Obj1 Obj2] can, how-
ever, be interrupted in all positions. In addition to the coordination examples, there is
also the possibility of moving elements out of the pattern either to the left or the right.
In sum, we can say that language learners have to learn that there is a relation between
functors and their arguments. This is all that is left of pattern-based approaches but
this insight is also covered by the selection-based approaches that we will discuss in the
following section.
A defender of pattern-based approaches could perhaps object that there is a relevant
construction for (25) that combines all material. This means that one would have a con-
struction with the form [Sbj V Obj1 Conj V PP Obj2]. It would then have to be determined
experimentally or with corpus studies whether this actually makes sense. The general-
ization that linguists have found is that categories with the same syntactic properties
can be coordinated (N, N, NP, V, V, VP, …). For the coordination of verbs or verbal
projections, it must hold that the coordinated phrases require the same arguments:
(26) a. Er [arbeitet] und [liest viele Bücher].
he works and reads many books
b. Er [kennt und liebt] diese Schallplatte.
he knows and loves this record
c. Er [zeigt dem Jungen] und [gibt der Frau] die Punk-Rock-CD.
he shows the boy and gives the woman the punk rock CD
d. Er [liebt diese Schallplatte] und [schenkt ihr ein Buch].
he loves this record and gives her a book
In an approach containing only patterns, one would have to assume an incredibly large
number of constructions and so far we are only considering coordinations that consist
545
16 Language acquisition
of exactly two conjuncts. However, the phenomenon discussed above is not only re-
stricted to coordination of two elements. If we do not wish to abandon the distinction
between competence and performance (see Chapter 15), then the number of conjuncts
is not constrained at all (by the competence grammar):
(27) Er [kennt, liebt und verborgt] diese Schallplatte.
he knows loves and lends.out this record
It is therefore extremely unlikely that learners have patterns for all possible cases in
their input. It is much more likely that they draw the same kind of generalizations as
linguists from the data occurring in their input: words and phrases with the same syn-
tactic properties can be coordinated. If this turns out to be true, then all that is left for
pattern-based approaches is the assumption of discontinuously realized constructions
and thus a dependency between parts of constructions that states that they do not have
to be immediately adjacent to one another. The acquisition problem is then the same as
for selection-based approaches that will be the topic of the following section: what ulti-
mately has to be learned are dependencies between elements or valences (see Behrens
(2009: 439), the author reaches the same conclusion following different considerations).
546
16.5 Summary
What needs to be acquired is the same in each case: there is particular material that has
to be combined with other material in order to yield a complete utterance.
In her article, Green shows how long-distance dependencies and the position of En-
glish auxiliaries can be acquired in later stages of development. The acquisition of gram-
mar proceeds in a monotone fashion, that is, knowledge is added – for example, knowl-
edge about the fact that material can be realized outside of the local context – and pre-
vious knowledge does not have to be revised. In her model, mistakes in the acquisition
process are in fact mistakes in the assignment of lexical entries to valence classes. These
mistakes have to be correctable.
In sum, one can say that all of Tomasello’s insights can be applied directly to selec-
tion-based approaches and the problems with pattern-based approaches do not surface
with selection-based approaches. It is important to point out explicitly once again here
that the selection-based approach discussed here also is a construction-based approach.
Constructions are just lexical and not phrasal. The important point is that, in both ap-
proaches, words and also more complex phrases are pairs of form and meaning and can
be acquired as such.
In Chapter 21, we will discuss pattern-based approaches further and we will also ex-
plore areas of the grammar where phrasal patterns should be assumed.
16.5 Summary
We should take from the preceding discussion that models of language acquisition that
assume that a grammar is chosen from a large set of grammars by setting binary pa-
rameters are in fact inadequate. All theories that make reference to parameters have in
common that they are purely hypothetical since there is no non-trivial set of parameters
that all proponents of the model equally agree on. In fact there is not even a trivial one.
In a number of experiments, Tomasello and his colleagues have shown that, in its
original form, the Principles & Parameters model makes incorrect predictions and that
language acquisition is much more pattern-based than assumed by proponents of P&P
analyses. Syntactic competence develops starting from verb islands. Depending on the
frequency of the input, certain verbal constructions can be mastered even though the
same construction has not yet been acquired with less frequent verbs.
The interaction with other areas of grammar still remains problematic for pattern-
based approaches: in a number of publications, it has been shown that the interac-
tion of phenomena that one can observe in complex utterances can in fact not be ex-
plained with phrasal patterns since embedding cannot be captured in an inheritance
hierarchy. This problem is not shared by selection-based approaches. All experimental
results and insights of Tomasello can, however, be successfully extended to selection-
based approaches.
547
16 Language acquisition
Further reading
Meisel (1995) gives a very good overview of theories of acquisition in the Princi-
ples & Parameters model.
Adele Goldberg and Michael Tomasello are the most prominent proponents of
Construction Grammar, a theory that explicitly tries to do without the assump-
tion of innate linguistic knowledge. They published many papers and books
about topics related to Construction Grammar and acquisition. The most impor-
tant books probably are Goldberg (2006) and Tomasello (2003).
An overview of different theories of acquisition in German can be found in
Klann-Delius (2008) an English overview is Ambridge & Lieven (2011).
548
17 Generative capacity and grammar
formalisms
In several of the preceding chapters, the complexity hierarchy for formal languages was
mentioned. The simplest languages are so-called regular languages (Type-3), they are
followed by those described as context-free grammars (Type-2), then those grammars
which are context-sensitive (Type-1) and finally we have unrestricted grammars (Type-
0) that create recursively enumerable languages, which are the most complicated class.
In creating theories, a conscious effort was made to use formal means that correspond
to what one can actually observe in natural language. This led to the abandonment
of unrestricted Transformational Grammar since this has generative power of Type-0
(see page 86). GPSG was deliberately designed in such a way as to be able to analyze
just the context-free languages and not more. In the mid-80s, it was shown that natu-
ral languages have a higher complexity than context-free languages (Shieber 1985; Culy
1985). It is now assumed that so-called mildly context-sensitive grammars are sufficient
for analyzing natural languages. Researchers working on TAG are working on devel-
oping variants of TAG that fall into exactly this category. Similarly, it was shown for
different variants of Stabler’s Minimalist Grammars (see Section 4.6.4 and Stabler 2001;
2011b) that they have a mildly context-sensitive capacity (Michaelis 2001). Peter Hell-
wig’s Dependency Unification Grammar is also mildly context-sensitive (Hellwig 2003:
595). LFG and HPSG, as well as Chomsky’s theory in Aspects, fall into the class of Type-0
languages (Berwick 1982; Johnson 1988). The question at this point is whether it is an
ideal goal to find a descriptive language that has exactly the same power as the object
it describes. Carl Pollard (1997: 9) once said that it would be odd to claim that certain
theories in physics were not adequate simply because they make use of tools from mathe-
matics that are too powerful.1 It is not the descriptive language that should constrain the
theory but rather the theory contains the restrictions that must hold for the objects in
question. This is the view that Chomsky (1981b: 277, 280) takes. Also, see Berwick (1982:
Section 4), Kaplan & Bresnan (1982: Section 8) on LFG and Johnson (1988: Section 3.5)
on the Off-Line Parsability Constraint in LFG and attribute-value grammars in general.
There is of course a technical reason to look for a grammar with the lowest level of
complexity possible: we know that it is easier for computers to process grammars with
1
If physicists required the formalism to constrain the theory:
Editor: Professor Einstein, I’m afraid we can’t accept this manuscript of yours on general relativity.
Einstein: Why? Are the equations wrong?
Editor: No, but we noticed that your differential equations are expressed in the first-order language of
set theory. This is a totally unconstrained formalism! Why, you could have written down ANY
set of differential equations! (Pollard 1997: 9)
17 Generative capacity and grammar formalisms
lower complexity than more complex grammars. To get an idea about the complexity of
a task, the so-called ‘worst case’ for the relevant computations is determined, that is, it is
determined how long a program needs for an input of a certain length in the least favor-
able case to get a result for a grammar from a certain class. This begs the question if the
worst case is actually relevant. For example, some grammars that allow discontinuous
constituents perform less favorably in the worst case than normal phrase structure gram-
mars that only allow for combinations of continuous strings (Reape 1991: Section 8). As
I have shown in Müller (2004d), a parser that builds up larger units starting from words
(a bottom-up parser) is far less efficient when processing a grammar assuming a verb
movement analysis than is the case for a bottom-up parser that allows for discontinuous
constituents. This has to do with the fact that verb traces do not contribute any phonolog-
ical material and a parser cannot locate them without further machinery. It is therefore
assumed that a verb trace exists in every position in the string and in most cases these
traces do not contribute to an analysis of the complete input. Since the verb trace is not
specified with regard to its valence information, it can be combined with any material
in the sentence, which results in an enormous computational load. On the other hand,
if one allows discontinuous constituents, then one can do without verb traces and the
computational load is thereby reduced. In the end, the analysis using discontinuous con-
stituents was eventually discarded for linguistic reasons (Müller 2005b,c; 2007a; 2017a),
however, the investigation of the parsing behavior of both grammars is still interesting
as it shows that worst case properties are not always informative.
I will discuss another example of the fact that language-specific restrictions can re-
strict the complexity of a grammar: Gärtner & Michaelis (2007: Section 3.2) assume that
Stabler’s Minimalist Grammars (see Section 4.6.4) with extensions for late adjunction
and extraposition are actually more powerful than mildly context-sensitive. If one bans
extraction from adjuncts (Frey & Gärtner 2002: 46) and also assumes the Shortest Move
Constraint (see footnote 32 on page 165), then one arrives at a grammar that is mildly
context-sensitive (Gärtner & Michaelis 2007: 178). The same is true of grammars with
the Shortest Move Constraint and a constraint for extraction from specifiers.
Whether extraction takes place from a specifier or not depends on the organization
of the particular grammar in question. In some grammars, all arguments are specifiers
(Kratzer 1996: 120–123, also see Figure 18.4 on page 561). A ban on extraction from spec-
ifiers would imply that extraction out of arguments would be impossible. This is, of
course, not true in general. Normally, subjects are treated as specifiers (also by Frey &
Gärtner 2002: 44). It is often claimed that subjects are islands for extraction (see Grewen-
dorf 1989: 35, 41; G. Müller 1996b: 220; 1998: 32, 163; Sabel 1999: 98; Fanselow 2001: 422).
Several authors have noted, however, that extraction from subjects is possible in Ger-
man (see Dürscheid 1989: 25; Haider 1993: 173; Pafel 1993; Fortmann 1996: 27; Suchsland
1997: 320; Vogel & Steinbach 1998: 87; Ballweg 1997: 2066; Müller 1999b: 100–101; De
Kuthy 2002: 7). The following data are attested examples:
550
(1) a. [Von den übrigbleibenden Elementen]𝑖 scheinen [die Determinantien _𝑖 ] die
of the left.over elements seem the determinants the
2
wenigsten Klassifizierungsprobleme aufzuwerfen.
fewest classification.problems to.throw.up
‘Of the remaining elements, the determinants seem to pose the fewest prob-
lems for classification.’
b. [Von den Gefangenen]𝑖 hatte eigentlich [keine _𝑖 ] die Nacht der Bomben
of the prisoners had actually none the night of.the bombs
überleben sollen.3
survive should
‘None of the prisoners should have actually survived the night of the bomb-
ings.’
c. [Von der HVA]𝑖 hielten sich [etwa 120 Leute _𝑖 ] dort in ihren Gebäuden
of the HVA held REFL around 120 people there in their buildings
4
auf.
PART
‘Around 120 people from the HVA stayed there inside their buildings.’
d. [Aus dem „Englischen Theater“]𝑖 stehen [zwei Modelle _𝑖 ] in den Vitrinen.5
from the English theater stand two models in the cabinets
‘Two models from the ‘English Theater’ are in the cabinets.’
e. [Aus der Fraktion]𝑖 stimmten ihm [viele _𝑖 ] zu darin, dass die
from the faction agreed him many PART there.in that the
Kaufkraft der Bürger gepäppelt werden müsse, nicht die gute Laune
buying.power of.the citizens boosted become must not the good mood
der Wirtschaft.6
of.the economy
‘Many of the fraction agreed with him that it is the buying power of citizens
that needed to be increased, not the good spirits of the economy.’
f. [Vom Erzbischof Carl Theodor Freiherr von Dalberg]𝑖 gibt es
from archbishop Carl Theodor Freiherr from Dalberg gives it
7
beispielsweise [ein Bild _𝑖 ] im Stadtarchiv.
for.example a picture in.the city.archives
‘For example, there is a picture of archbishop Carl Theodor Freiherr of Dalberg
in the city archives.’
2
In the main text of Engel (1970: 102).
3
Bernhard Schlink, Der Vorleser, Diogenes Taschenbuch 22953, Zürich: Diogenes Verlag, 1997, p. 102.
4
Spiegel, 3/1999, p. 42.
5
Frankfurter Rundschau, quoted from De Kuthy (2001: 52).
6
taz, 16.10.2003, p. 5.
7
Frankfurter Rundschau, quoted from De Kuthy (2002: 7).
551
17 Generative capacity and grammar formalisms
552
This means that a ban on extraction from specifiers cannot hold for German. As such, it
cannot be true for all languages.
We have a situation that is similar to the one with discontinuous constituents: since
it is not possible to integrate the ban on extraction discussed here into the grammar
formalism, it is more powerful than what is required for describing natural language.
However, the restrictions in actual grammars – in this case, the restrictions on extraction
from specifiers in the relevant languages – ensure that the respective language-specific
grammars have a mildly context-sensitive capacity.
553
18 Binary branching, locality, and
recursion
This chapter discusses three points: section 18.1 deals with the question of whether all
linguistic structures should be binary branching or not. Section 18.2 discusses the ques-
tion what information should be available for selection, that is, whether governing heads
can access the internal structure of selected elements or whether everything should be
restricted to local selection. Finally, Section 18.3 discusses recursion and how/whether
it is captured in the different grammar theories that are discussed in this book.
Mummy must leave now Mummy must leave now Mummy must leave now
However, Haegeman (1994: 88) provides evidence for the fact that (1) has the structure
in (2):
(2) [Mummy [must [leave now]]]
The relevant tests showing this include elliptical constructions, that is, the fact that it
is possible to refer to the constituents in (2) with pronouns. This means that there is
actually evidence for the structure of (1) that is assumed by linguists and we therefore
do not have to assume that it is just hard-wired in our brains that only binary-branching
structures are allowed. Haegeman (1994: 143) mentions a consequence of the binary
branching hypothesis: if all structures are binary-branching, then it is not possible to
straightforwardly account for sentences with ditransitive verbs in X theory. In X theory,
it is assumed that a head is combined with all its complements at once (see Section 2.5).
So in order to account for ditransitive verbs in X theory, an empty element (little v) has
to be assumed (see Section 4.1.4).
It should have become clear in the discussion of the arguments for the Poverty of
the Stimulus in Section 13.8 that the assumption that only binary-branching structures
are possible is part of our innate linguistic knowledge is nothing more than pure spec-
ulation. Haegeman offers no kind of evidence for this assumption. As shown in the
discussions of the various theories we have seen, it is possible to capture the data with
flat structures. For example, it is possible to assume that, in English, the verb is combined
with its complements in a flat structure (Pollard & Sag 1994: 39). There are sometimes
theory-internal reasons for deciding for one kind of branching or another, but these are
not always applicable to other theories. For example, Binding Theory in GB theory is
formulated with reference to dominance relations in trees (Chomsky 1981a: 188). If one
assumes that syntactic structure plays a crucial role for the binding of pronouns (see
page 90), then it is possible to make assumptions about syntactic structure based on the
observable binding relations (so also Section 4.1.4). Binding data have, however, received
a very different treatment in various theories. In LFG, constraints on f-structure are used
for Binding Theory (Dalrymple 1993), whereas Binding Theory in HPSG operates on ar-
gument structure lists (valence information that are ordered in a particular way, see
Section 9.1.1).
The opposite of Haegeman’s position is the argumentation for flat structures put for-
ward by Croft (2001: Section 1.6.2). In his Radical Construction Grammar FAQ, Croft
observes that a phrasal construction such as the one in (3a) can be translated into a
Categorial Grammar lexical entry like (3b).
(3) a. [VP V NP ]
b. VP/NP
He claims that a disadvantage of Categorial Grammar is that it only allows for binary-
branching structures and yet there exist constructions with more than two parts (p. 49).
The exact reason why this is a problem is not explained, however. He even acknowledges
himself that it is possible to represent constructions with more than two arguments in
Categorial Grammar. For a ditransitive verb, the entry in Categorial Grammar of English
would take the form of (4):
556
18.1 Binary branching
(4) ((s\np)/np)/np
If we consider the elementary trees for TAG in Figure 18.2, it becomes clear that it
is equally possible to incorporate semantic information into a flat tree and a binary-
branching tree. The binary-branching tree corresponds to a Categorial Grammar deriva-
S
S
NP↓ VP
NP↓ VP
V′ NP↓
V NP↓ NP↓
V NP↓
gives
gives
tion. In both analyses in Figure 18.2, a meaning is assigned to a head that occurs with
a certain number of arguments. Ultimately, the exact structure required depends on
the kinds of restrictions on structures that one wishes to formulate. In this book, such
restrictions are not discussed, but as explained above some theories model binding re-
lations with reference to tree structures. Reflexive pronouns must be bound within a
particular local domain inside the tree. In theories such as LFG and HPSG, these bind-
ing restrictions are formulated without any reference to trees. This means that evidence
from binding data for one of the structures in Figure 18.2 (or for other tree structures)
constitutes nothing more than theory-internal evidence.
Another reason to assume trees with more structure is the possibility to insert adjuncts
on any node. In Chapter 9, an HPSG analysis for German that assumes binary-branching
structures was proposed. With this analysis, it is possible to attach an adjunct to any
node and thereby explain the free ordering of adjuncts in the middle field:
(5) a. [weil] der Mann der Frau das Buch gestern gab
because the man the woman the book yesterday gave
‘because the man gave the woman the book yesterday’
b. [weil] der Mann der Frau gestern das Buch gab
because the man the woman yesterday the book gave
c. [weil] der Mann gestern der Frau das Buch gab
because the man yesterday the woman the book gave
d. [weil] gestern der Mann der Frau das Buch gab
because yesterday the man the woman the book gave
557
18 Binary branching, locality, and recursion
This analysis is not the only one possible, however. One could also assume an entirely
flat structure where arguments and adjuncts are dominated by one node. Kasper (1994)
suggests this kind of analysis in HPSG (see also Section 5.1.5 for GPSG analyses that make
use of metarules for the introduction of adjuncts). Kasper requires complex relational
constraints that create syntactic relations between elements in the tree and also compute
the semantic contribution of the entire constituent using the meaning of both the verb
and the adjuncts. The analysis with binary-branching structures is simpler than those
with complex relational constraints and – in the absence of theory-external evidence for
flat structures – should be preferred to the analysis with flat structures. At this point, one
could object that adjuncts in English cannot occur in all positions between arguments
and therefore the binary-branching Categorial Grammar analysis and the TAG analysis
in Figure 18.2 are wrong. This is not correct, however, as it is the specification of adjuncts
with regard to the adjunction site that is crucial in Categorial Grammar. An adverb has
the category (s\np)\(s\np) or (s\np)/(s\np) and can therefore only be combined with con-
stituents that correspond to the VP node in Figure 18.2. In the same way, an elementary
tree for an adverb in TAG can only attach to the VP node (see Figure 12.3 on page 419).
For the treatment of adjuncts in English, binary-branching structures therefore do not
make any incorrect predictions.
18.2 Locality
The question of local accessibility of information has been treated in various ways by
the theories discussed in this book. In the majority of theories, one tries to make infor-
mation about the inner workings of phrases inaccessible for adjacent or higher heads,
that is, glaubt ‘believe’ in (6) selects a sentential argument but it cannot “look inside”
this sentential argument.
(6) a. Karl glaubt, dass morgen seine Schwester kommt.
Karl believes that tomorrow his sister comes
‘Karl believes that his sister is coming tomorrow.’
b. Karl glaubt, dass seine Schwester morgen kommt.
Karl believes that his sister tomorrow comes
Thus for example, glauben cannot enforce that the subject of the verb has to begin with a
consonant or that the complementizer has to be combined with a verbal projection start-
ing with an adjunct. In Section 1.5, we saw that it is a good idea to classify constituents in
terms of their distribution and independent of their internal structure. If we are talking
about an NP box, then it is not important what this NP box actually contains. It is only
of importance that a given head wants to be combined with an NP with a particular case
marking. This is called locality of selection.
Various linguistic theories have tried to implement locality of selection. The simplest
form of this implementation is shown by phrase structure grammars of the kind dis-
cussed in Chapter 2. The rule in (17) on page 59, repeated here as (7), states that a ditran-
sitive verb can occur with three noun phrases, each with the relevant case:
558
18.2 Locality
(7) S → NP(Per1,Num1,nom)
NP(Per2,Num2,dat)
NP(Per3,Num3,acc)
V(Per1,Num1,ditransitive)
Since the symbols for NPs do not have any further internal structure, the verb cannot
require that there has to be a relative clause in an NP, for example. The internal prop-
erties of the NP are not visible to the outside. We have already seen in the discussion
in Chapter 2, however, that certain properties of phrases have to be outwardly visible.
This was the information that was written on the boxes themselves. For noun phrases,
at least information about person, number and case are required in order to correctly
capture their relation to a head. The gender value is important in German as well, since
adverbial phrases such as einer nach dem anderen ‘one after the other’ have to agree in
gender with the noun they refer to (see example (12) on page 515). Apart from that, in-
formation about the length of the noun phrases is required, in order to determine their
order in a clause. Heavy constituents are normally ordered after lighter ones, and are
also often extraposed (cf. Behaghel’s Gesetz der wachsenden Glieder ‘Law of increasing
constituents’ (1909: 139; 1930: 86)).
Theories that strive to be as restrictive as possible with respect to locality therefore
have to develop mechanisms that allow one to only access information that is required
to explain the distribution of constituents. This is often achieved by projecting certain
properties to the mother node of a phrase. In X theory, the part of speech a head be-
longs to is passed up to the maximal projection: if the head is an N, for example, then
the maximal projection is an NP. In GPSG, HPSG and variants of CxG, there are Head
Feature Principles responsible for the projection of features. Head Feature Principles
ensure that an entire group of features, so-called head features, are present on the max-
imal projection of a head. Furthermore, every theory has to be capable of representing
the fact that a constituent can lack one of its parts and this part is then realized via a
long-distance dependency in another position in the clause. As previously discussed on
page 305, there are languages in which complementizers inflect depending on whether
their complement is missing a constituent or not. This means that this property must be
somehow accessible. In GPSG, HPSG and variants of CxG, there are additional groups
of features that are present at every node between a filler and a gap in a long-distance
dependency. In LFG, there is f-structure instead. Using Functional Uncertainty, one can
look for the position in the f-structure where a particular constituent is missing. In GB
theory, movement proceeds cyclically, that is, an element is moved into the specifier of
CP and can be moved from there into the next highest CP. It is assumed in GB theory
that heads can look inside their arguments, at least they can see the elements in the
specifier position. If complementizers can access the relevant specifier positions, then
they can determine whether something is missing from an embedded phrase or not. In
GB theory, there was also an analysis of case assignment in infinitive constructions in
which the case-assigning verb governs into the embedded phrase and assigns case to the
element in SpecIP. Figure 18.3 shows the relevant structure taken from Haegeman (1994:
170). Since the Case Principle is formulated in such a way that only finite I can assign
559
18 Binary branching, locality, and recursion
IP
NP I′
I VP
V′
V IP
NP I′
I VP
V′
V NP
Figure 18.3: Analysis of the AcI construction with Exceptional Case Marking
case to the subject (cf. page 110), him does not receive case from I. Instead, it is assumed
that the verb believe assigns case to the subject of the embedded infinitive.
Verbs that can assign case across phrase boundaries are referred to as ECM verbs,
where ECM stands for Exceptional Case Marking. As the name suggests, this instance
of case assignment into a phrase was viewed as an exception. In newer versions of the
theory (e.g., Kratzer 1996: 120–123), all case assignment is to specifier positions. For
example, the Voice head in Figure 18.4 on the next page assigns accusative to the DP in
the specifier of VP. Since the Voice head governs into the VP, case assignment to a run-
of-the-mill object in this theory is an instance of exceptional case assignment as well.
The same is true in Adger’s version of Minimalism, which was discussed in Chapter 4:
Adger (2010) argues that his theory is more restrictive than LFG or HPSG since it is
only one feature that can be selected by a head, whereas in LFG and HPSG complex
feature bundles are selected. However, the strength of this kind of locality constraint
is weakened by the operation Agree, which allows for nonlocal feature checking. As in
Kratzer’s proposal, case is assigned nonlocally by little v to the object inside the VP (see
Section 4.1.5.2).
Adger discusses PP arguments of verbs like depend and notes that these verbs need
specific PPs, that is, the form of the preposition in the PP has to be selectable. While
560
18.2 Locality
VoiceP
DP Voice′
Voice VP
DP V′
this is trivial in Dependency Grammar, where the preposition is selected right away,
the respective information is projected in theories like HPSG and is then selectable at
the PP node. However, this requires that the governing verb can determine at least two
properties of the selected element: its part of speech and the form of the preposition.
This is not possible in Adger’s system and he left this for further research. Of course
it would be possible to assume an onP (a phrasal projection of on that has the category
‘on’). Similar solutions have been proposed in Minimalist theories (see Section 4.6.1 on
functional projections), but such a solution would obviously miss the generalization that
all prepositional phrases have something in common, which would not be covered in a
system with atomic categories that are word specific.
In theories such as LFG and HPSG, case assignment takes place locally in constructions
such as those in (8):
Although him, ihn ‘him’, er ‘he’ and den Teich ‘the pond’ are not semantic arguments
of the finite verbs, they are syntactic arguments (they are raised) and can therefore be
assigned case locally. See Bresnan (1982a: 348–349 and Section 8.2) and Pollard & Sag
561
18 Binary branching, locality, and recursion
(1994: Section 3.5) for an analysis of raising in LFG and HPSG respectively. See Meurers
(1999c), Przepiórkowski (1999b), and Müller (2007a: Section 17.4) for case assignment in
HPSG and for its interaction with raising.
There are various phenomena that are incompatible with strict locality and require the
projection of at least some information. For example, there are question tags in English
that must match the subject of the clause with which they are combined:
(9) a. She is very smart, isn’t she / * he?
b. They are very smart, aren’t they?
Bender & Flickinger (1999), Flickinger & Bender (2003) therefore propose making infor-
mation about agreement or the referential index of the subject available on the sentence
node.1 In Sag (2007), all information about phonology, syntax and semantics of the sub-
ject is represented as the value of a feature XARG (EXTERNAL ARGUMENT). Here, external
argument does not stand for what it does in GB theory, but should be understood in
a more general sense. For example, it makes the possessive pronoun accessible on the
node of the entire NP. Sag (2007) argues that this is needed to force coreference in En-
glish idioms:
(10) a. He𝑖 lost [his𝑖 / *her 𝑗 marbles].
b. They𝑖 kept/lost [their𝑖 / *our 𝑗 cool].
The use of the XARG feature looks like an exact parallel to accessing the specifier position
as we saw in the discussion of GB. However, Sag proposes that complements of prepo-
sitions in Polish are also made accessible by XARG since there are data suggesting that
higher heads can access elements inside PPs (Przepiórkowski 1999a: Section 5.4.1.2).
In Section 10.6.2 about Sign-based Construction Grammar, we already saw that a the-
ory that only makes the reference to one argument available on the highest node of a
projection cannot provide an analysis for idioms of the kind given in (11). This is because
the subject is made available with verbal heads, however, it is the object that needs to
be accessed in sentences such as (11). This means that one has to be able to formulate
constraints affecting larger portions of syntactic structure.
(11) a. Ich glaube, mich / # dich tritt ein Pferd.2
I believes me you kicks a horse
‘I am utterly surprised.’
b. Jonas glaubt, ihn tritt ein Pferd.3
Jonas believes him kicks a horse
‘Jonas is utterly surprised.’
c. # Jonas glaubt, dich tritt ein Pferd.
Jonas believes you kicks a horse
‘Jonas believes that a horse kicks you.’
1
See also Sag & Pollard (1991: 89).
2
Richter & Sailer (2009: 311).
3
http://www.machandel-verlag.de/der-katzenschatz.html, 2015-07-06.
562
18.2 Locality
Theories of grammar with extended locality domains do not have any problems with
this kind of data.4 An example for this kind of theory is TAG. In TAG, one can specify
trees of exactly the right size (Abeillé 1988; Abeillé & Schabes 1989). All the material that
is fixed in an idiom is simply determined in the elementary tree. Figure 18.5 shows the
tree for kick the bucket as it is used in (12a).
(12) a. The cowboys kicked the bucket.
b. Cowboys often kick the bucket.
c. He kicked the proverbial bucket.
NP↓ VP
V NP
kicked D N
the bucket
Since TAG trees can be split up by adjunction, it is possible to insert elements between
the parts of an idiom as in (12b,c) and thus explain the flexibility of idioms with regard to
adjunction and embedding.5 Depending on whether the lexical rules for the passive and
long-distance dependencies can be applied, the idiom can occur in the relevant variants.
4
Or more carefully put: they do not have any serious problems since the treatment of idioms in all their
many aspects is by no means trivial (Sailer 2000).
5
Interestingly, variants of Embodied CxG are strikingly similar to TAG. The Ditransitive Construction that
was discussed on page 342 allows for additional material to occur between the subject and the verb.
The problems that arise for the semantics construction are also similar. Abeillé & Schabes (1989: 9)
assume that the semantics of John kicked the proverbial bucket is computed from the parts John′ , kick-the-
bucket ′ and proverbial′ , that is, the added modifiers always have scope over the entire idiom. This is not
adequate for all idioms (Fischer & Keil 1996):
In the idiom in (i), Bär ‘bear’ actually means ‘lie’ and the adjective has to be interpreted accordingly. The
relevant tree should therefore contain nodes that contribute semantic information and also say something
about the composition of these features.
In the same way, when computing the semantics of noun phrases in TAG and Embodied Construction
Grammar, one should bear in mind that the adjective that is combined with a discontinuous NP Construc-
tion (see page 340) or an NP tree can have narrow scope over the noun (all alleged murderers).
563
18 Binary branching, locality, and recursion
In cases where the entire idiom or parts of the idiom are fixed, it is possible to rule out
adjunction to the nodes of the idiom tree. Figure 18.6 shows a pertinent example from
Abeillé & Schabes (1989: 7). The ban on adjunction is marked by a subscript NA.
NP↓ VP
V NP↓ PPNA
takes P NPNA
into NNA
account
The question that also arises for other theories is whether the efforts that have been
made to enforce locality should be abandoned altogether. In our box model in Section 1.5,
this would mean that all boxes were transparent. Since plastic boxes do not allow all of
the light through, objects contained in multiple boxes cannot be seen as clearly as those
in the topmost box (the path of Functional Uncertainty is longer). This is parallel to a
suggestion made by Kay & Fillmore (1999) in CxG. Kay and Fillmore explicitly represent
all the information about the internal structure of a phrase on the mother node and
therefore have no locality restrictions at all in their theory. In principle, one can motivate
this kind of theory in parallel to the argumentation in Chapter 17. The argument there
made reference to the complexity of the grammatical formalism: the kind of complexity
that the language of description has is unimportant, it is only important what one does
with it. In the same way, one can say that regardless of what kind of information is in
principle accessible, it is not accessed if this is not permitted. This was the approach
taken by Pollard & Sag (1987: 143–145).
It is also possible to assume a world in which all the boxes contain transparent areas
where it is possible to see parts of their contents. This is more or less the LFG world: the
information about all levels of embedding contained in the f-structure is visible to both
the inside and the outside. We have already discussed Nordlinger’s (1998) LFG analysis
of Wambaya on page 309. In Wambaya, words that form part of a noun phrase can be
distributed throughout the clause. For example, an adjective that refers to a noun can
occur in a separate position from it. Nordlinger models this by assuming that an adjective
can make reference to an argument in the f-structure and then agrees with it in terms of
564
18.2 Locality
case, number and gender. Bender (2008c) has shown that this analysis can be transferred
to HPSG: instead of no longer representing an argument on the mother node after it has
been combined with a head, simply marking the argument as realized allows us to keep it
in the representation (Meurers 1999c; Przepiórkowski 1999b; Müller 2007a: Section 17.4).
Detmar Meurers compares both of these HPSG approaches to different ways of working
through a shopping list: in the standard approach taken by Pollard & Sag (1994), one
tears away parts of the shopping list once the relevant item has been found. In the other
case, the relevant item on the list is crossed out. At the end of the shopping trip, one
ends up with a list of what has been bought as well as the items themselves.
I have proposed the crossing-out analysis for depictive predicates in German and En-
glish (Müller 2004a; 2008). Depictive predicates say something about the state of a per-
son or object during the event expressed by a verb:
(13) a. Er sah sie nackt.6
he saw her naked
b. He saw her naked.
In (13), the depictive adjective can either refer to the subject or the object. However, there
is a strong preference for readings where the antecedent noun precedes the depictive
predicate (Lötscher 1985: 208). Figure 18.7 on the following page shows analyses for the
sentences in (14):
(14) a. dass er𝑖 die Äpfel 𝑗 ungewaschen𝑖/𝑗 isst
that he the apples unwashed eats
‘that he eats the apples unwashed’
b. dass er𝑖 ungewaschen𝑖/∗𝑗 die Äpfel 𝑗 isst
that he unwashed the apples eats
‘that he eats the apples (while he is) unwashed’
Arguments that have been realized are still represented on the upper nodes, however,
they are crossed-out and thereby marked as “realized”. In German, this preference for
the antecedent noun can be captured by assuming a restriction that states that the an-
tecedent noun must not yet have been realized.
It is commonly assumed for English that adjuncts are combined with a VP.
(15) a. John [[VP ate the apples𝑖 ] unwashed𝑖 ].
b. You can’t [[VP give them𝑖 injections] unconscious𝑖 ].7
In approaches where the arguments of the verb are accessible at the VP node, it is possible
to establish a relation between the depictive predicate and an argument although the
antecedent noun is inside the VP. English differs from German in that depictives can
refer to both realized (them in (15b)) and unrealized (you in (15b)) arguments.
6
Haider (1985b: 94).
7
Simpson (2005a: 17).
565
18 Binary branching, locality, and recursion
V[comps h 1/ , / i]
2 V[comps h 1/ , / i]
2
1 NP[nom] V[comps h 1 , / i]
2 1 NP[nom] V[comps h 1 , / i]
2
Figure 18.7: Analysis of dass er die Äpfel ungewaschen isst ‘that he the apples unwashed
eats’ and dass er ungewaschen die Äpfel isst ‘that he unwashed the apples eat’
Higginbotham (1985: 560) and Winkler (1997) have proposed corresponding non-can-
cellation approaches in GB theory. There are also parallel suggestions in Minimalist
theories: checked features are not deleted, but instead marked as already checked (Sta-
bler 2011b: 14). However, these features are still viewed as inaccessible.
Depending on how detailed the projected information is, it can be possible to see ad-
juncts and argument in embedded structures as well as their phonological, syntactic and
semantic properties. In the CxG variant proposed by Kay and Fillmore, all information
is available. In LFG, information about grammatical function, case and similar proper-
ties is accessible. However, the part of speech is not contained in the f-structure. If
the part of speech does not stand in a one-to-one relation to grammatical function, it
cannot be restricted using selection via f-structure. Nor is phonological information rep-
resented completely in the f-structure. If the analysis of idioms requires nonlocal access
to phonological information or part of speech, then this has to be explicitly encoded in
the f-structure (see Bresnan (1982b: 46–50) for more on idioms).
In the HPSG variant that I adopt, only information about arguments is projected. Since
arguments are always represented by descriptions of type synsem, no information about
their phonological realization is present. However, there are daughters in the structure
so that it is still possible to formulate restrictions for idioms as in TAG or Construction
Grammar (see Richter & Sailer (2009) for an analysis of the ‘horse’ example in (11a)).
This may seem somewhat like overkill: although we already have the tree structure, we
are still projecting information about arguments that have already been realized (un-
fortunately these also contain information about their arguments and so on). At this
point, one could be inclined to prefer TAG or LFG since these theories only make use
of one extension of locality: TAG uses trees of arbitrary or rather exactly the necessary
size and LFG makes reference to a complete f-structure. However, things are not quite
that simple: if one wants to create a relation to an argument when adjoining a depictive
566
18.3 Recursion
predicate in TAG, then one requires a list of possible antecedents. Syntactic factors (e.g.,
reference to dative vs. accusative noun phrases, to argument vs. adjuncts, coordination
of verbs vs. nouns) play a role in determining the referent noun, this cannot be reduced
to semantic relations. Similarly, there are considerably different restrictions for different
kinds of idioms and these cannot all be formulated in terms of restrictions on f-structure
since f-structure does not contain information about parts of speech.
One should bear in mind that some phenomena require reference to larger portions
of structure. The majority of phenomena can be treated in terms of head domains and
extended head domains, however, there are idioms that go beyond the sentence level.
Every theory has to account for this somehow.
18.3 Recursion
Every theory in this book can deal with self-embedding in language as it was discussed
on page 4. The example (2) is repeated here as (16):
(16) that Max thinks [that Julia knows [that Otto claims [that Karl suspects [that Rich-
ard confirms [that Friederike is laughing]]]]]
Most theories capture this directly with recursive phrase structure rules or dominance
schemata. TAG is special with regard to recursion since recursion is factored out of the
trees. The corresponding effects are created by an adjunction operation that allows any
amount of material to be inserted into trees. It is sometimes claimed that Construction
Grammar cannot capture the existence of recursive structure in natural language (e.g.,
Leiss 2009: 269). This impression is understandable since many analyses are extremely
surface-oriented. For example, one often talks of a [Sbj TrVerb Obj] construction. How-
ever, the grammars in question also become recursive as soon as they contain a sentence
embedding or relative clause construction. A sentence embedding construction could
have the form [Sbj that-Verb that-S], where a that-Verb is one that can take a sentential
complement and that-S stands for the respective complement. A that-clause can then
be inserted into the that-S slot. Since this that-clause can also be the result of the appli-
cation of this construction, the grammar is able to produce recursive structures such as
those in (17):
(17) Otto claims [that-S that Karl suspects [that-S that Richard sleeps]].
In (17), both Karl suspects that Richard sleeps and the entire clause are instances of the
[Sbj that-Verb that-S] construction. The entire clause therefore contains an embedded
subpart that is licensed by the same construction as the clause itself. (17) also contains a
constituent of the category that-S that is embedded inside of that-S. For more on recur-
sion and self-embedding in Construction Grammar, see Verhagen (2010).
Similarly, every Construction Grammar that allows a noun to combine with a genitive
noun phrase also allows for recursive structures. The construction in question could
have the form [Det N NP[gen] ] or [ N NP[gen] ]. The [Det N NP[gen] ] construction
licenses structures such as (18):
567
18 Binary branching, locality, and recursion
(18) [NP des Kragens [NP des Mantels [NP der Vorsitzenden]]]
the collar of.the coat of.the chairwoman
‘the collar of the coat of the chairwoman’
Jurafsky (1996) and Bannard, Lieven & Tomasello (2009) use probabilistic context-free
grammars (PCFG) for a Construction Grammar parser with a focus on psycholinguistic
plausibility and modeling of acquisition. Context-free grammars have no problems with
self-embedding structures like those in (18) and thus this kind of Construction Grammar
itself does not encounter any problems with self-embedding.
Goldberg (1995: 192) assumes that the resultative construction for English has the
following form:
(19) [SUBJ [V OBJ OBL]]
This corresponds to a complex structure as assumed for elementary trees in TAG. LTAG
differs from Goldberg’s approach in that every structure requires a lexical anchor, that
is, for example (19), the verb would have to be fixed in LTAG. But in Goldberg’s analysis,
verbs can be inserted into independently existing constructions (see Section 21.1). In TAG
publications, it is often emphasized that elementary trees do not contain any recursion.
The entire grammar is recursive however, since additional elements can be added to the
tree using adjunction and – as (17) and (18) show – insertion into substitution nodes can
also create recursive structures.
568
19 Empty elements
This chapter deals with empty elements. I first discuss the general attitude of various
research traditions towards empty elements and then show how they can be eliminated
from grammars (Section 19.2). Section 19.3 discusses empty elements that have been
suggested in order to facilitate semantic interpretation. Section 19.4 discusses possible
motivation for empty elements with a special focus on cross-linguistic comparison and
the final Section 19.5 shows that certain accounts with transformations, lexical rules, and
empty elements can be translated into each other.
1
Note that empty elements in TAG are slightly different from empty elements in other theories. In TAG the
empty elements are usually part of elementary trees, that is, they are not lexical items that are combined
with other material.
570
19.2 Eliminating empty elements from grammars
copulas, controlled infinitives, and for coordinate structures, but Groß & Osborne (2009:
73) reject empty elements (with the exception of ellipsis, Osborne 2018a).
No empty elements are assumed in Construction Grammar (Michaelis & Ruppenhofer
2001: 49–50; Goldberg 2003a: 219; Goldberg 2006: 10), the related Simpler Syntax (Culi-
cover & Jackendoff 2005) as well as in Cognitive Grammar.2 The argumentation against
empty elements runs along the following lines:
1. There is no evidence for invisible objects.
2. There is no innate linguistic knowledge.
3. Therefore, knowledge about empty elements cannot be learned, which is why they
cannot be assumed as part of our grammar.
This begs the question of whether all the premises on which the conclusion is based
actually hold. If we consider an elliptical construction such as (2), then it is clear that a
noun has been omitted:
(2) Ich nehme den roten Ball und du den blauen.
I take the.ACC red.ACC ball and you the.ACC blue.ACC
‘I’ll take the red ball and you take the blue one.’
Despite there being no noun in den blauen ‘the blue’, this group of words behaves both
syntactically and semantically just like a noun phrase. (2) is of course not necessarily
evidence for there being empty elements, because one could simply say that den blauen
is a noun phrase consisting only of an article and an adjective (Wunderlich 1987).
Similar to the fact that it is understood that a noun is missing in (2), speakers of English
know that something is missing after like:
(3) Bagels, I like.
Every theory of grammar has to somehow account for these facts. It must be represented
in some way that like in (3) behaves just like a verb phrase that is missing something.
One possibility is to use traces. Bar-Hillel, Perles & Shamir (1961: 153, Lemma 4.1) have
shown that it is possible to turn phrase structure grammars with empty elements into
those without any. In many cases, the same techniques can be applied to the theories
presented here and we will therefore discuss the point in more detail in the following
section.
571
19 Empty elements
(4) v → np, v
v → np, pp, v
np → 𝜖
(5) v → np, v
v→v
v → np, pp, v
v → pp, v
This can also lead to cases where all elements on the right-hand side of a rule are re-
moved. Thus, what one has done is actually create a new empty category and then one
has to apply the respective replacement processes again. We will see an example of this
in a moment. Looking at the pair of grammars in (4)–(5), it is clear that the number
of rules has increased in (5) compared to (4) despite the grammars licensing the same
sequences of symbols. The fact that an NP argument can be omitted is not expressed
directly in (5) but instead is implicitly contained in two rules.
If one applies this procedure to the HPSG grammar in Chapter 9, then the trace does
not have a specific category such as NP. The trace simply has to be compatible with a
non-head daughter. As the examples in (6) show, adjuncts, arguments and parts of verbal
complexes can be extracted.
(6) a. Er𝑖 liest t𝑖 die Berichte.
he reads the reports
b. Oft𝑖 liest er die Berichte t𝑖 nicht.
often reads he the reports not
‘Often, he does not read the reports.’
c. Lesen𝑖 wird er die Berichte t𝑖 müssen.
read will he the reports must
‘He will have to read the reports.’
The relevant elements are combined with their head in a specific schema (Head-Argu-
ment Schema, Head-Adjunct Schema, Predicate Complex Schema). See Chapter 9 for
the first two schemata; the Predicate Complex Schema is motivated in detail in Müller
(2002a: Chapter 2; 2007a: Chapter 15). If one wishes to do without traces, then one needs
further additional schemata for the fronting of adjuncts, of arguments and of parts of
predicate complexes. The combination of a head with a trace is given in Figure 19.1 on
the next page. The trace-less analysis is shown in Figure 19.2 on the facing page. In
Figure 19.1, the element in the COMPS list of kennen is identified with the SYNSEM value
of the trace 4 . The lexical entry of the trace prescribes that the LOCAL value of the trace
should be identical to the element in the INHER|SLASH list.
The Non-Local Feature Principle (page 303) ensures that the SLASH information is
present on the mother node. Since an argument position gets saturated in Head-Argu-
ment structures, the accusative object is no longer contained in the COMPS list of the
mother node.
572
19.2 Eliminating empty elements from grammars
V[COMPS ⟨ NP[nom] ⟩,
INHER|SLASH ⟨ 1 ⟩]
_ liest
reads
Figure 19.1: Introduction of information about long-distance dependencies with a trace
V[COMPS ⟨ NP[nom] ⟩,
INHER|SLASH ⟨ 1 ⟩ ]
liest
reads
Figure 19.2: Introduction of information about long-distance dependencies using a unary
projection
Figure 19.2 shows the parallel trace-less structure. The effect that one gets by combin-
ing a trace in argument position in Head-Argument structures is represented directly on
the mother node in Figure 19.2: the LOCAL value of the accusative object was identified
with the element in INHER|SLASH on the mother node and the accusative object does not
occur in the valence list any more.
The grammar presented in Chapter 9 contains another empty element: a verb trace.
This would then also have to be eliminated.
Figure 19.3 on the next page shows the combination of a verb trace with an accusative
object. The verb trace is specified such that the DSL value is identical to the LOCAL value of
573
19 Empty elements
V[HEAD|DSL 1 ,
COMPS 2 ]
3 NP[acc] V 1 [HEAD|DSL 1 ,
COMPS 2 ⊕ ⟨ 3 NP[acc] ⟩]
die Berichte _
the reports
the trace (see p. 297). Since DSL is a head feature, the corresponding value is also present
on the mother node. Figure 19.4 shows the structures that we get by omitting the empty
node. This structure may look odd at first sight since a noun phrase is projected to a
3 NP[acc]
die Berichte
the reports
verb (see page 235 for similar verb-less structures in LFG). The information about the
fact that a verb is missing in the structure is equally contained in this structure as in the
structure with the verb trace. It is the DSL value that is decisive for the contexts in which
the structure in Figure 19.4 can appear. This is identical to the value in Figure 19.3 and
contains the information that a verb that requires an accusative object is missing in the
structure in question. Until now, we have seen that extraction traces can be removed
from the grammar by stipulating three additional rules. Similarly, three new rules are
needed for the verb trace. Unfortunately, it does not stop here as the traces for extraction
and head movement can also interact. For example, the NP in the tree in Figure 19.4 could
be an extraction trace. Therefore, the combination of traces can result in more empty
elements that then also have to be eliminated. Since we have three schemata, we will
have three new empty elements if we combine the non-head daughter with an extraction
trace and the head daughter with a verb trace. (8) shows these cases:
574
19.2 Eliminating empty elements from grammars
Eliminating two empty elements therefore comes at the price of twelve new rules. These
rules are not particularly transparent and it is not immediately obvious why the mother
node describes a linguistic object that follows general grammatical laws. For example,
there are no heads in the structures following the pattern in Figure 19.4. Since there is
no empirical difference between the theoretical variant with twelve additional schemata
575
19 Empty elements
and the variant with two empty elements, one should prefer the theory that makes fewer
assumptions (Occam’s Razor) and that is the theory with two empty elements.
One might think that the problem discussed here is just a problem specific to HPSG not
shared by trace-less analyses such as the LFG approach that was discussed in Section 7.5.
If we take a closer look at the rule proposed by Dalrymple (2006: 84), we see that the
situation in LFG grammars is entirely parallel. The brackets around the category symbols
mark their optionality. The asterisk following the PP means that any number of PPs (zero
or more) can occur in this position.
(11) V′ → (V) (NP) PP*
This means that (11) is a shorthand for rules such as those in (12):
(12) a. V′ → V
b. V′ → V NP
c. V′ → V NP PP
d. V′ → V NP PP PP
e. …
f. V′ → NP
g. V′ → NP PP
h. V′ → NP PP PP
i. …
Since all the elements on the right-hand side of the rule are optional, the rule in (11) also
stands for (13):
(13) V′ → 𝜖
Thus, one does in fact have an empty element in the grammar although the empty el-
ement is not explicitly listed in the lexicon. This follows from the optionality of all
elements on the right-hand side of a rule. The rule in (12f) corresponds to the schema
licensed by the structure in Figure 19.4. In the licensed LFG structure, there is also no
head present. Furthermore, one has a large number of rules that correspond to exactly
the schemata that we get when we eliminate empty elements from an HPSG grammar.
This fact is, however, hidden in the representational format of the LFG rules. The rule
schemata of LFG allow for handy abbreviations of sometimes huge sets of rules (even
infinite sets when using ‘*’).
Pollard (1988) has shown that Steedman’s trace-less analysis of long-distance depen-
dencies is not without its problems. As discussed in Section 8.5.3, a vast number of
recategorization rules or lexical entries for relative pronouns are required.
576
19.3 Empty elements and semantic interpretation
Sentences such as (14) are interesting since they have multiple readings (see Dowty
1979: Section 5.6) and it is not obvious how these can be derived.
(14) dass Max alle Fenster wieder öffnete
that Max all windows again opened
‘that Max opened all the windows again’
There is a difference between a repetitive and a restitutive reading: for the repetitive
reading of (14), Max has to have opened every window at least once before, whereas the
restitutive reading only requires that all windows were open at some point, that is, they
could have been opened by someone else.
These different readings are explained by decomposing the predicate open ′ into at
least two sub-predicates. Egg (1999) suggests the decomposition into CAUSE and open ′:
(15) CAUSE(x, open ′(y))
This means that there is a CAUSE operator that has scope over the relation open ′. Using
this kind of decomposition, it is possible to capture the varying scope of wieder ‘again’:
in one of the readings, wieder scopes over CAUSE and it scopes over open ′ but below
CAUSE in the other. If we assume that öffnen has the meaning in (15), then we still have
to explain how the adverb can modify elements of a word’s meaning, that is, how wieder
‘again’ can refer to open ′. Von Stechow (1996: 93) developed the analysis in Figure 19.5
on the next page. AgrS and AgrO are functional heads proposed for subject and object
agreement in languages like Basque and have been adopted for German (see Section 4.6).
Noun phrases have to be moved from the VoiceP into the specifier position of the AgrS
and AgrO heads in order to receive case. T stands for Tense and corresponds to Infl in
the GB theory (see Section 3.1.5 and Section 4.1.5). What is important is that there is
the Voice head and the separate representation of offen ‘open’ as the head of its own
phrase. In the figure, everything below Voice′ corresponds to the verb öffnen. By as-
suming a separate Voice head that contributes causative meaning, it becomes possible
to derive both readings in syntax: in the reading with narrow scope of wieder ‘again’,
the adverb is adjoined to the XP and has scope over open(x). In the reading with wide
scope, the adverb attaches to VoiceP or some higher phrase and therefore has scope over
CAUSE(BECOME(open(x))).
Jäger & Blutner (2003) point out that this analysis predicts that sentences such as (16)
only have the repetitive reading, that is, the reading where wieder ‘again’ has scope over
CAUSE.
(16) dass Max wieder alle Fenster öffnete
that Max again all windows opened
This is because wieder precedes alle Fenster and therefore all heads that are inside VoiceP.
Thus, wieder can only be combined with AgrOP or higher phrases and therefore has (too)
wide scope. (16) does permit a restitutive reading, however: all windows were open at
an earlier point in time and Max reestablishes this state.
Egg (1999) develops an analysis for these wieder cases using Constraint Language for
Lambda-Structures (CLLS). CLLS is an underspecification formalism, that is, no logical
577
19 Empty elements
AgrSP
DP AgrS′
TP AgrS
AgrOP T
DP AgrO′
VoiceP AgrO
DP Voice′
Voice VP
XP V
formulae are given but instead expressions that describe logical formulae. Using this kind
of expressions, it is possible to leave scope relations underspecified. I have already men-
tioned Minimal Recursion Semantics (MRS) (Copestake, Flickinger, Pollard & Sag 2005)
in several chapters of this book. As well as CLLS, MRS together with Underspecified Dis-
course Representation Theory (Reyle 1993; Frank & Reyle 1995) and Hole Semantics (Bos
1996; Blackburn & Bos 2005) all belong to the class of underspecification formalisms. See
Baldridge & Kruijff (2002) for an underspecification analysis in Categorial Grammar and
Nerbonne (1993) for an early underspecification analysis in HPSG. In the following, I will
reproduce Egg’s analysis in an MRS-like notation.
Before we turn to (14) and (16), let us consider the simpler sentence in (17):
(17) dass Max alle Fenster öffnete
that Max all windows opened
‘that Max opened all the windows’
578
19.3 Empty elements and semantic interpretation
This sentence can mean that in a particular situation, it is true of all windows that Max
opened them. A less readily accessible reading is the one in which Max causes all of the
windows to be open. It is possible to force this reading if one rules out the first reading
through contextual information (Egg 1999):
(18) Erst war nur die Hälfte der Fenster im Bus auf, aber dann öffnete Max
first was only the half of.the windows in.the bus open but then opened Max
alle Fenster.
all windows
‘At first, only half of the windows in the bus were open, but then Max opened all
of the windows.’
Both readings under discussion here differ with regard to the scope of the universal quan-
tifier. The reading where Max opens all the windows himself corresponds to wide scope
in (19a). The reading where some windows could have already been open corresponds
to (19b):
(19) a. ∀ x window ′(x) → CAUSE(max ′, open ′(x))
b. CAUSE(max ′, ∀ x window ′(x) → open ′(x))
Using underspecification, both of these readings can be represented in one dominance
graph such as the one given in Figure 19.6. Each relation in Figure 19.6 has a name that
h0
.
h4:window(x)
.
h5:open(x)
.
Figure 19.6: Dominance graph for Max alle Fenster öffnete
one can use to refer to the relation or “grasp” it. These names are referred to as handle.
The dominance graph states that ℎ0 dominates both ℎ1 and ℎ6 and that ℎ2 dominates
ℎ4, ℎ3 dominates ℎ5, and ℎ7 dominates ℎ5. The exact scopal relations are underspecified:
the universal quantifier can have scope over CAUSE or CAUSE can have scope over the
universal quantifier. Figures 19.7 and 19.8 show the variants of the graph with resolved
scope. The underspecified graph in Figure 19.6 does not say anything about the relation
between ℎ3 and ℎ6. The only thing it says is that ℎ3 somehow has to dominate ℎ5.
579
19 Empty elements
h0
.
h4:window(x)
.
h5:open(x)
. .
Figure 19.7: Dominance graph for the reading ∀ x window(x) → CAUSE(max,open(x)).
h0
.
h4:window(x)
.
h5:open(x)
. .
Figure 19.8: Graph for te reading CAUSE(max, ∀ x window(x) → open(x)).
In Figure 19.7 every (ℎ3) dominates CAUSE (ℎ6) and CAUSE dominates open (ℎ5). So,
every ′ dominates open ′ indirectly. In Figure 19.8, CAUSE dominates every ′ and every ′
dominates open ′. Again the constraints of Figure 19.6 are fulfilled, but ℎ7 dominates ℎ5
only indirectly.
The fact that the quantifier dominates ℎ4 is determined by the lexical entry of the
quantifier. The fact that the quantifier dominates ℎ5 does not have to be made explicit in
the analysis since the quantifier binds a variable in the relation belonging to ℎ5, namely
x. The dominance relation between ℎ7 and ℎ5 is always determined in the lexicon since
CAUSE and open ′ both belong to the semantic contribution of a single lexical entry.
The exact syntactic theory that one adopts for this analysis is, in the end, not of great
importance. I have chosen HPSG here. As Figure 19.9 on the next page shows, the analy-
sis of alle Fenster öffnet contains a simple structure with a verb and an object. This struc-
580
19.3 Empty elements and semantic interpretation
D E
V′[COMPS NP 𝑦 ,
RELS ⟨ h1:every(x, h2, h3), h4:window(x), h6:CAUSE(y,h7), h5:open(x) ⟩,
HCONS h0 =𝑞 h1, h2 =𝑞 h4, h0 =𝑞 h6, h7 =𝑞 h5 ]
D E
2 NP 𝑥 [RELS ⟨ h1:every(x, h2, h3), h4:window(x) ⟩, V[COMPS NP 𝑦 , 2 ,
HCONS h0 =𝑞 h1, h2 =𝑞 h4 ] RELS ⟨ h6:CAUSE(y,h7), h5:open(x) ⟩,
HCONS h0 =𝑞 h6, h7 =𝑞 h5 ]
ture does not differ from the one that would be assumed for alle Kinder kennt ‘all children
know’, involving the semantically simplex verb kennen ‘to know’. The only difference
comes from the meaning of the individual words involved. As shown in Section 9.1.6,
relations between individual words are passed on upwards. The same happens with sco-
pal restrictions. These are also represented in lists. HCONS stands for handle constraints.
=𝑞 in h0 =𝑞 h6 stand for the equality modulo quantifier scope.
Egg lists the following readings for the sentence in (16) – repeated here as (20):
(20) dass Max wieder alle Fenster öffnete
that Max again all windows opened
‘that Max opened all the windows again’
1. Max opened every window and he had already done that at least once for each
window (again ′(∀(CAUSE(open))); repetitive)
2. Max caused every window to be open and he had done that at least once before
(again ′(CAUSE(∀(open))); repetitive)
3. At some earlier point in time, all windows were simultaneously open and Max
re-established this state (CAUSE(again ′(∀(open))); restitutive)
These readings correspond to the dominance graph in Figure 19.10 on the following page.
Figure 19.11 on the next page shows the graph for (14) – repeated here as (21):
(21) dass Max alle Fenster wieder öffnete
that Max all windows again opened
To derive these dominance graphs from the ones without wieder ‘again’, all one has
to do is add the expression h8:again(h9) and the dominance requirements that demand
581
19 Empty elements
h0
.
h8:again(h9)
. .
h4:window(x)
.
h5:open(x)
. .
Figure 19.10: Dominance graph for Max wieder alle Fenster öffnete ‘that Max opened all
the windows again’
h0
.
.
h8:again(h9)
. .
h4:window(x)
.
h5:open(x)
.
Figure 19.11: Dominance graph for Max alle Fenster wieder öffnete ‘that Max opened all
the windows again’
that ℎ9 dominates quantifiers occurring to the right of wieder and that it is dominated
by quantifiers to the left of wieder.
It is therefore unproblematic to derive the relevant readings for modification by wieder
without empty elements for CAUSE and BECOME. The meaning of the word öffnen is
decomposed in a similar way but the decomposed meaning is assigned to a single ele-
ment, the verb. By underspecification of the scopal relations in the lexicon, the relevant
readings can then be derived.
582
19.4 Evidence for empty elements
583
19 Empty elements
as a schema that connects the subject to the VP (Pollard & Sag 1994: 39). In the lexical
items for finite verbs, it is already determined what the tree will look like in the end. As
in Categorial Grammar, adjuncts in HPSG can be combined with various intermediate
projections. Depending on the dominance schemata used in a particular grammar, the
lexical item will determine the constituent structure in which it can occur or allow for
multiple structures. In the grammar of German proposed in Chapter 9, it is possible to
analyze six different sequences with a lexical item for a ditransitive verb, that is, the lex-
ical item can – putting adjuncts aside – occur in six different structures with verb-final
order. Two sequences can be analyzed with the passive lexical item, which only has two
arguments. As in Categorial Grammar, sets of licensed structures are related to other
sets of licensed structures. In HPSG theorizing and also in Construction Grammar, there
have been attempts to replace lexical rules with other mechanisms since their “status is
dubious and their interaction with other analyses is controversial” (Bouma, Malouf &
Sag 2001: 19). Bouma, Malouf, et al. (2001) propose an analysis for extraction that, rather
than connecting lexical items with differing valence lists, establishes a relation between
a subset of a particular list in a lexical item and another list in the same lexical item. The
results of the two alternative analyses are shown in (22) and (23), respectively:
COMPS ⟨ NP[nom], NP[acc] ⟩
(22) a.
SLASH ⟨⟩
COMPS ⟨ NP[nom] ⟩
b.
SLASH ⟨ NP[acc] ⟩
In (22), (22a) is the basic entry and (22b) is related to (22a) via a lexical rule. The alterna-
tive analysis would only involve specifying the appropriate value of the ARG-ST feature3
and the COMPS and SLASH value is then derived from the ARG-ST value using the relevant
constraints. (23) shows two of the licensed lexical items.
ARG-ST ⟨ NP[nom], NP[acc] ⟩
(23) a. COMPS ⟨ NP[nom], NP[acc] ⟩
SLASH ⟨⟩
ARG-ST ⟨ NP[nom], NP[acc] ⟩
b. COMPS ⟨ NP[nom] ⟩
SLASH ⟨ NP[acc] ⟩
If we want to eliminate lexical rules entirely in this way, then we would require an
additional feature for each change.4 Since there are many interacting valence-chang-
ing processes, things only work out with the stipulation of a large number of auxiliary
features. The consequences of assuming such analyses have been discussed in detail
in Müller (2007a: Section 7.5.2.2). The problems that arise are parallel for inheritance-
3
ARG-ST stands for Argument Structure. The value of ARG-ST is a list containing all the arguments of a head.
For more on ARG-ST, see Section 9.1.1.
4
Alternatively, one could assume a very complex relation that connects ARG-ST and COMPS. But this would
then have to deliver the result of an interaction of a number of phenomena and the interaction of these
phenomena would not be captured in a transparent way.
584
19.5 Transformations, lexical rules, and empty elements
based approaches for argument structure-changing processes: they also require auxil-
iary features since it is not possible to model embedding and multiple changes of valence
information with inheritance. See Section 10.2.
Furthermore, the claim that the status of lexical rules is dubious must be rejected:
there are worked-out formalizations of lexical rules (Meurers 2001; Copestake & Briscoe
1992; Lascarides & Copestake 1999) and their interaction with other analyses is not con-
troversial. Most HPSG implementations make use of lexical rules and the interaction of a
number of rules and constraints can be easily verified by experiments with implemented
fragments.
Jackendoff (1975) presents two possible conceptions of lexical rules: in one variant,
the lexicon contains all words in a given language and there are just redundancy rules
saying something about how certain properties of lexical entries behave with regard to
properties of other lexical entries. For example, les- ‘read-’ and lesbar ‘readable’ would
both have equal status in the lexicon. In the other way of thinking of lexical rules, there
are a few basic lexical entries and the others are derived from these using lexical rules.
The stem les- ‘read-’ would be the basic entry and lesbar would be derived from it. In
HPSG, the second of the two variants is more often assumed. This is equivalent to the
assumption of unary rules. In Figure 9.9 on page 293, this has been shown accordingly:
the verb kennt ‘knows’ is mapped by a lexical rule to a verb that selects the projection
of an empty verbal head. With this conception of lexical rules, it is possible to remove
lexical rules from the grammar by assuming binary-branching structures with an empty
head rather than unary rules. For example, in HPSG analyses of resultative constructions
such as (24), lexical rules have been proposed (Verspoor 1997; Wechsler 1997; Wechsler
& Noh 2001; Müller 2002a: Chapter 5).
(24) [dass] Peter den Teich leer fischt
that Peter the pond empty fishes
‘that Peter fishes the pond empty’
In my own analysis, a lexical rule connects a verb used intransitively to a verb that
selects an accusative object and a predicate. Figure 19.12 on the following page shows
the corresponding tree. If we consider what (24) means, then we notice that the fishing
act causes the pond to become empty. This causation is not contained in any of the basic
lexical items for the words in (24). In order for this information to be present in the
semantic representation of the entire expression, it has to be added by means of a lexical
rule. The lexical rule says: if a verb is used with an additional predicate and accusative
object, then the entire construction has a causative meaning.
Figure 19.13 on page 587 shows how a lexical rule can be replaced by an empty head.
The empty head requires the intransitive verb and additionally an adjective, an accusative
object and a subject. The subject of fischt ‘fishes’ must of course be identical to the sub-
ject that is selected by the combination of fischt and the empty head. This is not shown
in the figure. It is possible, however, to establish this identity (see Hinrichs & Nakazawa
1994a). The causative semantics is contributed by the empty head in this analysis. The
trick that is being implemented here is exactly what was done in Section 19.2, just in the
opposite direction: in the previous section, binary-branching structures with an empty
585
19 Empty elements
V[COMPS ⟨⟩]
V[COMPS ⟨ NP[nom]⟩ ]
5
Here, we are discussing lexical rules, but this transformation trick can also be applied to other unary rules.
Semanticists often use such rules for type shifting. For example, a rule that turns a referential NP such as
a trickster in (i.a) into a predicative one (i.b) (Partee 1987).
These changes can be achieved by a unary rule that is applied to an NP or with a special empty head that
takes an NP as its argument. In current Minimalist approaches, empty heads are used (Ramchand 2005:
370), in Categorial Grammar and HPSG unary-branching rules are more common (Flickinger 2008: 91–92;
Müller 2009c; 2012b).
586
V[COMPS ⟨⟩]
587
19.5 Transformations, lexical rules, and empty elements
20 Extraction, scrambling, and passive:
one or several descriptive devices?
An anonymous reviewer suggested discussing one issue in which transformational theo-
ries differ from theories like LFG and HPSG. The reviewer claimed that Transformational
Grammars use just one tool for the description of active/passive alternations, scrambling,
and extraction, while theories like LFG and HPSG use different techniques for all three
phenomena. If this claim were correct and if the analyses made correct predictions, the
respective GB/Minimalism theories would be better than their competitors, since the
general aim in science is to develop theories that need a minimal set of assumptions.
I already commented on the analysis of passive in GB in Section 3.4, but I want to ex-
tend this discussion here and include a Minimalist analysis and one from Dependency
Grammar.
The task of any passive analysis is to explain the difference in argument realization
in examples like (1):
(1) a. She beats him.
b. He was beaten.
In these examples about chess, the accusative object of beat is realized as the nominative
in (1b). In addition, it can be observed that the position of the elements is different: while
him is realized postverbally in object position in (1a), it is realized preverbally in (1b). In
GB this is explained by a movement analysis. It is assumed that the object does not get
case in passive constructions and hence has to move into the subject position where
case is assigned by the finite verb. This analysis is also assumed in Minimalist work as
in David Adger’s textbook (2003), for instance. Figure 20.1 on the following page shows
his analysis of (2):
(2) Jason was killed.
TP stands for Tense Phrase and corresponds to the IP that was discussed in Chapter 3.
PassP is a functional head for passives. vP is a special category for the analysis of verb
phrases that was originally introduced for the analysis of ditransitives (Larson 1988) and
VP is the normal VP that consists of verb and object. In Adger’s analysis, the verb kill
moves from the verb position in VP to the head position of v, the passive auxiliary be
moves from the head position of PassP to the head position of the Tense Phrase. Features
like Infl are ‘checked’ in combination with such movements. The exact implementation
of these checking and valuing operations does not matter here. What is important is
that Jason moves from the object position to a position that was formerly known as the
20 Extraction, scrambling, and passive: one or several descriptive devices?
TP
Jason T [uN*]
T[past,nom] PassP
v VP
Figure 20.1: Adger’s Minimalist movement-based analysis of the passive (p. 231)
specifier position of T (see Footnote 28 on page 161 on the notion of specifier). All these
analyses assume that the participle cannot assign accusative to its object and that the
object has to move to another position to get case or check features. How exactly one can
formally represents the fact that the participle cannot assign case is hardly ever made
explicit in the GB literature. The following is a list of statements that can be found in
the literature:
(3) a. We shall assume that a passivized verb loses the ability to assign structural
ACCUSATIVE case to its complement. (Haegeman 1994: 183)
b. das Objekt des Aktivsatzes wird zum Subjekt des Passivsatzes, weil die pas-
sivische Verbform keinen Akkusativ-Kasus regieren kann (Akk-Kasus-Absorp-
tion). (Lohnstein 2014: 172)
In addition, it is sometimes said that the external theta-role is absorbed by the verb mor-
phology (Jaeggli 1986; Haegeman 1994: 183). Now, what would it entail if we made this
explicit? There is some lexical item for verbs like beat. The active form has the ability
to assign accusative to its object, but the passive form does not. Since this is a property
that is shared by all transitive verbs (by definition of the term transitive verb), this is
some regularity that has to be captured. One way to capture this is the assumption of a
special passive morpheme that suppresses the agent and changes something in the case
specification of the stem it attaches too. How this works in detail was never made ex-
plicit. Let us compare this morpheme-based analysis with lexical rule-based analyses: as
was explained in Section 19.5, empty heads can be used instead of lexical rules in those
cases in which the phonological form of the input and the output do not differ. So for ex-
ample, lexical rules that license additional arguments as in resultative constructions, for
instance, can be replaced by an empty head. However, as was explained in Section 9.2,
lexical rules are also used to model morphology. This is also true for Construction Gram-
590
mar (see Gert Booij’s work on Construction Morphology (2010), which is in many ways
similar to Riehemann’s work in HPSG (1993; 1998)). In the case of the passive lexical rule,
the participle morphology is combined with the stem and the subject is suppressed in
the corresponding valence list. This is exactly what is described in the GB/MP literature.
The respective lexical rule for the analysis of ge-lieb-t ‘loved’ is depicted in Figure 20.2
to the left. The morpheme-based analysis is shown to the right. To keep things simple, I
[ PHON 1 ] ge t
assume a flat analysis, but those who insist on binary branching structures would have
to come up with a way of deciding whether the ge- or the -t is combined first with the
stem and in which way selection and percolation of features takes place. Independent
of how morphology is done, the fact that the inflected form (the top node in both fig-
ures) has different properties than the verb stem has to be represented somehow. In
the morpheme-based world, the morpheme is responsible for suppressing the agent and
changing the case assignment properties, in the lexical rule/construction world this is
done by the respective lexical rule. There is no difference in terms of needed tools and
necessary stipulations.
The situation in Minimalist theories is a little bit different. For instance, (Adger 2003:
229, 231) writes the following:
Passives are akin to unaccusatives, in that they do not assign accusative case to
their object, and they do not appear to have a thematic subject. […] Moreover, the
idea that the function of this auxiliary is to select an unaccusative little vP simul-
taneously explains the lack of accusative case and the lack of a thematic subject.
(Adger 2003: 229, 231)
So this is an explicit statement. The relation between a stem and a passive participle form
that was assumed in GB analyses is now a verb stem that is combined with two different
versions of little v. Which v is chosen is determined by the governing head, a functional
Perf head or a Pass head. This can be depicted as in Figure 20.3 on the following page.
When kill is used in the perfect or the passive, it is spelled out as killed. If it is used
in the active with a 3rd person singular subject it is spelled out as kills. This can be
compared with a lexical analysis, for instance the one assumed in HPSG. The analysis
is shown in Figure 20.4 on the next page. The left figure shows a lexical item that is
licensed by a lexical rule that is applied to the stem kill-. The stem has two elements in
its argument structure list and for the active forms the complete argument structure list
591
20 Extraction, scrambling, and passive: one or several descriptive devices?
vP
DP v vP
v[uD] VP v VP
Figure 20.3: Analysis of the passive and the perfect and the passive in a Minimalist the-
ory involving two different versions of little v
V[SPR ⟨ 1 ⟩, V[SPR ⟨ 2 ⟩,
COMPS ⟨ 2 ⟩, COMPS ⟨ ⟩,
ARG-ST ⟨ 1 NP[str], 2 NP[str] ⟩] ARG-ST ⟨ 2 NP[str] ⟩]
Figure 20.4: Lexical rule-based analysis of the perfect and the passive in HPSG
is shared between the licensed lexical item and the stem. The first element of the ARG-ST
list is mapped to SPR and the other elements to COMPS (in English). Passive is depicted
in the right figure: the first element of the ARG-ST with structural case is suppressed and
since the element that was the second element in the ARG-ST list of the stem ( 2 ) is now
the first element, this item is mapped to SPR. See Section 9.2 for passive in HPSG and
Section 9.1.1 for comments on ARG-ST and the differences between German and English.
The discussion of Figures 20.3 and 20.4 are a further illustration of a point made in Sec-
tion 19.5: lexical rules can be replaced by empty heads and vice versa. While HPSG says
there are stems that are related to inflected forms and corresponding to the inflection the
arguments are realized in a certain way, Minimalist theories assume two variants of little
v that differ in their selection of arguments. Now, the question is: are there empirical
differences between the two approaches? I think there are differences if one considers
the question of language acquisition. What children can acquire from data is that there
are various inflected forms and that they are related somehow. What remains question-
able is whether they really would be able to detect empty little vs. One could claim of
course that children operate with chunks of structures such as the ones in Figure 20.3.
But then a verb would be just a chunk consisting of little v and V and having some open
slots. This would be indistinguishable from what the HPSG analysis assumes.
As far as the “lexical rules as additional tool” aspect is concerned, the discussion is
closed, but note that the standard GB/Minimalism analyses differ in another way from
LFG and HPSG analyses, since they assume that passive has something to do with move-
592
ment, that is, they assume that the same mechanisms are used that are used for nonlocal
dependencies.1 This works for languages like English in which the object has to be real-
ized in postverbal position in the active and in preverbal position in the passive, but it
fails for languages like German in which the order of constituents is more free. Lenerz
(1977: Section 4.4.3) discussed the examples in (44) on page 112 – which are repeated here
as (4) for convenience:
(4) a. weil das Mädchen dem Jungen den Ball schenkt
because the girl.NOM the.DAT boy the.ACC Ball gives
‘because the girl gives the ball to the boy’
b. weil dem Jungen der Ball geschenkt wurde
because the.DAT boy the.NOM ball given was
c. weil der Ball dem Jungen geschenkt wurde
because the.NOM ball the.DAT boy given was
‘because the ball was given to the boy’
While both orders in (4b) and (4c) are possible, the one with dative–nominative order
in (4b) is the unmarked one. There is a strong linearization preference in German de-
manding that animate NPs be serialized before inanimate ones (Hoberg 1981: 46). This
linearization rule is unaffected by passivization. Theories that assume that passive is
movement either have to assume that the passive of (4a) is (4c) and (4b) is derived from
(4c) by a further reordering operation (which would be implausible since usually one
assumes that more marked constructions require more transformations), or they would
have to come up with other explanations for the fact that the subject of the passive sen-
tence has the same position as the object in active sentences. As was already explained in
Section 3.4, one such explanation is to assume an empty expletive subject that is placed
in the position where nominative is assigned and to somehow connect this expletive el-
ement to the subject in object position. While this somehow works, it should be clear
that the price for rescuing a movement-based analysis of passive is rather high: one has
to assume an empty expletive element, that is, something that neither has a form nor a
meaning. The existence of such an object could not be inferred from the input unless it
is assumed that the structures in which it is assumed are given. Thus, a rather rich UG
would have to be assumed.
The question one needs to ask here is: why does the movement-based analysis have
these problems and why does the valence-based analysis not have them? The cause
of the problem is that the analysis of the passive mixes two things: the fact that SVO
languages like English encode subjecthood positionally, and the fact that the subject is
suppressed in passives. If these two things are separated the problem disappears. The fact
that the object of the active sentence in (1a) is realized as the subject in (1b) is explained
1
There is another option in Minimalist theories. Since Agree can check features nonlocally, T can assign
nominative to an embedded element. So, in principle the object may get nominative in the VP without
moving to T. However, Adger (2003: 368) assumes that German has a strong EPP feature on T, so that
the underlying object has to move to the specifier of T. This is basically the old GB analysis of passive in
German with all its conceptual problems and disadvantages.
593
20 Extraction, scrambling, and passive: one or several descriptive devices?
by the assumption that the first NP on the argument structure list with structural case
is realized as subject and mapped to the respective valence feature: SPR in English. Such
mappings can be language specific (see Section 9.1.1 and Müller (2019c) where I discuss
Icelandic, which is an SVO language with subjects with lexical case).
In what follows, I discuss another set of examples that are sometimes seen as evidence
for a movement-based analysis. The examples in (5) are instances of the so-called remote
passive (Höhle 1978: 175–176).2
(5) a. daß er auch von mir zu überreden versucht wurde3
that he.NOM also from me to persuade tried got
‘that an attempt to persuade him was also made by me’
b. weil der Wagen oft zu reparieren versucht wurde
because the car.NOM often to repair tried was
‘because many attempts were made to repair the car’
What is interesting about these examples is that the subject is the underlying object of a
deeply embedded verb. This seems to suggest that the object is extracted out of the verb
phrase. So the analysis of (5b) would be (6):
(6) weil [IP der Wagen𝑖 [VP oft [VP [VP [VP _𝑖 zu reparieren] versucht] wurde]
because the car.NOM often to repair tried was
While this is a straight-forward explanation of the fact that (5b) is grammatical, another
explanation is possible as well. In the HPSG analysis of German (and Dutch) it is assumed
that verbs like those in (5b) form a verbal complex, that is, zu reparieren versucht wurde
‘to repair tried was’ forms one unit. When two or more verbs form a complex, the highest
verb attracts the arguments from the verb it embeds (Hinrichs & Nakazawa 1989b; 1994a;
Bouma & van Noord 1998). A verb like versuchen ‘to try’ selects a subject, an infinitive
with zu ‘to’ and all complements that are selected by this infinitive. In the analysis of
(7), versuchen ‘to try’ selects for its subject, the object of reparieren ‘to repair’ and for the
verb zu reparieren ‘to repair’.
(7) weil er den Wagen zu reparieren versuchen will
because he.NOM the.ACC car to repair try wants
‘because he wants to try to repair the car’
Now if the passive lexical rule applies to versuch-, it suppresses the first argument of
versuch- with structural case, which is the subject of versuch-. The next argument of
versuch- is the object of zu reparieren. Since this element is the first NP with structural
case, it gets nominative as in (5b). So, this shows that there is an analysis of the remote
passive that does not rely on movement. Since movement-based analyses were shown to
be problematic and since there are no data that cannot be explained without movement,
analyses without movement have to be preferred.
2
See Müller (2002a: Section 3.1.4.1) and Wurmbrand (2003b) for corpus examples.
3
Oppenrieder (1991: 212).
594
This leaves us with movement-based accounts of local reordering (scrambling). The
reviewer suggested that scrambling, passive, and nonlocal extraction may be analyzed
with the same mechanism. It was long thought that scope facts made the assumption of
movement-based analyses of scrambling necessary, but it was pointed out by Kiss (2001:
146) and Fanselow (2001: Section 2.6) that the reverse is true: movement-based accounts
of scrambling make wrong predictions with regard to available quantifier scopings. I
discussed the respective examples in Section 3.5 already and will not repeat the discus-
sion here. The conclusion that has to be drawn from this is that passive, scrambling,
and long distance extraction are three different phenomena that should be treated dif-
ferently. The solution for the analysis of the passive that is adopted in HPSG is based on
an analysis by Haider (1986a), who worked within the GB framework. The “scrambling-
as-base generation” approach to local reordering that was used in HPSG right from the
beginning (Gunji 1986) is also adopted by some practitioners of GB/Minimalism, e.g.,
Fanselow (2001).
Having discussed the analyses in GB/Minimalism, I now turn to Dependency Gram-
mar. Groß & Osborne (2009) suggest that w-fronting, topicalization, scrambling, extra-
position, splitting, and also the remote passive should be analyzed by what they call
rising. The concept was already explained in Section 11.5. The Figures 20.5 and 20.6
show examples for the fronting and the scrambling of an object. Groß and Osborne as-
N N V𝑔
Det
Figure 20.5: Analysis of Die Idee wird jeder verstehen. ‘Everybody will understand the
idea.’ involving rising
sume that the object depends on the main verb in sentences with auxiliary verbs, while
the subject depends on the auxiliary. Therefore, the object die Idee ‘the idea’ and the
object sich ‘himself’ have to rise to the next higher verb in order to keep the structures
projective. Figure 20.7 on the following page shows the analysis of the remote passive.
The object of zu reparieren ‘to repair’ rises to the auxiliary wurde ‘was’.
Groß and Osborne use the same mechanism for all these phenomena, but it should be
clear that there have to be differences in the exact implementation. Groß and Osborne
say that English does not have scrambling, while German does. If this is to be captured,
there must be a way to distinguish the two phenomena, since if this were not possible,
one would predict that English has scrambling as well, since both German and English
595
20 Extraction, scrambling, and passive: one or several descriptive devices?
Adv N N V𝑔
Det
Figure 20.6: Analysis of Gestern hat sich der Spieler verletzt. ‘Yesterday, the player injured
himself.’ involving rising of the object of the main verb verletzt ‘injured’
Subjunction
N V
Det V𝑔
Figure 20.7: Analysis of the remote passive dass der Wagen zu reparieren versucht wurde
‘that it was tried to repair the car’ involving rising
allow long distance fronting. Groß & Osborne (2009: 58) assume that object nouns that
rise must take the nominative. But if the kind of rising that they assume for remote
passives is identical to the one that they assume for scrambling, they would predict that
den Wagen gets nominative in (8) as well:
(8) dass den Wagen niemand repariert hat
that the.ACC car nobody.NOM repaired has
‘that nobody repaired the car’
Since den Wagen ‘the car’ and repariert ‘repaired’ are not adjacent, den Wagen has to
rise to the next higher head in order to allow for a projective realization of elements.
So in order to assign case properly, one has to take into account the arguments that
are governed by the head to which a certain element rises. Since the auxiliary hat ‘has’
596
already governs a nominative, the NP den Wagen has to be realized in the accusative. An
analysis that assumes that both the accusative and nominative depend on hat ‘has’ in
(8) is basically the verbal complex analysis assumed in HPSG and some GB variants.
Note, however, that this does not extend to nonlocal dependencies. Case is assigned
locally by verbs or verbal complexes, but not to elements that come from far away. The
long distance extraction of NPs is more common in southern variants of German and
there are only a few verbs that do not take a nominative argument themselves. The
examples below involve dünken ‘to think’, which governs an accusative and a sentential
object and scheinen ‘to seem’, which governs a dative and a sentential object. If (9a) is
analyzed with den Wagen rising to dünkt, one might expect that den Wagen ‘the car’ gets
nominative since there is no other element in the nominative. However, (8b) is entirely
out.
Similarly there is no agreement between the fronted element and the verb to which it
attaches:
(10) a. Mir scheint, dass die Wagen ihm gefallen.
me.DAT.1PL seems.3SG that the cars.3PL him please.3PL
‘He seems to me to like the cars.’
b. Die Wagen scheint mir, dass ihm gefallen.
the cars.3PL seem.3SG me.DAT that him please.3PL
‘The cars, he seems to me to like.’
c. * Die Wagen scheinen mir, dass ihm gefällt.
the cars.3PL seem.3PL me.DAT that him pleases.3SG
d. * Die Wagen scheinen mir, dass ihm gefallen.
the cars.3PL seem.3PL me.DAT that him please.3PL
This shows that scrambling/remote passive and extraction should not be dealt with by
the same mechanism or if they are dealt with by the same mechanism one has to make
sure that there are specialized variants of the mechanism that take the differences into
account. I think what Groß and Osborne did is simply recode the attachment relations of
phrase structure grammars. die Idee ‘the idea’ has some relation to wird jeder verstehen
‘will everybody understand’ in Figure 20.5, as it does in GB, LFG, GPSG, HPSG, and other
similar frameworks. In HPSG, die Idee ‘the idea’ is the filler in a filler-head configuration.
The remote passive and local reorderings of arguments of auxiliaries, modal verbs, and
other verbs that behave similarly are explained by verbal complex formation where all
non-verbal arguments depend on the highest verb (Hinrichs & Nakazawa 1994a).
597
20 Extraction, scrambling, and passive: one or several descriptive devices?
Concluding this chapter, it can be said that local reorderings and long-distance depen-
dencies are two different things that should be described with different tools (or there
should be further constraints that differ for the respective phenomena when the same
tool is used). Similarly, movement-based analyses of the passive are problematic since
passive does not necessarily imply reordering.
598
21 Phrasal vs. lexical analyses
coauthored with Stephen Wechsler
This section deals with a rather crucial aspect when it comes to the comparison of
the theories described in this book: valence and the question whether sentence struc-
ture, or rather syntactic structure in general, is determined by lexical information or
whether syntactic structures have an independent existence (and meaning) and lexical
items are just inserted into them. Roughly speaking, frameworks like GB/Minimalism,
LFG, CG, HPSG, and DG are lexical, while GPSG and Construction Grammar (Goldberg
1995; 2003a; Tomasello 2003; 2006b; Croft 2001) are phrasal approaches. This catego-
rization reflects tendencies, but there are non-lexical approaches in Minimalism (Borer’s
exoskeletal approach, 2003) and LFG (Alsina 1996; Asudeh et al. 2008; 2013) and there
are lexical approaches in Construction Grammar (Sign-Based Construction Grammar,
see Section 10.6.2). The phrasal approach is wide-spread also in frameworks like Cogni-
tive Grammar (Dąbrowska 2001; Langacker 2009: 169) and Simpler Syntax (Culicover &
Jackendoff 2005; Jackendoff 2008) that could not be discussed in this book.
The question is whether the meaning of an utterance like (1a) is contributed by the
verb give and the structure needed for the NPs occurring together with the verb does
not contribute any meaning or whether there is a phrasal pattern [X Verb Y Z] that
contributes some “ditransitive meaning” whatever this may be.1
(1) a. Peter gives Mary the book.
b. Peter fishes the pond empty.
Similarly, there is the question of how the constituents in (1b) are licensed. This sentence
is interesting since it has a resultative meaning that is not part of the meaning of the verb
fish: Peter’s fishing causes the pond to become empty. Nor is this additional meaning
part of the meaning of any other item in the sentence. On the lexical account, there
is a lexical rule that licenses a lexical item that selects for Peter, the pond, and empty.
This lexical item also contributes the resultative meaning. On the phrasal approach, it is
1
Note that the prototypical meaning is a transfer of possession in which Y receives Z from X, but the reverse
holds in (i.b):
assumed that there is a pattern [Subj V Obj Obl]. This pattern contributes the resultative
meaning, while the verb that is inserted into this pattern just contributes its prototypical
meaning, e.g., the meaning that fish would have in an intransitive construction. I call
such phrasal approaches plugging approaches, since lexical items are plugged into ready-
made structures that do most of the work.
In what follows I will examine these proposals in more detail and argue that the lexical
approaches to valence are the correct ones. The discussion will be based on earlier work
of mine (Müller 2006; 2007b; 2010b) and work that I did together with Steve Wechsler
(Müller & Wechsler 2014a,b). Some of the sections in Müller & Wechsler (2014a) started
out as translations of Müller (2013a), but the material was reorganized and refocused
due to intensive discussion with Steve Wechsler. So rather than using a translation of
Section 11.11 of Müller (2013a), I use parts of Müller & Wechsler (2014a) here and add
some subsections that had to be left out of the article due to space restrictions (Subsec-
tions 21.3.6 and 21.7.3). Because there have been misunderstandings in the past (e.g., Boas
(2014), see Müller & Wechsler (2014b)), a disclaimer is necessary here: this section is not
an argument against Construction Grammar. As was mentioned above Sign-Based Con-
struction Grammar is a lexical variant of Construction Grammar and hence compatible
with what I believe to be correct. This section is also not against phrasal constructions
in general, since there are phenomena that seem to be best captured with phrasal con-
structions. These are discussed in detail in Subsection 21.10. What I will argue against in
the following subsections is a special kind of phrasal construction, namely phrasal argu-
ment structure constructions (phrasal ASCs). I believe that all phenomena that have to
do with valence and valence alternations should be treated lexically.
600
21.1 Some putative advantages of phrasal models
include many different kinds of specific linguistic symbols. But never are they
empty rules devoid of semantic content or communicative function. (Tomasello
2003: 100)
Thus constructions are said to differ from grammatical rules in two ways: they must
carry meaning; and they reflect the actual “patterns of usage” fairly directly.
Consider first the constraint that every element of the grammar must carry meaning,
which we call the semiotic dictum. Do lexical or phrasal theories hew the most closely
to this dictum? Categorial Grammar, the paradigm of a lexical theory (see Chapter 8),
is a strong contender: it consists of meaningful words, with only a few very general
combinatorial rules such as X/Y ∗ Y = X. Given the rule-to-rule assumption, those com-
binatorial rules specify the meaning of the whole as a function of the parts. Whether
such a rule counts as meaningful in itself in Tomasello’s sense is not clear.
What does seem clear is that the combinatorial rules of Construction Grammar, such
as Goldberg’s Correspondence Principle for combining a verb with a construction (1995:
50), have the same status as those combinatorial rules:
(2) The Correspondence Principle: each participant that is lexically profiled and ex-
pressed must be fused with a profiled argument role of the construction. If a verb
has three profiled participant roles, then one of them may be fused with a non-
profiled argument role of a construction. (Goldberg 1995: 50)
Both verbs and constructions are specified for participant roles, some of which are pro-
filed. Argument profiling for verbs is “lexically determined and highly conventionalized”
(Goldberg 1995: 46). Profiled argument roles of a construction are mapped to direct gram-
matical functions, i. e., SUBJ, OBJ, or OBJ2. By the Correspondence Principle the lexically
profiled argument roles must be direct, unless there are three of them, in which case one
may be indirect.2 With respect to the semiotic dictum, the Correspondence Principle has
the same status as the Categorial Grammar combinatorial rules: a meaningless algebraic
rule that specifies the way to combine meaningful items.
Turning now to the lexicalist syntax we favor, some elements abide by the semiotic
dictum while others do not. Phrase structure rules for intransitive and transitive VPs (or
the respective HPSG ID schema) do not. Lexical valence structures clearly carry meaning
since they are associated with particular verbs. In an English ditransitive, the first object
expresses the role of “intended recipient” of the referent of the second object. Hence He
carved her a toy entails that he carved a toy with the intention that she receive it. So the
lexical rule that adds a benefactive recipient argument to a verb adds meaning. Alter-
natively, a phrasal ditransitive construction might contribute that “recipient” meaning.3
Which structures have meaning is an empirical question for us.
In Construction Grammar, however, meaning is assumed for all constructions a pri-
ori. But while the ditransitive construction plausibly contributes meaning, no truth-
2
We assume that the second sentence of (2) provides for exceptions to the first sentence.
3
In Section 21.2.1 we argue that the recipient should be added in the lexical argument structure, not through a
phrasal construction. See Wechsler (1991: 111–113; 1995: 88–89) for an analysis of English ditransitives with
elements of both constructional and lexical approaches. It is based on Kiparsky’s notion of a thematically
restricted positional linker (1987; 1988).
601
21 Phrasal vs. lexical analyses
conditional meaning has yet been discovered for either the intransitive or bivalent tran-
sitive constructions. Clearly the constructionist’s evidence for the meaningfulness of cer-
tain constructions such as the ditransitive does not constitute evidence that all phrasal
constructions have meaning. So the lexical and phrasal approaches seem to come out
the same, as far as the semiotic dictum is concerned.
Now consider the second usage-based dictum, that the elements of the grammar di-
rectly reflect patterns of usage, which we call the transparency dictum. The Construction
Grammar literature often presents their constructions informally in ways that suggest
that they represent surface constituent order patterns: the transitive construction is “[X
VERB Y]” (Tomasello) or “[Subj V Obj]” (Goldberg 1995; 2006)4 ; the passive construction
is “X was VERBed by Y” (Tomasello 2003: 100) or “Subj aux Vpp (PPby)” (Goldberg 2006:
5). But a theory in which constructions consist of surface patterns was considered in de-
tail and rejected by Müller (2006: Section 2), and does not accurately reflect Goldberg’s
actual theory.5 The more detailed discussions present argument structure constructions,
which are more abstract and rather like the lexicalists’ grammatical elements (or perhaps
an LFG f-structure): the transitive construction resembles a transitive valence structure
(minus the verb itself); the passive construction resembles the passive lexical rule.
With respect to fulfilling the desiderata of usage-based theorists, we do not find any
significant difference between the non-lexical and lexical approaches.
21.1.2 Coercion
Researchers working with plugging proposals usually take coercion as an indication of
the usefulness of phrasal constructions. For instance, Anatol Stefanowitsch (Lecture in
the lecture series Algorithmen und Muster –- Strukturen in der Sprache, 2009) discussed
the example in (3):
(3) Das Tor zur Welt Hrnglb öffnete sich ohne Vorwarnung und verschlang [sie] … die
Welt Hrnglb wird von Magiern erschaffen, die Träume zu Realität formen können,
aber nicht in der Lage sind zu träumen. Haltet aus, Freunde. Und ihr da draußen,
bitte träumt ihnen ein Tor.6
The crucial part is bitte träumt ihnen ein Tor ‘Dream a gate for them’. In this fantasy con-
text the word träumen, which is intransitive, is forced into the ditransitive construction
and therefore gets a certain meaning. This forcing of a verb corresponds to overwriting
or rather extending properties of the verb by the phrasal construction.
4
Goldberg et al. (2004: 300) report about a language acquisition experiment that involves an SOV pattern.
The SOV order is mentioned explicitly and seen as part of the construction.
5
This applies to argument structure constructions only. In some of her papers Goldberg assumes that very
specific phrase structural configurations are part of the constructions. For instance in her paper on complex
predicates in Persian (Goldberg 2003b) she assigns V0 and V categories. See Müller (2010b: Section 4.9) for
a critique of that analysis.
6
http://www.elbenwaldforum.de/showflat.php?Cat=&Board=Tolkiens_Werke&Number=1457418&page=
3&view=collapsed&sb=5&o=&fpart=16, 2010-02-27.
‘The gate to the world Hrnglb opened without warning and swallowed them. The world Hrnglb is
created by magicians that can form reality from dreams but cannot dream themselves. Hold out, friends!
And you out there, please, dream a gate for them.’
602
21.1 Some putative advantages of phrasal models
7
Kay (2005), working in the framework of CxG, also suggests unary constructions.
8
Douglas Adams. 1979. The Hitchhiker’s Guide to the Galaxy, Harmony Books. Quoted from Goldberg
(2003a: 220).
603
21 Phrasal vs. lexical analyses
the complement structure of a verb, but this is simply a special case of a more general
phenomenon that has been analyzed by rules of systematic polysemy.
604
21.1 Some putative advantages of phrasal models
wrong place and misses the real differences between these approaches. This argument
from simplicity is often repeated and so it is important to understand why it is incorrect.
Tomasello (2003) presents the argument as follows. Discussing first the lexical rules
approach, Tomasello (2003: 160) writes that
One implication of this view is that a verb must have listed in the lexicon a different
meaning for virtually every different construction in which it participates […]. For
example, while the prototypical meaning of cough involves only one participant,
the cougher, we may say such things as He coughed her his cold, in which there
are three core participants. In the lexical rules approach, in order to produce this
utterance the child’s lexicon must have as an entry a ditransitive meaning for the
verb cough. (Tomasello 2003: 160)
Tomasello (2003: 160) then contrasts a Construction Grammar approach, citing Fillmore
et al. (1988), Goldberg (1995), and Croft (2001). He concludes as follows:
The main point is that if we grant that constructions may have meaning of their
own, in relative independence of the lexical items involved, then we do not need to
populate the lexicon with all kinds of implausible meanings for each of the verbs we
use in everyday life. The construction grammar approach in which constructions
have meanings is therefore both much simpler and much more plausible than the
lexical rules approach. (Tomasello 2003: 161)
This reflects a misunderstanding of lexical rules, as they are normally understood. There
is no implausible sense populating the lexicon. The lexical rule approach to He coughed
her his cold states that when the word coughed appears with two objects, the whole com-
plex has a certain meaning (see Müller 2006: 876). Furthermore we explicitly distinguish
between listed elements (lexical entries) and derived ones. The general term subsuming
both is lexical item.
The simplicity argument also relies on a misunderstanding of a theory Tomasello ad-
vocates, namely the theory due to Goldberg (1995; 2006). For his argument to go through,
Tomasello must tacitly assume that verbs can combine freely with constructions, that is,
that the grammar does not place extrinsic constraints on such combinations. If it is nec-
essary to also stipulate which verbs can appear in which constructions, then the claim
to greater simplicity collapses: each variant lexical item with its “implausible meaning”
under the lexical rule approach corresponds to a verb-plus-construction combination
under the phrasal approach.
Passages such as the following may suggest that verbs and constructions are assumed
to combine freely:9
Constructions are combined freely to form actual expressions as long as they can
be construed as not being in conflict (invoking the notion of construal is intended
9
The context of these quotes makes clear that the verb and the argument structure construction are consid-
ered constructions. See Goldberg (2006: 21, ex. (2)).
605
21 Phrasal vs. lexical analyses
606
21.2 Evidence for lexical approaches
(9) a. cause-to-receive-by-kicking(x, y, z)
b. cause(kick(x, y),receive(z,y))
The same sort of “composite fused structure” is assumed under either view. With respect
to the semantic structure, the number and plausibility of senses, and the polyadicity of
the semantic relations, the two theories are identical. They mainly differ in the way this
representation fits into the larger theory of syntax. They also differ in another respect:
on the lexical view, the derived three-argument valence structure is associated with the
phonological string kicked. Next, we present evidence for this claim.
607
21 Phrasal vs. lexical analyses
Interestingly, it is possible to coordinate basic ditransitive verbs with verbs that have
additional arguments licensed by the lexical rule. (12) provides examples in English and
German ((12b) is quoted from Müller (2013a: 420)):
These sentences show that both verbs are 3-argument verbs at the 𝑉 0 level, since they
involve 𝑉 0 coordination:
(13) [V0 offered and made] [NP me] [NP a wonderful espresso]
This is expected under the lexical rule analysis but not the non-lexical constructional
one.13
Summarizing the coordination argument: coordinated verbs generally must have com-
patible syntactic properties like valence properties. This means that in (12b), for example,
gebacken ‘baked’ and gegeben ‘given’ have the same valence properties. On the lexical
approach the creation verb gebacken, together with a lexical rule, licenses a ditransitive
verb. It can therefore be coordinated with gegeben. On the phrasal approach however,
the verb gebacken has two argument roles and is not compatible with the verb gegeben,
which has three argument roles. In the phrasal model, gebacken can only realize three
arguments when it enters the ditransitive phrasal construction or argument structure
construction. But in sentences like (12) it is not gebacken alone that enters the phrasal
syntax, but rather the combination of gebacken and gegeben. On this view, the verbs are
incompatible as far as the semantic roles are concerned.
To fix this under the phrasal approach, one could posit a mechanism such that the
semantic roles that are required for the coordinate phrase baked and given are shared by
each of its conjunct verbs and that they are therefore compatible. But this would amount
to saying that there are several verb senses for baked, something that the anti-lexicalists
claim to avoid, as discussed in the next section.
11
http://www.thespinroom.com.au/?p=102, 2012-07-07.
12
http://www.musiker-board.de/diverses-ot/35977-die-liebe-637-print.html, 2012-06-08.
13
One might wonder whether these sentences could be instances of Right Node Raising (RNR) out of coordi-
nated VPs (Bresnan 1974; Abbott 1976):
(i) She [ offered ___ ] and [ made me ___ ] a wonderful espresso.
But this cannot be correct. Under such an analysis the first verb has been used without a benefactive or
recipient object. But me is interpreted as the recipient of both the offering and making. Secondly, the
second object can be an unstressed pronoun (She offered and made me it), which is not possible in RNR.
Note that offered and made cannot be a pseudo-coordination meaning ‘offered to make’. This is possible
only with stem forms of certain verbs such as try.
608
21.2 Evidence for lexical approaches
609
21 Phrasal vs. lexical analyses
(16) a. active present participles (cf. The leaf is falling): the falling leaf
b. active past participles (cf. The leaf has fallen): the fallen leaf
c. passive participles (cf. The toy is being broken (by the child).): the broken toy
That the derived forms are adjectives, not verbs, is shown by a host of properties, includ-
ing negative un- prefixation: unbroken means ‘not broken’, just as unkind means ‘not
kind’, while the un- appearing on verbs indicates, not negation, but action reversal, as in
untie (Bresnan, 1982b: 21, 2001: Chapter 3). Predicate adjectives preserve the subject of
predication of the verb and for prenominal adjectives the rule is simply that the role that
would be assigned to the subject goes to the modified noun instead (The toy remained
(un-)broken.; the broken toy). Being an 𝐴0 , such a form can be coordinated with another
𝐴0 , as in the following:
In (17b), three adjectives are coordinated, one underived (old), one derived from a present
participle (rotting), and one from a passive participle (broken). Such coordination is com-
pletely mundane on a lexical theory. Each A0 conjunct has a valence feature (in HPSG it
would be the SPR feature for predicates or the MOD feature for the prenominal modifiers),
which is shared with the mother node of the coordinate structure. But the point of the
phrasal (or ASC) theory is to deny that words have such valence features.
The claim that lexical derivation of valence structure is distinct from phrasal combina-
tion is further supported with evidence from deverbal nominalization (Wechsler 2008a).
To derive nouns from verbs, -ing suffixation productively applies to all inflectable verbs
(the shooting of the prisoner), while morphological productivity is severely limited for
various other suffixes such as -(a)tion (* the shootation of the prisoner). So forms such
as destruction and distribution must be retrieved from memory while -ing nouns such as
looting or growing could be (and in the case of rare verbs or neologisms, must be) derived
from the verb or the root through the application of a rule (Zucchi 1993). This difference
explains why -ing nominals always retain the argument structure of the cognate verb,
while other forms show some variation. A famous example is the lack of the agent argu-
ment for the noun growth versus its retention by the noun growing: * John’s growth of
tomatoes versus John’s growing of tomatoes (Chomsky 1970).15
But what sort of rule derives the -ing nouns, a lexical rule or a phrasal one? In
Marantz’s (1997) phrasal analysis, a phrasal construction (notated as vP) is responsible
for assigning the agent role of -ing nouns such as growing. For him, none of the words
directly selects an agent via its argument structure. The -ing forms are permitted to ap-
pear in the vP construction, which licenses the possessive agent. Non-ing nouns such as
destruction and growth do not appear in vP. Whether they allow expression of the agent
depends on semantic and pragmatic properties of the word: destruction involves external
causation so it does allow an agent, while growth involves internal causation so it does
not allow an agent.
15
See Section 21.3.3 for further discussion.
610
21.2 Evidence for lexical approaches
However, a problem for Marantz is that these two types of nouns can coordinate and
share dependents (example (18a) is from Wechsler (2008a: Section 7)):
(18) a. With nothing left after the soldier’s [destruction and looting] of their home,
they reboarded their coach and set out for the port of Calais.16
b. The [cultivation, growing or distribution] of medical marijuana within the
County shall at all times occur within a secure, locked, and fully enclosed
structure, including a ceiling, roof or top, and shall meet the following re-
quirements.17
On the phrasal analysis, the nouns looting and growing occur in one type of syntactic
environment (namely vP), while forms destruction, cultivation, and distribution occur in
a different syntactic environment. This places contradictory demands on the structure
of coordinations like those in (18). As far as we know, neither this problem nor the others
raised by Wechsler (2008a) have even been addressed by advocates of the phrasal theory
of argument structure.
Consider one last example. In an influential phrasal analysis, Hale and Keyser (1993)
derived denominal verbs like to saddle through noun incorporation out of a structure
akin to [PUT a saddle ON x]. Again, verbs with this putative derivation routinely coor-
dinate and share dependents with verbs of other types:
(19) Realizing the dire results of such a capture and that he was the only one to prevent
it, he quickly [saddled and mounted] his trusted horse and with a grim determi-
nation began a journey that would become legendary.18
As in all of these X0 coordination cases, under the phrasal analysis the two verbs place
contradictory demands on a single phrase structure.
A lexical valence structure is an abstraction or generalization over various occurrences
of the verb in syntactic contexts. To be sure, one key use of that valence structure is sim-
ply to indicate what sort of phrases the verb must (or can) combine with, and the result
of semantic composition; if that were the whole story then the phrasal theory would
be viable. But it is not. As it turns out, this lexical valence structure, once abstracted,
can alternatively be used in other ways: among other possibilities, the verb (crucially
including its valence structure) can be coordinated with other verbs that have similar
valence structures; or it can serve as the input to lexical rules specifying a new word
bearing a systematic relation to the input word. The coordination and lexical derivation
facts follow from the lexical view, while the phrasal theory at best leaves these facts as
mysterious and at worst leads to irreconcilable contradictions for the phrase structure.
611
21 Phrasal vs. lexical analyses
as phrasal constructions.19 As was argued in Müller (2006) this is incompatible with the
assumption of lexical integrity. Lexical integrity means that word formation happens
before syntax and that the morphological structure is inaccessible to syntactic processes
(Bresnan & Mchombo 1995).20 Let us consider a concrete example, such as (20):
(20) a. Er tanzt die Schuhe blutig / in Stücke.
he dances the shoes bloody into pieces
b. die in Stücke / blutig getanzten Schuhe
the into pieces bloody danced shoes
c. * die getanzten Schuhe
the danced shoes
The shoes are not a semantic argument of tanzt. Nevertheless the referent of the NP
that is realized as accusative NP in (20a) is the element the adjectival participle in (20b)
predicates over. Adjectival participles like the one in (20b) are derived from a passive
participle of a verb that governs an accusative object. If the accusative object is licensed
phrasally by configurations like the one in (20a), then it is not possible to explain why
the participle getanzten can be formed despite the absence of an accusative object in the
valence specification of the verb. See Müller (2006: Section 5) for further examples of the
interaction of resultatives and morphology. The conclusion drawn by Dowty (1978: 412)
and Bresnan (1982b: 21) in the late 70s and early 80s is that phenomena which feed mor-
phology should be treated lexically. The natural analysis in frameworks like HPSG, CG,
CxG, and LFG is therefore one that assumes a lexical rule for the licensing of resultative
constructions. See Verspoor (1997), Wechsler (1997), Wechsler & Noh (2001), Wunderlich
(1992: 45; 1997: 120–126), Kaufmann & Wunderlich (1998), Müller (2002a: Chapter 5),
Kay (2005), and Simpson (1983) for lexical proposals in some of these frameworks. The
lexical approach assumes that the lexical item for the mono-valent tanz- is related to an-
other lexical item for tanz- that selects an object and a result predicate in addition to the
subject selected by the mono-valent variant. Inflection and adjective derivation apply to
this derived stem and the respective results can be used in (20a) or (20b).
This argument for a lexical treatment of resultative constructions is similar to the one
that was discussed in connection with the GPSG representation of valence in Section 5.5:
morphological processes have to be able to see the valence of the element they apply to.
19
Asudeh & Toivonen (2014: Section 2.3) argue that their account is not constructional. If a construction is
a form-meaning pair, their account is constructional, since a certain c-structure is paired with a semantic
contribution. Asudeh & Toivonen (2014: Section 2.2) compare their approach with approaches in Con-
structional HPSG (Sag 1997) and Sign-Based Construction Grammar (see Section 10.6.2), which they term
constructional. The only difference between these approaches and the approach by Asudeh, Dalrymple &
Toivonen is that the constructions in the HPSG-based theories are modeled using types and hence have a
name.
20
Asudeh et al. (2013: 14) claim that the Swedish Directed Motion Construction does not interact with deriva-
tional morphology. However, the parallel German construction does interact with derivational morphol-
ogy. The absence of this interaction in Swedish can be explained by other factors of Swedish grammar and
given this I believe it to be more appropriate to assume an analysis that captures both the German and the
Swedish data in the same way.
612
21.2 Evidence for lexical approaches
This is not the case if arguments are introduced by phrasal configurations after the level
of morphology.
Asudeh, Dalrymple & Toivonen’s papers are about the concept of lexical integrity and
about constructions. Asudeh & Toivonen (2014) replied to our target article and pointed
out (again) that their template approach makes it possible to specify the functional struc-
ture of words and phrases alike. In the original paper they discussed the Swedish word
vägen, which is the definite form of väg ‘way’. They showed that the f-structure is par-
allel to the f-structure for the English phrase the way. In our reply (2014b), we gave in
too early, I believe. Since the point is not about being able to provide the f-structure
of words, the point is about morphology, that is – in LFG terms – about deriving the f-
structure by a morphological analysis. More generally speaking, one wants to derive all
properties of the involved words, that is, their valence, their meaning, and the linking of
this meaning to their dependents. What we used in our argument based on the sentences
in (20) was parallel to what Bresnan (1982b: 21; 2001: 31) used in her classical argument
for a lexical treatment of the passive. So either Bresnan’s argument (and ours) is invalid
or both arguments are valid and there is a problem for Asudeh, Dalrymple & Toivonen’s
approach and for phrasal approaches in general. I want to give another example that was
already discussed in Müller (2006: 869) but was omitted in Müller & Wechsler (2014a)
due to space limitations. I will first point out why this example is problematic for phrasal
approaches and then explain why it is not sufficient to be able to assign certain f-struc-
tures to words: in (21a), we are dealing with a resultative construction. According to the
plugging approach, the resultative meaning is contributed by a phrasal construction into
which the verb fischt is inserted. There is no lexical item that requires a resultative pred-
icate as its argument. If no such lexical item exists, then it is unclear how the relation
between (21a) and (21b) can be established:
(21) a. [dass] jemand die Nordsee leer fischt
that somebody the North.Sea empty fishes
‘that somebody fishes the North Sea empty’
b. wegen der Leerfischung der Nordsee21
because of.the empty.fishing of.the North.Sea
‘because of the fishing that resulted in the North Sea being empty’
As Figure 21.1 on the next page shows, both the arguments selected by the heads and
the structures are completely different. In (21b), the element that is the subject of the
related construction in (21a) is not realized. As is normally the case in nominalizations,
it is possible to realize it in a PP with the preposition durch ‘by’:
(22) wegen der Leerfischung der Nordsee durch die Anrainerstaaten
because of.the empty.fishing of.the North.Sea by the neighboring.states
‘because of the fishing by the neighboring states that resulted in the North Sea
being empty’
If one assumes that the resultative meaning comes from a particular configuration in
which a verb is realized, there would be no explanation for (21b) since no verb is in-
volved in the analysis of this example. One could of course assume that a verb stem is
21
taz, 20.06.1996, p. 6.
613
21 Phrasal vs. lexical analyses
NP
S Det N′
inserted into a construction both in (21a) and (21b). The inflectional morpheme -t and
the derivational morpheme -ung as well as an empty nominal inflectional morpheme
would then be independent syntactic components of the analysis. However, since Gold-
berg (2003b: 119) and Asudeh et al. (2013) assume lexical integrity, only entire words can
be inserted into syntactic constructions and hence the analysis of the nominalization of
resultative constructions sketched here is not an option for them.
One might be tempted to try and account for the similarities between the phrases in
(21) using inheritance. One would specify a general resultative construction standing in
an inheritance relation to the resultative construction with a verbal head and the nom-
inalization construction. I have discussed this proposal in more detail in Müller (2006:
Section 5.3). It does not work as one needs embedding for derivational morphology and
this cannot be modeled in inheritance hierarchies (Krieger & Nerbonne (1993), see also
Müller (2006) for a detailed discussion).
It would also be possible to assume that both constructions in (23), for which struc-
tures such as those in Figure 21.1 would have to be assumed, are connected via meta-
rules.22,23
(23) a. [ Sbj Obj Obl V ]
b. [ Det [ [ Adj V -ung ] ] NP[gen] ]
The construction in (23b) corresponds to Figure 21.2.24 The genitive NP is an argument
of the adjective. It has to be linked semantically to the subject slot of the adjective.
22
Goldberg (p. c. 2007, 2009) suggests connecting certain constructions using GPSG-like metarules. Depper-
mann (2006: 51), who has a more Croftian view of CxG, rules this out. He argues for active/passive al-
ternations that the passive construction has other information structural properties. Note also that GPSG
metarules relate phrase structure rules, that is, local trees. The structure in Figure 21.2, however, is highly
complex.
23
The structure in (23b) violates a strict interpretation of lexical integrity as is commonly assumed in LFG.
Booij (2005; 2009), working in Construction Grammar, subscribes to a somewhat weaker version, however.
24
I do not assume zero affixes for inflection. The respective affix in Figure 21.2 is there to show that
there is structure. Alternatively one could assume a unary branching rule/construction as is common
in HPSG/Construction Morphology.
614
21.2 Evidence for lexical approaches
NP
Det N′
N NP[gen]
N-Stem N-Affix
Alternatively, one could assume that the construction only has the form [Adj V -ung ],
that is, that it does not include the genitive NP. But then one could also assume that
the verbal variant of the resultative construction has the form [OBL V] and that Sbj and
Obj are only represented in the valence lists. This would almost be a lexical analysis,
however.
Turning to lexical integrity again, I want to point out that all that Asudeh & Toivonen
can do is assign some f-structure to the N in Figure 21.2. What is needed, however, is
a principled account of how this f-structure comes about and how it is related to the
resultative construction on the sentence level.
My argument regarding -ung-nominalization from Müller (2006: Section 5.1) was also
taken up by Bruening (2018). I noted that Leerfischung should not be analyzed as a com-
pound of leer and Fischung with Fischung being the result of combining -ung with the
intransitive verb lexeme fisch- but rather that -ung should apply to a version of fisch-
that selects for a result predicate and its subject and that this version of Fischung is then
combined with the result predicate it selects for. Bruening argued against my analysis
claiming that all arguments of nouns are optional and that my analysis would predict
that there is a noun Fischung with a resultative meaning but without the resultative
predicate. As I pointed out in Müller (2006: 869), a noun Fischung does exist but it refers
to parts of a boat, not to an event nominalization. Bruening concludes that a syntactic
approach is needed and that -ung applies to the combination of leer and fisch-.
Now, while it is generally true that arguments of nouns can be omitted there are
situations in which the argument cannot be ommitted without changing the meaning.
Sebastian Nordhoff (p. c. 2017) found the following examples:
(24) a. Bartträger
beard.carrier
‘bearded man’
b. Spaßmacher
joke.maker
‘jester’
615
21 Phrasal vs. lexical analyses
c. Arbeitgeber
work.giver
‘employer’
d. Unfallbauer
crash.builder
‘crasher’
e. Abibauer
secondary.school.leaving.examination.builder
‘secondary school leaving examination taker’
f. Traumfänger
dream.catcher
‘dreamcatcher’
g. Zeitschinder
time.grinder
‘temporizer’
h. Pläneschmieder
plans.forger
‘contriver’
For example, a Bartträger is somebody who has a beard. If one omitted the first part of
the compound, one would get a Träger ‘carrier’; a relation to the original sense cannot
be established. Similarly a Spaßmacher is literally a ‘joke.maker’. Without the first part
of the compound this would be Macher, which translates as doer or action man. What
the examples above have in common is the following: the verbal parts are frequent and
in the most frequent uses of the verb the object is concrete. In the compounds above the
first part is unusual in that it is abstract. If the first element of the compound is omitted,
we get the default reading of the verb, something that is incompatible with the meaning
of the verb in the complete compound.
The contrast between Leerfischung and #Fischung can be explained in a similar way:
the default reading of fisch- is the one without resultative meaning. Without the realized
predicate we get the derivation product Fischung, which does not exist (with the relevant
meaning).
So, in a lexical analysis of resultatives we have to make sure that the resultative pred-
icate is not optional and this is what my analysis does. It says that fisch- needs a re-
sultative predicate. It does not say that it optionally takes a result predicate. What is
needed is a careful formulation of a theory of what can be dropped that ensures that no
arguments are omitted that are crucial for recognizing the sense of a certain construc-
tion/collocation. The nominalization rules have to be set up accordingly.25 I do not see
25
Note that this also applies to lexical theories of idioms of the kind suggested by Sag (2007) and Kay, Sag &
Flickinger (2015). If one analyses idioms like kick the habit and kick the bucket with a special lexical item
for kick, one has to make sure that the object of kick is not omitted since the idioms are not recognizable
without the object.
616
21.2 Evidence for lexical approaches
any problems for the analyses of resultatives and particle verbs that I suggested in Müller
(2002a; 2003c).
Before I turn to approaches with radical underspecification of argument structure in
the next section, I want to comment on a more recent paper by Asudeh, Giorgolo &
Toivonen (2014). The authors discuss the phrasal introduction of cognate objects and
benefactives. (25a) is an example of the latter construction.
(25) a. The performer sang the children a song.
b. The children were sung a song.
According to the authors, the noun phrase the children is not an argument of sing but
contributed by the c-structure rule that optionally licenses a benefactive.
(26) V′ → V DP DP
↑=↓ (↑ OBJ) = ↓ (↑ OBJ𝜃 ) = ↓
( @BENEFACTIVE )
Whenever this rule is called, the template BENEFACTIVE can add a benefactive role and
the respective semantics, provided this is compatible with the verb that is inserted into
the structure. The authors show how the mappings for the passive example in (25b)
work, but they do not provide the c-structure rule that licenses such examples. Unless
one assumes that arguments in (26) can be optional (see below), one would need a c-
structure rule for passive VPs and this rule has to license a benefactive as well.26 So it
would be:
(27) V′ → V[pass] DP
↑=↓ (↑ OBJ𝜃 ) = ↓
( @BENEFACTIVE )
Note that a benefactive cannot be added to just any verb: Adding a benefactive to an
intransitive verb as in (28a) is out and the passive that would correspond to (28a) is
ungrammatical as well, as (28b) shows:
(28) a. * He laughed the children.
b. * The children were laughed.
The benefactive template would account for the ungrammaticality of (28) since it re-
quires an ARG2 to be present and the intransitive laugh does not have an ARG2 , but this
account would not extend to other verbs. For example, the template would admit the
sentences in (29b–c) since give with prepositional object has an ARG2 (Kibort 2008: 317).
(29) a. He gave it to Mary.
b. * He gave Peter it to Mary.
c. * Peter was given it to Mary.
26
See for instance Bergen & Chang (2005) and van Trijp (2011) for Construction Grammar analyses that as-
sume active and passive variants of phrasal constructions. See Cappelle (2006) on allostructions in general.
617
21 Phrasal vs. lexical analyses
give could combine with the to PP semantically and would then be equivalent to a tran-
sitive verb as far as resources are concerned (looking for an ARG1 and an ARG2 ). The
benefactive template would map the ARG2 to ARG3 and hence (29b) would be licensed.
Similar examples can be constructed with other verbs that take prepositional objects,
for instance accuse sb. of something. Since there are verbs that take a benefactive and a
PP object as shown by (30), (29b) cannot be ruled out with reference to non-existing c-
structure rules.
(30) I buy him a coat for hundred dollar.
So, if the c-structure is to play a role in argument structure constructions at all, one
could not just claim that all c-structure rules optionally introduce a benefactive argu-
ment. Therefore there is something special about the two rules in (26) and (27). The
problem is that there is no relation between these rules. They are independent state-
ments saying that there can be a benefactive in the active and that there can be one in
the passive. This is what Chomsky (1957: 43) criticized in 1957 with respect to simple
phrase structure grammar and this was the reason for the introduction of transforma-
tions. Bresnan-style LFG captured the generalizations by lexical rules (Bresnan 1978;
1982b) and later by lexical rules in combination with Lexical Mapping Theory (Toivonen
2013). But if elements are added outside the lexical representations, the representations
where these elements are added have to be related too. One could say that our knowl-
edge about formal tools has changed since 1957. We now can use inheritance hierarchies
to capture generalizations. So one can assume a type (or a template) that is the supertype
of all those c-structure rules that introduce a benefactive. But since not all rules allow
for the introduction of a benefactive element, this basically amounts to saying: c-struc-
ture rule A, B, and C allow for the introduction of a benefactive. In comparison lexical
rule-based approaches have one statement introducing the benefactive. The lexical rule
states what verbs are appropriate for adding a benefactive and syntactic rules are not
affected.
Asudeh (p. c. May 2016) and an anonymous reviewer of HeadLex16 pointed out to me
that the rules in (26) and (27) can be generalized over if the arguments in (26) are made
optional. (31) shows the rule in (26) with the DPs marked as optional by the brackets
enclosing them.
(31) V′ → V (DP) (DP)
↑=↓ (↑ OBJ) = ↓ (↑ OBJ𝜃 ) = ↓
( @BENEFACTIVE )
Since both of the DPs are optional (31) is equivalent to a specification of four rules,
namely (26) and the three versions of the rule in (32):
(32) a. V′ → V DP
↑=↓ (↑ OBJ𝜃 ) = ↓
( @BENEFACTIVE )
618
21.2 Evidence for lexical approaches
b. V′ → V DP
↑=↓ (↑ OBJ) = ↓
( @BENEFACTIVE )
c. V′ → V
↑=↓
( @BENEFACTIVE )
(32a) is the variant of (31) in which the OBJ is omitted (needed for (33a)), (32b) is the
variant in which the OBJ𝜃 is omitted (needed for (33b)) and in (32c) both DPs are omitted
(needed for (33c)).
(33) a. She had been prepared divine and elaborate meals.
b. What kind of picture did the kids draw the teacher?
c. Such divine and elaborate meals, she had never been prepared before, not even
by her ex-husband who was a professional chef.
Hence, (31) can be used for V′s containing two objects, for V′s in the passive containing
just one object, for V′ with the secondary object extraced and for V′ in the passive with
the secondary object extracted. The template-based approach does not overgenerate
since the benefactive template is specified such that it requires the verb it applies to to
select for an ARG2. Since intransitives like laugh do not select for an ARG2 a benefactive
cannot be added. So, in fact the actual configuration in the c-structure rule does only
play a minor role: the account mainly relies on semantics and resource sensitivity. There
is one piece of information that is contributed by the c-structure rule: it constrains the
grammatical functions of ARG2 and ARG3 , which are disjunctively specified in the tem-
plate definitions for ARG2 and ARG3 : ARG2 can be realized as SUBJ or as OBJ. In the active
case ARG1 will be the SUBJ and because of function argument bi-uniqueness (Bresnan
et al. 2016: 334) no other element can be the SUBJ and hence ARG2 has to be an OBJ. ARG3
can be either an OBJ or an OBJ𝜃 . Since ARG2 is an OBJ in the active, ARG3 has to be an
OBJ𝜃 in the active. In the passive case ARG1 is suppressed or realized as OBL𝜃 (by PP).
ARG2 will be realized as SUBJ (since English requires a SUBJ to be realized) and ARG3 could
be realized as either OBJ or OBJ𝜃 . This is not constrained by the template specifications
so far. Because of the optionality in (31), either the OBJ or the OBJ𝜃 function could be
chosen for ARG3 . This means that either Lexical Mapping Theory has to be revised or
one has to make sure that the c-structure rule used in the passive of benefactives states
the grammatical function of the object correctly. Hence one would need the c-structure
rule in (27) and then there would be the missing generalization I pointed out above.
If one finds a way to set up the mappings to grammatical functions without reference
to c-structures in lexical templates, this means that it is not the case that an argument is
added by a certain configuration the verb enters in. Since any verb may enter (33) and
since the only important thing is the interaction between the lexical specification of the
verb and the benefactive template, the same structures would be licensed if the benefac-
tive template were added to the lexical items of verbs directly. The actual configuration
would not constrain anything. All (alleged) arguments from language acquisition and
619
21 Phrasal vs. lexical analyses
psycholinguistics (see Sections 21.6 and 21.7) for phrasal analyses would not apply to
such a phrasal account.
If the actual c-structure configuration does not contribute any restrictions as to what
arguments may be realized and what grammatical functions they get, the difference
between the lexical use of the benefactive template and the phrasal introduction as exe-
cuted in (31) is really minimal. However, there is one area in grammar where there is a
difference: coordination. As Müller & Wechsler (2014a: Section 6.1) pointed out, it is pos-
sible to coordinate ditransitive verbs with verbs that appear together with a benefactive.
(34) is one of their examples:
(34) She then offered and made me a wonderful espresso — nice.27
If the benefactive information is introduced at the lexical level the coordinated verbs
basically have the same selectional requirements. If the benefactive information is in-
troduced at the phrasal level baked and gave are coordinated and then the benefactive
constraints are imposed on the result of the coordination by the c-structure rule. While
it is clear that the lexical items that would be assumed in a lexical approach can be coor-
dinated in a symmetric coordination, problems seem to arise for the phrasal approach. It
is unclear how the asymmetric coordination of the mono- and ditransitive verbs can be
accounted for and how the constraints of the benefactive template are distributed over
the two conjuncts. The fact that the benefactive template is optional does not help here
since the optionality means that the template is either called or it is not. The situation
is depicted in Figure 21.3. The optionality of the template call in the top figure basi-
cally corresponds to the disjunction of the two trees in the lower part of the figure. The
optionality does not allow for a distribution to one of the daughters in a coordination.
Mary Dalrymple (p. c. 2016) pointed out that the coordination rule that coordinates
two verbs can be annotated with two optional calls of the benefactive template.
(35) V → V Conj V
( @BENEFACTIVE ) ( @BENEFACTIVE )
In an analysis of the examples in (34), the template in rule (26) would not be called but the
respective templates in (35) would be called instead. While this does work technically,
similar coordination rules would be needed for all other constructions that introduce
arguments in c-structures. Furthermore, the benefactive would have to be introduced
in several unrelated places in the grammar and finally the benefactive is introduced at
nodes consisting of a single verb without any additional arguments being licensed, which
means that one could have gone for the lexical approach right away.
Timm Lichte (p. c. 2016) pointed out an important consequence of a treatment of coor-
dination via (35): since the result of the coordination behaves like a normal ditransitive
verb it would enter the normal ditransitive construction. Toivonen’s original motivation
for a phrasal analysis was the observation that extraction out of and passivization of
benefactive constructions is restricted for some speakers (Toivonen 2013: 416):
27
http://www.thespinroom.com.au/?p=102 2012-07-07
620
21.2 Evidence for lexical approaches
VP
V
NP NP
(BENEFACTIVE)
≡
V Conj V
V Conj V V Conj V
(37) a. * My sister was carved a soap statue of Bugs Bunny (by a famous sculptor).
b. My sister was given a soap statue of Bugs Bunny (by a famous sculptor).
621
21 Phrasal vs. lexical analyses
(2014) argued that Swedish is different from German and hence there would not be a
problem. However, the situation is different with the benefactive constructions. Al-
though English and German do differ in many respects, both languages have similar
dative constructions:
(38) a. He baked her a cake.
b. Er buk ihr einen Kuchen.
he baked her.DAT a.ACC cake
Now, the analysis of the free constituent order in German was explained by assuming
binary branching structures in which a VP node is combined with one of its arguments
or adjuncts (see Section 7.4). The c-structure rule is repeated in (39):
(39) VP → NP VP
(↑ SUBJ |OBJ |OBJ𝜃 ) = ↓ ↑=↓
The dependent elements contribute to the f-structure of the verb and coherence/com-
pleteness ensure that all arguments of the verb are present. One could add the intro-
duction of the benefactive argument to the VP node of the right-hand side of the rule.
However, since the verb-final variant of (38b) would have the structure in (40), one would
get spurious ambiguities, since the benefactive could be introduced at every node:
(40) weil [VP er [VP ihr [VP einen Kuchen [VP [V buk]]]]]
because he her a cake baked
So the only option seems to be to introduce the benefactive at the rule that got the
recursion going, namely the rule that projected the lexical verb to the VP level. The rule
(39) from page 237 is repeated as (41) for convenience.
(41) VP → (V)
↑=↓
Note also that benefactive datives appear in adjectival environments as in (42):
(42) a. der seiner Frau einen Kuchen backende Mann
the his.DAT wife a.ACC cake baking man
‘the man who is baking a cake for her’
b. der einen Kuchen seiner Frau backende Mann
the a.ACC cake his.DAT wife baking man
‘the man who is baking a cake for her’
In order to account for these datives one would have to assume that the adjective-to-AP
rule that would be parallel to (41) introduces the dative. The semantics of the benefac-
tive template would have to somehow make sure that the benefactive argument is not
added to intransitive verbs like lachen ‘to laugh’ or participles like lachende ‘laughing’.
While this may be possible, I find the overall approach unattractive. First it does not
have anything to do with the original constructional proposal but just states that the
622
21.3 Radical underspecification: the end of argument structure?
benefactive may be introduced at several places in syntax, second the unary branching
syntactic rule is applying to a lexical item and hence is very similar to a lexical rule and
third the analysis does not capture cross-linguistic commonalities of the construction.
In a lexical rule-based approach as the one that was suggested by Briscoe & Copestake
(1999: Section 5), a benefactive argument is added to certain verbs and the lexical rule is
parallel in all languages that have this phenomenon (Müller 2018a). The respective lan-
guages differ simply in the way the arguments are realized with respect to their heads.
In languages that have adjectival participles, these are derived from the respective verbal
stems. The morphological rule is the same independent of benefactive arguments and
the syntactic rules for adjectival phrases do not have to mention benefactive arguments.
I discuss the template-based approach in more detail in Müller (2018a). This book also
contains a fully worked out analysis of the benefactive and the resultative constructions
and their interactions in German and English in the framework of HPSG.
28
See Müller (2010a: Section 11.11.3) for a detailed discussion of Haugereid’s approach.
29
Dowty (1989) called the system in (43a) an ordered argument system.
623
21 Phrasal vs. lexical analyses
(1996) further noted the possibility of mixed accounts such as (43c), in which the agent
(subject) argument is severed from the kill ′ relation, but the theme (object) remains an
argument of the kill ′ relation.30
In other words, the lexical approach is neutral on the question of the “conceptual struc-
ture” of eventualities, as noted already in a different connection in Section 21.1.4. For this
reason, certain semantic arguments for the neo-Davidsonian approach, such as those put
forth by Schein (1993: Chapter 4) and Lohndal (2012), do not directly bear upon the issue
of lexicalism, as far as we can tell.
But Kratzer (1996), among others, has gone further and argued for an account that
is neo-Davidsonian (or rather, mixed) “in the syntax”. Kratzer’s claim is that the verb
specifies only the internal argument(s), as in (45a) or (45b), while the agent (external
argument) role is assigned by the phrasal structure. On the “neo-Davidsonian in the
syntax” view, the lexical representation of the verb has no arguments at all, except the
event variable, as shown in (45c).
On such accounts, the remaining dependents of the verb receive their semantic roles
from silent secondary predicates, which are usually assumed to occupy the positions of
functional heads in the phrase structure. An Event Identification rule identifies the event
variables of the verb and the silent light verb (Kratzer 1996: 22); this is why the existential
quantifiers in (43) have been replaced with lambda operators in (45). A standard term
for the agent-assigning silent predicate is “little v” (see Section 4.1.4 on little v). These
30
The event variable is shown as existentially bound, as in Davidson’s original account. As discussed below,
in Kratzer’s version it must be bound by a lambda operator instead.
624
21.3 Radical underspecification: the end of argument structure?
extra-lexical dependents are the analogs of the ones contributed by the constructions in
Construction Grammar.
In the following subsections we address arguments that have been put forth in favor
of the little v hypothesis, from idiom asymmetries (Section 21.3.2) and deverbal nomi-
nals (Section 21.3.3). We argue that the evidence actually favors the lexical view. Then
we turn to problems for exoskeletal approaches, from idiosyncratic syntactic selection
(Section 21.3.4) and expletives (Section 21.3.5). We conclude with a look at the treatment
of idiosyncratic syntactic selection under Borer’s exoskeletal theory (Section 21.3.7), and
a summary (Section 21.3.8).
On the other hand, one does not often find special meanings of a verb associated with
the choice of subject, leaving the object position open (examples from Marantz (1984:
26)):
Kratzer observes that a mixed representation of kill as in (48a) allows us to specify vary-
ing meanings that depend upon its sole NP argument.
625
21 Phrasal vs. lexical analyses
On the polyadic (Davidsonian) theory, the meaning could similarly be made to depend
upon the filler of the agent role. On the polyadic view, “there is no technical obstacle”
(Kratzer 1996: 116) to conditions like those in (48b), except reversed, so that it is the filler
of the agent role instead of the theme role that affects the meaning. But, she writes, this
could not be done if the agent is not an argument of the verb. According to Kratzer, the
agent-severed representation (such as (48a)) disallows similar constraints on the mean-
ing that depend upon the agent, thereby capturing the idiom asymmetry.
But as noted by Wechsler (2005), “there is no technical obstacle” to specifying agent-
dependent meanings even if the Agent has been severed from the verb as Kratzer pro-
poses. It is true that there is no variable for the agent in (48a). But there is an event
variable e, and the language user must be able to identify the agent of e in order to in-
terpret the sentence. So one could replace the variable a with “the agent of e” in the
expressions in (48b), and thereby create verbs that violate the idiom asymmetry.
While this may seem to be a narrow technical or even pedantic point, it is nonetheless
crucial. Suppose we try to repair Kratzer’s argument with an additional assumption: that
modulations in the meaning of a polysemous verb can only depend upon arguments of
the relation denoted by that verb, and not on other participants in the event. Under
that additional assumption, it makes no difference whether the agent is severed from
the lexical entry or not. For example, consider the following (mixed) neo-Davidsonian
representation of the semantic content in the lexical entry of kill:
(49) kill: 𝜆𝑦𝜆𝑥𝜆𝑒 [𝑘𝑖𝑙𝑙 (𝑒, 𝑦) ∧ 𝑎𝑔𝑒𝑛𝑡 (𝑒, 𝑥)]
Assuming that sense modulations can only be affected by arguments of the kill(e,y) rela-
tion, we derive the idiom asymmetry, even if (49) is the lexical entry for kill. So suppose
that we try to fix Kratzer’s argument with a different assumption: that modulations in
the meaning of a polysemous verb can only depend upon an argument of the lexically
denoted function. Kratzer’s “neo-Davidsonian in the syntax” lexical entry in (45a) lacks
the agent argument, while the lexical entry in (49) clearly has one. But Kratzer’s entry
still fails to predict the asymmetry because, as noted above, it has the e argument and
so the sense modulation can be conditioned on the “agent of e”. As noted above, that
event argument cannot be eliminated (for example through existential quantification)
because it is needed in order to undergo event identification with the event argument of
the silent light verb that introduces the agent (Kratzer 1996: 22).
Moreover, recasting Kratzer’s account in lexicalist terms allows for verbs to vary. This
is an important advantage, because the putative asymmetry is only a tendency. The
following are examples in which the subject is a fixed part of the idiom and there are
open slots for non-subjects:31
(50) a. A little bird told X that S.
‘X heard the rumor that S.’
b. The cat’s got X’s tongue.
‘X cannot speak.’
31
(50a) is from Nunberg, Sag & Wasow (1994: 526), (50b) from Bresnan (1982a: 349–350), and (50c) from
Bresnan (1982a: 349–350).
626
21.3 Radical underspecification: the end of argument structure?
c. What’s eating X?
‘Why is X so galled?’
Further data and discussion of subject idioms in English and German can be found in
Müller (2007a: Section 3.2.1).
The tendency towards a subject-object asymmetry plausibly has an independent ex-
planation. Nunberg, Sag & Wasow (1994) argue that the subject-object asymmetry is
a side-effect of an animacy asymmetry. The open positions of idioms tend to be ani-
mate while the fixed positions tend to be inanimate. Nunberg et al. (1994) derive these
animacy generalizations from the figurative and proverbial nature of the metaphorical
transfers that give rise to idioms. If there is an independent explanation for this ten-
dency, then a lexicalist grammar successfully encodes those patterns, perhaps with a
mixed neo-Davidsonian lexical decomposition, as explained above (see Wechsler (2005)
for such a lexical account of the verbs buy and sell). But the little v hypothesis rigidly
predicts this asymmetry for all agentive verbs, and that prediction is not borne out.
In contrast, nominals derived from obligatorily transitive verbs such as destroy allow
expression of the agent, as shown in (54a):
Following a suggestion by Chomsky (1970), Marantz (1997) argued on the basis of these
data that the agent role is lacking from lexical entries. In verbal projections like (51) and
(53) the agent role is assigned in the syntax by little v. Nominal projections like (52)
627
21 Phrasal vs. lexical analyses
and (54) lack little v. Instead, pragmatics takes over to determine which agents can be
expressed by the possessive phrase: the possessive can express “the sort of agent implied
by an event with an external rather than an internal cause” because only the former
can “easily be reconstructed” (quoted from Marantz (1997: 218)). The destruction of a
city has a cause external to the city, while the growth of tomatoes is internally caused
by the tomatoes themselves (Smith 1970). Marantz points out that this explanation is
unavailable if the noun is derived from a verb with an argument structure specifying
its agent, since the deverbal nominal would inherit the agent of a causative alternation
verb.
The empirical basis for this argument is the putative mismatch between the allowa-
bility of agent arguments, across some verb-noun cognate pairs: e.g., grow allows the
agent but growth does not. But it turns out that the grow/growth pattern is rare. Most
deverbal nominals precisely parallel the cognate verb: if the verb has an agent, so does
the noun. Moreover, there is a ready explanation for the exceptional cases that exhibit
the grow/growth pattern (Wechsler 2008a). First consider non-alternating theme-only
intransitives (unaccusatives), as in (55) and non-alternating transitives as in (56). The
pattern is clear: if the verb is agentless, then so is the noun:
(55) arriv(al), disappear(ance), fall etc.:
a. A letter arrived.
b. the arrival of the letter
c. * The mailman arrived a letter.
d. * the mailman’s arrival of the letter
(56) destroy/destruction, construct(ion), creat(ion), assign(ment) etc.:
a. The army is destroying the city.
b. the army’s destruction of the city
This favors the view that the noun inherits the lexical argument structure of the verb.
For the anti-lexicalist, the badness of (55c) and (55d), respectively, would have to receive
independent explanations. For example, on Harley and Noyer’s 2000 proposal, (55c) is
disallowed because a feature of the root ARRIVE prevents it from appearing in the con-
text of v, but (55d) is instead ruled out because the cause of an event of arrival cannot be
easily reconstructed from world knowledge. This exact duplication in two separate com-
ponents of the linguistic system would have to be replicated across all non-alternating
intransitive and transitive verbs, a situation that is highly implausible.
Turning to causative alternation verbs, Marantz’s argument is based on the implicit
generalization that noun cognates of causative alternation verbs (typically) lack the
agent argument. But apart from the one example of grow/growth, there do not seem
to be any clear cases of this pattern. Besides grow(th), Chomsky (1970: examples (7c)
and (8c)) cited two experiencer predicates, amuse and interest: John amused (interested)
the children with his stories versus * John’s amusement (interest) of the children with his
stories. But this was later shown by Rappaport (1983) and Dowty (1989) to have an inde-
pendent aspectual explanation. Deverbal experiencer nouns like amusement and interest
628
21.3 Radical underspecification: the end of argument structure?
typically denote a mental state, where the corresponding verb denotes an event in which
such a mental state comes about or is caused. These result nominals lack not only the
agent but all the eventive arguments of the verb, because they do not refer to events. Ex-
actly to the extent that such nouns can be construed as representing events, expression
of the agent becomes acceptable.
In a response to Chomsky (1970), Carlota Smith (1972) surveyed Webster’s dictionary
and found no support for Chomsky’s claim that deverbal nominals do not inherit agent
arguments from causative alternation verbs. She listed many counterexamples, includ-
ing “explode, divide, accelerate, expand, repeat, neutralize, conclude, unify, and so on at
length.” (Smith 1972: 137). Harley and Noyer (2000) also noted many so-called “excep-
tions”: explode, accumulate, separate, unify, disperse, transform, dissolve/dissolution, de-
tach(ment), disengage-(ment), and so on. The simple fact is that these are not exceptions
because there is no generalization to which they can be exceptions. These long lists of
verbs represent the norm, especially for suffix-derived nominals (in -tion, -ment, etc.).
Many zero-derived nominals from alternating verbs also allow the agent, such as change,
release, and use: my constant change of mentors from 1992–1997; the frequent release of the
prisoners by the governor; the frequent use of sharp tools by underage children (examples
from Borer (2003: fn. 13)).32
Like the experiencer nouns mentioned above, many zero-derived nominals lack event
readings. Some reject all the arguments of the corresponding eventive verb, not just the
agent: * the freeze of the water, * the break of the window, and so on. According to Stephen
Wechsler, his drop of the ball is slightly odd, but the drop of the ball has exactly the
same degree of oddness. The locution a drop in temperature matches the verbal one The
temperature dropped, and both verbal and nominal forms disallow the agent: * The storm
dropped the temperature. * the storm’s drop of the temperature. In short, the facts seem to
point in exactly the opposite direction from what has been assumed in this oft-repeated
argument against lexical valence. Apart from the one isolated case of grow/growth, event-
denoting deverbal nominals match their cognate verbs in their argument patterns.
Turning to grow/growth itself, we find a simple explanation for its unusual behavior
(Wechsler 2008a). When the noun growth entered the English language, causative (tran-
sitive) grow did not exist. The OED provides these dates of the earliest attestations of
grow and growth:
Thus growth entered the language at a time when transitive grow did not exist. The
argument structure and meaning were inherited by the noun from its source verb, and
then preserved into present-day English. This makes perfect sense if, as we claim, words
have predicate argument structures. Nominalization by -th suffixation is not productive
32
Pesetsky (1996: 79, ex. (231)) assigns a star to the thief’s return of the money, but it is acceptable to many
speakers. The Oxford English Dictionary lists a transitive sense for the noun return (definition 11a), and
corpus examples like her return of the spoils are not hard to find.
629
21 Phrasal vs. lexical analyses
in English, so growth is listed in the lexicon. To explain why growth lacks the agent we
need only assume that a lexical entry’s predicate argument structure dictates whether
it takes an agent argument or not. So even this one word provides evidence for lexical
argument structure.
630
21.3 Radical underspecification: the end of argument structure?
There are language-internal niches with the same type of prepositional objects. For in-
stance, freuen ‘to rejoice over/about’ and lachen ‘laugh at/about’ take über as well. But
there is no general way to predict on semantic grounds which preposition has to be
taken.
It is often impossible to find semantic motivation for case. In German there is a ten-
dency to replace genitive (61a) with dative (61b) with no apparent semantic motivation:
(61) a. dass der Opfer gedacht werde
that the victims.GEN remembered is
‘that the victims would be remembered’
b. daß auch hier den Opfern des Faschismus gedacht werde […]33
that also here the victims.DAT of.the fascism remembered is
‘that the victims of fascism would be remembered here too’
The synonyms treffen and begegnen ‘to meet’ govern different cases (example from Pol-
lard & Sag (1987: 126)).
(62) a. Er traf den Mann.
he.NOM met the.ACC man
b. Er begegnete dem Mann.
he.NOM met the.DAT man
One has to specify the case that the respective verbs require in the lexical items of the
verbs.34
A radical variant of the plugging approach is suggested by Haugereid (2009).35 Hau-
gereid (pages 12–13) assumes that the syntax combines a verb with an arbitrary combi-
nation of a subset of five different argument roles. Which arguments can be combined
with a verb is not restricted by the lexical item of the verb.36 A problem for such views
is that the meaning of an ambiguous verb sometimes depends on which of its arguments
are expressed. The German verb borgen has the two translations ‘borrow’ and ‘lend’,
which basically are two different perspectives on the same event (see Kunze (1991; 1993)
for an extensive discussion of verbs of exchange of possession). Interestingly, the dative
object is obligatory only with the ‘lend’ reading (Müller 2010a: 403):
(63) a. Ich borge ihm das Eichhörnchen.
I lend him the squirrel
‘I lend the squirrel to him.’
33
Frankfurter Rundschau, 07.11.1997, p. 6.
34
Or at least mark the fact that treffen takes an object with the default case for objects and begegnen takes a
dative object in German. See Haider (1985b), Heinz & Matiasek (1994), and Müller (2001) on structural and
lexical case.
35
Technical aspects of Haugereid’s approach are discussed in Section 21.3.6.
36
Haugereid has the possibility to impose valence restrictions on verbs, but he claims that he uses this pos-
sibility just in order to get a more efficient processing of his computer implementation (p. 13).
631
21 Phrasal vs. lexical analyses
21.3.5 Expletives
A final example for the irreducibility of valence to semantics are verbs that select for
expletives and reflexive arguments of inherently reflexive verbs in German:
(64) a. weil es regnet
because it rains
b. weil (es) mir (vor der Prüfung) graut
because EXPL me.DAT before the exam dreads
‘because I am dreading the exam’
c. weil er es bis zum Professor bringt
because he EXPL until to.the professor brings
‘because he made it to professor’
632
21.3 Radical underspecification: the end of argument structure?
• Arg2: patient
• Arg3: benefactive or recipient
• Arg4: goal
• Arg5: antecedent
Here, antecedent is a more general role that stands for instrument, comitative, manner
and source. The roles Arg1–Arg3 correspond to subject and objects. Arg4 is a resultative
predicate of the end of a path. Arg4 can be realized by a PP, an AP or an NP. (65) gives
examples for the realization of Arg4:
(65) a. John smashed the ball out of the room.
b. John hammered the metal flat.
c. He painted the car a brilliant red.
633
21 Phrasal vs. lexical analyses
Whereas Arg4 follows the other participants in the causal chain of events, the antecedent
precedes the patient in the order of events. It is realized as a PP. (66) is an example of
the realization of Arg5:
(66) John punctured the balloon with a needle.
Haugereid now assumes that argument frames consist of these roles. He provides the
examples in (67):
(67) a. John smiles. (arg1-frame)
b. John smashed the ball. (arg12-frame)
c. The boat arrived. (arg2-frame)
d. John gave Mary a book. (arg123-frame)
e. John gave a book to Mary. (arg124-frame)
f. John punctured the ball with a needle. (arg125-frame)
Haugereid points out that multiple verbs can occur in multiple argument frames. He
provides the variants in (68) for the verb drip:
(68) a. The roof drips. (arg1-frame)
b. The doctor drips into the eyes. (arg14-frame)
c. The doctor drips with water. (arg15-frame)
d. The doctor drips into the eyes with water. (arg145-frame)
e. The roof drips water. (arg12-frame)
f. The roof drips water into the bucket. (arg124-frame)
g. The doctor dripped the eyes with water. (arg125-frame)
h. The doctor dripped into the eyes with water. (arg145-frame)
i. John dripped himself two drops of water. (arg123-frame)
j. John dripped himself two drops of water into his eyes. (arg1234-frame)
k. John dripped himself two drops of water into his eyes with a drop counter.
(arg12345-frame)
l. Water dripped. (arg2-frame)
m. It drips. (arg0-frame)
He proposes the inheritance hierarchy in Figure 21.4 in order to represent all possible
argument combinations. The Arg5 role is omitted due to space considerations.
Haugereid assumes binary-branching structures where arguments can be combined
with a head in any order. There is a dominance schema for each argument role. The
schema realizing the argument role 3 provides a link value arg3+. If the argument role 2
is provided by another schema, we arrive at the frame arg23. For unergative intransitive
verbs, it is possible to determine that it has an argument frame of arg1. This frame is only
compatible with the types arg1+, arg2−, arg3− and arg4−. Verbs that have an optional
634
21.3 Radical underspecification: the end of argument structure?
link
object are assigned to arg1-12 according to Haugereid. This type allows for the following
combinations: arg1+, arg2−, arg3− and arg4− such as arg1+, arg2+, arg3− and arg4−.
This approach comes very close to an idea by Goldberg: verbs are underspecified
with regard to the sentence structures in which they occur and it is only the actual
realization of arguments in the sentence that decides which combinations of arguments
are realized. One should bear in mind that the hierarchy in Figure 21.4 corresponds to
a considerable disjunction: it lists all possible realizations of arguments. If we say that
essen ‘to eat’ has the type arg1-12, then this corresponds to the disjunction arg1 ∨ arg12.
In addition to the information in the hierarchy above, one also requires information
about the syntactic properties of the arguments (case, the form of prepositions, verb
forms in verbal complements). Since this information is in part specific to each verb (see
Section 21.1), it cannot be present in the dominance schemata and must instead be listed
in each individual lexical entry. For the lexical entry for warten auf ‘wait for’, there must
be information about the fact that the subject has to be an NP and that the prepositional
object is an auf -PP with accusative. The use of a type hierarchy then allows one to
elegantly encode the fact that the prepositional object is optional. The difference to a
disjunctively specified COMPS list with the form of (69) is just a matter of formalization.
(69) COMPS ⟨ NP[str] ⟩ ∨ ⟨ NP[str], PP[auf , acc] ⟩
Since Haugereid’s structures are binary-branching, it is possible to derive all permu-
tations of arguments (70a–b), and adjuncts can be attached to every branching node
(70c–d).
(70) a. dass [arg1 keiner [arg2 Pizza isst]]
that nobody pizza eats
‘that nobody eats pizza’
b. dass [arg2 Pizza [arg1 keiner isst]]
that pizza nobody eats
635
21 Phrasal vs. lexical analyses
636
21.3 Radical underspecification: the end of argument structure?
argument of the main verb. It is a semantic argument of the secondary predicate leer
‘empty’ and has been raised to the object of the resultative construction. Depending on
the exact analysis assumed, the accusative object is either a syntactic argument of the
verb or of the adjective, however, it is never a semantic argument of the verb. In addition
to this problem, the representation in (73b) does not capture the fact that leer ‘empty’
predicates over the object. Haugereid (2007, p.c.) suggests that this is implicit in the
representation and follows from the fact that all arg4s predicate over all arg2s. Unlike
Haugereid’s analysis, analyses using lexical rules that relate a lexical item of a verb to
another verbal item with a resultative meaning allow for a precise specification of the
semantic representation that then captures the semantic relation between the predicates
involved. In addition, the lexical rule-based analysis makes it possible to license lexical
items that do not establish a semantic relation between the accusative object and the
verb (Wechsler 1997; Wechsler & Noh 2001; Müller 2002a: Chapter 5).
Haugereid sketches an analysis of the syntax of the German clause and tackles ac-
tive/passive alternations. However, certain aspects of the grammar are not elaborated
on. In particular, it remains unclear how complex clauses containing AcI verbs such as
sehen ‘to see’ and lassen ‘to let’ should be analyzed. Arguments of embedded and em-
bedding verbs can be permuted in these constructions. Haugereid (2007, p. c.) assumes
special rules that allow to saturate arguments of more deeply embedded verbs, for ex-
ample, a special rule that combines an arg2 argument of an argument with a verb. In
order to combine das Nilpferd and nicht füttern helfen lässt in sentences such as (74), he
is forced to assume a special grammatical rule that combines an argument of a doubly
embedded verb with another verb:
(74) weil Hans Cecilia John das Nilpferd nicht füttern helfen lässt
because Hans Cecilia John the hippo not feed help let
‘because Hans is not letting Cecilia help John feed the hippo.’
In Müller (2004d: 220), I have argued that embedding under complex-forming predi-
cates is only constrained by performance factors (see also Section 12.6.3). In German,
verbal complexes with more than four verbs are barely acceptable. Evers (1975: 58–59)
has pointed out, however, that the situation in Dutch is different since Dutch verbal
complexes have a different branching: in Dutch, verbal complexes with up to five verbs
are possible. Evers attributes this difference to a greater processing load for German
verbal complexes (see also Gibson 1998: Section 3.7). Haugereid would have to assume
that there are more rules for Dutch than for German. In this way, he would give up
the distinction between competence and performance and incorporate performance re-
strictions directly into the grammar. If he wanted to maintain a distinction between the
two, then Haugereid would be forced to assume an infinite number of schemata or a
schema with functional uncertainty since depth of embedding is only constrained by
performance factors. Existing HPSG approaches to the analysis of verbal complexes do
without functional uncertainty (Hinrichs & Nakazawa 1994a). Since such raising analy-
ses are required for object raising anyway (as discussed above), they should be given
preference.
Summing up, it must be said that Haugereid’s exoskeletal approach does account for
different orderings of arguments, but it neither derives the correct semantic representa-
637
21 Phrasal vs. lexical analyses
tions nor does it offer a solution for the problem of idiosyncratic selection of arguments
and the selection of expletives.
Borer goes on to pose various questions for future research, related to constraining the
class of possible idioms. With regard to that research program it should be noted that
a major focus of lexicalist research has been narrowing the class of subcategorization
and extricating derivable properties from idiosyncratic subcategorization. Those are the
functions of HPSG lexical hierarchies, for example.
21.3.8 Summary
In Sections 21.3.2–21.3.5 we showed that the question of which arguments must be real-
ized in a sentence cannot be reduced to semantics and world knowledge or to general
638
21.4 Relations between constructions
facts about subjects. The consequence is that valence information has to be connected
to lexical items. One therefore must either assume a connection between a lexical item
and a certain phrasal configuration as in Croft’s approach (2003) and in LTAG or assume
our lexical variant. In a Minimalist setting the right set of features must be specified lexi-
cally to ensure the presence of the right case assigning functional heads. This is basically
similar to the lexical valence structures we are proposing here, except that it needlessly
introduces various problems discussed above, such as the problem of coordination raised
in Section 21.2.1.
639
21 Phrasal vs. lexical analyses
phrasal configurations and thereby explain the regular relation between active and pas-
sive. The only proposals to date involve the use of inheritance hierarchies, so let us
examine them.
Researchers working in various frameworks, both with lexical and phrasal orientation,
have tried to develop inheritance-based analyses that could capture the relation between
valence patterns such as those in (76) and (77) (see for instance Kay & Fillmore 1999: 12;
Michaelis & Ruppenhofer 2001: Chapter 4; Candito 1996; Clément & Kinyon 2003: 188;
Kallmeyer & Osswald 2012: 171–172; Koenig 1999: Chapter 3; Davis & Koenig 2000; Kor-
doni 2001 for proposals in CxG, TAG, and HPSG). The idea is that a single representation
(lexical or phrasal, depending on the theory) can inherit properties from multiple con-
structions. In a phrasal approach the description of the pattern in (76b) inherits from
the transitive and the active construction and the description of (77b) inherits from both
the transitive and the passive constructions. Figure 21.5 illustrates the inheritance-based
lexical approach: a lexical entry for a verb such as read or eat is combined with either
an active or passive representation. The respective representations for the active and
passive are responsible for the expression of the arguments. As was already discussed
lexeme
in Section 10.2, inheritance-based analyses cannot account for multiple changes in va-
lence as for instance the combination of passive and impersonal construction that can
be observed in languages like Lithuanian (Timberlake 1982: Section 5), Irish (Noonan
1994), and Turkish (Özkaragöz 1986). Özkaragöz’s Turkish examples are repeated here
with the original glossing as (78) for convenience:
(78) a. Bu şato-da boğ-ul-un-ur. (Turkish)
this château-LOC strangle-PASS-PASS-AOR
‘One is strangled (by one) in this château.’
b. Bu oda-da döv-ül-ün-ür.
this room-LOC hit-PASS-PASS-AOR
‘One is beaten (by one) in this room.’
c. Harp-te vur-ul-un-ur.
war-LOC shoot-PASS-PASS-AOR
‘One is shot (by one) in war.’
Another example from Section 10.2 that cannot be handled with inheritance is multiple
causativization in Turkish. Turkish allows double and even triple causativization (Lewis
1967: 146):
640
21.4 Relations between constructions
An inheritance-based analysis would not work, since inheriting the same information
several times does not add anything new. Krieger & Nerbonne (1993) make the same
point with respect to derivational morphology in cases like preprepreversion: inheriting
information about the prefix pre- twice or more often, does not add anything.
So assuming phrasal models, the only way to capture the generalization with regard
to (76) and (77) seems to be to assume GPSG-like metarules that relate the constructions
in (76) to the ones in (77). If the constructions are lexically linked as in LTAG, the respec-
tive mapping rules would be lexical rules. For approaches that combine LTAG with the
Goldbergian plugging idea such as the one by Kallmeyer & Osswald (2012) one would
have to have extended families of trees that reflect the possibility of having additional
arguments and would have to make sure that the right morphological form is inserted
into the respective trees. The morphological rules would be independent of the syntactic
structures in which the derived verbal lexemes could be used. One would have to assume
two independent types of rules: GPSG-like metarules that operate on trees and morpho-
logical rules that operate on stems and words. We believe that this is an unnecessary
complication and apart from being complicated the morphological rules would not be
acceptable as form-meaning pairs in the CxG sense since one aspect of the form namely
that additional arguments are required is not captured in these morphological rules. If
such morphological rules were accepted as proper constructions then there would not
be any reason left to require that the arguments have to be present in a construction in
order for it to be recognizable, and hence, the lexical approach would be accepted.38
Inheritance hierarchies are the main explanatory device in Croft’s Radical Construc-
tion Grammar (Croft 2001). He also assumes phrasal constructions and suggests repre-
senting these in a taxonomic network (an inheritance hierarchy). He assumes that every
idiosyncrasy of a linguistic expression is represented on its own node in this kind of
network. Figure 21.6 shows part of the hierarchy he assumes for sentences. There are
Clause
Sbj sleep Sbj run Sbj kick Obj Sbj kiss Obj
38
Compare the discussion of Totschießen ‘shoot dead’ in example (94) below.
641
21 Phrasal vs. lexical analyses
sentences with intransitive verbs and sentences with transitive verbs. Sentences with
the form [Sbj kiss Obj] are special instances of the construction [Sbj TrVerb Obj]. The
[Sbj kick Obj] construction also has further sub-constructions, namely the constructions
[Sbj kick the bucket] and [Subj kick the habit]. Since constructions are always pairs of
form and meaning, this gives rise to a problem: in a normal sentence with kick, there is
a kicking relation between the subject and the object of kick. This is not the case for the
idiomatic use of kick in (80):
(80) He kicked the bucket.
This means that there cannot be a normal inheritance relation between the [Sbj kick
Obj] and the [Sbj kick the bucket] construction. Instead, only parts of the information
may be inherited from the [Sbj kick Obj] construction. The other parts are redefined by
the sub-construction. This kind of inheritance is referred to as default inheritance.
kick the bucket is a rather fixed expression, that is, it is not possible to passivize it
or front parts of it without losing the idiomatic reading (Nunberg, Sag & Wasow 1994:
508). However, this is not true for all idioms. As Nunberg, Sag & Wasow (1994: 510) have
shown, there are idioms that can be passivized (81a) as well as realizations of idioms
where parts of idioms occur outside of the clause (81b).
(81) a. The beans were spilled by Pat.
b. The strings [that Pat pulled] got Chris the job.
The problem is now that one would have to assume two nodes in the inheritance hier-
archy for idioms that can undergo passivization since the realization of the constituents
is different in active and passive variants but the meaning is nevertheless idiosyncratic.
The relation between the active and passive form would not be captured. Kay (2002) has
proposed an algorithm for computing objects (Construction-like objects = CLOs) from
hierarchies that then license active and passive variants. As I have shown in Müller
(2006: Section 3), this algorithm does not deliver the desired results and it is far from
straightforward to improve it to the point that it actually works. Even if one were to
adopt the changes I proposed, there are still phenomena that cannot be described using
inheritance hierarchies (see Section 10.2 in this book).
A further interesting point is that the verbs have to be explicitly listed in the construc-
tions. This begs the question of how constructions should be represented where the
verbs are used differently. If a new node in the taxonomic network is assumed for cases
like (82), then Goldberg’s criticism of lexical analyses that assume several lexical entries
for a verb that can appear in various constructions39 will be applicable here: one would
have to assume constructions for every verb and every possible usage of that verb.
(82) He kicked the bucket into the corner.
39
Note the terminology: I used the word lexical entry rather than lexical item. The HPSG analysis uses lexical
rules that correspond to Goldberg’s templates. What Goldberg criticizes is lexical rules that relate lexical
entries, not lexical rules that licence new lexical items, which may be stored or not. HPSG takes the latter
approach to lexical rules. See Section 9.2.
642
21.4 Relations between constructions
For sentences with negation, Croft assumes the hierarchy with multiple inheritance
given in Figure 21.7. The problem with this kind of representation is that it remains
I didn’t sleep
unclear as to how the semantic embedding of the verb meaning under negation can be
represented. If all constructions are pairs of form and meaning, then there would have
to be a semantic representation for [Sbj IntrVerb] (CONT value or SEM value). Similarly,
there would have to be a meaning for [Sbj Aux-n’t Verb]. The problem now arises that
the meaning of [Sbj IntrVerb] has to be embedded under the meaning of the negation and
this cannot be achieved directly using inheritance since X and not(X) are incompatible.
There is a technical solution to this problem using auxiliary features. Since there are a
number of interactions in grammars of natural languages, this kind of analysis is highly
implausible if one claims that features are a direct reflection of observable properties of
linguistic objects. For a more detailed discussion of approaches with classifications of
phrasal patterns, see Müller (2010b) as well as Müller (2007a: Section 18.3.2.2) and for the
use of auxiliary features in inheritance-based analyses of the lexicon, see Müller (2007a:
Section 7.5.2.2).
Figure 21.8 shows Ziem & Lasch’s hierarchy for German sentences with the verbs
lachen ‘laugh’, weinen ‘cry’, drücken ‘push’, mögen ‘like’ that is similar to Croft’s hierar-
chy in spirit (Ziem & Lasch 2013: 97). Things that I mentioned with respect to Croft’s
Satz
[[NPnom ] [weinen]] [NPnom [lachen]] [[NPnom ] drücken [NPacc ]] [[NPnom ] mag [NPacc ]]
Figure 21.8: Inheritance hierarchy for clauses by Ziem & Lasch (2013: 97)
hierarchy also apply to this hierarchy for German and in fact they demonstrate the prob-
lems even more clearly. The idiomatic usages of den Preis drücken ‘to beat down the price’
and die Schulbank drücken ‘to go to school’ are not as fixed as the hierarchy seems to
643
21 Phrasal vs. lexical analyses
suggest. For example, den Preis drücken may appear with an indefinite article and there
may be NP-internal modification:
(83) a. Einen solch guten Preis kann man nur schwer weiter drücken.
a such good price can one only difficult further press
‘It is hardly possible to beat down such a good price even further.’
b. So kann man den schon recht guten Preis weiter drücken.
this.way can one the yet right good price further press
‘This way the rather good price can be reduced even further.’
Note also that it would be wrong to claim that all instances of the den Preis drücken
involve the realization of an NP with nominative case.
(84) Einen solchen Preis zu drücken ist nicht leicht.
a such price to press is not easy
‘It is not easy to beat down such a good price.’
Since Construction Grammar does not use empty elements, (84) cannot be explained
without the assumption of a separate phrasal construction, one without an NP[nom].
So what has to be said about the special usage of drücken ‘to press’ is that drücken has
to cooccur with an NP (definite or indefinite) containing Preis ‘price’. If one insists on
a phrasal representation like the one in Figure 21.8, one has to explain how the clauses
represented in this figure are related to other clauses in which the NP contains an ad-
jective. One would be forced to assume relations between complex linguistic objects
(basically something equivalent to transformations with the power that was assumed
in the 50ies, see p. 86) or one would have to assume that den Preis ‘the price’ is not a
more specific description of the NP[acc] but rather corresponds to some underspecified
representation that allows the integration of adjectives between determiner and noun
(see the discussion of the phrasal lexical item in (40) on p. 334. A third way to capture
the relation between den Preis drücken ‘reduce the price’ and den guten Preis drücken
‘reduce the good price’ would be to assume a TAG-style grammar that can take a tree,
break it up and insert an adjective in the middle. I never saw a discussion of these issues
anywhere in the literature. Hierarchies like the ones in Figure 21.6 and Figure 21.8 seem
to classify some attested examples but they do not say anything about general grammar.
Note for instance, that (83a) differs from the representation in Figure 21.8 by having an
additional modal verb and by having the NP containing Preis fronted. Such frontings can
be nonlocal. How is this accounted for? If the assumption is that the elements in Fig-
ure 21.8 are not ordered, what would be left of the original constructional motivation? If
the assumptions that elements of the phrasal constructions may be discontinuous, what
are the restrictions on this (see Section 10.6.4.7 and Section 11.7.1)? The alternative to
classifying (some of the) possible phrases is the hierarchy of lexical types given in Fig-
ure 21.9 on the next page. The respective lexical items have valence specifications that
allow them to be used in certain configurations: arguments can be scrambled, can be
extracted, passivization lexical rules may apply and so on.
644
21.4 Relations between constructions
verb lexem
GF2 GF3
tion. Culicover & Jackendoff (2005: 204) explicitly avoid names like Subject and Object
since this is crucial for their analysis of the passive to work. They assume that the first
645
21 Phrasal vs. lexical analyses
GF following a bracket is the subject of the clause the bracket corresponds to (p. 195–
196) and hence has to be mapped to an appropriate tree position in English. Note that
this view of grammatical functions and obliqueness does not account for subjectless sen-
tences that are possible in some languages, for instance in German.40
Regarding the passive, the authors write:
we wish to formulate the passive not as an operation that deletes or alters part
of the argument structure, but rather as a piece of structure in its own right that
can be unified with the other independent pieces of the sentence. The result of
the unification is an alternative licensing relation between syntax and semantics.
(Culicover & Jackendoff 2005: 203)
GF2 GF3
Although Culicover and Jackendoff emphasize the similarity between their approach
and Relational Grammar (Perlmutter 1983), there is an important difference: in Relational
Grammar additional levels (strata) can be stipulated if additional remappings are needed.
In Culicover and Jackendoff’s proposal there is no additional level. This causes problems
for the analysis of languages which allow for multiple argument alternations. Examples
from Turkish were provided in (78). Approaches that assume that the personal passive
is the unification of a general structure with a passive-specific structure will not be able
to capture this, since they committed to a certain structure too early. The problem for
approaches that state syntactic structure for the passive is that such a structure, once
stated, cannot be modified. Culicover and Jackendoff’s proposal works in this respect
40
Of course one could assume empty expletive subjects, as was suggested by Grewendorf (1993: 1311), but
empty elements and especially those without meaning are generally avoided in the constructionist litera-
ture. See Müller (2010a: Section 3.4, Section 11.1.1.3) for further discussion.
646
21.4 Relations between constructions
since there are no strong constraints in the right-hand side of their constraint in (85).
But there is a different problem: when passivization is applied the second time, it has to
apply to the innermost bracket, that is, the result of applying (85) should be:
(86) [GF 𝑖 > [GF 𝑗 …]]𝑘 ⇔ [ …V𝑘 + pass …(by NP𝑖 ) …(by NP 𝑗 ) …]𝑘
This cannot be done with unification, since unification checks for compatibility and since
the first application of passive was possible it would be possible for the second time as
well. Dots in representations are always dangerous and in the example at hand one
would have to make sure that NP𝑖 and NP 𝑗 are distinct, since the statement in (85) just
says there has to be a by-PP somewhere. What is needed instead of unification would
be something that takes a GF representation and searches for the outermost bracket and
then places a bracket to the left of the next GF. But this is basically a rule that maps one
representation onto another one, just like lexical rules do.
If Culicover and Jackendoff want to stick to a mapping analysis, the only option to
analyze the data seems to be to assume an additional level for impersonal passives from
which the mapping to phrase structure is done. In the case of Turkish sentences like (87),
which is a personal passive, the mapping to this level would be the identity function.
(87) Arkadaş-ım bu oda-da döv-ül-dü.
friend-my this room-LOC hit-PASS-AOR
‘My friend is beaten (by one) in this room.’
In the case of passivization + impersonal construction, the correct mappings would be
implemented by two mappings between the three levels that finally result in a mapping
as the one that is seen in (78b), repeated here as (88) for convenience.
(88) Bu oda-da döv-ül-ün-ür.
this room-LOC hit-PASS-PASS-AOR
‘One is beaten (by one) in this room.’
Note that passivization + impersonal construction is also problematic for purely inher-
itance based approaches. What all these approaches can suggest though is that they
just stipulate four different relations between argument structure and phrase structure:
active, passive, impersonal construction, passive + impersonal construction. But this
misses the fact that (88) is an impersonal variant of the passive in (87).
In contrast, the lexical rule-based approach suggested by Müller (2003b) does not have
any problems with such multiple alternations: the application of the passivization lexical
rule suppresses the least oblique argument and provides a lexical item with the argument
structure of a personal passive. Then the impersonal lexical rule applies and suppresses
the now least oblique argument (the object of the active clause). The result is impersonal
constructions without any arguments as the one in (88).
647
21 Phrasal vs. lexical analyses
believe that the essential problem with them is that they fail to capture the derivational
character of the relationship between certain word forms. Alternations signaled by pas-
sive voice and causative morphology are relatively simple and regular when formulated
as operations on lexical valence structures that have been abstracted from their phrasal
context. But non-transformational rules or systems formulated on the phrasal structures
encounter serious problems that have not yet been solved.
648
21.6 Arguments from language acquisition
(2002a,c; 2003c; 2007c).41 A German example is given in (91); several pages of attested
examples can be found in the cited references and some more complex examples will
also be discussed in Section 21.7.3 on page 660.
(91) Los damit geht es schon am 15. April.42
PART there.with goes it already at.the 15 April
‘It already starts on April the 15th.’
Particle verbs are mini-idioms. So the conclusion is that idiomatic expressions that al-
low for a certain flexibility in order should not be represented as phrasal configurations
describing adjacent elements. For some idioms, a lexical analysis along the lines of Sag
(2007) seems to be required.43 The issue of particle verbs will be taken up in Section 21.7.3
again, where we discuss evidence for/against phrasal analyses from neuroscience.
649
21 Phrasal vs. lexical analyses
650
21.6 Arguments from language acquisition
phrasal patterns have to be broken up in coordination structures. This was already men-
tioned in Section 16.3, but I think it is illuminative to have a look at concrete proposals.
In Categorial Grammar, there is a very elegant treatment of coordination (see Steed-
man 1991). A generalization with regard to so-called symmetric coordination is that two
objects with the same syntactic properties are combined to an object with those proper-
ties. We have already encountered the relevant data in the discussion of the motivation
for feature geometry in HPSG on page 275. Their English versions are repeated below
as (95):
(95) a. the man and the woman
b. He knows and loves this record.
c. He is dumb and arrogant.
Steedman (1991) analyzes examples such as those in (95) with a single rule:
(96) X conj X ⇒ X
This rule combines two categories of the same kind with a conjunction in between to
form a category that has the same category as the conjuncts.48 Figure 21.12 shows the
analysis of (95a) and Figure 21.13 gives an analysis of the corresponding English example
of (95b).
48
Alternatively, one could analyze all three examples using a single lexical entry for the conjunction and: and
is a functor that takes a word or phrase of any category to its right and after this combination then needs
to be combined with an element of the same category to its left in order to form the relevant category after
combining with this second element. This means that the category for und would have the form (X\X)/X.
This analysis does not require any coordination rules. If one wants to assume, as is common in GB/MP,
that every structure has a head, then a headless analysis that assumes a special rule for coordination like
the one in (96) would be ruled out.
651
21 Phrasal vs. lexical analyses
If we compare this analysis to the one that would have to be assumed in traditional
phrase structure grammars, it becomes apparent what the advantages are: one rule was
required for the analysis of NP coordination where two NPs are coordinated to form
an NP and another was required for the analysis of V coordination. This is not only
undesirable from a technical point of view, neither does it capture the basic property of
symmetric coordination: two symbols with the same syntactic category are combined
with each other.
It is interesting to note that it is possible to analyze phrases such as (97) in this way:
(97) give George a book and Martha a record
In Section 1.3.2.4, we have seen that this kind of sentences is problematic for constituent
tests. However, in Categorial Grammar, it is possible to analyze them without any prob-
lems if one adopts rules for type raising and composition as Dowty (1988) and Steedman
(1991) do. In Section 8.5, we have already seen forward type raising as well as forward
and backward composition. In order to analyze (97), one would require backward type
raising repeated in (98) and backward composition repeated in (99):
(98) Backward type raising (< T)
X ⇒ T\(T/X)
This kind of type-raising analysis was often criticized because raising categories leads
to many different analytical possibilities for simple sentences. For example, one could
first combine a type-raised subject with the verb and then combine the resulting con-
stituent with the object. This would mean that we would have a [[S V] O] in addition
to the standard [S [V O]] analysis. Steedman (1991) argues that both analyses differ in
terms of information structure and it is therefore valid to assume different structures for
the sentences in question.
I will not go into these points further here. However, I would like to compare Steed-
man’s lexical approach to phrasal analyses: all approaches that assume that the ditransi-
tive construction represents a continuous pattern encounter a serious problem with the
examples discussed above. This can be best understood by considering the TAG analysis
652
21.6 Arguments from language acquisition
of coordination proposed by Sarkar & Joshi (1996). If one assumes that [Sbj TransVerb
Obj] or [S [V O]] constitutes a fixed unit, then the trees in Figure 21.15 form the starting
point for the analysis of coordination.
S S
NP↓ VP NP↓ VP
V NP↓ V NP↓
knows loves
If one wants to use these trees/constructions for the analysis of (100), there are in
principle two possibilities: one assumes that two complete sentences are coordinated or
alternatively, one assumes that some nodes are shared in a coordinated structure.
(100) He knows and loves this record.
Abeillé (2006) has shown that it is not possible to capture all the data if one assumes
that cases of coordination such as those in (100) always involve the coordination of two
complete clauses. It is also necessary to allow for lexical coordination of the kind we
saw in Steedman’s analysis (see also Section 4.6.3). Sarkar & Joshi (1996) develop a TAG
analysis in which nodes are shared in coordinate structures. The analysis of (100) can be
seen in Figure 21.16. The subject and object nodes are only present once in this figure.
S S
NP VP V VP
V V NP
The S nodes of both elementary trees both dominate the he NP. In the same way, the
object NP node belongs to both VPs. The conjunction connects two verbs indicated by
the thick lines. Sarkar and Joshi provide an algorithm that determines which nodes are
653
21 Phrasal vs. lexical analyses
to be shared. The structure may look strange at first, but for TAG purposes, it is not the
derived tree but rather the derivation tree that is important, since this is the one that is
used to compute the semantic interpretation. The authors show that the derivation trees
for the example under discussion and even more complex examples can be constructed
correctly.
In theories such as HPSG and LFG where structure building is, as in Categorial Gram-
mar, driven by valence, the above sentence is unproblematic: both verbs are conjoined
and then the combination behaves like a simple verb. The analysis of this is given in Fig-
ure 21.17. This analysis is similar to the Categorial Grammar analysis in Figure 21.13.49
With Goldberg’s plugging analysis one could also adopt this approach to coordination:
NP VP
V NP
V V
Figure 21.17: Selection-based analysis of He knows and loves this record. in tree notation
here, knows and loves would first be plugged into a coordination construction and the
result would then be plugged into the transitive construction. Exactly how the seman-
tics of knows and loves is combined with that of the transitive construction is unclear
since the meaning of this phrase is something like and ′(know ′(x, y), love ′(x, y)), that is, a
complex event with at least two open argument slots x and y (and possibly additionally
an event and a world variable depending on the semantic theory that is used). Goldberg
would probably have to adopt an analysis such as the one in Figure 21.16 in order to
maintain the plugging analysis.
Croft would definitely have to adopt the TAG analysis since the verb is already present
in his constructions. For the example in (97), both Goldberg and Croft would have to
draw from the TAG analysis in Figure 21.18 on the next page. The consequence of this
is that one requires discontinuous constituents. Since coordination allows a consider-
able number of variants, there can be gaps between all arguments of constructions. An
example with a ditransitive verb is given in (101):
(101) He gave George and sent Martha a record.
See Crysmann (2008) and Beavers & Sag (2004) for HPSG analyses that assume discon-
tinuous constituents for particular coordination structures.
49
A parallel analysis in Dependency Grammar is possible as well. Tesnière’s original analysis was different
though. See Section 11.6.2.1 for discussion.
654
21.7 Arguments from psycho- and neurolinguistics
S VP S
NP VP VP
V NP NP NP NP
Figure 21.18: TAG analysis of He gave George a book and Martha a record.
The result of these considerations is that the argument that particular elements oc-
cur next to each other and that this occurrence is associated with a particular meaning
is considerably weakened. What competent speakers do acquire is the knowledge that
heads must occur with their arguments somewhere in the utterance and that all the
requirements of the heads involved have to somehow be satisfied (𝜃 -Criterion, coher-
ence/completeness, empty SPR and COMPS list). The heads themselves need not necessar-
ily occur directly adjacent to their arguments. See the discussion in Section 16.3 about
pattern-based models of language acquisition.
The computation of the semantic contribution of complex structures such as those in
Figure 21.18 is by no means trivial. In TAG, there is the derivation tree in addition to
the derived tree that can then be used to compute the semantic contribution of a linguis-
tic object. Construction Grammar does not have this separate level of representation.
The question of how the meaning of the sentences discussed here is derived from their
component parts still remains open for phrasal approaches.
Concluding the section on language acquisition, we assume that a valence representa-
tion is the result of language acquisition, since this is necessary for establishing the de-
pendency relations in various possible configurations in an utterance. See also Behrens
(2009: 439) for a similar conclusion.
655
21 Phrasal vs. lexical analyses
guity like those in (102) and sentences with two verbs with the same core meaning have
different processing times.
(102) a. Bill set the alarm clock onto the shelf.
b. Bill set the alarm clock for six.
Errors due to lexical ambiguity cause a bigger increase in processing time than errors in
the use of the same verb. Experiments showed that there was a bigger difference in pro-
cessing times for the sentences in (102) than for the sentences in (103). The difference in
processing times between (103a) and (103b) would be explained by different preferences
for phrasal constructions. In a lexicon-based approach one could explain the difference
by assuming that one lexical item is more basic, that is, stored in the mental dictionary
and the other is derived from the stored one. The application of lexical rules would be
time consuming, but since the lexical items are related, the overall time consumption is
smaller than the time needed to process two unrelated items (Müller 2002a: 405).
Alternatively one could assume that the lexical items for both valence patterns are
the result of lexical rule applications. As with the phrasal constructions, the lexical rules
would have different preferences. This shows that the lexical approach can explain the
experimental results as well, so that they do not force us to prefer phrasal approaches.
Goldberg (1995: 18) claims that lexical approaches have to assume two variants of
load with different meaning and that this would predict that load alternations would
behave like two verbs that really have absolutely different meanings. The experiments
discussed above show that such predictions are wrong and hence lexical analyses would
be falsified. However, as was shown in Müller (2010a: Section 11.11.8.2), the argumenta-
tion contains two flaws: let’s assume that the construction meaning of the construction
that licenses (103a) is C1 and the construction meaning of the construction that licenses
(103b) is C2 . Under such assumptions the semantic contribution of the two lexical items
in the lexical analysis would be (104). load(…) is the contribution of the verb that would
be assumed in phrasal analyses.
(104) a. load (onto): C1 ∧ load(…)
b. load (with): C2 ∧ load(…)
(104) shows that the lexical items partly share their semantic contribution. We hence
predict that the processing of the dispreferred argument realization of load is simpler
than the dispreferred meaning of set: in the latter case a completely new verb has to be
activated while in the first case parts of the meaning are activated already.50
Goldberg (1995: 107) argues against lexical rule-based approaches for locative alterna-
tions like (105), since according to her such approaches have to assume that one of the
verb forms has to be the more basic form.
50
See also Croft (2003: 64–65) for a brief rejection of Goldberg’s interpretation of the experiment that corre-
sponds to what is said here.
656
21.7 Arguments from psycho- and neurolinguistics
657
21 Phrasal vs. lexical analyses
lexical rule would have to be used rather than the bivalent one. Building syntactic struc-
ture and lexicon access in general place different demands on our processing capacities.
However, when (106c) is parsed, the lexical items for drink are active already, we only
have to use a different one. It is currently unclear to us whether psycholinguistic exper-
iments can differentiate between the two approaches, but it seems to be unlikely.
items) cause brain responses that differ in polarity from brain responses on incorrect
strings of words, that is, syntactic combinations. This suggests that there is indeed an
empirical basis for deciding the issue.
Concerning the standard example of the Caused-Motion Construction in (109) the
authors write the following:
(109) She sneezed the foam off the cappuccino.52
this constellation of brain activities may initially lead to the co-activation of the
verb sneeze with the DCNAs for blow and thus to the sentence mentioned. Ulti-
mately, such co-activation of a one-place verb and DCNAs associated with other
verbs may result in the former one-place verb being subsumed into a three-place
verb category and DCNA set, a process which arguably has been accomplished
for the verb laugh as used in the sequence laugh NP off the stage. (Pulvermüller,
Cappelle & Shtyrov 2013)
659
21 Phrasal vs. lexical analyses
(See Section 21.5.1 for discussion). However, in general, it remains an open question what
it means to be a discontinuous lexical item. The idea of discontinuous words is pretty old
(Wells 1947), but there have not been many formal accounts of this idea. Nunberg, Sag &
Wasow (1994) suggest a representation in a linearization-based framework of the kind
that was proposed by Reape (1994) and Kathol (1995: 244–248) and Crysmann (2002)
worked out such analyses in detail. Kathol’s lexical item for aufwachen ‘to wake up’ is
given in (110):
(110) aufwachen (following Kathol 1995: 246):
…|HEAD 1 verb
…|VCOMP ⟨⟩
vc
* ⟨ wachen ⟩ + * +
auf " #
DOM …|HEAD 1 ⃝
sepref
…|VCOMP ⟨ ⟩ SYNSEM 2 …|HEAD
2 FLIP −
The lexical representation contains the list-valued feature DOM that contains a descrip-
tion of the main verb and the particle (see Section 11.7.2.2 for details). The DOM list is a
list that contains the dependents of a head. The dependents can be ordered in any order
provided no linearization rule is violated (Reape 1994). The dependency between the
particle and the main verb was characterized by the value of the VCOMP feature, which
is a valence feature for the selection of arguments that form a complex predicate with
their head. The shuffle operator ⃝ concatenates two lists without specifying an order
between the elements of the two lists, that is, both wachen, auf and auf, wachen are
possible. The little marking vc is an assignment to a topological field in the clause.
I criticized such linearization-based proposals since it is unclear how analyses that
claim that the particle is just linearized in the domain of its verb can account for sen-
tences like (111), in which complex syntactic structures are involved (Müller 2007b). Ger-
man is a V2 language and the fronting of a constituent into the position before the finite
verb is usually described as some sort of nonlocal dependency; that is, even authors
who favor linearization-based analyses do not assume that the initial position is filled
by simple reordering of material (Kathol 2000; Müller 1999b; 2002a; Bjerre 2006).53
(111) a. [vf [mf Den Atem] [vc an]] hielt die ganze Judenheit.54
the breath PART held the whole Jewish.community
‘The whole Jewish community held their breath.’
53
Kathol (1995: Section 6.3) working in HPSG suggested such an analysis for simple sentences, but later
changed his view. Wetta (2011) also working in HPSG assumes a purely linearization-based approach. Sim-
ilarly Groß & Osborne (2009) working in Dependency Grammar assume that there is a simple dependency
structure in simple sentences while there are special mechanisms to account for extraction out of embedded
clauses. I argue against such proposals in Müller (2017a) referring to the scope of adjuncts, coordination of
simple with complex sentences and Across the Board Extraction and apparent multiple frontings. See also
Section 11.7.1.
54
Lion Feuchtwanger, Jud Süß, p. 276, quoted from Grubačić (1965: 56).
660
21.7 Arguments from psycho- and neurolinguistics
b. [vf [mf Wieder] [vc an]] treten auch die beiden Sozialdemokraten.55
again PART kick also the two Social.Democrats
‘The two Social Democrats are also running for office again.’
c. [vf [vc Los] [nf damit]] geht es schon am 15. April.56
PART there.with went it already at.the 15 April
‘It already starts on April the 15th.’
The conclusion that has to be drawn from examples like (111) is that particles interact in
complex ways with the syntax of sentences. This is captured by the lexical treatment that
was suggested in Müller (2002a: Chapter 6) and Müller (2003c): the main verb selects
for the verbal particle. By assuming that wachen selects for auf, the tight connection
between verb and particle is represented.57 Such a lexical analysis provides an easy way
to account for fully nontransparent particle verbs like an-fangen ‘to begin’. However, I
also argued for a lexical treatment of transparent particle verbs like losfahren ‘to start
to drive’ and jemanden/etwas anfahren ‘drive directed towards somebody/something’.
The analysis involves a lexical rule that licenses a verbal item selecting for an adjunct
particle. The particles an and los can modify verbs and contribute arguments (in the
case of an) and the particle semantics. This analysis can be shown to be compatible
with the neuro-mechanical findings: if it is the case that even transparent particle verb
combinations with low frequency are stored, then the rather general lexical rule that I
suggested in the works cited above is the generalization of the relation between a large
amount of lexical particle verb items and their respective main verb. The individual
particle verbs would be special instantiations that have the form of the particle specified
as it is also the case for non-transparent particle verbs like anfangen. If it should turn
out that productive combinations with particle verbs of low frequency cause syntactic
reflexes in the brain, this could be explained as well: the lexical rule licenses an item
that selects for an adverbial element. This selection would then be seen as parallel to the
relation between the determiner and the noun in the NP der Mut ‘the courage’, which
Cappelle et al. (2010: 191) discuss as an example of a syntactic combination. Note that
this analysis is also compatible with another observation made by Shtyrov, Pihko &
Pulvermüller (2005): morphological affixes also cause the lexical reflexes. In my analysis
the stem of the main verb is related to another stem that selects for a particle. This stem
can be combined with (derivational and inflectional) morphological affixes causing the
lexical activation pattern in the brain. After this combination the verb is combined with
the particle and the dependency can be either a lexical or a syntactic one, depending on
the results of the experiments to be carried out. The analysis is compatible with both
results.
55
taz, bremen, 24.05.2004, p. 21.
56
taz, 01.03.2002, p. 8.
57
Cappelle et al. (2010: 197) write: “the results provide neurophysiological evidence that phrasal verbs are
lexical items. Indeed, the increased activation that we found for existing phrasal verbs, as compared to
infelicitous combinations, suggests that a verb and its particle together form one single lexical representa-
tion, i. e. a single lexeme, and that a unified cortical memory circuit exists for it, similar to that encoding a
single word”. I believe that my analysis is compatible with this statement.
661
21 Phrasal vs. lexical analyses
58
However, see Booij (2009) for some challenges to lexical integrity.
59
http://www.coffee2watch.at/egala. 2012-03-23
662
21.8 Arguments from statistical distribution
distributional analyses cannot decide the question whether argument structure construc-
tions are phrasal or lexical.
X X X
X X X X X X X X X X X X
The first figure corresponds to the Goldbergian view of phrasal constructions where the verb is inserted
into the construction and the meaning is present at the topmost node. In the second figure, there is a
lexical rule that provides the resultative semantics and the corresponding valence information. In the third
analysis, there is an empty head that combines with the verb and has ultimately the same effect as the
lexical rule.
663
21 Phrasal vs. lexical analyses
X X
X X X X X
X X X X X X
X X X X X X
X X X X X
Figure 21.19: Three possible analyses for resultative construction: holistic construction,
lexical rule, empty head
an empty head into one with a lexical rule. For the present example, any argumentation
for a particular analysis will be purely theory-internal.
Although Unsupervised Data-Oriented Parsing (U-DOP) cannot help us to decide be-
tween analyses, there are areas of grammar for which these structures are of interest:
under the assumption of binary-branching structures, there are different branching pos-
sibilities depending on whether one assumes an analysis with verb movement or not.
This means that although one does not see an empty element in the input, there is a re-
flex in statistically-derived trees. The left tree in Figure 21.20 shows a structure that one
would expect from an analysis following Steedman (2000: 159), see Section 8.3. The tree
on the right shows a structure that would be expected from a GB-type verb movement
analysis (see Section 3.2). But at present, there is no clear finding in this regard (Bod, p. c.
X X
X X X X
X X X X
664
21.8 Arguments from statistical distribution
2009). There is a great deal of variance in the U-DOP trees. The structure assigned to an
utterance depends on the verb (Bod, referring to the Wall Street Journal). Here, it would
be interesting to see if this changes with a larger data sample. In any case, it would be
interesting to look at how all verbs as well as particular verb classes behave. The U-DOP
procedure applies to trees containing at least one word each. If one makes use of parts
of speech in addition, this results in structures that correspond to the ones we have seen
in the preceding chapters. Sub-trees would then not have two Xs as their daughters but
rather NP and V, for example. It is also possible to do statistic work with this kind of
subtrees and use the part of speech symbols of words (the preterminal symbols) rather
than the words themselves in the computation. For example, one would get trees for the
symbol V instead of many trees for specific verbs. So instead of having three different
trees for küssen ‘kiss’, kennen ‘know’ and sehen ‘see’, one would have three identical trees
for the part of speech “verb” that corresponds to the trees that are needed for transitive
verbs. The probability of the V tree is therefore higher than the probabilities of the trees
for the individual verbs. Hence one would have a better set of data to compute structures
for utterances such as those in Figure 21.20. I believe that there are further results in this
area to be found in the years to come.
Concluding this subsection, we contend that Bod’s paper is a milestone in the Poverty
of the Stimulus debate, but it does not and cannot show that a particular version of
constructionist theories, namely the phrasal one, is correct.
21.8.2 Collostructions
Stefanowitsch & Gries (2009: Section 5) assume a plugging analysis: “words occur in
(slots provided by) a given construction if their meaning matches that of the construc-
tion”. The authors claim that their collostructional analysis has confirmed [the plugging
analysis] from various perspectives. Stefanowitsch and Gries are able to show that cer-
tain verbs occur more often than not in particular constructions, while other verbs never
occur in the respective constructions. For instance, give, tell, send, offer and show are at-
tracted by the Ditransitive Construction, while make and do are repelled by this construc-
tion, that is they occur significantly less often in this construction than what would be
expected given the overall frequency of verbs in the corpus. Regarding this distribution
the authors write:
These results are typical for collexeme analysis in that they show two things. First,
there are indeed significant associations between lexical items and grammatical
structures. Second, these associations provide clear evidence for semantic coher-
ence: the strongly attracted collexemes all involve a notion of ‘transfer’, either liter-
ally or metaphorically, which is the meaning typically posited for the ditransitive.
This kind of result is typical enough to warrant a general claim that collostructional
analysis can in fact be used to identify the meaning of a grammatical construction
in the first place. (Stefanowitsch & Gries 2009: 943)
We hope that the preceding discussion has made clear that the distribution of words in
a corpus cannot be seen as evidence for a phrasal analysis. The corpus study shows that
665
21 Phrasal vs. lexical analyses
give usually is used with three arguments in a certain pattern that is typical for English
(Subject Verb Object1 Object2) and that this verb forms a cluster with other verbs that
have a transfer component in their meaning. The corpus data do not show whether this
meaning is contributed by a phrasal pattern or by lexical entries that are used in a certain
configuration.
21.9 Conclusion
The essence of the lexical view is that a verb is stored with a valence structure indicat-
ing how it combines semantically and syntactically with its dependents. Crucially, that
structure is abstracted from the actual syntactic context of particular tokens of the verb.
Once abstracted, that valence structure can meet other fates besides licensing the phrasal
structure that it most directly encodes: it can undergo lexical rules that manipulate that
structure in systematic ways; it can be composed with the valence structure of another
predicate; it can be coordinated with similar verbs; and so on. Such an abstraction al-
lows for simple explanations of a wide range of robust, complex linguistic phenomena.
We have surveyed the arguments against the lexical valence approach and in favor of
a phrasal representation instead. We find the case for a phrasal representation of argu-
ment structure to be unconvincing: there are no compelling arguments in favor of such
approaches, and they introduce a number of problems:
• They offer no account for the interaction of valence changing processes and deri-
vational morphology.
• They offer no account for the interaction of valence changing processes and coor-
dination of words.
Assuming a lexical valence structure allows us to solve all the problems that arise with
phrasal approaches.
666
21.10 Why (phrasal) constructions?
two do not interact. In these cases, there is mostly a choice between analyses with silent
heads and those with phrasal constructions. In this section, I will discuss some of these
cases.
667
21 Phrasal vs. lexical analyses
Cases such as (114) can be analyzed with an empty head that corresponds to haltend
‘holding’. For (113), on the other hand, one would require either a syntactic structure with
multiple empty elements, or an empty head that selects both parts of the construction
and contributes the components of meaning that are present in (115). If one adopts the
first approach with multiple silent elements, then one would have to explain why these
elements cannot occur in other constructions. For example, it would be necessary to
assume an empty element corresponding to man ‘one’/‘you’. But such an empty element
could never occur in embedded clauses since subjects cannot simply be omitted there:
(117) * weil dieses Buch gerne liest
because this book gladly reads
Intended: ‘because he/she/it likes to read this book’
If one were to follow the second approach, one would be forced to assume an empty
head with particularly odd semantics.
The directives in (118) and (119) are similarly problematic (see also Jackendoff & Pinker
(2005: 220) for parallel examples in English):
(118) a. Her mit dem Geld / dem gestohlenen Geld!
here with the money the stolen money
‘Hand over the (stolen) money!’
b. Weg mit dem Krempel / dem alten Krempel!
away with the junk the old junk
‘Get rid of this (old) junk!’
c. Nieder mit den Studiengebühren / den sozialfeindlichen Studiengebühren!
down with the tuition.fees the antisocial tuition.fees
‘Down with (antisocial) tuition fees!’
Here, it is also not possible to simply identify an elided verb. It is, of course, possible
to assume an empty head that selects an adverb or a mit-PP, but this would be ad hoc.
Alternatively, it would be possible to assume that adverbs in (118) select the mit-PP. Here,
one would have to disregard the fact that adverbs do not normally take any arguments.
The same is true of Jacobs’s examples in (119). For these, one would have to assume that
in and zur ‘to the’ are the respective heads. Each of the prepositions would then have to
select a noun phrase and a mit-PP. While this is technically possible, it is as unattractive
as the multiple lexical entries that Categorial Grammar has to assume for pied-piping
constructions (see Section 8.6).
668
21.10 Why (phrasal) constructions?
An empty passive morpheme absorbs the capability of the verb to assign accusative (see
also Section 3.4 on the analysis of the passive in GB theory). The object therefore has
to be realized as a PP or not at all. It follows from Burzio’s Generalization that as the
accusative object has been suppressed, there cannot be an external argument. G. Müller
assumes, like proponents of Distributed Morphology (e.g., Marantz 1997), that lexical en-
tries are inserted into complete trees post-syntactically. The antipassive morpheme cre-
ates a feature bundle in the relevant tree node that is not compatible with German verbs
such as schmeißen ‘throw’ and this is why only a null verb with the corresponding spec-
ifications can be inserted. Movement of the directional PP is triggered by mechanisms
that cannot be discussed further here. The antipassive morpheme forces an obligatory
reordering of the verb in initial position (to C, see Section 3.2 and Section 4.2). By stipu-
lation, filling the prefield is only possible in sentences where the C position is filled by a
visible verb and this is why G. Müller’s analysis does only derive V1 clauses. These are
interpreted as imperatives or polar questions. Figure 21.21 on the following page gives
the analysis of (120b). Budde (2010) and Maché (2010) note that the discussion of the data
has neglected the fact that there are also interrogative variants of the construction:
(121) a. Wohin mit den Klamotten?
where.to with the clothes
‘Where should the clothes go?’
b. Wohin mit dem ganzen Geld?
where.to with the entire money
‘Where should all this money go?’
Since these questions correspond to V2 sentences, one does not require the constraint
that the prefield can only be filled if the C position is filled.
One major advantage of this analysis is that it derives the different sentence types
that are possible with this kind of construction: the V1-variants correspond to polar
questions and imperatives, and the V2-variants with a question word correspond to wh-
669
21 Phrasal vs. lexical analyses
CP
C vP
v + APASS C PP2 v′
V v + APASS VP v
DP1 V′
Figure 21.21: In den Müll mit diesen Klamotten ‘in the trash with these clothes’ as an
antipassive following G. Müller (2009a)
Nevertheless one should still bear the price of this analysis in mind: it assumes an empty
antipassive morpheme that is otherwise not needed in German. It would only be used
in constructions of the kind discussed here. This morpheme is not compatible with any
verb and it also triggers obligatory verb movement, which is something that is not known
from any other morpheme used to form verb diatheses.
670
21.10 Why (phrasal) constructions?
The costs of this analysis are, of course, less severe if one assumes that humans already
have this antipassive morpheme anyway, that is, this morpheme is part of our innate
Universal Grammar. But if one follows the argumentation from the earlier sections of
this chapter, then one should only assume innate linguistic knowledge if there is no
alternative explanation.
G. Müller’s analysis can be translated into HPSG. The result is given in (124):
verb-initial-lr
RELS imperative-or-interrogative ⊕
EVENT 2
PHON ⟨⟩
HEAD|MOD none
⟨⟩
CAT
SPR D E
(124) COMPS XP[MOD … IND 1 ], (PP[mit] )
1
LEX-DTR
SS|LOC IND 2
* directive +
CONT EVENT 2
RELS
PATIENT 1
(124) contains a lexical entry for an empty verb in verb-initial position. directive ′ is a
placeholder for a more general relation that should be viewed as a supertype of all possi-
ble meanings of this construction. These subsume both schmeißen ‘to throw’ and cases
such as (125) that were pointed out to me by Monika Budde:
(125) Und mit dem Klavier ganz langsam durch die Tür!
and with the piano very slowly through the door
‘Carry the piano very slowly through the door!’
Since only verb-initial and verb-second orders are possible in this construction, the ap-
plication of the lexical rule for verb-initial position (see page 296) is obligatory. This can
be achieved by writing the result of the application of this lexical rule into the lexicon,
without having the object to which the rule should have applied actually being present
in the lexicon itself. Koenig (1999: Section 3.4.2, 5.3) proposed something similar for En-
glish rumored ‘it is rumored that …’ and aggressive. There is no active variant of the verb
rumored, a fact that can be captured by the assumption that only the result of applying a
passive lexical rule is present in the lexicon. The actual verb or verb stem from which the
participle form has been derived exists only as the daughter of a lexical rule but not as
an independent linguistic object. Similarly, the verb * aggress only exists as the daughter
of a (non-productive) adjective rule that licenses aggressive and a nominalization rule
licensing aggression.
The optionality of the mit-PP is signaled by the brackets in (124). If one adds the
information inherited from the type verb-initial-lr under SYNSEM, then the result is (126).
671
21 Phrasal vs. lexical analyses
verb-initial-lr
verb
VFORM fin
HEAD INITIAL +
DSL
none
SYNSEM|LOC SPR ⟨⟩
* +
HEAD verb
COMPS LOC|CAT DSL 3
COMPS ⟨⟩
RELS imperative-or-interrogative ⊕ 4
(126) EVENT 2
PHON ⟨⟩
HEAD|MOD none
CAT SPR ⟨⟩ D
E
COMPS XP[MOD … IND 1 ], (PP[mit] )
1
LEX-DTR
SS|LOC 3 IND 2
* directive +
CONT EVENT
RELS 4 2
PATIENT
1
The valence properties of the empty verb in (126) are to a large extent determined by the
lexical rule for verb-initial order: the V1-LR licenses a verbal head that requires a VP to
its right that is missing a verb with the local properties of the LEX-DTR ( 3 ).
Semantic information dependent on sentence type (assertion, imperative or question)
is determined inside the V1-LR depending on the morphological make-up of the verb
and the SLASH value of the selected VP (see Müller 2007a: Section 10.3; 2015b; 2017a).
Setting the semantics to imperative-or-interrogative rules out assertion as it occurs in V2-
clauses. Whether this type is resolved in the direction of imperative or interrogative is
ultimately decided by further properties of the utterance such as intonation or the use
of interrogative pronouns.
The valence of the lexical daughters in (126) as well as the connection to the semantic
role (the linking to the patient role) are simply stipulated. Every approach has to stipulate
that an argument of the verb has to be expressed as a mit-PP. Since there is no antipassive
in German, the effect that could be otherwise achieved by an antipassive lexical rule in
(126) is simply written into the LEX-DTR of the verb movement rule.
The COMPS list of LEX-DTR contains a modifier (adverb, directional PP) and the mit-
PP. This mit-PP is co-indexed with the patient of directive ′ and the modifier refers to
the referent of the mit-PP. The agent of directive ′ is unspecified since it depends on the
context (speaker, hearer, third person).
This analysis is shown in Figure 21.22 on the next page. Here, V[LOC 2 ] corresponds
to the LEX-DTR in (126). The V1-LR licenses an element that requires a maximal verb
672
21.10 Why (phrasal) constructions?
V[COMPS ⟨⟩]
V[LOC 2 ] 3 PP V[HEAD|DSL 2 ,
COMPS ⟨ 3 ⟩]
4 PP[mit] V 2 [HEAD|DSL 2 ,
COMPS ⟨ 3 , 4 ⟩]
projection with that exact DSL value 2 . Since DSL is a head feature, the information is
present along the head path. The DSL value is identified with the LOCAL value ( 2 in
Figure 21.22) in the verb movement trace (see page 297). This ensures that the empty
element at the end of sentence has exactly the same local properties that the LEX-DTR in
(126) has. Thus, both the correct syntactic and semantic information is present on the
verb trace and structure building involving the verb trace follows the usual principles.
The structures correspond to the structures that were assumed for German sentences
in Chapter 9. Therefore, there are the usual possibilities for integrating adjuncts. The
correct derivation of the semantics, in particular embedding under imperative or inter-
rogative semantics, follows automatically (for the semantics of adjuncts in conjunction
with verb position, see Müller (2007a: Section 9.4)). Also, the ordering variants with the
mit-PP preceding the direction (125) and direction preceding the mit-PP (120b) follow
from the usual mechanisms.
If one rejects the analyses discussed up to this point, then one is only really left with
phrasal constructions or dominance schemata that connect parts of the construction
and contribute the relevant semantics. Exactly how one can integrate adjuncts into the
phrasal construction in a non-stipulative way remains an open question; however, there
are already some initial results by Jakob Maché (2010) suggesting that directives can still
be sensibly integrated into the entire grammar provided an appropriate phrasal schema
is assumed.
673
21 Phrasal vs. lexical analyses
tion of aspect marking inside the VP:61 if the first VP contains a perfect marker, then we
have the meaning ‘VP1 in order to do/achieve VP2’ (127a). If the second VP contains a
perfect marker, then the entire construction means ‘VP2 because VP1’ (127b), and if the
first VP contains a durative marker and the verb hold or use, then the entire construction
means ‘VP2 using VP1’ (127c).
(127) a. Ta1 qu3 le qian2 qu4 guang1jie1.
he withdraw PRF money go shop
‘He withdrew money to go shopping.’
b. Ta1 chi1 pu2tao tu3 le Pu2taopi2.
he eat grape spit PRF grape.skin
‘He spat grape skins because he ate grapes.’
c. Ta1 na2 zhe kuai4zi chi1 fan4.
he hold DUR chopsticks eat food
‘He eats with chopsticks.’
If we consider the sentences, we only see two adjacent VPs. The meanings of the entire
sentences, however, contain parts of meaning that go beyond the meaning of the verb
phrases. Depending on the kind of aspect marking, we arrive at different interpretations
with regard to the semantic combination of verb phrases. As can be seen in the transla-
tions, English sometimes uses conjunctions in order to express relations between clauses
or verb phrases.
There are three possible ways to capture these data:
1. One could claim that speakers of Chinese simply deduce the relation between the
VPs from the context,
2. one could assume that there are empty heads in Chinese corresponding to because
or to, or
3. one could assume a phrasal construction for serial verbs that contributes the cor-
rect semantics for the complete meaning depending on the aspect marking inside
the VPs.
The first approach is unsatisfactory because the meaning does not vary arbitrarily. There
are grammaticalized conventions that should be captured by a theory. The second solu-
tion has a stipulative character and thus, if one wishes to avoid empty elements, only
the third solution remains. Müller & Lipenkova (2009) have presented a corresponding
analysis.
674
21.10 Why (phrasal) constructions?
phrase and a clause or a verb phrase missing the fronted phrase. The fronted phrase
contains a relative or interrogative pronoun.
(128) a. the man [who] sleeps
b. the man [who] we know
c. the man [whose mother] visited Kim
d. a house [in which] to live
The GB analysis of relative clauses is given in Figure 21.23. In this analysis, an empty
head is in the C position and an element from the IP is moved to the specifier position.
CP[rel]
NP C[rel]
C0 [rel] IP
The alternative analysis shown in Figure 21.24 involves combining the subparts di-
rectly in order to form a relative clause. Borsley (2006) has shown that one would require
S[rel]
NP S
six empty heads in order to capture the various types of relative clauses possible in En-
glish if one wanted to analyze them lexically. These heads can be avoided and replaced
by corresponding schemata (see Chapter 19 on empty elements). A parallel argument
675
21 Phrasal vs. lexical analyses
can also be found in Webelhuth (2011) for German: grammars of German would also
have to assume six empty heads for the relevant types of relative clause.
Unlike the resultative constructions that were already discussed, there is no variability
among interrogative and relative clauses with regard to the order of their parts. There are
no changes in valence and no interaction with derivational morphology. Thus, nothing
speaks against a phrasal analysis. If one wishes to avoid the assumption of empty heads,
then one should opt for the analysis of relative clauses by Sag, or the variant in Müller
(1999b: Chapter 10; 2007a: Chapter 11). The latter analysis does without a special schema
for noun-relative clause combinations since the semantic content of the relative clause
is provided by the relative clause schema.
Sag (2010) discusses long-distance dependencies in English that are subsumed under
the term wh-movement in GB theory and the Minimalist Program. He shows that this is
by no means a uniform phenomenon. He investigates wh-questions (130), wh-exclama-
tives (131), topicalization (132), wh-relative clauses (133) and the-clauses (134):
(130) a. How foolish is he?
b. I wonder how foolish he is.
These individual constructions vary in many respects. Sag lists the following questions
that have to be answered for each construction:
• Is there a special wh-element in the filler daughter and, if so, what kind of element
is it?
• Which syntactic categories can the filler daughters have?
• Can the head-daughter be inverted or finite? Is this obligatory?
• What is the semantic and/or syntactic category of the mother node?
• What is the semantic and/or syntactic category of the head-daughter?
• Is the sentence an island? Does it have to be an independent clause?
The variation that exists in this domain has to be captured somehow by a theory of
grammar. Sag develops an analysis with multiple schemata that ensure that the cate-
gory and semantic contribution of the mother node correspond to the properties of both
676
21.10 Why (phrasal) constructions?
daughters. The constraints for both classes of constructions and specific constructions
are represented in an inheritance hierarchy so that the similarities between the construc-
tions can be accounted for. The analysis can of course also be formulated in a GB-style
using empty heads. One would then have to find some way of capturing the generaliza-
tions pertaining to the construction. This is possible if one represents the constraints on
empty heads in an inheritance hierarchy. Then, the approaches would simply be nota-
tional variants of one another. If one wishes to avoid empty elements in the grammar,
then the phrasal approach would be preferable.
62
Zwölf Städte. Einstürzende Neubauten. Fünf auf der nach oben offenen Richterskala, 1987.
677
21 Phrasal vs. lexical analyses
For this kind of structures, it would be necessary to assume that a preposition selects a
noun to its right and, if it find this, it then requires a second noun of this exact form to
its left. For N-um-N and N-für-N, it is not entirely clear what the entire construction has
to do with the individual prepositions. One could also try to develop a lexical analysis
for this phenomenon, but the facts are different to those for resultative constructions:
in resultative constructions, the semantics of simplex verbs clearly plays a role. Further-
more, unlike with the resultative construction, the order of the component parts of the
construction is fixed in the N-P-N construction. It is not possible to extract a noun or
place the preposition in front of both nouns. Syntactically, the N-P-N combination with
some prepositions behaves like an NP (Jackendoff 2008: 9):
(138) Student after/upon/*by student flunked.
This is also strange if one wishes to view the preposition as the head of the construction.
Instead of a lexical analysis, Jackendoff proposes the following phrasal construction
for N-after-N combinations:
(139) Meaning: MANY X𝑖 s IN SUCCESSION [or however it is encoded]
Syntax: [NP N𝑖 P 𝑗 N𝑖 ]
Phonology: Wd𝑖 after 𝑗 Wd𝑖
The entire meaning as well as the fact that the N-P-N has the syntactic properties of an
NP would be captured on the construction level.
I already discussed examples by Bargmann (2015) in Section 11.7.2.4 that show that
N-P-N constructions may be extended by further P-N combinations:
(140) Day after day after day went by, but I never found the courage to talk to her.
So rather than an N-P-N pattern Bargmann suggests the pattern in (141), where ‘+’ stands
for at least one repetition of a sequence.
(141) N (P N)+
As was pointed out on page 414 this pattern is not easy to cover in selection-based ap-
proaches. One could assume that an N takes arbitrarily many P-N combinations, which
would be very unusual for heads. Alternatively, one could assume recursion, so N would
be combined with a P and with an N-P-N to yield N-P-N-P-N. But such an analysis would
make it really difficult to enforce the restrictions regarding the identity of the nouns in
the complete construction. In order to enforce such an identity the N that is combined
with N-P-N would have to impose constraints regarding deeply embedded nouns inside
the embedded N-P-N object (see also Section 18.2).
G. Müller (2011) proposes a lexical analysis of the N-P-N construction. He assumes
that prepositions can have a feature REDUP. In the analysis of Buch um Buch ‘book after
book’, the preposition is combined with the right noun um Buch. In the phonological
component, reduplication of Buch is triggered by the REDUP feature, thereby yielding
Buch um Buch. This analysis also suffers from the problems pointed out by Jackendoff:
in order to derive the semantics of the construction, the semantics would have to be
678
21.10 Why (phrasal) constructions?
present in the lexical entry of the reduplicating preposition (or in a relevant subsequent
component that interprets the syntax). Furthermore it is unclear how a reduplication
analysis would deal with the Bargmann data.
Further reading
679
22 Structure, potential structure and
underspecification
The previous chapter extensively dealt with the question whether one should adopt a
phrasal or a lexical analysis of valence alternations. This rather brief chapter deals with
a related issue. I discuss the analysis of complex predicates consisting of a preverb and
a light verb. Preverbs often have an argument structure of their own. They describe an
event and the light verb can be used to realize either the full number of arguments or
a reduced set of arguments. (1) provides the example from Hindi discussed by Vaidya,
Rambow & Palmer (2019).
tareef ‘praise’ is a noun that can be combined with the light verb kar ‘do’ to form an
active sentence as in (1a) or with the light verb ho ‘be’ to form a passive sentence as in
(1b). Similar examples can of course be found in other languages making heavy use of
complex predicates (Müller 2010b).
In what follows I compare the analysis of Vaidya et al. (2019) in the framework of
Lexicalized TAG with a an HPSG analysis. As the name specifies, LTAG is a lexicalized
framework, something that was argued for in the previous chapter. However, TAG is
similar to phrasal Construction Grammar in that it makes use of phrasal configurations
to represent argument slots. This differs from Categorial Grammar and HPSG since
the latter frameworks assume descriptions of arguments (head/functor representations
in CG and valence lists in HPSG) rather than structures containing these arguments. So
while TAG elementary trees contain actual structure, CG and HPSG lexical items contain
potential structure. TAG structures can be taken appart and items can be inserted into
the middle of an existing structure but usually the structure is not transformed into
something else.1 This is an interesting difference that becomes crucial when talking
about the formation of complex predicates and in particular about certain active/passive
alternations.
1
One way to “delete” parts of the structure would be to assume empty elements that can be inserted into
substitution nodes (see Chapter 19 for discussion).
22 Structure, potential structure and underspecification
Vaidya et al. (2019) assume that the structures for the examples in (1) are composed
of elementary trees for tareef ‘praise’ and the respective light verbs. This is shown in
Figure 22.1 and Figure 22.2 respectively. The TAG analysis is only sketched here. The
XP0
XP0
XP𝑟 NP1 XP1
NP1 ↓ XP1
XP 𝑓 VP logon=ne XP 𝑓 VP
NP2 ↓ XP2 people
V NP2 XP2 V
X
k-ii pustak=kii X k-ii
tareef do book do
praise tareef
praise
Figure 22.1: Analysis of logon=ne pustak=kii tareef k-ii ‘People praised the book.’ The
tree of the light verb is adjoined into the tree of the preverb, into the XP1
position
XP1
XP1 XP𝑟
XP 𝑓 VP
NP1 ↓ XP2 XP 𝑓 VP
NP1 XP2 V
X V
pustak=kii X hu-ii
tareef hu-ii book be
praise be tareef
praise
Figure 22.2: Analysis of pustak=kii tareef hu-ii ‘The book got praised.’
authors use feature-based TAG, which makes it possible to enforce obligatory adjunc-
tion: the elementary tree for tareef is specified in a way that makes it necessary to take
the tree apart and insert nodes of another tree (see page 430). This way it can be ensured
that the preverb has to be augmented by a light verb. This results in XP 𝑓 being inserted
at XP1 in the figures above.
682
What the analysis clearly shows is that TAG assumes two lexical items for the preverb:
one with two arguments for the active case and one with just one argument for the
passive. In general one would say that tareef is a noun describing the praising event, that
is, one person praises another one. Now this noun can be combined with a light verb and
depending on which light verb is used we get an active sentence with both arguments
realized or a passive sentence with the agent of the eventive noun suppressed. There is
no morphological reflex of this active/passive alternation at the noun. It is just the same
noun tareef : in an active sentence in (1a) and in a passive one in (1b).
And here we see a real difference between the frameworks: TAG is a framework in
which structure is assembled: the basic operations are substitution and adjunction. The
lexicon consists of ready-made building blocks that are combined to yield the trees we
want to have in the end. This differs from Categorial Grammar and HPSG where lexical
items do not encode real structure to be used in an analysis, but potential structure:
lexical items come with a list of their arguments, that is, items that are required for the
lexical element under consideration to project to a full phrase. However, lexical heads
may enter relations with their valents and form NPs, APs, VPs, PPs or other phrases,
but they do not have to. Geach (1970) developed a technique that is called functional
composition or argument composition within the framework of Categorial Grammar and
this was transferred to HPSG by Hinrichs & Nakazawa (1989b; 1994a). Since the 90ies this
technique is used for the analysis of complex predicates in HPSG for German (Hinrichs &
Nakazawa 1989b; 1994a; Kiss 1995; Meurers 1999a; Müller 1999b; Kathol 2000), Romance
(Miller & Sag 1997: 600; Monachesi 1998; Abeillé & Godard 2002), Korean (Chung 1998),
and Persian (Müller 2010b). See Godard & Samvelian (2020) for an overview. For instance
Müller (2010b: 642) analyzes the light verbs kardan ‘do’ and šodan ‘become’ this way:
both raise the subject of the embedded predicate and make it their own argument but
kardan introduces an additional argument while šodan does not do so.
Applying the argument composition technique to our example, we get the following
lexical item for tareef :
(2) Sketch of lexical item for tareef ‘praise’:
HEAD noun
SUBJ ⟨ 1 ⟩
COMPS ⟨ 2 NP ⟩
ARG-ST ⟨ 1 NP, 2 NP ⟩
The ARG-ST list contains all arguments of a head. The arguments are linked to the seman-
tic representation and are mapped to valence features like SPECIFIER and COMPLEMENTS.
Depending on the langauge and the realizationability of subjects within projections, that
subject may be mapped to a separate feature, which is a HEAD feature. HEAD features are
projected along the head path but the features contained under HEAD do not license
combinations with the head.
The lexical items for kar ‘do’ and ho ‘be’ are:
(3) a. Sketch
of lexical item for kar ‘do’:
HEAD verb
ARG-ST 1 ⊕ 2 ⊕ ⟨ N[SUBJ 1 , COMPS 2 ]⟩
683
22 Structure, potential structure and underspecification
1 NP V
2 NP V
3 N[SUBJ ⟨ 1 ⟩ V[SUBJ ⟨ ⟩
COMPS ⟨ 2 ⟩ COMPS 4
ARG-ST ⟨ 1 , 2 ⟩] ARG-ST 4 ⟨ 1 , 2 , 3 ⟩]
Figure 22.3: Analysis of logon=ne pustak=kii tareef k-ii ‘People praised the book.’ The
arguments of the preverb are taken over by the light verb
b. Sketch
of lexical item for ho ‘be’:
HEAD verb
ARG-ST 1 ⊕ ⟨ N[COMPS 1 ] ⟩
The verb kar ‘do’ selects for a noun and takes whatever the subject of this noun is ( 1 )
and concatenates the list of complements the noun takes ( 2 ) with the value of SUBJ. The
result is 1 ⊕ 2 and it is a prefix of the ARG-ST list of the light verb. The lexical item for ho
‘be’ is similar, the difference being that the subject of the embedded verb is not attracted
to the higher ARG-ST list, only the complements ( 1 ) are.
For finite verbs it is assumed that all arguments are mapped to the COMPS list of the
verb, so the COMPS list is identical to the ARG-ST list. The analysis of our example sen-
tences is shown in Figures 22.3 and 22.4.
The conclusion is that HPSG has a representation of potential structure. When light
verbs are present, they can take over valents and “execute” them according to their own
preferences. This is not possible in TAG since once structure is assembled it cannot be
changed. We may insert items into the middle of an already assembled structure but we
cannot take out arguments or reorder them. This is possible in Categorial Grammar and
in HPSG: the governing head may choose which arguments to take order and in which
order they should be represented in the valence repsresentations of the governing head.
LFG is somewhere in the middle between TAG and HPSG: the phrase structural con-
figurations are not fully determined as in TAG since LFG does not store and manipulate
phrase markers. But lexical items are associated with f-structures and these f-structures
are responsible for which elements are realized in syntax. As complex predicates are
assumed to be monoclausal it is not sufficient to embed the f-structure of the preverb
684
V
2 NP V
3 N[SUBJ ⟨ 1 NP ⟩ V[SUBJ ⟨ ⟩
COMPS ⟨ 2 ⟩ COMPS 4
ARG-ST ⟨ 1 , 2 ⟩] ARG-ST 4 ⟨ 2 , 3 ⟩]
Figure 22.4: Analysis of pustak=kii tareef hu-ii ‘The book got praised.’
within the f-structure of the light verb (Butt et al. 2003). Since the grammatical func-
tions that are ultimately realized in the clause do not depend on the preverb alone the
light verb may have to determine the grammatical functions contributed by the preverb.
In order to be able to do this Butt et al. (2003) use the restriction operator (Kaplan &
Wedekind 1993), which restricts out certain features or path equations provided by the
preverb’s and the light verb’s f-structures. The statement of grammatical functions in
f-structures is another instance of too strict specifications: once specified, it is difficult
to get rid of it and special means like partial copying via restriction are needed. An al-
ternative not relying on restriction was suggested by Butt (1997): embedding relations
can be specified on the a-structure representation and then a mapping is defined that
maps the complex a-structure to the desired f-structure. Mapping between several lev-
els of representation is a general tool that is also used in HPSG: for instance, Bouma,
Malouf & Sag (2001) used ARG-ST, DEPS, and COMPS in the treatment of nonlocal depen-
dencies. See also Koenig (1999) on the introduction of arguments via additional auxiliary
features. As I showed in Müller (2007a: Section 7.5.2.2), one would need an extra feature
for every kind of argument alternation that is to be modeled this way. Recent versions
of LFG use glue semantics to keep track of arguments (Dalrymple, Lamping & Saraswat
1993; Dalrymple 2001: Chapter 8). Glue semantics can be used to do argument extension
and argument manipulation in general in ways that are parallel to argument attraction
approaches. See for instance Asudeh, Dalrymple & Toivonen (2013) for a treatment of
benefactive arguments.
Summing up, I showed that there are indeed differences between the frameworks
that are due to the basic representational formalisms they assume. While TAG assumes
that the lexicon contains trees with a certain structure, HPSG assumes that lexical items
come with valence specifications, that is, they have descriptions of items that have to
be combined with the lexical item. But the way in which the items have to be combined
with the head is determined by dominance schemata (grammar rules) that are separate
from the lexical items. So the valence specifications specify possible structures. Since
685
22 Structure, potential structure and underspecification
Further reading
686
23 Universal Grammar and doing
comparative linguistics without an a
priori assumption of a (strong) UG
The following two sections deal with the tools that I believe to be necessary to capture
generalizations and the way one can derive such generalizations.
688
23.1 Formal tools for capturing generalizations
689
23 Universal Grammar and comparative linguistics without UG
sign
intransitive-verb transitive-verb
Figure 23.1: Section of an inheritance hierarchy with lexical entries and dominance sche-
mata
In addition to a type for roots, the above figure contains types for stems and words.
Complex stems are complex objects that are derived from simple roots but still have
to be inflected (lesbar- ‘readable’, besing- ‘to sing about’). Words are objects that do
not inflect. Examples of these are the pronouns er ‘he’, sie ‘she’ etc. as well as preposi-
tions. An inflected form can be formed from a verbal stem (geliebt ‘loved’, besingt ‘sings
about’). Relations between inflected words and (complex) stems can be formed again us-
ing derivation rules. In this way, geliebt ‘loved’ can be recategorized as an adjective stem
that must then be combined with adjectival endings (geliebt-e). The relevant descriptions
of complex stems/words are subtypes of complex-stem or word. These subtypes describe
the form that complex words such as geliebte must have. For a technical implementation
of this, see Müller (2002a: Section 3.2.7). Using dominance schemata, all words can be
combined to form phrases. The hierarchy given here is of course by no means complete.
There are a number of additional valence classes and one could also assume more general
types that simply describe one, two and three-place predicates. Such types are probably
plausible for the description of other languages. Here, we are only dealing with a small
part of the type hierarchy in order to have a comparison to the Croftian hierarchy: in
Figure 23.1, there are no types for sentence patterns with the form [Sbj IntrVerb], but
rather types for lexical objects with a particular valence (V[COMPS ⟨ NP[str] ⟩]). Lexical
rules can then be applied to the relevant lexical objects that license objects with another
valence or introduce information about inflection. Complete words can be combined in
the syntax with relatively general rules, for example in head-argument structures. The
problems from which purely phrasal approaches suffer are thereby avoided. Neverthe-
690
23.2 How to develop linguistic theories that capture cross-linguistic generalizations
less generalizations about lexeme classes and the utterances that can be formed can be
captured in the hierarchy.
There are also principles in addition to inheritance hierarchies: the Semantics Princi-
ple presented in Section 9.1.6 holds for all languages. The Case Principle that we also saw
is a constraint that only applies to a particular class of languages, namely nominative-
accusative languages. Other languages have an ergative-absolutive system.
The assumption of innate linguistic knowledge is not necessary for the theory of lan-
guage sketched here. As the discussion in Section 13 has shown, the question of whether
this kind of knowledge exists has still not been answered conclusively. Should it turn
out that this knowledge actually exists, the question arises of what exactly is innate. It
would be a plausible assumption that the part of the inheritance hierarchy that is rele-
vant for all languages is innate together with the relevant principles (e.g., the constraints
on Head-Argument structures and the Semantics Principle). It could, however, also be
the case that only a part of the more generally valid types and principles is innate since
something being innate does not follow from the fact that it is present in all languages
(see also Section 13.1.9).
In sum, one can say that theories that describe linguistic objects using a consistent
descriptive inventory and make use of inheritance hierarchies to capture generalizations
are the ones best suited to represent similarities between languages. Furthermore, this
kind of theory is compatible with both a positive and a negative answer to the question
of whether there is innate linguistic knowledge.
691
23 Universal Grammar and comparative linguistics without UG
passive is movement makes unwanted predictions for German, since the subject of pas-
sives stays in the object position in German. Furthermore, this analysis requires the
assumption of invisible expletives, that is, entities that cannot be seen and do not have
any meaning.
On the other extreme of the spectrum we find people working in Construction Gram-
mar or without any framework at all (see footnote 1 on page 1 for discussion) who claim
that all languages are so different that we cannot even use the same vocabulary to an-
alyze them. Moreover, within languages, we have so many different objects that it is
impossible (or too early) to state any generalizations. Again, what I describe here are
extreme positions and clichés.
In what follows, I sketch the procedure that we apply in the CoreGram project1 (Müller
2013b; 2015c). In the CoreGram project we work on a set of typologically diverse lan-
guages in parallel:
• German (Müller 2007a; 2009c; 2012b; Müller & Ørsnes 2011; 2013a; Müller 2014b;
2017a)
• Danish (Ørsnes 2009b; Müller 2009c; 2012b; Müller & Ørsnes 2011; 2013a,b; 2015)
• Persian (Müller 2010b; Müller & Ghayoomi 2010)
• Maltese (Müller 2009b)
• Mandarin Chinese (Lipenkova 2009; Müller & Lipenkova 2009; 2013; 2016)
• Yiddish (Müller & Ørsnes 2011)
• English (Müller 2009c; 2012b; Müller & Ørsnes 2013a)
• Hindi
• Spanish (Machicao y Priemer 2015)
• French
These languages belong to diverse language families (Indo-European, Afro-Asiatic, Sino-
Tibetan) and among the Indo-European languages the languages belong to different
groups (Germanic, Romance, Indo-Iranian). Figure 23.2 provides an overview. We work
out fully formalized, computer-processable grammar fragments in the framework of
HPSG that have a semantics component. The details will not be discussed here, but
the interested reader is referred to Müller (2015c).
As was argued in previous sections, the assumption of innate language-specific knowl-
edge should be kept to a minimum. This is also what Chomsky suggested in his Min-
imalist Program. There may even be no language-specific innate knowledge at all, a
view taken in Construction Grammar/Cognitive Grammar. So, instead of imposing con-
straints from one language onto other languages, a bottom-up approach seems to be
1
https://hpsg.hu-berlin.de/Projects/CoreGram.html, 2nd September 2020.
692
23.2 How to develop linguistic theories that capture cross-linguistic generalizations
Languages
Danish English German Yiddish French Spanish Hindi Persian Maltese Mandarin Chinese
Figure 23.2: Language families and groups of the languages covered in the CoreGram
project
Arg St
V2
Set 3
SOV
VC
Set 1 Set 2
German Dutch
between German and Dutch (Set 3). For instance, the argument structure of lexical items,
a list containing descriptions of syntactic and semantic properties of arguments and the
linking of these arguments to the meaning of the lexical items, is contained in Set 3. In
addition to the constraints for SOV languages, the verb position and the fronting of a
constituent in V2 clauses are contained in Set 3. The respective constraints are shared be-
tween the two grammars. Although these sets are arranged in a hierarchy in Figure 23.3
and the following figures this has nothing to do with the type hierarchies that have been
discussed in the previous subsection. These type hierarchies are part of our linguistic
theories and various parts of such hierarchies can be in different sets: those parts of
the type hierarchy that concern more general aspects can be in Set 3 in Figure 23.3 and
693
23 Universal Grammar and comparative linguistics without UG
those that are specific to Dutch or German are in the respective other sets. When we add
another language, say Danish, we get further differences. While German and Dutch are
SOV, Danish is an SVO language. Figure 23.4 shows the resulting situation: the topmost
node represents constraints that hold for all the languages considered so far (for instance
the argument structure constraints, linking and V2) and the node below it (Set 4) con-
tains constraints that hold for German and Dutch only.2 For instance, Set 4 contains
constraints regarding verbal complexes and SOV order. The union of Set 4 and Set 5 is
Set 3 of Figure 23.3.
Arg Str
Set 5
V2
SOV
Set 4
VC
If we add further languages, further constraint sets will be distinguished. Figure 23.5
on the facing page shows the situation that results when we add English and French.
Again, the picture is not complete since there are constraints that are shared by Danish
and English but not by French, but the general idea should be clear: by systematically
working this way, we should arrive at constraint sets that directly correspond to those
that have been established in the typological literature.
The interesting question is what will be the topmost set if we consider enough lan-
guages. At first glance, one would expect that all languages have valence representations
and linkings between these and the semantics of lexical items (argument structure lists
in the HPSG framework). However, Koenig & Michelson (2012) argue for an analysis
of Oneida (a Northern Iroquoian language) that does not include a representation of
syntactic valence. If this analysis is correct, syntactic argument structure would not be
universal. It would, of course, be characteristic of a large number of languages, but it
would not be part of the topmost set. So this leaves us with just one candidate for the top-
2
In principle, there could be constraints that hold for Dutch and Danish, but not for German or for German
and Danish, but not for Dutch. These constraints would be removed from Set 1 and Set 2, respectively, and
inserted into another constraint set higher up in the hierarchy. These sets are not illustrated in the figure
and I keep the names Set 1 and Set 2 from Figure 23.3 for the constraint sets for German and Dutch.
694
23.2 How to develop linguistic theories that capture cross-linguistic generalizations
SOV
Set 4
VC
most set from the area of syntax: the constraints that license the combination of two or
more linguistic objects. This is basically Chomsky’s External Merge without the binarity
restriction3 . In addition, the topmost set would, of course, contain the basic machinery
for representing phonology and semantics.
It should be clear from what has been said so far that the goal of every scientist who
works this way is to find generalizations and to describe a new language in a way that
reuses theoretical constructs that have been found useful for a language that is already
covered. However, as was explained above, the resulting grammars should be motivated
by data of the respective languages and not by facts from other languages. In situations
where more than one analysis would be compatible with a given dataset for language X,
the evidence from language Y with similar constructs is most welcome and can be used
as evidence in favor of one of the two analyses for language X. I call this approach the
bottom-up approach with cheating: unless there is contradicting evidence, we can reuse
analyses that have been developed for other languages.
Note that this approach is compatible with the rather agnostic view advocated by
Haspelmath (2010a), Dryer (1997), Croft (2001: Section 1.4.2–1.4.3), and others, who ar-
gue that descriptive categories should be language-specific, that is, the notion of subject
for Tagalog is different from the one for English, the category noun in English is different
3
Note that binarity is more restrictive than flat structures: there is an additional constraint that there have
to be exactly two daughters. As was argued in Section 21.10.4 one needs phrasal constructions with more
than two constituents.
695
23 Universal Grammar and comparative linguistics without UG
from the category noun in Persian and so on. Even if one follows such extreme positions,
one can still derive generalizations regarding constituent structure, head-argument re-
lations and so on. However, I believe that some categories can fruitfully be used cross-
linguistically; if not universally, then at least for language classes. As Newmeyer (2010:
692) notes with regard to the notion of subject: calling two items subject in one language
does not entail that they have identical properties. The same is true for two linguistic
items from different languages: calling a Persian linguistic item subject does not entail
that it has exactly the same properties as an English linguistic item that is called sub-
ject. The same is, of course, true for all other categories and relations, for instance, parts
of speech: Persian nouns do not share all properties with English nouns.4 Haspelmath
(2010c: 697) writes: “Generative linguists try to use as many crosslinguistic categories in
the description of individual languages as possible, and this often leads to insurmount-
able problems.” If the assumption of a category results in problems, they have to be
solved. If this is not possible with the given set of categories/features, new ones have
to be assumed. This is not a drawback of the methodology, quite the opposite is true: if
we have found something that does not integrate nicely into what we already have, this
is a sign that we have discovered something new and exciting. If we stick to language-
particular categories and features, it is much harder to notice that a special phenomenon
is involved, since all categories and features are specific to one language anyway. Note
also that not all speakers of a language community have exactly the same categories. If
one were to take the idea of language-particular category symbols to an extreme, one
would end up with person specific category symbols like Klaus-English-noun.
After my talk at the MIT in 2013, members of the linguistics department objected
to the approach taken in the CoreGram project and claimed that it would not make any
predictions as far as possible/impossible languages are concerned. Regarding predictions
two things must be said: firstly, predictions are being made on a language particular
basis. As an example consider the following sentences from Netter (1991):
4
Note that using labels like Persian Noun and English Noun (see for instance Haspelmath 2010a: Section 2
for such a suggestion regarding case, e.g., Russian Dative, Korean Dative, …) is somehow strange since
it implies that both Persian nouns and English nouns are somehow nouns. Instead of using the category
Persian Noun one could assign objects of the respective class to the class noun and add a feature LANGUAGE
with the value persian. This simple trick allows one to assign both objects of the type Persian Noun and
objects of the type English Noun to the class noun and still maintain the fact that there are differences. Of
course, no theoretical linguist would introduce the LANGUAGE feature to differentiate between Persian and
English nouns, but nouns in the respective languages have other features that make them differ. So the
part of speech classification as noun is a generalization over nouns in various languages and the categories
Persian Noun and English Noun are feature bundles that contain further, language-specific information.
696
23.2 How to develop linguistic theories that capture cross-linguistic generalizations
When I first read these sentences I had no idea about their structure. I switched on
my computer and typed them in and within milliseconds I got an analysis of the sen-
tences and by inspecting the result I realized that these sentences are combinations of
partial verb phrase fronting and the so-called third construction (Müller 1999b: 439). I
had previously implemented analyses of both phenomena but had never thought about
the interaction of the two. The grammar predicted that examples like (4) are grammati-
cal. Similarly the constraints of the grammar can interact to rule out certain structures.
So predictions about ungrammaticality/impossible structures are in fact made as well.
Secondly, the topmost constraint set holds for all languages seen so far. It can be re-
garded as a hypothesis about properties that are shared by all languages. This constraint
set contains constraints about the connection between syntax and information struc-
ture and such constraints allow for V2 languages but rule out languages with the verb in
penultimate position (see Kayne 1994: 50 for the claim that such languages do not exist.
Kayne develops a complicated syntactic system that predicts this). Of course, if a lan-
guage is found that places the verb in penultimate position for the encoding of sentence
types or some other communicative effect, a more general topmost set has to be defined.
But this is parallel for Minimalist theories: if languages are found that are incompatible
with basic assumptions, the basic assumptions have to be revised. As with the language
particular constraints, the constraints in the topmost set make certain predictions about
what can be and what cannot be found in languages.
Frequently discussed examples such as those languages that form questions by revers-
ing the order of the words in a string (Haider 2015: 224; Musso et al. 2003) need not be
ruled out by the grammar, since they are ruled out by language external constraints: we
simply lack the working memory to do such complex computations.
A variant of this argument comes from David Pesetsky and was raised in Facebook
discussions of an article by Paul Ibbotson and Michael Tomasello published in The Guard-
ian5 . Pesetsky claimed that Tomasello’s theory of language acquisition could not explain
why we find V2 languages but no V3 languages. First, I do not know of anything that
blocks V3 languages in current Minimalist theories. So per se the fact that V3 languages
may not exist cannot be used to support any of the competing approaches. Of course,
the question could be asked whether the V3 pattern would be useful for reaching our
communicative goals and whether it can be easily acquired. Now, with V2 as a pattern
it is clear that we have exactly one position that can be used for special purposes in the
V2 sentence (topic or focus). For monovalent and bivalent verbs we have an argument
that can be placed in initial position. The situation is different for the hypothetical V3
languages, though: If we have monovalent verbs like sleep, there is nothing for the sec-
ond position. As Pesetsky pointed out in the answer to my comment on a blog post,
languages solve such problems by using expletives. For instance some languages insert
an expletive to mark subject extraction in embedded interrogative sentences, since oth-
erwise the fact that the subject is extracted would not be recognizable by the hearer. So
5
The roots of language: What makes us different from other animals? Published 2015-11-05.
http://www.theguardian.com/science/head-quarters/2015/nov/05/roots-language-what-makes-us-
different-animals, 2018/04/25.
697
23 Universal Grammar and comparative linguistics without UG
the expletive helps to make the structure transparent. V2 languages also use expletives
to fill the initial position if speakers want to avoid something in the special, designated
position:
(5) Es kamen drei Männer zum Tor hinein.
EXPL came three man to.the gate in
‘Three man came through the gate.’
In order to do the same in V3 languages one would have to put two expletives in front of
the verb. So there seem to be many disadvantages of a V3 system that V2 systems do not
have and hence one would expect that V3 systems are less likely to come into existence.
If they existed, they would be expected to be subject to change in the course of time; e.g.,
omission of the expletive with intransitives, optional V2 with transitives and finally V2
in general. With the new modeling techniques for language acquisition and agent-based
community simulation one can actually simulate such processes and I guess in the years
to come, we will see exciting work in this area.
Cinque (1999: 106) suggested a cascade of functional projections to account for reoc-
curring orderings in the languages of the world. He assumes elaborate tree structures to
play a role in the analysis of all sentences in all languages even if there is no evidence
for respective morphosyntactic distinctions in a particular language (see also Cinque &
Rizzi 2010: 55). In the latter case, Cinque assumes that the respective tree nodes are
empty. Cinque’s results could be incorporated in the model advocated here. We would
define part of speech categories and morpho-syntactic features in the topmost set and
state linearization constraints that enforce the order that Cinque encoded directly in
his tree structure. In languages in which such categories are not manifested by lexical
material, the constraints would never apply. Neither empty elements nor elaborate tree
structures would be needed. Thus Cinque’s data could be covered in a better way in
an HPSG with a rich UG but I, nevertheless, refrain from introducing 400 categories (or
features) into the theories of all languages and, again, I point out that such a rich and
language-specific UG is implausible from a genetic point of view. Therefore, I wait for
other, probably functional, explanations of the Cinque data.
Note also that implicational universals can be derived from hierarchically organized
constraint sets as the ones proposed here. For instance, one can derive from Figure 23.5
the implicational statement that all SVO languages are V2 languages, since there is no
language that has constraints from Set 4 that does not also have the constraints of Set 7.
Of course, this implicational statement is wrong, since there are lots and lots of SOV lan-
guages and just exceptionally few V2 languages. So, as soon as we add other languages
as for instance Persian or Japanese, the picture will change.
The methodology suggested here differs from what is done in MGG, since MGG stip-
ulates the general constraints that are supposed to hold for all languages on the basis
of general specualtions about language. In the best case, these general assumptions are
fed by a lot of experience with different languages and grammars, in the worst case they
are derived from insights gathered from one or more Indo-European languages. Quite
often impressionistic data is used to motivate rather far-reaching fundamental design de-
698
23.2 How to develop linguistic theories that capture cross-linguistic generalizations
cisions (Fanselow 2009; Sternefeld & Richter 2012; Haider 2014). It is interesting to note
that this is exactly what members of the MGG camp reproach typologists for. Evans &
Levinson (2009a) pointed out that counterexamples can be found for many alleged uni-
versals. A frequent response to this is that unanalyzed data cannot refute grammatical
hypotheses (see, for instance, Freidin 2009: 454). In the very same way it has to be said
that unanalyzed data should not be used to build theories on (Fanselow 2009). In the
CoreGram project, we aim to develop broad-coverage grammars of several languages,
so those constraints that make it to the top node are motivated and not stipulated on the
basis of intuitive implicit knowledge about language.
Since it is data-oriented and does not presuppose innate language-specific knowledge,
this research strategy is compatible with work carried out in Construction Grammar (see
Goldberg 2013b: 481 for an explicit statement to this end) and in any case it should also
be compatible with the Minimalist world.
699
24 Conclusion
The analyses discussed in this book show a number of similarities. All frameworks use
complex categories to describe linguistic objects. This is most obvious for GPSG, LFG,
HPSG, CxG and FTAG, however, GB/Minimalism and Categorial Grammar also talk
about NPs in third person singular and the relevant features for part of speech, person
and number form part of a complex category. In GB, there are the feature N and V with
binary values (Chomsky 1970: 199), Stabler (1992: 119) formalizes Barriers with feature-
value pairs and Sauerland & Elbourne (2002: 290–291) argue for the use of feature-value
pairs in a Minimalist theory. Also, see Veenstra (1998: ) for a constraint-based formal-
ization of a Minimalist analysis using typed feature descriptions. Dependency Grammar
dialects like Hellwig’s Dependency Unification Grammar also use feature-value pairs
(Hellwig 2003: 612).
Furthermore, there is a consensus in all current frameworks (with the exception of
Construction Grammar and Dependency Grammar) about how the sentence structure of
German should be analyzed: German is an SOV and V2 language. Clauses with verb-ini-
tial order resemble verb-final ones in terms of structure. The finite verb is either moved
(GB) or stands in a relation to an element in verb-final position (HPSG). Verb-second
clauses consist of verb-initial clauses out of which one constituent has been extracted.
It is also possible to see some convergence with regard to the analysis of the passive:
some ideas originally formulated by Haider (1984; 1985b; 1986a) in the framework of GB
have been adopted by HPSG. Some variants of Construction Grammar also make use of
a specially marked ‘designated argument’ (Michaelis & Ruppenhofer 2001: 55–57).
If we consider new developments in the individual frameworks, it becomes clear that
the nature of the proposed analyses can sometimes differ drastically. Whereas CG, LFG,
HPSG and CxG are surface-oriented, sometimes very abstract structures are assumed
in Minimalism and in some cases, one tries to trace all languages back to a common
base structure (Universal Base Hypothesis).1 This kind of approach only makes sense if
one assumes that there is innate linguistic knowledge about this base structure common
to all languages as well as about the operations necessary to derive the surface struc-
tures. As was shown in Chapter 13, all arguments for the assumption of innate linguistic
knowledge are either not tenable or controversial at the very least. The acquisition of lin-
guistic abilities can to a large extent receive an input-based explanation (Section 13.8.3,
Section 16.3 and Section 16.4). Not all questions about acquisition have been settled once
and for all, but input-based approaches are at least plausible enough for one to be very
cautious about any assumption of innate linguistic knowledge.
1
It should be noted that there are currently many subvariants and individual opinions in the Minimalist
community so that it is only possible – as with CxG – to talk about tendencies.
24 Conclusion
Models such as LFG, CG, HPSG, CxG and TAG are compatible with performance data,
something that is not true of certain transformation-based approaches, which are viewed
as theories of competence that do not make any claims about performance. In MGG, it
is assumed that there are other mechanisms for working with linguistic knowledge, for
example, mechanisms that combine ‘chunks’ (fragments of linguistic material). If one
wishes to make these assumptions, then it is necessary to explain how chunks and the
processing of chunks are acquired and not how a complex system of transformations
and transformation-comparing constraints is acquired. This means that the problem
of language acquisition would be a very different one. If one assumes a chunk-based
approach, then the innate knowledge about a universal transformational base would
only be used to derive a surface-oriented grammar. This then poses the question of what
exactly the evidence for transformations in a competence grammar is and if it would not
be preferable to simply assume that the competence grammar is of the kind assumed by
LFG, CG, HPSG, CxG or TAG. One can therefore conclude that constraint-based analyses
and the kind of transformational approaches that allow a constraint-based reformulation
are the only approaches that are compatible with the current facts, whereas all other
analyses require additional assumptions.
A number of works in Minimalism differ from those in other frameworks in that they
assume structures (sometimes also invisible structures) that can only be motivated by
evidence from other languages. This can streamline the entire apparatus for deriving
different structures, but the overall costs of the approach are not reduced: some amount
of the cost is just transferred to the UG component. The abstract grammars that result
cannot be learned from the input.
One can take from this discussion that only constraint-based, surface-oriented models
are adequate and explanatory: they are also compatible with psycholinguistic facts and
plausible from the point of view of acquisition.
If we now compare these approaches, we see that a number of analyses can be trans-
lated into one another. LFG (and some variants of CxG and DG) differ from all other
theories in that grammatical functions such as subject and object are primitives of the
theory. If one does not want this, then it is possible to replace these labels with Argu-
ment1, Argument2, etc. The numbering of arguments would correspond to their relative
obliqueness. LFG would then move closer to HPSG. Alternatively, one could mark argu-
ments in HPSG and CxG with regard to their grammatical function additionally. This is
what is done for the analysis of the passive (DESIGNATED ARGUMENT).
LFG, HPSG, CxG and variants of Categorial Grammar (Moens et al. 1989; Briscoe 2000;
Villavicencio 2002) possess means for the hierarchical organization of knowledge, which
is important for capturing generalizations. It is, of course, possible to expand any other
framework in this way, but this has never been done explicitly, except in computer im-
plementations and inheritance hierarchies do not play an active role in theorizing in the
other frameworks.
In HPSG and CxG, roots, stems, words, morphological and syntactic rules are all ob-
jects that can be described with the same means. This then allows one to make gen-
eralizations that affect very different objects (see Chapter 23). In LFG, c-structures are
702
viewed as something fundamentally different, which is why this kind of generalization
is not possible. In cross-linguistic work, there is an attempt to capture similarities in
the f-structure, the c-structure is less important and is not even discussed in a num-
ber of works. Furthermore, its implementation from language to language can differ
enormously. For this reason, my personal preference is for frameworks that describe
all linguistic objects using the same means, that is, HPSG and CxG. Formally, nothing
stands in the way of a description of the c-structure of an LFG grammar using feature-
value pairs so that in years to come there could be even more convergence between the
theories. For hybrid forms of HPSG and LFG, see Ackerman & Webelhuth (1998) and
Hellan & Haugereid (2003), for example.
If one compares CxG and HPSG, it becomes apparent that the degree of formaliza-
tion in CxG works is relatively low and a number of questions remain unanswered. The
more formal approaches in CxG (with the exception of Fluid Construction Grammar) are
variants of HPSG. There are relatively few precisely worked-out analyses in Construc-
tion Grammar and no description of German that would be comparable to the other
approaches presented in this book. To be fair, it must be said that Construction Gram-
mar is the youngest of the theories discussed here. Its most important contributions to
linguistic theory have been integrated into frameworks such as HPSG and LFG.
The theories of the future will be a fusion of surface-oriented, constraint-based and
model-theoretic approaches like CG, LFG, HPSG, Construction Grammar, equivalent
variants of TAG and GB/Minimalist approaches that will be reformulated as constraint-
based. (Variants of) Minimalism and (variants of) Construction Grammar are the most
widely adopted approaches at present. I actually suspect the truth to lie somewhere in
the middle. The linguistics of the future will be data-oriented. Introspection as the sole
method of data collection has proven unreliable (Müller 2007c; Meurers & Müller 2009)
and is being increasingly complemented by experimental and corpus-based analyses.
Statistical information and statistical processes play a very important role in machine
translation and are becoming more important for linguistics in the narrow sense (Abney
1996). We have seen that statistical information is important in the acquisition process
and Abney discusses cases of other areas of language such as language change, parsing
preferences and gradience with grammaticality judgments. Following a heavy focus
on statistical procedures, there is now a transition to hybrid forms in computational
linguistics,2 since it has been noticed that it is not possible to exceed certain levels of
quality with statistical methods alone (Steedman 2011; Church 2011; Kay 2011). The same
holds here as above: the truth is somewhere in between, that is, in combined systems.
In order to have something to combine, the relevant linguistic theories first need to be
developed. As Manfred Pinkal said: “It is not possible to build systems that understand
language without understanding language.”
2
See Kaufmann & Pfister (2007) and Kaufmann (2009) for the combination of a speech recognizer with a
HPSG grammar.
703
Appendix A: Solutions to the exercises
A.1 Introduction and basic terms
(1) a. Karl isst .
|{z} |{z}
VF LS
b. Der Mann liebt eine Frau, den Peter kennt.
| {z } |{z} | {z } |{z} |{z} |{z}
VF LS MF VF MF RS
| {z }
NF
c. Der Mann liebt eine Frau, die Peter kennt.
| {z } |{z} |{z} |{z} |{z}
VF LS VF MF RS
| {z }
MF
d. Die Studenten haben behauptet, nur wegen der Hitze einzuschlafen.
| {z } |{z} | {z } | {z }| {z }
VF LS RS MF RS
| {z }
NF
e. Dass Peter nicht kommt, ärgert Klaus.
|{z} | {z } | {z } |{z} |{z}
LS MF RS LS MF
| {z }
VF
f. Einen Mann küssen, der ihr nicht gefällt, würde sie nie.
| {z } | {z } |{z} | {z } | {z } | {z } | {z }
MF RS VF MF RS LS MF
| {z }
NF
| {z }
VF
On (1c): theoretically, this could also be a case of extraposition of the relative clause to
the postfield. Since eine Frau, die Peter kennt is a constituent, however, it is assumed that
no reordering of the relative clause has taken place. Instead, we have a simpler structure
with eine Frau, die Peter kennt as a complete NP in the middle field.
A Solutions to the exercises
2. In general, it is assumed that the grammar with the fewest rules is the best one.
Therefore, we can reject grammars that contain unnecessary rules such as (2).
One should bear in mind what the aim of a theory of grammar is. If our goal is
to describe the human language capacity, then a grammar with more rules could
be better than other grammars with less rules. This is because psycholinguistic
research has shown that highly-frequent units are simply stored in our brains and
not built up from their individual parts each time, although we would of course
be able to do this.
3. The problem here is the fact that it is possible to derive a completely empty noun
phrase (see Figure A.1). This noun phrase could be inserted in all positions where
NP
Det N
_ _
This problem can be solved using a feature that determines whether the left pe-
riphery of the N is empty. Visible Ns and N with at least an adjective would have
the value ‘–’ and all others ‘+’. Empty determiners could then only be combined
with Ns that have the value ‘–’. See Netter (1994).
706
A.2 Phrase structure grammars
707
A Solutions to the exercises
7. Adjective phrases such as those in (7) cannot be analyzed since the degree modifier
occurs between the complement and the adjective:
One would either have to allow for specifiers to be combined with their heads
before complements or allow crossing lines in trees. Another assumption could
be that German is like English, however then adjectival complements would have
to be obligatorily reordered before their specifier. For a description of this kind of
reordering, see Chapter 3. See Section 13.1.2 for a discussion of X-Theory.
8. Write a phrase structure grammar that can analyze the sentences in (8), but does
not allow the strings of words in (9).
In order to rule out the last two sentences, the grammar has to contain information
about case. The following grammar will do the job:
708
A.3 Transformational Grammar – Government & Binding
i. np(nom) → er
j. np(dat) → ihr
k. d(nom) → der
l. d(dat) → der
m. d(acc) → das
n. d(acc) → ein
o. n(nom) → Mann
p. n(dat) → Frau
q. n(acc) → Buch
r. n(acc) → Wunder
s. p(auf,acc) → auf
CP CP
C′ C′
C0 IP C0 IP
NP I′ NP I′
VP I0 VP I0
V′ V′
NP V0 NP V0
dass die Frau den Mann _ 𝑗 lieb- 𝑗 -t dass der Mann𝑖 _𝑖 geliebt _ 𝑗 wir- 𝑗 -d
that the woman the man love- -s that the man loved is
709
A Solutions to the exercises
CP CP
NP C′ C′
C0 IP C0 IP
NP I′ NP I′
VP I0 VP I0
V′ V′
NP V0 NP V0
der Mann𝑖 (wir- 𝑗 -d)𝑘 _𝑖 _𝑖 geliebt _ 𝑗 _𝑘 dass der Mann der Frau _ 𝑗 hilf- 𝑗 -t
the man is loved that the man the woman help- -s
CP
NP C′
C0 IP
NP I′
VP I0
V′
NP V0
710
A.4 Generalized Phrase Structure Grammar
The rules (12b,c) correspond to X-rules that we encountered in Section 2.4.1. They only
differ from these rules in that the part of speech of the head is not given on the right-
hand side of the rule. The part of speech is determined by the Head Feature Convention.
The part of speech of the head is identical to that on the left-hand side of the rule, that
is, it must be N in (12b,c). It also follows from the Head Feature Convention that the
whole NP has the same case as the head and therefore does not have to be mentioned
additionally in the rule above. 27 is the SUBCAT value. This number is arbitrary.
In order for the verb to appear in the correction position, we need linearization rules:
(14) V[+MC] < X
X < V[−MC]
The fact that the determiner precedes the noun is ensured by the following LP-rule:
(15) Det < X
711
A Solutions to the exercises
V3[+FIN, +MC]
grammar that licenses the sentences in (11) should have (at least) the following parts:
1. ID rules:
2. LP rules:
712
A.5 Feature descriptions
3. Metarules:
(22) V3 → W, X ↦→
V3/X → W
4. Lexical entries
p-o-s
2. Lists can be described using recursive structures that consist of both a list begin-
ning and a rest. The rest can either be a non-empty list (ne_list) or the empty list
(e_list). The list ⟨ a, b, c ⟩ can be represented as follows:
ne_list
FIRST a
ne_list
(24) FIRST b
REST ne_list
REST FIRST c
REST e_list
713
A Solutions to the exercises
diff-list
ne_list
FIRST a
(25) LIST ne_list
REST FIRST b
REST 1 list
LAST 1
Unlike the list representation in (24), the REST value of the end of the list is not
e_list, but rather simply list. It is then possible to extend a list by adding another
list to the point where it ends. The concatenation of (25) and (26a) is (26b).
diff-list
ne_list
(26) a. LIST FIRST c
REST 2 list
LAST 2
diff-list
ne_list
FIRST a
ne_list
FIRST b
b. LIST
REST ne_list
REST FIRST c
REST 2 list
LAST 2
In order to combine the lists, the LIST value of the second list has to be identi-
fied with the LAST value of the first list. The LAST value of the resulting list then
corresponds to the LAST value of the second list ( 2 in the example.)
Information about the encoding of difference lists can be found by searching for
the keywords list, append, and feature structure. In the search results, one can find
pages on developing grammars that explain difference lists.
714
A.6 Lexical Functional Grammar
The analysis is parallel to the analysis in Figure 7.5 on page 242. The difference is
that the object is in the dative and not in the accusative. The respective grammat-
ical function is OBJ𝜃 rather than OBJ.
The necessary c-structure rules are given in (29):
(29) a. VP → NP VP
(↑SUBJ |OBJ |OBJ𝜃 ) = ↓ ↑= ↓
b. VP → (V)
↑= ↓
c. C′ → C VP
↑= ↓ ↑= ↓
d. CP → XP C′
(↑DF) = ↓ ↑=↓
(↑DF) = (↑COMP* GF)
These rules allow two f-structures for the example in question: one in which the
NP dem Kind ‘the child’ is the topic and another in which this NP is the focus.
Figure A.3 shows the analysis with a topicalized constituent in the prefield.
715
A Solutions to the exercises
CP pred ‘VERSCHLINGENhSUBJ,OBJθ i’
pred ‘SANDY’
subj
(↑df)= (↑comp* gf) ↑=↓
case nom
(↑df)=↓ C tense
PRES
NP
pred ‘Kind’
topic
↑=↓ ↑=↓
case dat
C VP
OBJθ
(↑ subj) = ↓
NP
Figure A.4: Categorial Grammar analysis of The children in the room laugh loudly.
2. The analysis of the picture of Mary is given in Figure A.5. n/pp corresponds to N0 ,
n corresponds to N and np corresponds to NP.
716
A.8 Head-Driven Phrase Structure Grammar
2. An analysis of the difference in (30) has to capture the fact that the case of the ad-
jective has to agree with that of the noun. In (30a), the genitive form of interessant
‘interesting’ is used, whereas (30b) contains a form that is incompatible with the
genitive singular.
717
A Solutions to the exercises
The structure sharing of the case value of the adjective with the case value of the
N under MOD identifies the case values of the noun and the adjective. interessanten
can therefore be combined with Mannes, but not with Mann. Similarly, interessan-
ter can only be combined with the nominative Mann, but not with the genitive
Mannes.
For a refinement of the analysis of agreement inside the noun phrase, see Müller
(2007a: Abschnitt 13.2).
1
http://idioms.thefreedictionary.com/, 2018-02-20.
718
A.10 Dependency Grammar
N N V
Det Rel
N N
Adj
Subjunction N
N Adv V
719
A Solutions to the exercises
V N Adv Adv
Det Rel
N N
Adj
einen Mann getroffen _ der blonde Haare hat habe ich noch nie
a man met who blond hair has have I yet never
N′
NP NP
AP N′*
Det Det Det↓ N′ Det↓ N′
′
A
der dem N N
the the NP↓ A
König Diener
king treue servant
loyal
Figure A.6: Elementary trees for der dem König treue Diener ‘the servant loyal to the
king’
By substituting the tree for dem ‘the’ in the substitution node of König ‘king’, one then
720
A.11 Tree Adjoining Grammar
arrives at a full NP. This can then be inserted into the substitution node of treue ‘loyal’.
Similarly, the tree for der ‘the’ can be combined with the one for Diener. One then has
both of the trees in Figure A.7.
N′
AP N′*
NP
A′
Det N′
NP A
der N
Det N′ treue the
loyal Diener
dem N servant
the
König
king
Figure A.7: Trees for der dem König treue and der Diener ‘the servant loyal to the king’
after substitution
The adjective tree can then be adjoined to the N′-node of der Diener, which yields the
structure in Figure A.8 on the next page.
721
A Solutions to the exercises
NP
Det N′
der AP N′
the
A′ N
NP A Diener
servant
Det N′ treue
loyal
dem N
the
König
king
722
References
Abbott, Barbara. 1976. Right node raising as a test for constituenthood. Linguistic Inquiry
7(4). 639–642.
Abeillé, Anne. 1988. Parsing French with Tree Adjoining Grammar: Some linguistic ac-
counts. In Dénes Vargha (ed.), Proceedings of COLING 88, 7–12. University of Budapest:
Association for Computational Linguistics. http://www.aclweb.org/anthology/C/C88/
C88-1002.pdf (18 August, 2020).
Abeillé, Anne. 2006. In defense of lexical coordination. In Olivier Bonami & Patricia
Cabredo Hofherr (eds.), Empirical issues in formal syntax and semantics, vol. 6, 7–36.
Paris: CNRS. http://www.cssp.cnrs.fr/eiss6/ (18 August, 2020).
Abeillé, Anne & Rui P. Chaves. 2020. Coordination. In Stefan Müller, Anne Abeillé,
Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Gram-
mar: The handbook (Empirically Oriented Theoretical Morphology and Syntax). To
appear. Berlin: Language Science Press.
Abeillé, Anne & Danièle Godard. 2002. The syntactic structure of French auxiliaries.
Language 78(3). 404–452.
Abeillé, Anne & Owen Rambow (eds.). 2000. Tree Adjoining Grammars: formalisms, lin-
guistic analysis and processing (CSLI Lecture Notes 156). Stanford, CA: CSLI Publica-
tions.
Abeillé, Anne & Yves Schabes. 1989. Parsing idioms in Lexicalized TAG. In Harold Somers
& Mary McGee Wood (eds.), Proceedings of the Fourth Conference of the European Chap-
ter of the Association for Computational Linguistics, 1–9. Manchester, England: Associ-
ation for Computational Linguistics.
Abney, Steven P. 1987. The English noun phrase in its sentential aspect. Cambridge, MA:
MIT. (Doctoral dissertation). http://www.vinartus.net/spa/87a.pdf (18 August, 2020).
Abney, Steven P. 1996. Statistical methods and linguistics. In Judith L. Klavans & Philip
Resnik (eds.), The balancing act: combining symbolic and statistical approaches to lan-
guage (Language, Speech, and Communication), 1–26. London, England/Cambridge,
MA: MIT Press.
Abney, Steven P. & Jennifer Cole. 1986. A Government-Binding parser. In S. Berman,
J-W. Choe & J. McDonough (eds.), Proceedings of NELS 16, 1–17. University of Mas-
sachusetts, Amherst: GLSA.
Abraham, Werner. 1995. Deutsche Syntax im Sprachenvergleich: Grundlegung einer typolo-
gischen Syntax des Deutschen (Studien zur deutschen Grammatik 41). Tübingen: Stauf-
fenburg Verlag.
Abraham, Werner. 2003. The syntactic link between thema and rhema: The syntax-
discourse interface. Folia Linguistica 37(1–2). 13–34.
References
724
CA: CSLI Publications. http : / / csli - publications . stanford . edu / LFG / 10/ (18 August,
2020).
Altmann, Hans & Ute Hofman. 2004. Topologie fürs Examen: Verbstellung, Klammerstruk-
tur, Stellungsfelder, Satzglied- und Wortstellung (Linguistik fürs Examen 4). Wiesbaden:
VS Verlag für Sozialwissenschaften/GWV Fachverlage GmbH.
Ambridge, Ben & Adele E. Goldberg. 2008. The island status of clausal complements:
Evidence in favor of an information structure explanation. Cognitive Linguistics 19.
349–381.
Ambridge, Ben & Elena V. M. Lieven. 2011. Child language acquisition: Contrasting theo-
retical approaches. Cambridge, UK: Cambridge University Press.
Ambridge, Ben, Caroline F. Rowland & Julian M. Pine. 2008. Is structure dependence
an innate constraint? New experimental evidence from children’s complex-question
production. Cognitive Science: A Multidisciplinary Journal 32(1). 222–255.
Anderson, John M. 1971. The grammar of case: Towards a localistic theory. Vol. 4 (Cam-
bridge Studies in Linguistics). Cambridge, UK: Cambridge University Press.
Anderson, Stephen R. 1992. A-morphous morphology (Cambridge Studies in Linguistics
62). Cambridge: Cambridge University Press.
Aoun, Joseph & David W. Lightfoot. 1984. Government and contraction. Linguistic In-
quiry 15(3). 465–473.
Aoun, Joseph & Dominique Sportiche. 1983. On the formal theory of government. The
Linguistic Review 2(3). 211–236.
Arad Greshler, Tali, Livnat Herzig Sheinfux, Nurit Melnik & Shuly Wintner. 2015. Devel-
opment of maximally reusable grammars: Parallel development of Hebrew and Arabic
grammars. In Stefan Müller (ed.), Proceedings of the 22nd International Conference on
Head-Driven Phrase Structure Grammar, Nanyang Technological University (NTU), Sin-
gapore, 27–40. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/
HPSG/2015/ahmw.pdf (18 August, 2020).
Arends, Jacques. 2008. A demographic perspective on Creole formation. In Silvia
Kouwenberg & John Victor Singler (eds.), The handbook of pidgin and creole studies,
309–331. Oxford/Cambridge: Blackwell Publishers Ltd.
Arka, I Wayan, Avery Andrews, Mary Dalrymple, Meladel Mistica & Jane Simpson. 2009.
A linguistic and computational morphosyntactic analysis for the applicative -i in In-
donesian. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2009
conference, 85–105. Stanford, CA: CSLI Publications. http://csli-publications.stanford.
edu/LFG/14/ (18 August, 2020).
Arnold, Doug & Andrew Spencer. 2015. A constructional analysis for the skeptical. In
Stefan Müller (ed.), Proceedings of the 22nd International Conference on Head-Driven
Phrase Structure Grammar, Nanyang Technological University (NTU), Singapore, 41–
60. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/HPSG/2015/
arnold-spencer.pdf (18 August, 2020).
Arnold, Jennifer E., Michael K. Tanenhaus, Rebecca J. Altmann & Maria Fagnano. 2004.
The old and thee, uh, new. Psychological Science 15(9). 578–582.
725
References
726
Baldridge, Jason, Sudipta Chatterjee, Alexis Palmer & Ben Wing. 2007. DotCCG and
VisCCG: Wiki and programming paradigms for improved grammar engineering with
OpenCCG. In Tracy Holloway King & Emily M. Bender (eds.), Grammar Engineering
across Frameworks 2007 (Studies in Computational Linguistics ONLINE), 5–25. Stan-
ford, CA: CSLI Publications. http://csli-publications.stanford.edu/GEAF/2007/ (18 Au-
gust, 2020).
Baldridge, Jason & Geert-Jan M. Kruijff. 2002. Coupling CCG and Hybrid Logic De-
pendency Semantics. In Pierre Isabelle (ed.), 40th Annual Meeting of the Association
for Computational Linguistics: Proceedings of the conference, 319–326. University of
Pennsylvania, Philadelphia: Association for Computational Linguistics. DOI: 10.3115/
1073083.1073137. https://www.aclweb.org/anthology/events/acl- 2002/ (18 August,
2020).
Ballweg, Joachim. 1997. Stellungsregularitäten in der Nominalphrase. In Hans-Werner
Eroms, Gerhard Stickel & Gisela Zifonun (eds.), Grammatik der deutschen Sprache,
vol. 7.3 (Schriften des Instituts für deutsche Sprache), 2062–2072. Berlin: Walter de
Gruyter.
Baltin, Mark. 1981. Strict bounding. In Carl Lee Baker & John J. McCarthy (eds.), The
logical problem of language acquisition, 257–295. Cambridge, MA: MIT Press.
Baltin, Mark. 2004. Remarks on the relation between language typology and Universal
Grammar: Commentary on Newmeyer. Studies in Language 28(3). 549–553.
Baltin, Mark. 2017. Extraposition. In Martin Everaert & Henk van Riemsdijk (eds.), The
Blackwell companion to syntax, 2nd edn. (Blackwell Handbooks in Linguistics), 237–
271. Oxford: Blackwell Publishers Ltd. DOI: 10.1002/9781118358733.
Bangalore, Srinivas, Aravind K. Joshi & Owen Rambow. 2003. Dependency and valency
in other theories: Tree Adjoining Grammar. In Vilmos Ágel, Ludwig M. Eichinger,
Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning Lobin (eds.),
Dependenz und Valenz / Dependency and valency: Ein internationales Handbuch der
zeitgenössischen Forschung / An international handbook of contemporary research, vol. 1
(Handbücher zur Sprach- und Kommunikationswissenschaft 25), 669–678. Berlin: Wal-
ter de Gruyter.
Bannard, Colin, Elena Lieven & Michael Tomasello. 2009. Modeling children’s early
grammatical knowledge. Proceedings of the National Academy of Sciences 106(41).
17284–17289.
Bar-Hillel, Yehoshua, Micha A. Perles & Eliahu Shamir. 1961. On formal properties of sim-
ple phrase-structure grammars. Zeitschrift für Phonetik, Sprachwissenschaft und Kom-
munikationsforschung 14(2). 143–172.
Bargmann, Sascha. 2015. Syntactically flexible VP-idioms and the N-after-N Construction.
Poster presentation at the 5th General Meeting of PARSEME, Iasi, 23–24 September
2015.
Bartsch, Renate & Theo Vennemann. 1972. Semantic structures: A study in the rela-
tion between semantics and syntax (Athenäum-Skripten Linguistik 9). Frankfurt/Main:
Athenäum.
Barwise, Jon & John Perry. 1983. Situations and attitudes. Cambridge, MA: MIT Press.
727
References
Barwise, Jon & John Perry. 1987. Situationen und Einstellungen – Grundlagen der Situa-
tionssemantik. Berlin, New York: de Gruyter.
Baschung, K., G. G. Bes, A. Corluy & T. Guillotin. 1987. Auxiliaries and clitics in French
UCG grammar. In Bente Maegaard (ed.), Proceedings of the Third Conference of the Eu-
ropean Chapter of the Association for Computational Linguistics, 173–178. Copenhagen,
Denmark: Association for Computational Linguistics.
Bates, Elizabeth A. 1984. Bioprograms and the innateness hypothesis. The Behavioral and
Brain Sciences 7(2). 188–190.
Baumgärtner, Klaus. 1965. Spracherklärung mit den Mitteln der Abhängigkeitsstruktur.
Beiträge zur Sprachkunde und Informationsverarbeitung 5. 31–53.
Baumgärtner, Klaus. 1970. Konstituenz und Dependenz: Zur Integration beider gramma-
tischer Prinzipien. In Hugo Steger (ed.), Vorschläge für eine strukturelle Grammatik des
Deutschen (Wege der Forschung 144), 52–77. Darmstadt: Wissenschaftliche Buchge-
sellschaft.
Bausewein, Karin. 1990. Haben kopflose Relativsätze tatsächlich keine Köpfe? In Gisbert
Fanselow & Sascha W. Felix (eds.), Strukturen und Merkmale syntaktischer Kategorien
(Studien zur deutschen Grammatik 39), 144–158. Tübingen: originally Gunter Narr
Verlag now Stauffenburg Verlag.
Bayer, Josef & Jaklin Kornfilt. 1989. Restructuring effects in German. DYANA Report. Uni-
versity of Edinburgh.
Beavers, John. 2003. A CCG implementation for the LKB. LinGO Working Paper 2002-08.
Stanford, CA: CSLI Stanford.
Beavers, John. 2004. Type-inheritance Combinatory Categorial Grammar. In Proceedings
of COLING 2004, 57–63. Geneva, Switzerland: Association for Computational Linguis-
tics.
Beavers, John, Elias Ponvert & Stephen Mark Wechsler. 2008. Possession of a controlled
substantive. In T. Friedman & S. Ito (eds.), Proceedings of Semantics and Linguistic The-
ory (SALT) XVIII , 108–125. Ithaca, NY: Cornell University.
Beavers, John & Ivan A. Sag. 2004. Coordinate ellipsis and apparent non-constituent
coordination. In Stefan Müller (ed.), Proceedings of the 11th International Confer-
ence on Head-Driven Phrase Structure Grammar, Center for Computational Linguistics,
Katholieke Universiteit Leuven, 48–69. Stanford, CA: CSLI Publications. http : / / csli -
publications.stanford.edu/HPSG/2004/ (18 August, 2020).
Bech, Gunnar. 1955. Studien über das deutsche Verbum infinitum (Linguistische Arbeiten
139). 2nd unchanged edition 1983. Tübingen: Max Niemeyer Verlag.
Becker, Tilman, Aravind K. Joshi & Owen Rambow. 1991. Long-distance scrambling and
Tree Adjoining Grammars. In Fifth Conference of the European Chapter of the Associa-
tion for Computational Linguistics. Proceedings of the conference, 21–26. Berlin: Associ-
ation for Computational Linguistics. http://www.aclweb.org/anthology/E91-1005.pdf
(18 August, 2020).
Beermann, Dorothee & Lars Hellan. 2004. A treatment of directionals in two imple-
mented HPSG grammars. In Stefan Müller (ed.), Proceedings of the 11th International
Conference on Head-Driven Phrase Structure Grammar, Center for Computational Lin-
728
guistics, Katholieke Universiteit Leuven, 357–377. Stanford, CA: CSLI Publications. http:
//csli-publications.stanford.edu/HPSG/2004/ (18 August, 2020).
Beghelli, Filippo & Timothy Stowell. 1997. Distributivity and negation: The syntax of
each and every. In Anna Szabolcsi (ed.), Ways of scope taking, 71–107. Dordrecht:
Kluwer Academic Publishers.
Behaghel, Otto. 1909. Beziehung zwischen Umfang und Reihenfolge von Satzgliedern.
Indogermanische Forschungen 25. 110–142.
Behaghel, Otto. 1930. Von deutscher Wortstellung. Zeitschrift für Deutschkunde 44. 81–
89.
Behrens, Heike. 2009. Konstruktionen im Spracherwerb. Zeitschrift für Germanistische
Linguistik 37(3). 427–444.
Bellugi, Ursula, Liz Lichtenberger, Wendy Jones, Zona Lai & Marie St. George. 2000. The
neurocognitive profile of Williams Syndrome: A complex pattern of strengths and
weaknesses. Journal of Cognitive Neuroscience 12. 7–29.
Bender, Emily & Daniel P. Flickinger. 1999. Peripheral constructions and core phenom-
ena: Agreement in tag questions. In Gert Webelhuth, Jean-Pierre Koenig & Andreas
Kathol (eds.), Lexical and Constructional aspects of linguistic explanation (Studies in
Constraint-Based Lexicalism 1), 199–214. Stanford, CA: CSLI Publications.
Bender, Emily M. 2001. Syntactic variation and linguistic competence: The case of AAVE
copula absence. Stanford University. (Doctoral dissertation). http://faculty.washington.
edu/ebender/dissertation/ (18 August, 2020).
Bender, Emily M. 2008a. Evaluating a crosslinguistic grammar resource: A case study of
Wambaya. In Johanna D. Moore, Simone Teufel, James Allan & Sadaoki Furui (eds.),
Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics:
Human Language Technologies, 977–985. Columbus, Ohio: Association for Computa-
tional Linguistics. http://aclweb.org/anthology/P08-1111 (18 August, 2020).
Bender, Emily M. 2008b. Grammar engineering for linguistic hypothesis testing. In
Nicholas Gaylord, Alexis Palmer & Elias Ponvert (eds.), Proceedings of the Texas Lin-
guistics Society X Conference: Computational linguistics for less-studied languages, 16–
36. Stanford CA: CSLI Publications ONLINE.
Bender, Emily M. 2008c. Radical non-configurationality without shuffle operators: An
analysis of Wambaya. In Stefan Müller (ed.), Proceedings of the 15th International Con-
ference on Head-Driven Phrase Structure Grammar, 6–24. Stanford, CA: CSLI Publica-
tions. http://csli-publications.stanford.edu/HPSG/2008/ (18 August, 2020).
Bender, Emily M. 2010. Reweaving a grammar for Wambaya: A case study in grammar
engineering for linguistic hypothesis testing. Linguistic Issues in Language Technology
– LiLT 3(3). 1–34. http://journals.linguisticsociety.org/elanguage/lilt/article/view/662/
523.html (18 August, 2020).
Bender, Emily M. & Guy Emerson. 2020. Computational linguistics and grammar engi-
neering. In Stefan Müller, Anne Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.),
Head-Driven Phrase Structure Grammar: The handbook (Empirically Oriented Theoret-
ical Morphology and Syntax). To appear. Berlin: Language Science Press.
729
References
Bender, Emily M., Daniel P. Flickinger & Stephan Oepen. 2002. The Grammar Matrix:
An open-source starter-kit for the rapid development of cross-linguistically consistent
broad-coverage precision grammars. In John Carroll, Nelleke Oostdijk & Richard Sut-
cliffe (eds.), Proceedings of the Workshop on Grammar Engineering and Evaluation at
the 19th International Conference on Computational Linguistics, 8–14. Taipei, Taiwan.
Bender, Emily M. & Melanie Siegel. 2005. Implementing the syntax of Japanese nu-
meral classifiers. In Keh-Yih Su, Oi Yee Kwong, Jn’ichi Tsujii & Jong-Hyeok Lee
(eds.), Natural language processing IJCNLP 2004 (Lecture Notes in Artificial Intelli-
gence 3248), 626–635. Berlin: Springer Verlag.
Bergen, Benjamin K. & Nancy Chang. 2005. Embodied Construction Grammar in
simulation-based language understanding. In Jan-Ola Östman & Mirjam Fried (eds.),
Construction Grammars: Cognitive grounding and theoretical extensions, 147–190. Am-
sterdam: John Benjamins Publishing Co.
Berman, Judith. 1996. Eine LFG-Grammatik des Deutschen. In Deutsche und französische
Syntax im Formalismus der LFG (Linguistische Arbeiten 344), 11–96. Tübingen: Max
Niemeyer Verlag.
Berman, Judith. 1999. Does German satisfy the Subject Condition? In Miriam Butt &
Tracy Holloway King (eds.), Proceedings of the LFG ’99 conference, University of Manch-
ester. Stanford, CA: CSLI Publications. http://csli- publications.stanford.edu/LFG/4/
(18 August, 2020).
Berman, Judith. 2003a. Clausal syntax of German (Studies in Constraint-Based Lexical-
ism). Stanford, CA: CSLI Publications.
Berman, Judith. 2003b. Zum Einfluss der strukturellen Position auf die syntaktische
Funktion der Komplementsätze. Deutsche Sprache 3. 263–286.
Berman, Judith. 2007. Functional identification of complement clauses in German and the
specification of COMP. In Annie Zaenen, Jane Simpson, Tracy Holloway King, Jane
Grimshaw, Joan Maling & Chris Manning (eds.), Architectures, rules, and preferences:
Variations on themes by Joan W. Bresnan, 69–83. Stanford, CA: CSLI Publications.
Berwick, Robert C. 1982. Computational complexity and Lexical-Functional Grammar.
American Journal of Computational Linguistics 8(3–4). 97–109.
Berwick, Robert C. & Samuel David Epstein. 1995. On the convergence of ‘Minimalist’
Syntax and Categorial Grammar. In Anton Nijholt, Giuseppe Scollo & Rene Steet-
skamp (eds.), Algebraic methods in language processing, 143–148. Enschede: University
of Twente. http://eprints.eemcs.utwente.nl/9555/01/twlt10.pdf (25 September, 2018).
Berwick, Robert C. & Partha Niyogi. 1996. Learning from triggers. Linguistic Inquiry 27.
605–622.
Berwick, Robert C., Paul Pietroski, Beracah Yankama & Noam Chomsky. 2011. Poverty
of the Stimulus revisited. Cognitive Science 35(7). 1207–1242. DOI: 10.1111/j.1551-6709.
2011.01189.x.
Bick, Eckhard. 2001. En Constraint Grammar parser for dansk. In Peter Widell & Mette
Kunøe (eds.), 8. Møde om Udforskningen af Dansk Sprog, 12.–13. October 2000, vol. 8,
40–50. Århus: Århus University.
730
Bick, Eckhard. 2003. A Constraint Grammar-based question answering system for Por-
tuguese. In Fernando Moura Pires & Salvador Abreu (eds.), Progress in artificial intel-
ligence: 11th Protuguese Conference on Artificial Intelligence, EPIA 2003, Beja, Portugal,
December 4–7, 2003, proceedings (Lecture Notes in Computer Science 2902), 414–418.
Berlin: Springer Verlag.
Bick, Eckhard. 2006. A Constraint Grammar parser for Spanish. In Proceedings of TIL
2006 – 4th Workshop on Information and Human Language Technology (Ribeirão Preto,
October 27–28, 2006), 3–10. http://www.nilc.icmc.usp.br/til/til2006/ (23 December,
2015).
Bick, Eckhard. 2009. A Dependency Constraint Grammar for Esperanto. In Eckhard Bick,
Kristin Hagen, Kaili Müürisep & Trond Trosterud (eds.), Constraint Grammar and ro-
bust parsing: Proceedings of the NODALIDA 2009 workshop (NEALT Proceedings Series
8), 8–12. Tartu: Tartu University Library.
Bick, Eckhard. 2010. FrAG: A hybrid Constraint Grammar parser for French. In Nico-
letta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios
Piperidis, Mike Rosner & Daniel Tapias (eds.), Proceedings of the Seventh International
Conference on Language Resources and Evaluation (LREC’10), 794–798. Valletta, Malta:
European Language Resources Association (ELRA).
Bick, Eckhard & Lars Nygaard. 2007. Using Danish as a CG interlingua: A wide-coverage
Norwegian-English machine translation system. In Joakim Nivre, Heiki-Jaan Kaalep,
Kadri Muischnek & Mare Koit (eds.), Proceedings of the 16th Nordic Conference of Com-
putational Linguistics, 21–28. Forlag uden navn.
Bickerton, Derek. 1984a. Creol is still king. The Behavioral and Brain Sciences 7(2). 212–
218.
Bickerton, Derek. 1984b. The Language Bioprogram Hypothesis. The Behavioral and
Brain Sciences 7(2). 173–188.
Bickerton, Derek. 1997. How to acquire language without positive evidence: What ac-
quisitionists can learn from Creoles. In Michel DeGraff (ed.), Language creation and
language change: creolization, diachrony, and development (Learning, Development,
and Conceptual Change), 49–74. Cambridge, MA: MIT Press.
Bierwisch, Manfred. 1963. Grammatik des deutschen Verbs (studia grammatica 2). Berlin:
Akademie Verlag.
Bierwisch, Manfred. 1966. Strukturalismus: Geschichte, Probleme und Methoden. Kurs-
buch 5. 77–152.
Bierwisch, Manfred. 1992. Grammatikforschung in der DDR: Auch ein Rückblick. Lingui-
stische Berichte 139. 169–181.
Bildhauer, Felix. 2008. Representing information structure in an HPSG grammar of Spanish.
Universität Bremen. (Dissertation).
Bildhauer, Felix. 2011. Mehrfache Vorfeldbesetzung und Informationsstruktur: Eine Be-
standsaufnahme. Deutsche Sprache 39(4). 362–379.
Bildhauer, Felix. 2014. Head-Driven Phrase Structure Grammar. In Andrew Carnie,
Yosuke Sato & Dan Siddiqi (eds.), The Routledge handbook of syntax, 526–555. Oxford:
Routledge.
731
References
Bildhauer, Felix & Philippa Cook. 2010. German multiple fronting and expected topic-
hood. In Stefan Müller (ed.), Proceedings of the 17th International Conference on Head-
Driven Phrase Structure Grammar, Université Paris Diderot, 68–79. Stanford, CA: CSLI
Publications. http://csli-publications.stanford.edu/HPSG/2010/ (18 August, 2020).
Bird, Steven & Ewan Klein. 1994. Phonological analysis in typed feature systems. Com-
putational Linguistics 20(3). 455–491.
Bishop, Dorothy V. M. 2002. Putting language genes in perspective. TRENDS in Genetics
18(2). 57–59.
Bjerre, Tavs. 2006. Object positions in a topological sentence model for Danish: A lineariza-
tion-based HPSG approach. Presentation at Ph.D.-Course at Sandbjerg, Denmark. http:
//www.hum.au.dk/engelsk/engsv/objectpositions/workshop/Bjerre.pdf (18 August,
2020).
Blackburn, Patrick & Johan Bos. 2005. Representation and inference for natural language:
A first course in computational semantics. Stanford, CA: CSLI Publications.
Blackburn, Patrick, Claire Gardent & Wilfried Meyer-Viol. 1993. Talking about trees. In
Steven Krauwer, Michael Moortgat & Louis des Tombe (eds.), Sixth Conference of the
European Chapter of the Association for Computational Linguistics. Proceedings of the
conference, 21–29. Uetrecht: Association for Computational Linguistics.
Błaszczak, Joanna & Hans-Martin Gärtner. 2005. Intonational phrasing, discontinuity,
and the scope of negation. Syntax 8(1). 1–22.
Blevins, James P. 2003. Passives and impersonals. Journal of Linguistics 39(3). 473–520.
DOI: 10.1017/S0022226703002081.
Block, Hans-Ulrich & Rudolf Hunze. 1986. Incremental construction of c- and f-structure
in a LFG-parser. In Makoto Nagao (ed.), Proceedings of COLING 86, 490–493. University
of Bonn: Association for Computational Linguistics.
Blom, Corrien. 2005. Complex predicates in Dutch: Synchrony and diachrony (LOT Disser-
tation Series 111). Utrecht: Utrecht University.
Bloom, Paul. 1993. Grammatical continuity in language development: The case of sub-
jectless sentences. Linguistic Inquiry 24(4). 721–734.
Boas, Hans C. 2003. A Constructional approach to resultatives (Stanford Monographs in
Linguistics). Stanford, CA: CSLI Publications.
Boas, Hans C. 2014. Lexical approaches to argument structure: Two sides of the same
coin. Theoretical Linguistics 40(1–2). 89–112.
Bobaljik, Jonathan. 1999. Adverbs: The hierarchy paradox. Glot International 4(9/10). 27–
28.
Bod, Rens. 2009a. Constructions at work or at rest? Cognitive Linguistics 20(1). 129–134.
Bod, Rens. 2009b. From exemplar to grammar: Integrating analogy and probability in
language learning. Cognitive Science 33(4). 752–793. DOI: 10 . 1111 / j . 1551 - 6709 . 2009 .
01031.x.
Bögel, Tina, Miriam Butt & Sebastian Sulger. 2008. Urdu ezafe and the morphology-
syntax interface. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG
2008 conference, 129–149. Stanford, CA: CSLI Publications. http : / / csli - publications .
stanford.edu/LFG/13/ (18 August, 2020).
732
Bohnet, Bernd. 2010. Very high accuracy and fast Dependency Parsing is not a contra-
diction. In Chu-Ren Huang & Dan Jurafsky (eds.), Proceedings of the 23rd International
Conference on Computational Linguistics, 89–97. Stroudsburg, PA, USA: Association
for Computational Linguistics.
Bolc, Leonard, Krzysztof Czuba, Anna Kupść, Małgorzata Marciniak, Agnieszka
Mykowiecka & Adam Przepiórkowski. 1996. A survey of systems for implementing
HPSG grammars. Tech. rep. 814. Warsaw, Poland: Institute of Computer Science, Pol-
ish Academy of Sciences. http : / / www . cs . cmu . edu / ~kczuba / systems - wide . ps . gz
(18 August, 2020).
Booij, Geert E. 2002. Separable complex verbs in Dutch: A case of periphrastic word
formation. In Nicole Dehé, Ray S. Jackendoff, Andrew McIntyre & Silke Urban (eds.),
Verb-particle explorations (Interface Explorations 1), 21–41. Berlin: Mouton de Gruyter.
Booij, Geert E. 2005. Construction-Dependent Morphology. Lingue e linguaggio 4. 31–46.
Booij, Geert E. 2009. Lexical integrity as a formal universal: A Constructionist view.
In Sergio Scalise, Elisabetta Magni & Antonietta Bisetto (eds.), Universals of language
today (Studies in Natural Language and Linguistic Theory 76), 83–100. Berlin: Springer
Verlag.
Booij, Geert E. 2010. Construction morphology. Language and Linguistics Compass 4(7).
543–555. DOI: 10.1111/j.1749-818X.2010.00213.x.
Booij, Geert E. 2012. Construction morphology. Ms. Leiden University.
Borer, Hagit. 1994. The projection of arguments. In E. Benedicto & J. Runner (eds.),
Functional projections (UMass Occasional Papers in Linguistics (UMOP) 17), 19–47.
Massachusetts: University of Massachusetts Graduate Linguistic Student Association.
Borer, Hagit. 2003. Exo-skeletal vs. endo-skeletal explanations: Syntactic projections and
the lexicon. In John Moore & Maria Polinsky (eds.), The nature of explanation in lin-
guistic theory, 31–67. Stanford, CA: CSLI Publications.
Borer, Hagit. 2005. Structuring sense: In name only. Vol. 1. Oxford: Oxford University
Press.
Borsley, Robert D. 1987. Subjects and complements in HPSG. Report No. CSLI-87-107. Stan-
ford, CA: Center for the Study of Language & Information.
Borsley, Robert D. 1989. Phrase-Structure Grammar and the Barriers conception of clause
structure. Linguistics 27(5). 843–863.
Borsley, Robert D. 1991. Syntactic theory: A unified approach. London: Edward Arnold.
Borsley, Robert D. 1999a. Mutation and constituent structure in Welsh. Lingua 109(4).
267–300. DOI: 10.1016/S0024-3841(99)00019-4.
Borsley, Robert D. 1999b. Syntactic theory: A unified approach. 2nd edn. London: Edward
Arnold.
Borsley, Robert D. 2005. Against ConjP. Lingua 115(4). 461–482. DOI: 10.1016/j.lingua.
2003.09.011.
Borsley, Robert D. 2006. Syntactic and lexical approaches to unbounded dependencies. Es-
sex Research Reports in Linguistics 49. University of Essex. 31–57. http://core.ac.uk/
download/pdf/4187949.pdf#page=35 (18 August, 2020).
733
References
Borsley, Robert D. 2007. Hang on again! Are we ‘on the right track’? In Andrew Radford
(ed.), Martin Atkinson – the Minimalist muse (Essex Research Reports in Linguistics
53), 43–69. Essex: Department of Language & Linguistics, University of Essex.
Borsley, Robert D. 2009. On the superficiality of Welsh agreement. Natural Language &
Linguistic Theory 27(2). 225–265. DOI: 10.1007/s11049-009-9067-3.
Borsley, Robert D. 2012. Don’t move! Iberia: An International Journal of Theoretical Lin-
guistics 4(1). 110–139.
Borsley, Robert D. 2013. On the nature of Welsh unbounded dependencies. Lingua 133.
1–29. DOI: 10.1016/j.lingua.2013.03.005.
Borsley, Robert D. & Stefan Müller. 2020. Minimalism. In Stefan Müller, Anne Abeillé,
Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar:
The handbook (Empirically Oriented Theoretical Morphology and Syntax). To appear.
Berlin: Language Science Press.
Bos, Johan. 1996. Predicate logic unplugged. In Paul J. E. Dekker & M. Stokhof
(eds.), Proceedings of the Tenth Amsterdam Colloquium, 133–143. Amsterdam:
ILLC/Department of Philosophy, University of Amsterdam.
Bosse, Solveig & Benjamin Bruening. 2011. Benefactive versus experiencer datives. In
Mary Byram Washburn, Katherine McKinney-Bock, Erika Varis, Ann Sawyer & Bar-
bara Tomaszewicz (eds.), Proceedings of the 28th West Coast Conference on Formal Lin-
guistics, 69–77. Somerville, MA: Cascadilla Press.
Boukedi, Sirine & Kais Haddar. 2014. HPSG grammar treating of different forms of Arabic
coordination. Research in Computing Science 86: Advances in Computational Linguis-
tics and Intelligent Decision Making. 25–41.
Boullier, Pierre & Benoît Sagot. 2005a. Analyse syntaxique profonde à grande échelle:
SXLFG. Traitement Automatique des Langues (T.A.L.) 46(2). 65–89.
Boullier, Pierre & Benoît Sagot. 2005b. Efficient and robust LFG parsing: SXLFG. In
Proceedings of IWPT 2005, 1–10. Vancouver, Canada: Association for Computational
Linguistics.
Boullier, Pierre, Benoît Sagot & Lionel Clément. 2005. Un analyseur LFG efficace pour le
français: SxLfg. In Actes de TALN 05, 403–408. Dourdan, France.
Bouma, Gosse. 1996. Extraposition as a nonlocal dependency. In Geert-Jan Kruijff, Glynn
V. Morrill & Dick Oehrle (eds.), Proceedings of Formal Grammar 96, 1–14. Prag. http:
//www.let.rug.nl/gosse/papers.html (18 August, 2020).
Bouma, Gosse, Robert Malouf & Ivan A. Sag. 2001. Satisfying constraints on extraction
and adjunction. Natural Language & Linguistic Theory 19(1). 1–65. DOI: 10 . 1023 / A :
1006473306778.
Bouma, Gosse & Gertjan van Noord. 1994. Constraint-based Categorial Grammar. In
James Pustejovsky (ed.), 32th Annual Meeting of the Association for Computational Lin-
guistics. Proceedings of the conference, 147–154. Las Cruses: Association for Computa-
tional Linguistics.
Bouma, Gosse & Gertjan van Noord. 1998. Word order constraints on verb clusters in
German and Dutch. In Erhard W. Hinrichs, Andreas Kathol & Tsuneko Nakazawa
734
(eds.), Complex predicates in nonderivational syntax (Syntax and Semantics 30), 43–72.
San Diego: Academic Press. DOI: 10.1163/9780585492223_003. (18 August, 2020).
Bouma, Gosse, Gertjan van Noord & Robert Malouf. 2001. Alpino: wide-coverage com-
putational analysis of Dutch. In Walter Daelemans, Khalil Sima’an, Jorn Veenstra &
Jakub Zavrel (eds.), Computational linguistics in the Netherlands 2000: Selected papers
from the Eleventh CLIN Meeting (Language and Computers 37). Amsterdam/New York,
NY: Rodopi.
Braine, Martin D. S. 1987. What is learned in acquiring word classes—A step toward an
acquisition theory. In Brian MacWhinny (ed.), Mechanisms of language acquisition,
65–87. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
Brame, Michael. 1982. The head-selector theory of lexical specifications and the non-
existence of coarse categories. Linguistic Analysis 10(4). 321–325.
Branco, António & Francisco Costa. 2008a. A computational grammar for deep linguistic
processing of Portuguese: LXGram, version A.4.1. Tech. rep. TR-2008-17. Universidade
de Lisboa, Faculdade de Ciências, Departamento de Informática.
Branco, António & Francisco Costa. 2008b. LXGram in the shared task ‘comparing
semantic representations’ of STEP 2008. In Johan Bos & Rodolfo Delmonte (eds.),
Semantics in text processing: STEP 2008 conference proceedings, vol. 1 (Research in Com-
putational Semantics), 299–314. London: College Publications. http: // www. aclweb.
org/anthology/W08-2224 (18 August, 2020).
Branco, António & Francisco Costa. 2010. LXGram: A deep linguistic processing gram-
mar for Portuguese. In Thiago A.S. Pardo (ed.), Computational processing of the Por-
tuguese language: 9th International Conference, PROPOR 2010, Porto Alegre, RS, Brazil,
April 27–30, 2010. Proceedings (Lecture Notes in Artificial Intelligence 6001), 86–89.
Berlin: Springer Verlag.
Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen-Schirra, Esther König,
Wolfgang Lezius, Christian Rohrer, George Smith & Hans Uszkoreit. 2004. TIGER:
Linguistic interpretation of a German corpus. Research on Language and Computation
2(4). 597–620.
Bresnan, Joan. 1974. The position of certain clause-particles in phrase structure. Linguis-
tic Inquiry 5(4). 614–619.
Bresnan, Joan. 1978. A realistic Transformational Grammar. In Morris Halle, Joan Bres-
nan & George A. Miller (eds.), Linguistic theory and psychological reality, 1–59. Cam-
bridge, MA: MIT Press.
Bresnan, Joan. 1982a. Control and complementation. Linguistic Inquiry 13(3). 343–434.
Bresnan, Joan. 1982b. The passive in lexical theory. In Joan Bresnan (ed.), The mental
representation of grammatical relations (MIT Press Series on Cognitive Theory and
Mental Representation), 3–86. Cambridge, MA: MIT Press.
Bresnan, Joan. 2001. Lexical-Functional Syntax. Oxford: Blackwell Publishers Ltd.
Bresnan, Joan, Ash Asudeh, Ida Toivonen & Stephen Wechsler. 2016. Lexical-functional
syntax. 2nd edn. (Blackwell Textbooks in Linguistics 16). Oxford: Wiley Blackwell.
DOI: 10.1002/9781119105664.
735
References
Bresnan, Joan & Jane Grimshaw. 1978. The syntax of free relatives in English. Linguistic
Inquiry 9. 331–392.
Bresnan, Joan & Jonni M. Kanerva. 1989. Locative inversion in Chicheŵa: A case study
of factorization in grammar. Linguistic Inquiry 20(1). 1–50.
Bresnan, Joan & Ronald M. Kaplan. 1982. Introduction: grammars as mental represen-
tations of language. In Joan Bresnan (ed.), The mental representation of grammatical
relations (MIT Press Series on Cognitive Theory and Mental Representation), xvii–lii.
Cambridge, MA: MIT Press.
Bresnan, Joan & Sam A. Mchombo. 1995. The Lexical Integrity Principle: Evidence from
Bantu. Natural Language & Linguistic Theory 13. 181–254. DOI: 10.1007/BF00992782.
Bresnan, Joan & Annie Zaenen. 1990. Deep unaccusativity in LFG. In Katarzyna Dzi-
wirek, Patrick Farrell & Errapel Mejı́as-Bikandi (eds.), Grammatical relations: A cross-
theoretical perspective, 45–57. Stanford, CA: CSLI Publications.
Brew, Chris. 1995. Stochastic HPSG. In Steven P. Abney & Erhard W. Hinrichs (eds.),
Proceedings of the Seventh Conference of the European Chapter of the Association for
Computational Linguistics, 83–89. Dublin: Association for Computational Linguistics.
Briscoe, Ted J. 1997. Review of Edward P. Stabler, Jr., The logical approach to syntax:
Foundations, specifications, and implementations of theories of Government and Bind-
ing. Journal of Linguistics 33(1). 223–225.
Briscoe, Ted J. 2000. Grammatical acquisition: inductive bias and coevolution of language
and the language acquisition device. Language 76(2). 245–296.
Briscoe, Ted J. & Ann Copestake. 1999. Lexical rules in constraint-based grammar. Com-
putational Linguistics 25(4). 487–526. http://www.aclweb.org/anthology/J99- 4002
(7 October, 2018).
Bröker, Norbert. 2003. Formal foundations of Dependency Grammar. In Vilmos Ágel,
Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer &
Henning Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein interna-
tionales Handbuch der zeitgenössischen Forschung / An international handbook of con-
temporary research, vol. 1 (Handbücher zur Sprach- und Kommunikationswissenschaft
25), 294–310. Berlin: Walter de Gruyter.
Brosziewski, Ulf. 2003. Syntactic derivations: A nontransformational view (Linguistische
Arbeiten 470). Tübingen: Max Niemeyer Verlag.
Brown, Roger & Camille Hanlon. 1970. Derivational complexity and order of acquisition
in child speech. In John R. Hayes (ed.), Cognition and the development of language, 11–
53. New York: John Wiley & Sons, Inc.
Bruening, Benjamin. 2009. Selectional asymmetries between CP and DP suggest that the
DP hypothesis is wrong. In Laurel MacKenzie (ed.), Proceedings of the 32th Annual Penn
Linguistics Colloquium (Penn Working Papers in Linguistics 15.1), 26–35. Philadelphia.
Bruening, Benjamin. 2018. The lexicalist hypothesis: both wrong and superfluous. Lan-
guage 94(1). 1–42. DOI: 10.1353/lan.2018.0000.
Bryant, John. 2003. Constructional analysis. University of Califorma at Berkeley. (MA
thesis). https://www2.eecs.berkeley.edu/bears/2004/STARS/bryant- constructional.
pdf (18 August, 2020).
736
Budde, Monika. 2010. Konstruktionen integrativ: Topik-Konstruktionen als rein-syntak-
tisches Pendant zu den lexikalisch verankerten Komplement-Konstruktionen. Vortrag
auf der Tagung Konstruktionsgrammatik: Neue Perspektiven zur Untersuchung des
Deutschen und Englischen. Internationale Fachtagung an der Christian-Albrechts-
Universität zu Kiel vom 18. bis 20. Februar 2010.
Bungeroth, Jan. 2002. A formal description of Sign Language using HPSG. Karlsruhe: De-
partment of Computer Science, University of Stellenbosch, Lehrstuhl Informatik für
Ingenieure und Naturwissenschaftler, Universität Karlsruhe (TH). (Diploma Thesis).
http://www-i6.informatik.rwth-aachen.de/~bungeroth/diplarb.pdf (18 August, 2020).
Burzio, Luigi. 1981. Intransitive verbs and Italian auxiliaries. MIT. (Doctoral dissertation).
Burzio, Luigi. 1986. Italian syntax: A Government-Binding approach (Studies in Natural
Language and Linguistic Theory 1). Dordrecht: D. Reidel Publishing Company.
Busemann, Stephan. 1992. Generierung natürlicher Sprache mit generalisierten Phrasen-
strukturgrammatiken. Vol. 313 (Informatik-Fachberichte). Berlin: Springer Verlag.
Bußmann, Hadumod (ed.). 1983. Lexikon der Sprachwissenschaft (Kröners Taschenaus-
gabe 452). Stuttgart: Alfred Kröner Verlag.
Bußmann, Hadumod (ed.). 1990. Lexikon der Sprachwissenschaft. 2nd edn. (Kröners
Taschenausgabe 452). Stuttgart: Alfred Kröner Verlag.
Butt, Miriam. 1997. Complex predicates in Urdu. In Alex Alsina, Joan Bresnan & Peter
Sells (eds.), Complex predicates (CSLI Lecture Notes 64), 107–149. Stanford, CA: CSLI
Publications.
Butt, Miriam. 2003. The light verb jungle. In C. Quinn, C. Bowern & G. Aygen (eds.),
Papers from the Harvard/Dudley House light verb workshop (Harvard Working Papers
in Linguistics 9), 1–49. Harvard University, Department of Linguistics.
Butt, Miriam, Stefanie Dipper, Anette Frank & Tracy Holloway King. 1999. Writing large-
scale parallel grammars for English, French and German. In Miriam Butt & Tracy Hol-
loway King (eds.), Proceedings of the LFG ’99 conference, University of Manchester. Stan-
ford, CA: CSLI Publications. http://csli-publications.stanford.edu/LFG/4/ (18 August,
2020).
Butt, Miriam, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi & Christian Rohrer.
2002. The Parallel Grammar Project. In Proceedings of COLING-2002 Workshop on
Grammar Engineering and Evaluation, 1–7.
Butt, Miriam, Tracy Holloway King & John T. Maxwell III. 2003. Complex predication
via restriction. In Miriam Butt & Tracy Holloway King (eds.), Nominals: Inside and out
(Studies in Constraint-Based Lexicalism 16), 92–104. Stanford, CA: CSLI Publications.
Butt, Miriam, Tracy Holloway King, María-Eugenia Niño & Frédérique Segond. 1999. A
grammar writer’s cookbook (CSLI Lecture Notes 95). Stanford, CA: CSLI Publications.
Butt, Miriam, Tracy Holloway King & Sebastian Roth. 2007. Urdu correlatives: Theo-
retical and implementational issues. In Miriam Butt & Tracy Holloway King (eds.),
Proceedings of the LFG 2007 conference, 107–127. Stanford, CA: CSLI Publications. http:
//csli-publications.stanford.edu/LFG/12/ (18 August, 2020).
737
References
Cahill, Aoife, Michael Burke, Martin Forst, Ruth O’Donovan, Christian Rohrer, Josef van
Genabith & Andy Way. 2005. Treebank-based acquisition of multilingual unification
grammar resources. Research on Language and Computation 3(2). 247–279.
Cahill, Aoife, Michael Burke, Ruth O’Donovan, Stefan Riezler, Josef van Genabith &
Andy Way. 2008. Wide-coverage deep statistical parsing using automatic dependency
structure annotation. Computational Linguistics 34(1). 81–124.
Calder, Jonathan, Ewan Klein & Henk Zeevat. 1988. Unification Categorial Grammar: A
concise, extendable grammar for natural language processing. In Dénes Vargha (ed.),
Proceedings of COLING 88, 83–86. University of Budapest: Association for Computa-
tional Linguistics. https : / / aclanthology . info / pdf / C / C88 / C88 - 1018 . pdf (20 March,
2018).
Callmeier, Ulrich. 2000. PET—A platform for experimentation with efficient HPSG pro-
cessing techniques. Journal of Natural Language Engineering 1(6). (Special Issue on
Efficient Processing with HPSG: Methods, Systems, Evaluation), 99–108.
Candito, Marie-Hélène. 1996. A principle-based hierarchical representation of LTAGs. In
Jun-ichi Tsuji (ed.), Proceedings of COLING-96. 16th International Conference on Com-
putational Linguistics (COLING96). Copenhagen, Denmark, August 5–9, 1996, 194–199.
Copenhagen, Denmark: Association for Computational Linguistics.
Candito, Marie-Hélène. 1998. Building parallel LTAG for French and Italian. In Pierre
Isabelle (ed.), Proceedings of the 36th Annual Meeting of the Association for Computa-
tional Linguistics and 17th International Conference on Computational Linguistics, 211–
217. Montreal, Quebec, Canada: Association for Computational Linguistics. DOI: 10.
3115/980845.980953.
Candito, Marie-Hélène. 1999. Organisation modulaire et paramétrable de grammaires élec-
troniques lexicalisées. Application au français et à l’italien. Université Paris 7. (Doctoral
dissertation).
Candito, Marie-Hélène & Sylvain Kahane. 1998. Can the TAG derivation tree represent
a semantic graph? An answer in the light of Meaning-Text Theory. In TAG+4, 25–28.
Cappelle, Bert. 2006. Particle placement and the case for “allostructions”. Constructions
online 1(7). 1–28.
Cappelle, Bert, Yury Shtyrov & Friedemann Pulvermüller. 2010. Heating up or cooling up
the brain? MEG evidence that phrasal verbs are lexical units. Brain and Language 115.
189–201.
Carlson, Gregory N. & Michael K. Tanenhaus. 1988. Thematic roles and language com-
prehension. In Wendy Wilkins (ed.), Thematic relations (Syntax and Semantics 21),
263–289. San Diego: Academic Press.
Carnie, Andrew. 2013. Syntax: A generative introduction. 3rd edn. (Introducing Linguistics
20). Wiley-Blackwell.
Carpenter, Bob. 1992. The logic of typed feature structures (Tracts in Theoretical Computer
Science). Cambridge: Cambridge University Press.
Carpenter, Bob. 1994. A natural deduction theorem prover for type-theoretic Categorial
Grammars. Tech. rep. Carnegie Mellon Laboratory for Computational Linguistics.
738
http://www.essex.ac.uk/linguistics/external/clmt/papers/cg/carp_cgparser_doc.ps
(18 August, 2020).
Carpenter, Bob. 1998. Type-logical semantics. Cambridge, MA: MIT Press.
Carpenter, Bob & Gerald Penn. 1996. Efficient parsing of compiled typed attribute value
logic grammars. In Harry Bunt & Masaru Tomita (eds.), Recent advances in parsing
technology (Text, Speech and Language Technology 1), 145–168. Dordrecht: Kluwer
Academic Publishers.
Çetinoğlu, Özlem & Kemal Oflazer. 2006. Morphology-syntax interface for Turkish LFG.
In Nicoletta Calzolari, Claire Cardie & Pierre Isabelle (eds.), Proceedings of the 21st
International Conference on Computational Linguistics and 44th Annual Meeting of the
Association for Computational Linguistics, 153–160. Sydney, Australia: Association for
Computational Linguistics.
Chang, Nancy Chih-Lin. 2008. Constructing grammar: A computational model of the emer-
gence of early constructions. Technical Report UCB/EECS-2009-24. Electrical Engineer-
ing & Computer Sciences, University of California at Berkeley.
Chaves, Rui P. 2009. Construction-based cumulation and adjunct extraction. In Stefan
Müller (ed.), Proceedings of the 16th International Conference on Head-Driven Phrase
Structure Grammar, University of Göttingen, Germany, 47–67. Stanford, CA: CSLI Pub-
lications. http://csli-publications.stanford.edu/HPSG/2009/ (18 August, 2020).
Chesi, Cristiano. 2015. On directionality of phrase structure building. Journal of Psy-
cholinguistic Research 44(1). 65–89. DOI: 10.1007/s10936-014-9330-6.
Choi, Hye-Won. 1999. Optimizing structure in scrambling: Scrambling and information
structure (Dissertations in Linguistics). Stanford, CA: CSLI Publications.
Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions
on Information Theory 2. 113–124.
Chomsky, Noam. 1957. Syntactic structures (Janua Linguarum / Series Minor 4). The
Hague: Mouton.
Chomsky, Noam. 1959. On certain formal properties of grammars. Information and Con-
trol 2(2). 137–167.
Chomsky, Noam. 1964a. Current issues in linguistic theory (Janua Linguarum / Series
Minor 38). The Hague/Paris: Mouton.
Chomsky, Noam. 1964b. Degrees of grammaticalness. In Jerry A. Fodor & Jerrold J. Katz
(eds.), The structure of language, 384–389. Englewood Cliffs, NJ: Prentice-Hall.
Chomsky, Noam. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, Noam. 1968. Language and the mind. Psychology Today 1(9). Reprint as: Chom-
sky (1976a), 48–68.
Chomsky, Noam. 1970. Remarks on nominalization. In Roderick A. Jacobs & Peter S.
Rosenbaum (eds.), Readings in English Transformational Grammar, chap. 12, 184–221.
Waltham, MA/Toronto/London: Ginn & Company.
Chomsky, Noam. 1971. Problems of knowledge and freedom. London: Fontana.
Chomsky, Noam. 1973. Conditions on transformations. In Stephen R. Anderson & Paul
Kiparsky (eds.), A festschrift for Morris Halle, 232–286. New York: Holt, Rinehart &
Winston.
739
References
Chomsky, Noam. 1975. The logical structure of linguistic theory. New York: Plenum Press.
Chomsky, Noam. 1976a. Language and the mind. In Diane D. Borstein (ed.), Readings
in the theory of grammar: from the 17th to the 20th century, 241–251. Reprint from:
Chomsky (1968). Cambridge, MA: Winthrop.
Chomsky, Noam. 1976b. Reflections on language. New York: Pantheon Books.
Chomsky, Noam. 1977. Essays on form and interpretation. New York: North Holland.
Chomsky, Noam. 1980. Rules and representations. Oxford: Basil Blackwell.
Chomsky, Noam. 1981a. Lectures on government and binding (Studies in Generative Gram-
mar 9). Dordrecht: Foris Publications.
Chomsky, Noam. 1981b. Reply to comments of Thompson. Philosophical Transactions of
the Royal Society of London. Series B, Biological Sciences 295(1077). 277–281.
Chomsky, Noam. 1982. Some concepts and consequences of the theory of Government and
Binding (Linguistic Inquiry Monographs 5). Cambridge, MA: MIT Press.
Chomsky, Noam. 1986a. Barriers (Linguistic Inquiry Monographs 13). Cambridge, MA:
MIT Press.
Chomsky, Noam. 1986b. Knowledge of language: Its nature, origin, and use (Convergence).
New York, NY: Praeger.
Chomsky, Noam. 1988. Language and problems of knowledge: The Managua lectures (Cur-
rent Studies in Linguistics 16). Cambridge, MA: MIT Press.
Chomsky, Noam. 1989. Some notes on economy of derivation and representation. In I.
Laka & Anoop Mahajan (eds.), Functional heads and clause structure (MIT Working
Papers in Linguistics 10), 43–74. Cambridge, MA: Department of Linguistics & Philos-
ophy.
Chomsky, Noam. 1990. On formalization and formal linguistics. Natural Language and
Linguistic Theory 8(1). 143–147.
Chomsky, Noam. 1991. Some notes on economy of derivation and representation. In
Robert Freidin (ed.), Principles and parameters in Generative Grammar, 417–454.
Reprint as: Chomsky (1995b: 129–166). Cambridge, MA: MIT Press.
Chomsky, Noam. 1993. A Minimalist Program for linguistic theory. In Kenneth Hale &
Samuel Jay Keyser (eds.), The view from building 20: Essays in linguistics in honor of
Sylvain Bromberger (Current Studies in Linguistics 24), 1–52. Cambridge, MA: MIT
Press.
Chomsky, Noam. 1995a. Bare phrase structure. In Hector Campos & Paula Kempchinsky
(eds.), Evolution and revolution in linguistic theory: Essays in honor of Carlos Otero, 51–
109. Washington, DC: Georgetown U Press.
Chomsky, Noam. 1995b. The Minimalist Program (Current Studies in Linguistics 28). Cam-
bridge, MA: MIT Press.
Chomsky, Noam. 1998. Noam Chomsky’s Minimalist Program and the philosophy of
mind: An interview [with] Camilo J. Cela-Conde and Gisde Marty. Syntax 1(1). 19–36.
Chomsky, Noam. 1999. Derivation by phase. MIT Occasional Papers in Linguistics 18.
Reprint in: Michael Kenstowicz, ed. 2001. Ken Hale. A Life in Language. Cambridge,
MA: MIT Press, 1–52. MIT.
740
Chomsky, Noam. 2000. New horizons in the study of language and mind. Cambridge, UK:
Cambridge University Press.
Chomsky, Noam. 2001. Derivation by phase. In Michael Kenstowicz (ed.), Ken Hale: A
life in language, 1–52. Cambridge, MA: MIT Press.
Chomsky, Noam. 2002. On nature and language. Cambridge, UK: Cambridge University
Press.
Chomsky, Noam. 2005. Three factors in language design. Linguistic Inquiry 36(1). 1–22.
Chomsky, Noam. 2007. Approaching UG from below. In Uli Sauerland & Hans-Martin
Gärtner (eds.), Interfaces + recursion = language? Chomsky’s Minimalism and the view
from syntax-semantics (Studies in Generative Grammar 89), 1–29. Berlin: Mouton de
Gruyter.
Chomsky, Noam. 2008. On phases. In Robert Freidin, Carlos P. Otero & Maria Luisa
Zubizarreta (eds.), Foundational issues in linguistic theory: Essays in honor of Jean-Roger
Vergnaud, 133–166. Cambridge, MA: MIT Press.
Chomsky, Noam. 2010. Restricting stipulations: consequences and challenges. Talk given
in Stuttgart.
Chomsky, Noam. 2013. Problems of projection. Lingua 130. 33–49. DOI: 10.1016/j.lingua.
2012.12.003.
Chomsky, Noam. 2014. Minimal recursion: Exploring the prospects. In Tom Roeper &
Margaret Speas (eds.), Recursion: Complexity in cognition, vol. 43 (Studies in Theoreti-
cal Psycholinguistics), 1–15. Springer Verlag.
Chomsky, Noam & George A. Miller. 1963. Introduction to the formal analysis of natural
languages. In R. Duncan Luce, Robert R. Bush & Eugene Galanter (eds.), Handbook of
mathematical psychology, vol. 2, 269–321. New York: John Wiley & Sons, Inc.
Chouinard, Michelle M. & Eve V. Clark. 2003. Adult reformulations of child errors as
negative evidence. Journal of Child Language 30. 637–669.
Chrupala, Grzegorz & Josef van Genabith. 2006. Using machine-learning to assign func-
tion labels to parser output for Spanish. In Nicoletta Calzolari, Claire Cardie & Pierre
Isabelle (eds.), Proceedings of the 21st International Conference on Computational Lin-
guistics and 44th Annual Meeting of the Association for Computational Linguistics, 136–
143. Sydney, Australia: Association for Computational Linguistics.
Chung, Chan. 1998. Argument composition and long-distance scrambling in Korean. In
Erhard W. Hinrichs, Andreas Kathol & Tsuneko Nakazawa (eds.), Complex predicates
in nonderivational syntax (Syntax and Semantics 30), 159–220. San Diego: Academic
Press.
Chung, Sandra & James McCloskey. 1983. On the interpretation of certain island facts in
GPSG. Linguistic Inquiry 14. 704–713.
Church, Kenneth. 2011. A pendulum swung too far. Linguistic Issues in Language Tech-
nology 6(5). Special Issue on Interaction of Linguistics and Computational Linguistics,
1–27. http : / / journals . linguisticsociety . org / elanguage / lilt / article / view / 2581 . html
(18 August, 2020).
Cinque, Guglielmo. 1994. On the evidence for partial N movement in the Romance DP.
In Guglielmo Cinque, Jan Koster, Jean-Yves Pollock, Luigi Rizzi & Raffaella Zanuttini
741
References
(eds.), Paths towards Universal Grammar: Studies in honor of Richard S. Kayne, 85–110.
Washington, D.C.: Georgetown University Press.
Cinque, Guglielmo. 1999. Adverbs and functional heads: A cross-linguistic perspective. New
York, Oxford: Oxford University Press.
Cinque, Guglielmo & Luigi Rizzi. 2010. The cartography of syntactic structures. In Bernd
Heine & Heiko Narrog (eds.), The Oxford handbook of linguistic analysis, 51–65. Oxford:
Oxford University Press.
Citko, Barbara. 2008. Missing labels. Lingua 118(7). 907–944.
Clark, Alexander. 2000. Inducing syntactic categories by context distribution clustering.
In Proceedings CoNLL 2000, 91–94. Stroudsburg, PA: Association for Computational
Linguistics.
Clark, Herbert H. & Jean E. Fox Tree. 2002. Using uh and um in spontaneous speaking.
Cognition 84(1). 73–111.
Clark, Herbert H. & Thomas Wasow. 1998. Repeating words in spontaneous speech. Cog-
nitive Psychology 37(3). 201–242.
Clark, Stephen & James Curran. 2007. Wide-coverage efficient statistical parsing with
CCG and log-linear models. Computational Linguistics 33(4). 493–552.
Clark, Stephen, Julia Hockenmaier & Mark Steedman. 2002. Building deep dependency
structures with a wide-coverage CCG parser. In Pierre Isabelle (ed.), 40th Annual Meet-
ing of the Association for Computational Linguistics: Proceedings of the conference, 327–
334. University of Pennsylvania, Philadelphia: Association for Computational Linguis-
tics. DOI: 10.3115/1073083.1073138. https://www.aclweb.org/anthology/events/acl-
2002/ (18 August, 2020).
Clément, Lionel. 2009. XLFG5 documentation. Translated from French by Olivier Bonami.
http://www.xlfg.org/ (18 August, 2020).
Clément, Lionel & Alexandra Kinyon. 2001. XLFG—An LFG parsing scheme for French.
In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2001 conference.
Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/LFG/6/ (18 Au-
gust, 2020).
Clément, Lionel & Alexandra Kinyon. 2003. Generating parallel multilingual LFG-TAG
grammars from a MetaGrammar. In Erhard Hinrichs & Dan Roth (eds.), Proceedings
of the 41st Annual Meeting of the Association for Computational Linguistics, 184–191.
Sapporo, Japan: Association for Computational Linguistics.
Clifton, Charles Jr. & Penelope Odom. 1966. Similarity relations among certain English
sentence constructions. Psychological Monographs: General and Applied 80(5). 1–35.
Coch, Jose. 1996. Overview of AlethGen. In Demonstrations and posters of the Eighth In-
ternational Natural Language Generation Workshop (INLG’96), 25–28.
Cook, Philippa. 2006. The datives that aren’t born equal: Beneficiaries and the dative
passive. In Daniel Hole, André Meinunger & Werner Abraham (eds.), Datives and sim-
ilar cases: Between argument structure and event structure, 141–184. Amsterdam: John
Benjamins Publishing Co.
Cook, Philippa Helen. 2001. Coherence in German: An information structure approach. De-
partments of Linguistics & German, University of Manchester. (Doctoral dissertation).
742
Cooper, Robin, Kuniaki Mukai & John Perry (eds.). 1990. Situation Theory and its appli-
cations. Vol. 1 (CSLI Lecture Notes 22). Stanford, CA: CSLI Publications.
Coopmans, Peter. 1989. Where stylistic and syntactic processes meet: Locative inversion
in English. Language 65(4). 728–751.
Copestake, Ann. 2002. Implementing typed feature structure grammars (CSLI Lecture
Notes 110). Stanford, CA: CSLI Publications.
Copestake, Ann. 2007. Applying robust semantics. In Proceedings of the 10th Conference
of the Pacific Assocation for Computational Linguistics (PACLING), 1–12.
Copestake, Ann & Ted Briscoe. 1995. Semi-productive polysemy and sense extension.
Journal of Semantics 12(1). 15–67.
Copestake, Ann & Ted J. Briscoe. 1992. Lexical operations in a unification based frame-
work. In James Pustejovsky & Sabine Bergler (eds.), Lexical semantics and knowledge
representation. SIGLEX 1991 (Lecture Notes in Artificial Intelligence 627), 101–119. Ber-
lin: Springer Verlag. DOI: 10.1007/3-540-55801-2_30.
Copestake, Ann & Daniel P. Flickinger. 2000. An open-source grammar development
environment and broad-coverage English grammar using HPSG. In Proceedings of the
second Linguistic Resources and Evaluation Conference, 591–600. Athens, Greece.
Copestake, Ann, Daniel P. Flickinger, Carl J. Pollard & Ivan A. Sag. 2005. Minimal Re-
cursion Semantics: An introduction. Research on Language and Computation 3(2–3).
281–332. DOI: 10.1007/s11168-006-6327-9.
Correa, Nelson. 1987. An Attribute-Grammar implementation of Government-Binding
Theory. In Candy Sidner (ed.), 25th Annual Meeting of the Association for Computa-
tional Linguistics, 45–51. Stanford, CA: Association for Computational Linguistics.
Covington, Michael A. 1990. Parsing discontinuous constituents in Dependency Gram-
mar. Computational Linguistics 16(4). 234–236.
Crabbé, Benoit. 2005. Représentation informatique de grammaires d’arbres fortement lexi-
calisées: le cas de la grammaire d’arbres adjoints. Université Nancy 2. (Doctoral disser-
tation).
Crain, Stephen, Drew Khlentzos & Rosalind Thornton. 2010. Universal Grammar versus
language diversity. Lingua 120(12). 2668–2672.
Crain, Stephen & Mineharu Nakayama. 1987. Structure dependence in grammar forma-
tion. Language 63(3). 522–543.
Crain, Stephen & Mark Steedman. 1985. On not being led up the garden path: The use
of context by the psychological syntax processor. In David R. Dowty, Lauri Karttunen
& Arnold M. Zwicky (eds.), Natural language processing, 320–358. Cambridge, UK:
Cambridge University Press.
Crain, Stephen, Rosalind Thornton & Drew Khlentzos. 2009. The case of the missing
generalizations. Cognitive Linguistics 20(1). 145–155. DOI: 10.1515/COGL.2009.008.
Cramer, Bart & Yi Zhang. 2009. Construction of a German HPSG grammar from a de-
tailed treebank. In Tracy Holloway King & Marianne Santaholma (eds.), Proceedings
of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009), 37–45.
Suntec, Singapore: Association for Computational Linguistics. https://www.aclweb.
org/anthology/events/ws-2009/#w09-26 (18 August, 2020).
743
References
Crocker, Matthew Walter & Ian Lewin. 1992. Parsing as deduction: Rules versus princi-
ples. In Bernd Neumann (ed.), ECAI 92. 10th European Conference on Artificial Intelli-
gence, 508–512. John Wiley & Sons, Inc.
Croft, William. 2001. Radical Construction Grammar: Syntactic theory in typological per-
spective. Oxford: Oxford University Press.
Croft, William. 2003. Lexical rules vs. constructions: A false dichotomy. In Hubert Cuy-
ckens, Thomas Berg, René Dirven & Klaus-Uwe Panther (eds.), Motivation in language:
Studies in honour of Günter Radden (Current Issues in Linguistic Theory 243), 49–68.
Amsterdam: John Benjamins Publishing Co. DOI: 10.1075/cilt.243.07cro.
Croft, William. 2009. Syntax is more diverse, and evolutionary linguistics is already here.
The Behavioral and Brain Sciences 32(5). 457–458.
Crysmann, Berthold. 2001. Clitics and coordination in linear structure. In Birgit Gerlach
& Janet Grijzenhout (eds.), Clitics in phonology, morphology and syntax (Linguistik
Aktuell/Linguistics Today 36), 121–159. Amsterdam: John Benjamins Publishing Co.
Crysmann, Berthold. 2002. Constraint-based co-analysis: Portuguese cliticisation and mor-
phology-syntax interaction in HPSG (Saarbrücken Dissertations in Computational Lin-
guistics and Language Technology 15). Saarbrücken: Deutsches Forschungszentrum
für Künstliche Intelligenz und Universität des Saarlandes.
Crysmann, Berthold. 2003. On the efficient implementation of German verb placement
in HPSG. In Proceedings of RANLP 2003, 112–116. Borovets, Bulgaria.
Crysmann, Berthold. 2004. Underspecification of intersective modifier attachment: Some
arguments from German. In Stefan Müller (ed.), Proceedings of the 11th International
Conference on Head-Driven Phrase Structure Grammar, Center for Computational Lin-
guistics, Katholieke Universiteit Leuven, 378–392. Stanford, CA: CSLI Publications. http:
//csli-publications.stanford.edu/HPSG/2004/ (18 August, 2020).
Crysmann, Berthold. 2005a. An inflectional approach to Hausa final vowel shortening. In
Geert Booij & Jaap van Marle (eds.), Yearbook of morphology 2004, 73–112. Dordrecht:
Kluwer Academic Publishers.
Crysmann, Berthold. 2005b. Relative clause extraposition in German: An efficient and
portable implementation. Research on Language and Computation 1(3). 61–82.
Crysmann, Berthold. 2005c. Syncretism in German: A unified approach to underspecifi-
cation, indeterminacy, and likeness of case. In Stefan Müller (ed.), Proceedings of the
12th International Conference on Head-Driven Phrase Structure Grammar, Department
of Informatics, University of Lisbon, 91–107. Stanford, CA: CSLI Publications. http://csli-
publications.stanford.edu/HPSG/2005/ (18 August, 2020).
Crysmann, Berthold. 2008. An asymmetric theory of peripheral sharing in HPSG: Con-
junction reduction and coordination of unlikes. In Gerhard Jäger, Paola Monachesi,
Gerald Penn & Shuly Wintner (eds.), Proceedings of Formal Grammar 2003, Vienna,
Austria, 47–62. Stanford, CA: CSLI Publications.
Crysmann, Berthold. 2009. Autosegmental representations in an HPSG of Hausa. In
Tracy Holloway King & Marianne Santaholma (eds.), Proceedings of the 2009 Work-
shop on Grammar Engineering Across Frameworks (GEAF 2009), 28–36. Suntec, Singa-
744
pore: Association for Computational Linguistics. https://www.aclweb.org/anthology/
events/ws-2009/#w09-26 (18 August, 2020).
Crysmann, Berthold. 2011. A unified account of Hausa genitive constructions. In Philippe
de Groote, Markus Egg & Laura Kallmeyer (eds.), Formal Grammar: 14th International
Conference, FG 2009, Bordeaux, France, July 25–26, 2009, revised selected papers (Lecture
Notes in Artificial Intelligence 5591), 102–117. Berlin: Springer Verlag.
Crysmann, Berthold. 2012. HaG: A computational grammar of Hausa. In Michael R.
Marlo, Nikki B. Adams, Christopher R. Green, Michelle Morrison & Tristan M. Purvis
(eds.), Selected proceedings of the 42nd Annual Conference on African Linguistics (ACAL
42), 321–337. Somerville, MA: Cascadilla Press. http://www.lingref.com/cpp/acal/42/
paper2780.pdf (18 August, 2020).
Crysmann, Berthold. 2013. On the locality of complement clause and relative clause ex-
traposition. In Gert Webelhuth, Manfred Sailer & Heike Walker (eds.), Rightward move-
ment in a comparative perspective (Linguistik Aktuell/Linguistics Today 200), 369–396.
Amsterdam: John Benjamins Publishing Co.
Crysmann, Berthold. 2016. Representing morphological tone in a computational gram-
mar of Hausa. Journal of Language Modelling 3(2). 463–512.
Culicover, Peter W. 1999. Syntactic nuts: Hard cases, syntactic theory, and language ac-
quisition. foundations of syntax. Vol. 1 (Oxford Linguistics). Oxford: Oxford University
Press.
Culicover, Peter W. & Ray S. Jackendoff. 2005. Simpler Syntax. Oxford: Oxford University
Press. DOI: 10.1093/acprof:oso/9780199271092.001.0001.
Culy, Christopher. 1985. The complexity of the vocabulary of Bambara. Linguistics and
Philosophy 8(3). 345–351. DOI: 10.1007/BF00630918.
Curtiss, Susan. 1977. Genie: A psycholinguistic study of a modern-day “wild child”. New
York: Academic Press.
Dąbrowska, Ewa. 2001. From formula to schema: The acquisition of English questions.
Cognitive Linguistics 11(1–2). 83–102.
Dąbrowska, Ewa. 2004. Language, mind and brain: Some psychological and neurological
constraints on theories of grammar. Washington, D.C.: Georgetown University Press.
Dahl, Östen. 1980. Some arguments for higher nodes in syntax: A reply to Hudson’s
‘constituency and dependency’. Linguistics 18(5–6). 485–488. DOI: 10.1515/ling.1980.18.
5-6.485.
Dahl, Östen & Viveka Velupillai. 2013a. Perfective/imperfective aspect. In Matthew S.
Dryer & Martin Haspelmath (eds.), The world atlas of language structures online.
Leipzig: Max Planck Institute for Evolutionary Anthropology. http : / / wals . info /
chapter/65 (18 August, 2020).
Dahl, Östen & Viveka Velupillai. 2013b. The past tense. In Matthew S. Dryer & Martin
Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck
Institute for Evolutionary Anthropology. http : / / wals . info / chapter / 66 (18 August,
2020).
Dahllöf, Mats. 2002. Token dependency semantics and the paratactic analysis of inten-
sional constructions. Journal of Semantics 19(4). 333–368.
745
References
Dahllöf, Mats. 2003. Two reports on computational syntax and semantics. Reports from
Uppsala University (RUUL) 36. Department of Linguistics. http : / / stp . ling . uu . se /
~matsd/pub/ruul36.pdf (18 August, 2020).
Dalrymple, Mary. 1993. The syntax of anaphoric binding (CSLI Lecture Notes 36). Stan-
ford, CA: CSLI Publications.
Dalrymple, Mary (ed.). 1999. Semantics and syntax in Lexical Functional Grammar: The
Resource Logic approach (Language, Speech, and Communication). Cambridge, MA:
MIT Press.
Dalrymple, Mary. 2001. Lexical Functional Grammar (Syntax and Semantics 34). New
York, NY: Academic Press.
Dalrymple, Mary. 2006. Lexical Functional Grammar. In Keith Brown (ed.), The encyclo-
pedia of language and linguistics, 2nd edn., 82–94. Oxford: Elsevier Science Publisher
B.V. (North-Holland).
Dalrymple, Mary, Ronald M. Kaplan & Tracy Holloway King. 2001. Weak crossover and
the absence of traces. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the
LFG 2001 conference, 66–82. Stanford, CA: CSLI Publications. http://csli-publications.
stanford.edu/LFG/6/ (18 August, 2020).
Dalrymple, Mary, Ronald M. Kaplan & Tracy Holloway King. 2004. Linguistic general-
izations over descriptions. In Miriam Butt & Tracy Holloway King (eds.), Proceedings
of the LFG 2004 conference, 199–208. Stanford, CA: CSLI Publications. http : / / csli -
publications.stanford.edu/LFG/9/ (18 August, 2020).
Dalrymple, Mary, Ronald M. Kaplan, John T. Maxwell III & Annie Zaenen (eds.). 1995.
Formal issues in Lexical-Functional Grammar (CSLI Lecture Notes 47). Stanford, CA:
CSLI Publications.
Dalrymple, Mary, John Lamping & Vijay Saraswat. 1993. LFG semantics via constraints.
In Steven Krauwer, Michael Moortgat & Louis des Tombe (eds.), Sixth Conference of
the European Chapter of the Association for Computational Linguistics. Proceedings of
the conference, 97–105. Uetrecht: Association for Computational Linguistics. DOI: 10.
3115/976744.976757.
Dalrymple, Mary, Maria Liakata & Lisa Mackie. 2006. Tokenization and morphologi-
cal analysis for Malagasy. Computational Linguistics and Chinese Language Processing
11(4). 315–332.
Dalrymple, Mary & Helge Lødrup. 2000. The grammatical functions of complement
clauses. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2000
conference. Stanford, CA: CSLI Publications. http : / / csli - publications . stanford . edu /
LFG/5/pdfs/lfg00dalrympl-lodrup.pdf (18 August, 2020).
Davidson, Donald. 1967. The logical form of action sentences. In Nicholas Rescher (ed.),
The logic of decision and action, 81–95. Pittsburg: Pittsburg University Press.
Davis, Anthony R. 1996. Lexical semantics and linking in the hierarchical lexicon. Stanford
University. (Doctoral dissertation).
Davis, Anthony R. & Jean-Pierre Koenig. 2000. Linking as constraints on word classes
in a hierarchical lexicon. Language 76(1). 56–91. DOI: 10.1353/lan.2000.0068.
746
De Kuthy, Kordula. 2000. Discontinuous NPs in German — A case study of the interaction of
syntax, semantics and pragmatics. Saarbrücken: Universität des Saarlandes. (Doctoral
dissertation).
De Kuthy, Kordula. 2001. Splitting PPs from NPs. In Walt Detmar Meurers & Tibor Kiss
(eds.), Constraint-based approaches to Germanic syntax (Studies in Constraint-Based
Lexicalism 7), 31–76. Stanford, CA: CSLI Publications.
De Kuthy, Kordula. 2002. Discontinuous NPs in German (Studies in Constraint-Based Lex-
icalism 14). Stanford, CA: CSLI Publications.
De Kuthy, Kordula, Vanessa Metcalf & Walt Detmar Meurers. 2004. Documentation of
the implementation of the Milca English Resource Grammar in the Trale system. Ohio
State University, ms.
De Kuthy, Kordula & Walt Detmar Meurers. 2001. On partial constituent fronting in
German. Journal of Comparative Germanic Linguistics 3(3). 143–205.
De Kuthy, Kordula & Walt Detmar Meurers. 2003a. Dealing with optional complements
in HPSG-based grammar implementations. In Stefan Müller (ed.), Proceedings of the
10th International Conference on Head-Driven Phrase Structure Grammar, Michigan
State University, East Lansing, 88–96. Stanford, CA: CSLI Publications. http : / / csli -
publications.stanford.edu/HPSG/2003/ (18 August, 2020).
De Kuthy, Kordula & Walt Detmar Meurers. 2003b. The secret life of focus exponents,
and what it tells us about fronted verbal projections. In Stefan Müller (ed.), Proceedings
of the 10th International Conference on Head-Driven Phrase Structure Grammar, Michi-
gan State University, East Lansing, 97–110. Stanford, CA: CSLI Publications. http://csli-
publications.stanford.edu/HPSG/2003/ (18 August, 2020).
de Saussure, Ferdinand. 1916a. Cours de linguistique générale (Bibliothèque Scientifique
Payot). Edited by Charles Bally and Albert Sechehaye. Paris: Payot.
de Saussure, Ferdinand. 1916b. Grundfragen der allgemeinen Sprachwissenschaft. 2nd edi-
tion 1967. Berlin: Walter de Gruyter & Co.
de Alencar, Leonel. 2004. Complementos verbais oracionais – uma análise léxico-funci-
onal. Lingua(gem) 1(1). 173–218.
de Alencar, Leonel. 2013. BrGram: uma gramática computacional de um fragmento do
português brasileiro no formalismo da LFG. In Proceedings of the 9th Brazilian Sympo-
sium in Information and Human Language Technology. Fortaleza, Ceará, Brazil, October
20–24, 183–188. Fortaleza, Ceará: Sociedade Brasileira de Computação. http://www.
aclweb.org/anthology/W13-4823 (18 August, 2020).
de Alencar, Leonel Figueiredo. 2015. A Passiva em português como construção predica-
tiva adjetival: evidência morfológica e implementação computacional em LFG/XLE
[passive as adjective predicative construction in Portuguese: Morphological evidence
and implementation in LFG/XLE]. Estudos da Língua(gem) 13(2). 35–57. http://www.
estudosdalinguagem.org/index.php/estudosdalinguagem/article/view/471 (28 Febru-
ary, 2018).
de Alencar, Leonel Figueiredo. 2017. A computational implementation of periphrastic
verb constructions in French. Alfa: Revista de Linguística (São José do Rio Preto) 61(2).
437–466. http://seer.fclar.unesp.br/alfa/article/view/8537/6749 (28 February, 2018).
747
References
Delmonte, Rodolfo. 1990. Semantic parsing with an LFG-based lexicon and conceptual
representations. Computers and the Humanities 24(5–6). 461–488.
Demberg, Vera & Frank Keller. 2008. A psycholinguistically motivated version of TAG. In
Proceedings of the 9th International Workshop on Tree Adjoining Grammars and Related
Formalisms TAG+9, 25–32. Tübingen.
Demske, Ulrike. 2001. Merkmale und Relationen: Diachrone Studien zur Nominalphrase
des Deutschen (Studia Linguistica Germanica 56). Berlin: Walter de Gruyter Verlag.
den Besten, Hans. 1983. On the interaction of root transformations and lexical deletive
rules. In Werner Abraham (ed.), On the formal syntax of the Westgermania: Papers
from the 3rd Groningen Grammar Talks, Groningen, January 1981 (Linguistik Aktuell/
Linguistics Today 3), 47–131. Amsterdam: John Benjamins Publishing Co.
den Besten, Hans. 1985. Some Remarks on the Ergative Hypothesis. In Werner Abraham
(ed.), Erklärende Syntax des Deutschen (Studien zur deutschen Grammatik 25), 53–74.
Tübingen: originally Gunter Narr Verlag now Stauffenburg Verlag.
Deppermann, Arnulf. 2006. Construction Grammar – eine Grammatik für die Interak-
tion? In Arnulf Deppermann, Reinhard Fiehler & Thomas Spranz-Fogasy (eds.), Gram-
matik und Interaktion, 43–65. Radolfzell: Verlag für Gesprächsforschung.
Derbyshire, Desmond C. 1979. Hixkaryana (Lingua Descriptive Series 1). Amsterdam:
North Holland.
Devlin, Keith. 1992. Logic and information. Cambridge: Cambridge University Press.
Dhonnchadha, E. Uí & Josef van Genabith. 2006. A part-of-speech tagger for Irish using
finite-state morphology and Constraint Grammar disambiguation. In Proceedings of
lrec’06, 2241–2244.
Diesing, Molly. 1992. Indefinites. Cambridge, MA: MIT Press.
Dione, Cheikh Mouhamadou Bamba. 2013. Handling Wolof Clitics in LFG. In Christine
Meklenborg Salvesen & Hans Petter Helland (eds.), Challenging clitics (Linguistik Ak-
tuell/Linguistics Today 206), 87–118. Amsterdam: John Benjamins Publishing Co.
Dione, Cheikh Mouhamadou Bamba. 2014. An LFG approach to Wolof cleft construc-
tions. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2014 con-
ference, 157–176. Stanford, CA: CSLI Publications.
Dipper, Stefanie. 2003. Implementing and documenting large-scale grammars – German
LFG. Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS), Volume
9, Number 1. IMS, University of Stuttgart. (Doctoral dissertation).
Donati, Caterina. 2006. On wh-head-movement. In Lisa Lai-Shen Cheng & Norbert
Corver (eds.), Wh-movement: Moving on (Current Studies in Linguistics 42), 21–46.
Cambridge, MA: MIT Press.
Donohue, Cathryn & Ivan A. Sag. 1999. Domains in Warlpiri. In Sixth International Con-
ference on HPSG–Abstracts. 04–06 August 1999, 101–106. Edinburgh.
Doran, Christine, Beth Ann Hockey, Anoop Sarkar, Bangalore Srinivas & Fei Xia. 2000.
Evolution of the XTAG system. In Anne Abeillé & Owen Rambow (eds.), Tree Adjoining
Grammars: formalisms, linguistic analysis and processing (CSLI Lecture Notes 156), 371–
403. Stanford, CA: CSLI Publications.
748
Dörre, Jochen & Michael Dorna. 1993. CUF: A formalism for linguistic knowledge repre-
sentation. DYANA 2 deliverable R.1.2A. Stuttgart, Germany: IMS.
Dörre, Jochen & Roland Seiffert. 1991. A formalism for natural language — STUF. In
Otthein Herzog & Claus-Rainer Rollinger (eds.), Text understanding in LILOG (Lecture
Notes in Artificial Intelligence 546), 29–38. Berlin: Springer Verlag.
Dowty, David. 1997. Non-constituent coordination, wrapping, and Multimodal Catego-
rial Grammars: Syntactic form as logical form. In Maria Luisa Dalla Chiara, Kees Doets,
Daniele Mundici & Johan Van Benthem (eds.), Structures and norms in science (Syn-
these Library 260), 347–368. Springer Verlag. DOI: 10.1007/978-94-017-0538-7_21.
Dowty, David R. 1978. Governed transformations as lexical rules in a Montague Gram-
mar. Linguistic Inquiry 9(3). 393–426.
Dowty, David R. 1979. Word meaning and Montague Grammar (Synthese Language Li-
brary 7). Dordrecht: D. Reidel Publishing Company.
Dowty, David R. 1988. Type raising, functional composition, and nonconstituent coordi-
nation. In Richard Oehrle, Emmon Bach & Deirdre Wheeler (eds.), Categorial Gram-
mars and natural language structures, 153–198. Dordrecht: D. Reidel Publishing Com-
pany.
Dowty, David R. 1989. On the semantic content of the notion ‘thematic role’. In Gennaro
Chierchia, Barbara H. Partee & Raymond Turner (eds.), Properties, types and meaning,
vol. 2 (Studies in Linguistics and Philosophy), 69–130. Dordrecht: Kluwer Academic
Publishers.
Dowty, David R. 1991. Thematic proto-roles and argument selection. Language 67(3).
547–619.
Dowty, David R. 2003. The dual analysis of adjuncts and complements in Categorial
Grammar. In Ewald Lang, Claudia Maienborn & Cathrine Fabricius-Hansen (eds.),
Modifying adjuncts (Interface Explorations 4), 33–66. Berlin: Mouton de Gruyter.
Dras, Mark, François Lareau, Benjamin Börschinger, Robert Dale, Yasaman Motazedi,
Owen Rambow, Myfany Turpin & Morgan Ulinski. 2012. Complex predicates in Ar-
rernte. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2012 con-
ference, 177–197. Stanford, CA: CSLI Publications.
Drellishak, Scott. 2009. Widespread but not universal: Improving the typological coverage
of the Grammar Matrix. University of Washington. (Doctoral dissertation).
Drosdowski, Günther. 1984. Duden: Grammatik der deutschen Gegenwartssprache.
4th edn. Vol. 4. Mannheim, Wien, Zürich: Dudenverlag.
Drosdowski, Günther. 1995. Duden: Die Grammatik. 5th edn. Vol. 4. Mannheim, Leipzig,
Wien, Zürich: Dudenverlag.
Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language 68(1). 81–
138.
Dryer, Matthew S. 1997. Are grammatical relations universal? In Joan Bybee, John
Haiman & Sandra Thompson (eds.), Essays on language function and language type:
Dedicated to T. Givón, 115–143. Amsterdam: John Benjamins Publishing Co.
Dryer, Matthew S. 2013a. Order of adposition and noun phrase. In Matthew S. Dryer
& Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig:
749
References
Max Planck Institute for Evolutionary Anthropology. http : / / wals . info / chapter / 85
(18 August, 2020).
Dryer, Matthew S. 2013b. Order of object and verb. In Matthew S. Dryer & Martin Haspel-
math (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute
for Evolutionary Anthropology. http://wals.info/chapter/83 (18 August, 2020).
Dryer, Matthew S. 2013c. Order of subject, object and verb. In Matthew S. Dryer & Martin
Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck
Institute for Evolutionary Anthropology. http : / / wals . info / chapter / 81 (18 August,
2020).
Dürscheid, Christa. 1989. Zur Vorfeldbesetzung in deutschen Verbzweit-Strukturen (FOKUS
1). Trier: Wissenschaftlicher Verlag.
Dürscheid, Christa. 2003. Syntax: Grundlagen und Theorien. 2nd edn. (Studienbücher zur
Linguistik 3). Westdeutscher Verlag.
Dyvik, Helge, Paul Meurer & Victoria Rosén. 2005. LFG, Minimal Recursion Semantics
and translation. Paper presented at the LFG conference 2005.
Egg, Markus. 1999. Derivation and resolution of ambiguities in wieder-sentences. In Paul
J. E. Dekker (ed.), Proceedings of the 12th Amsterdam Colloquium, 109–114.
Eisele, Andreas & Jochen Dörre. 1986. A Lexical Functional Grammar system in Prolog.
In Makoto Nagao (ed.), Proceedings of COLING 86, 551–553. University of Bonn: Asso-
ciation for Computational Linguistics. https://www.aclweb.org/anthology/C86-1129
(14 August, 2019).
Eisenberg, Peter. 1992. Platos Problem und die Lernbarkeit der Syntax. In Peter Suchsland
(ed.), Biologische und soziale Grundlagen der Sprache (Linguistische Arbeiten 280), 371–
378. Tübingen: Max Niemeyer Verlag.
Eisenberg, Peter. 1994a. German. In Ekkehard König & Johan van der Auwera (eds.),
The Germanic languages (Routledge Language Family Descriptions), 349–387. London:
Routledge.
Eisenberg, Peter. 1994b. Grundriß der deutschen Grammatik. 3rd edn. Stuttgart, Weimar:
Verlag J. B. Metzler.
Eisenberg, Peter. 2004. Grundriß der deutschen Grammatik. 2nd edn. Vol. 2. Der Satz.
Stuttgart, Weimar: Verlag J. B. Metzler.
Eisenberg, Peter, Jörg Peters, Peter Gallmann, Cathrine Fabricius-Hansen, Damaris
Nübling, Irmhild Barz, Thomas A. Fritz & Reinhard Fiehler. 2005. Duden: Die Gram-
matik. 7th edn. Vol. 4. Mannheim, Leipzig, Wien, Zürich: Dudenverlag.
Ellefson, Michelle R. & Morten Christiansen. 2000. Subjacency constraints without Uni-
versal Grammar: Evidence from artificial language learning and connectionist mod-
eling. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society,
645–650. Mahwah, NJ: Lawrence Erlbaum Associates.
Elman, Jeffrey L. 1993. Learning and development in neural networks: The importance
of starting small. Cognition 48(1). 71–99.
Elman, Jeffrey L., Elizabeth A. Bates, Mark H. Johnson, Annette Karmiloff-Smith,
Domenico Parisi & Kim Plunkett. 1996. Rethinking innateness: A connectionist perspec-
tive on development. Cambridge, MA: Bradford Books/MIT Press.
750
Embick, David. 2004. On the structure of resultative participles in English. Linguistic
Inquiry 35(3). 355–392.
Emirkanian, Louisette, Lyne Da Sylva & Lorne H. Bouchard. 1996. The implementation
of a computational grammar of French using the Grammar Development Environ-
ment. In Jun-ichi Tsuji (ed.), Proceedings of COLING-96. 16th International Conference
on Computational Linguistics (COLING96). Copenhagen, Denmark, August 5–9, 1996,
1024–1027. Copenhagen, Denmark: Association for Computational Linguistics.
Engdahl, Elisabet & Enric Vallduví. 1996. Information packaging in HPSG. In Claire
Grover & Enric Vallduví (eds.), Edinburgh Working Papers in Cognitive Science, vol. 12:
Studies in HPSG, chap. 1, 1–32. Edinburgh: Centre for Cognitive Science, University of
Edinburgh. ftp://ftp.cogsci.ed.ac.uk/pub/CCS-WPs/wp-12.ps.gz (20 March, 2018).
Engel, Ulrich. 1970. Regeln zur Wortstellung. Forschungsberichte des Instituts für deu-
tsche Sprache 5. Mannheim: Institut für deutsche Sprache. 3–148.
Engel, Ulrich. 1977. Syntax der deutschen Gegenwartssprache. Vol. 22 (Grundlagen der
Germanistik). Berlin: Erich Schmidt Verlag.
Engel, Ulrich. 1996. Tesnière mißverstanden. In Gertrud Gréciano & Helmut Schumacher
(eds.), Lucien Tesnière – Syntaxe Structurale et Opèrations Mentales. Akten des deutsch-
französischen Kolloquiums anläßlich der 100. Wiederkehr seines Geburtstages. Stras-
bourg 1993 (Linguistische Arbeiten 348), 53–61. Tübingen: Max Niemeyer Verlag.
Engel, Ulrich. 2014. Die dependenzielle Verbgrammatik (DVG). In Jörg Hagemann & Sven
Staffeldt (eds.), Syntaxtheorien: Analysen im Vergleich (Stauffenburg Einführungen 28),
43–62. Tübingen: Stauffenburg Verlag.
Erbach, Gregor. 1995. ProFIT: prolog with features, inheritance and templates. In Steven
P. Abney & Erhard W. Hinrichs (eds.), Proceedings of the Seventh Conference of the
European Chapter of the Association for Computational Linguistics, 180–187. Dublin:
Association for Computational Linguistics.
Ernst, Thomas. 1992. The phrase structure of English negation. The Linguistic Review
9(2). 109–144.
Eroms, Hans-Werner. 1985. Eine reine Dependenzgrammatik für das Deutsche. Deutsche
Sprache 13. 306–326.
Eroms, Hans-Werner. 1987. Passiv und Passivfunktionen im Rahmen einer Dependenz-
grammatik. In Centre de Recherche en Linguistique Germanique (Nice) (ed.), Das Pas-
siv im Deutschen (Linguistische Arbeiten 183), 73–95. Tübingen: Max Niemeyer Verlag.
Eroms, Hans-Werner. 2000. Syntax der deutschen Sprache (de Gruyter Studienbuch). Ber-
lin: Walter de Gruyter Verlag.
Eroms, Hans-Werner & Hans Jürgen Heringer. 2003. Dependenz und lineare Ordnung. In
Vilmos Ágel, Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen
Heringer & Henning Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein
internationales Handbuch der zeitgenössischen Forschung / An international handbook
of contemporary research, vol. 1 (Handbücher zur Sprach- und Kommunikationswis-
senschaft 25), 247–263. Berlin: Walter de Gruyter.
751
References
Eroms, Hans-Werner, Gerhard Stickel & Gisela Zifonun (eds.). 1997. Grammatik der deut-
schen Sprache. Vol. 7 (Schriften des Instituts für deutsche Sprache). Berlin: Walter de
Gruyter.
Erteschik-Shir, Nomi. 1973. On the nature of island constraints. Cambridge, MA: MIT.
(Doctoral dissertation).
Erteschik-Shir, Nomi. 1981. More on extractability from quasi-NPs. Linguistic Inquiry
12(4). 665–670.
Erteschik-Shir, Nomi & Shalom Lappin. 1979. Dominance and the functional explanation
of island phenomena. Theoretical Linguistics 6(1–3). 41–86.
Estigarribia, Bruno. 2009. Facilitation by variation: right-to-left learning of English
yes/no questions. Cognitive Science 34(1). 68–93.
Evans, Nicholas & Stephen C. Levinson. 2009a. The myth of language universals: Lan-
guage diversity and its importance for cognitive science. The Behavioral and Brain
Sciences 32(5). 429–448.
Evans, Nicholas & Stephen C. Levinson. 2009b. With diversity in mind: freeing the lan-
guage sciences from Universal Grammar. The Behavioral and Brain Sciences 32(5). 472–
492.
Evans, Roger. 1985. ProGram: A development tool for GPSG grammars. Linguistics 23(2).
213–244.
Everett, Daniel L. 2005. Cultural constraints on grammar and cognition in Pirahã. Current
Anthropology 46(4). 621–646.
Everett, Daniel L. 2009. Pirahã culture and grammar: A response to some criticisms.
Language 85(2). 405–442.
Everett, Daniel L. 2012. What does Pirahã grammar have to teach us about human lan-
guage and the mind? Cognitive Science 3(6). 555–563. DOI: 10.1002/wcs.1195.
Evers, Arnold. 1975. The transformational cycle in Dutch and German. University of
Utrecht. (Doctoral dissertation).
Faaß, Gertrud. 2010. A morphosyntactic description of Northern Sotho as a basis for an auto-
mated translation from Northern Sotho into English. Pretoria, South Africa: University
of Pretoria. (Doctoral dissertation). http : / / hdl . handle . net / 2263 / 28569 (18 August,
2020).
Fabregas, Antonio, Tom Stroik & Michael Putnam. 2016. Is simplest merge too simple?
Ms. Penn State University.
Falk, Yehuda N. 1984. The English auxiliary system: A Lexical-Functional analysis. Lan-
guage 60(3). 483–509.
Fan, Zhenzhen, Sanghoun Song & Francis Bond. 2015. An HPSG-based shared-grammar
for the Chinese languages: ZHONG [|]. In Emily M. Bender, Lori Levin, Stefan Müller,
Yannick Parmentier & Aarne Ranta (eds.), Proceedings of the Grammar Engineering
Across Frameworks (GEAF) Workshop, 17–24. The Association for Computational Lin-
guistics.
Fang, Ji & Tracy Holloway King. 2007. An LFG Chinese grammar for machine use. In
Tracy Holloway King & Emily M. Bender (eds.), Grammar Engineering across Frame-
works 2007 (Studies in Computational Linguistics ONLINE), 144–160. Stanford, CA:
752
CSLI Publications. http : / / csli - publications . stanford . edu / GEAF / 2007/ (18 August,
2020).
Fanselow, Gisbert. 1981. Zur Syntax und Semantik der Nominalkomposition (Linguistische
Arbeiten 107). Tübingen: Max Niemeyer Verlag.
Fanselow, Gisbert. 1987. Konfigurationalität (Studien zur deutschen Grammatik 29). Tü-
bingen: originally Gunter Narr Verlag now Stauffenburg Verlag.
Fanselow, Gisbert. 1988. Aufspaltung von NPn und „das“ Problem der ‚freien‘ Wortstel-
lung. Linguistische Berichte 114. 91–113.
Fanselow, Gisbert. 1990. Scrambling as NP-movement. In Günther Grewendorf & Wolf-
gang Sternefeld (eds.), Scrambling and Barriers (Linguistik Aktuell/Linguistics Today
5), 113–140. Amsterdam: John Benjamins Publishing Co.
Fanselow, Gisbert. 1992a. „Ergative“ Verben und die Struktur des deutschen Mittelfelds.
In Ludger Hoffmann (ed.), Deutsche Syntax: Ansichten und Aussichten (Institut für deut-
sche Sprache, Jahrbuch 1991), 276–303. Berlin: de Gruyter.
Fanselow, Gisbert. 1992b. Zur biologischen Autonomie der Grammatik. In Peter Suchs-
land (ed.), Biologische und soziale Grundlagen der Sprache (Linguistische Arbeiten 280),
335–356. Tübingen: Max Niemeyer Verlag.
Fanselow, Gisbert. 1993. Die Rückkehr der Basisgenerierer. Groninger Arbeiten zur Ger-
manistischen Linguistik 36. 1–74.
Fanselow, Gisbert. 2000a. Does constituent length predict German word order in the
Middle Field? In Josef Bayer & Christine Römer (eds.), Von der Philologie zur Gram-
matiktheorie: Peter Suchsland zum 65. Geburtstag, 63–77. Tübingen: Max Niemeyer
Verlag.
Fanselow, Gisbert. 2000b. Optimal exceptions. In Barbara Stiebels & Dieter Wunderlich
(eds.), The lexicon in focus (studia grammatica 45), 173–209. Berlin: Akademie Verlag.
Fanselow, Gisbert. 2001. Features, θ-roles, and free constituent order. Linguistic Inquiry
32(3). 405–437.
Fanselow, Gisbert. 2002. Against remnant VP-movement. In Artemis Alexiadou, Elena
Anagnostopoulou, Sjef Barbiers & Hans-Martin Gärtner (eds.), Dimensions of move-
ment: From features to remnants (Linguistik Aktuell/Linguistics Today 48), 91–127. Am-
sterdam: John Benjamins Publishing Co.
Fanselow, Gisbert. 2003a. Free constituent order: A Minimalist interface account. Folia
Linguistica 37(1–2). 191–231.
Fanselow, Gisbert. 2003b. Münchhausen-style head movement and the analysis of verb
second. In Anoop Mahajan (ed.), Proceedings of the workshop on head movement (UCLA
Working Papers in Linguistics 10). Los Angeles: UCLA, Linguistics Department.
Fanselow, Gisbert. 2003c. Zur Generierung der Abfolge der Satzglieder im Deutschen.
Neue Beiträge zur Germanistik 112. 3–47.
Fanselow, Gisbert. 2004a. Cyclic phonology-syntax-interaction: PPT Movement in Ger-
man (and other languages). In Shinichiro Ishihara, Michaela Schmitz & Anne Schwarz
(eds.), Interdisciplinary studies on information structure (Working Papers of the SFB 632
1), 1–42. Potsdam: Universitätsverlag.
Fanselow, Gisbert. 2004b. Fakten, Fakten, Fakten! Linguistische Berichte 200. 481–493.
753
References
Fanselow, Gisbert. 2004c. Münchhausen-style head movement and the analysis of verb
second. In Ralf Vogel (ed.), Three papers on German verb movement (Linguistics in
Potsdam 22), 9–49. Universität Potsdam.
Fanselow, Gisbert. 2006. On pure syntax (uncontaminated by information structure). In
Patrick Brandt & Eric Fuss (eds.), Form, structure and grammar: A festschrift presented
to Günther Grewendorf on occasion of his 60th birthday (Studia grammatica 63), 137–
157. Berlin: Akademie Verlag.
Fanselow, Gisbert. 2009. Die (generative) Syntax in den Zeiten der Empiriediskussion.
Zeitschrift für Sprachwissenschaft 28(1). 133–139.
Fanselow, Gisbert & Sascha W. Felix. 1987. Sprachtheorie 2. Die Rektions- und Bindungsthe-
orie (UTB für Wissenschaft: Uni-Taschenbücher 1442). Tübingen: A. Francke Verlag
GmbH.
Fanselow, Gisbert, Matthias Schlesewsky, Damir Cavar & Reinhold Kliegl. 1999. Optimal
parsing, syntactic parsing preferences, and Optimality Theory. Rutgers Optimality
Archive (ROA) 367. Universität Potsdam. http://roa.rutgers.edu/view.php3?roa=367
(18 August, 2020).
Feldhaus, Anke. 1997. Fragen über Fragen: Eine HPSG-Analyse ausgewählter Phänomene
des deutschen w-Fragesatzes. Working Papers of the Institute for Logic and Linguistics
27. IBM Scientific Center Heidelberg: Institute for Logic & Linguistics.
Feldman, Jerome. 1972. Some decidability results on grammatical inference and complex-
ity. Information and Control 20(3). 244–262.
Fillmore, Charles J. 1968. The case for case. In Emmon Bach & Robert T. Harms (eds.),
Universals of linguistic theory, 1–88. New York: Holt, Rinehart, & Winston.
Fillmore, Charles J. 1971. Plädoyer für Kasus. In Werner Abraham (ed.), Kasustheorie
(Schwerpunkte Linguistik und Kommunikationswissenschaft 2), 1–118. Frank-
furt/Main: Athenäum.
Fillmore, Charles J. 1988. The mechanisms of “Construction Grammar”. In Shelley Ax-
maker, Annie Jaisser & Helen Singmaster (eds.), Proceedings of the 14th Annual Meeting
of the Berkeley Linguistics Society, 35–55. Berkeley, CA: Berkeley Linguistics Society.
Fillmore, Charles J. 1999. Inversion and Constructional inheritance. In Gert Webelhuth,
Jean-Pierre Koenig & Andreas Kathol (eds.), Lexical and Constructional aspects of lin-
guistic explanation (Studies in Constraint-Based Lexicalism 1), 113–128. Stanford, CA:
CSLI Publications.
Fillmore, Charles J., Paul Kay & Mary Catherine O’Connor. 1988. Regularity and id-
iomaticity in grammatical constructions: The case of let alone. Language 64(3). 501–
538.
Fillmore, Charles J., Russell R. Lee-Goldmann & Russell Rhomieux. 2012. The FrameNet
constructicon. In Hans C. Boas & Ivan A. Sag (eds.), Sign-Based Construction Grammar
(CSLI Lecture Notes 193), 309–372. Stanford, CA: CSLI Publications.
Fischer, Ingrid & Martina Keil. 1996. Parsing decomposable idioms. In Jun-ichi Tsuji
(ed.), Proceedings of COLING-96. 16th International Conference on Computational Lin-
guistics (COLING96). Copenhagen, Denmark, August 5–9, 1996, 388–393. Copenhagen,
Denmark: Association for Computational Linguistics.
754
Fischer, Kerstin & Anatol Stefanowitsch (eds.). 2006. Konstruktionsgrammatik: Von der
Anwendung zur Theorie (Stauffenburg Linguistik 40). Tübingen: Stauffenburg Verlag.
Fisher, Simon E. & Gary F. Marcus. 2005. The eloquent ape: genes, brains and the evolu-
tion of language. Nature Reviews Genetics 7(1). 9–20.
Fisher, Simon E., Faraneh Vargha-Khadem, Kate E. Watkins, Anthony P. Monaco & Mar-
cus E. Pembrey. 1998. Localisation of a gene implicated in a severe speech and language
disorder. Nature Genetics 18(2). 168–170.
Fitch, W. Tecumseh. 2010. Three meanings of “recursion”: key distinctions for biolinguis-
tics. In Richard K. Larson, Viviane Déprez & Hiroko Yamakido (eds.), The evolution of
human language: biolinguistic perspectives (Approaches to the Evolution of Language
2), 73–90. Cambridge, UK: Cambridge University Press.
Fitch, W. Tecumseh, Marc D. Hauser & Noam Chomsky. 2005. The evolution of the lan-
guage faculty: clarifications and implications. Cognition 97(2). 179–210.
Flickinger, Dan. 2000. On building a more efficient grammar by exploiting types. Natural
Language Engineering 6(1). Special Issue on Efficient Processing with HPSG: Methods,
Systems, Evaluation, 15–28. DOI: 10.1017/S1351324900002370.
Flickinger, Dan, Tom Wasow & Carl Pollard. 2020. The evolution of HPSG. In Stefan
Müller, Anne Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven
Phrase Structure Grammar: The handbook (Empirically Oriented Theoretical Morphol-
ogy and Syntax). To appear. Berlin: Language Science Press.
Flickinger, Daniel P. 1983. Lexical heads and phrasal gaps. In Proceedings of the West Coast
Conference on Formal Linguistics, vol. 2. Stanford University Linguistics Dept.
Flickinger, Daniel P. 1987. Lexical rules in the hierarchical lexicon. Stanford University.
(Doctoral dissertation).
Flickinger, Daniel P. 2008. Transparent heads. In Stefan Müller (ed.), Proceedings of the
15th International Conference on Head-Driven Phrase Structure Grammar, 87–94. Stan-
ford, CA: CSLI Publications. http://csli-publications.stanford.edu/HPSG/2008/abstr-
flickinger.shtml (18 August, 2020).
Flickinger, Daniel P. & Emily M. Bender. 2003. Compositional semantics in a multilin-
gual grammar resource. In Emily M. Bender, Daniel P. Flickinger, Frederik Fouvry &
Melanie Siegel (eds.), Proceedings of the ESSLLI 2003 Workshop “Ideas and Strategies for
Multilingual Grammar Development”, 33–42. Vienna, Austria.
Flickinger, Daniel P., Ann Copestake & Ivan A. Sag. 2000. HPSG analysis of English. In
Wolfgang Wahlster (ed.), Verbmobil: Foundations of speech-to-speech translation (Arti-
ficial Intelligence), 254–263. Berlin: Springer Verlag.
Flickinger, Daniel P., Carl J. Pollard & Thomas Wasow. 1985. Structure-sharing in lexi-
cal representation. In William C. Mann (ed.), Proceedings of the Twenty-Third Annual
Meeting of the Association for Computational Linguistics, 262–267. Chicago, IL. DOI:
10.3115/981210.981242.
Fodor, Janet Dean. 1998a. Parsing to learn. Journal of Psycholinguistic Research 27(3). 339–
374.
Fodor, Janet Dean. 1998b. Unambiguous triggers. Linguistic Inquiry 29(1). 1–36.
755
References
Fodor, Janet Dean. 2001. Parameters and the periphery: reflections on syntactic nuts.
Journal of Linguistics 37. 367–392.
Fodor, Jerry A., Thomas G. Bever & Merrill F. Garrett. 1974. The psychology of language:
An introduction to psycholinguistics and Generative Grammar. New York: McGraw-Hill
Book Co.
Fokkens, Antske. 2011. Metagrammar engineering: Towards systematic exploration of
implemented grammars. In Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics: Human Language Technologies, 1066–1076. Portland,
Oregon, USA: Association for Computational Linguistics. http://www.aclweb.org/
anthology/P11-1107 (20 February, 2018).
Fokkens, Antske, Laurie Poulson & Emily M. Bender. 2009. Inflectional morphology in
Turkish VP coordination. In Stefan Müller (ed.), Proceedings of the 16th International
Conference on Head-Driven Phrase Structure Grammar, University of Göttingen, Ger-
many, 110–130. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/
HPSG/2009/ (18 August, 2020).
Fong, Sandiway. 1991. Computational properties of principle-based grammatical theories.
MIT Artificial Intelligence Lab. (Doctoral dissertation).
Fong, Sandiway. 2014. Unification and efficient computation in the Minimalist Program.
In L. Francis & L. Laurent (eds.), Language and recursion, 129–138. Berlin: Springer
Verlag.
Fong, Sandiway & Jason Ginsburg. 2012. Computation with doubling constituents:
Pronouns and antecedents in Phase Theory. In Anna Maria Di Sciullo (ed.),
Towards a Biolinguistic understanding of grammar: Essays on interfaces (Linguistik Ak-
tuell/Linguistics Today 194), 303–338. Amsterdam: John Benjamins Publishing Co.
Fordham, Andrew & Matthew Walter Crocker. 1994. Parsing with principles and proba-
bilities. In Judith L. Klavans Philip Resnik (ed.), The balancing act: combining symbolic
and statistical approaches to language. Las Cruces, New Mexico, USA: Association for
Computational Linguistics.
Forst, Martin. 2006. COMP in (parallel) grammar writing. In Miriam Butt & Tracy Hol-
loway King (eds.), Proceedings of the LFG 2006 conference. Stanford, CA: CSLI Publica-
tions. http://csli- publications.stanford.edu/LFG/11/pdfs/lfg06forst.pdf (18 August,
2020).
Forst, Martin & Christian Rohrer. 2009. Problems of German VP coordination. In Miriam
Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2009 conference, 297–316.
Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/LFG/14/ (18 Au-
gust, 2020).
Fortmann, Christian. 1996. Konstituentenbewegung in der DP-Struktur: Zur funktionalen
Analyse der Nominalphrase im Deutschen (Linguistische Arbeiten 347). Tübingen: Max
Niemeyer Verlag.
Fourquet, Jean. 1957. Review of: Heinz Anstock: Deutsche Syntax – Lehr- und Übungs-
buch. Wirkendes Wort 8. 120–122.
756
Fourquet, Jean. 1970. Prolegomena zu einer deutschen Grammatik (Sprache der Gegenwart
– Schriften des Instituts für deutsche Sprache in Mannheim 7). Düsseldorf: Pädagogis-
cher Verlag Schwann.
Fouvry, Frederik. 2003. Lexicon acquisition with a large-coverage unification-based
grammar. In Proceedings of EACL 03, 10th Conference of the European Chapter of the
Association for Computational Linguistics, research notes and demos, April 12–17, 2003,
Budapest, Hungary, 87–90.
Fraj, Fériel Ben, Chiraz Zribi & Mohamed Ben Ahmed. 2008. ArabTAG: A Tree Adjoin-
ing Grammar for Arabic syntactic structures. In Proceedings of the International Arab
Conference on Information Technology. Sfax, Tunisia.
Frank, Anette. 1994. Verb second by lexical rule or by underspecification. Arbeitspapiere
des SFB 340 No. 43. Heidelberg: IBM Deutschland GmbH. ftp://ftp.ims.uni-stuttgart.
de/pub/papers/anette/v2-usp.ps.gz (20 March, 2018).
Frank, Anette. 1996. Eine LFG-Grammatik des Französischen. In Deutsche und französis-
che Syntax im Formalismus der LFG (Linguistische Arbeiten 344), 97–244. Tübingen:
Max Niemeyer Verlag.
Frank, Anette. 2006. (Discourse-) functional analysis of asymmetric coordination. In
Miriam Butt, Mary Dalrymple & Tracy Holloway King (eds.), Intelligent linguistic ar-
chitectures: Variations on themes by Ronald M. Kaplan, 259–285. Stanford, CA: CSLI
Publications.
Frank, Anette & Uwe Reyle. 1995. Principle based semantics for HPSG. In Steven P. Abney
& Erhard W. Hinrichs (eds.), Proceedings of the Seventh Conference of the European
Chapter of the Association for Computational Linguistics, 9–16. Dublin: Association for
Computational Linguistics.
Frank, Anette & Annie Zaenen. 2002. Tense in LFG: Syntax and morphology. In Hans
Kamp & Uwe Reyle (eds.), How we say when it happens: Contributions to the theory
of temporal reference in natural language, 17–52. Reprint as: Frank & Zaenen (2004).
Tübingen: Max Niemeyer Verlag.
Frank, Anette & Annie Zaenen. 2004. Tense in LFG: Syntax and morphology. In Louisa
Sadler & Andrew Spencer (eds.), Projecting morphology, 23–66. Stanford, CA: CSLI
Publications.
Frank, Robert. 2002. Phrase structure composition and syntactic dependencies (Current
Studies in Linguistics 38). Cambridge, MA/London: MIT Press.
Franks, Steven. 1995. Parameters in Slavic morphosyntax. New York, Oxford: Oxford Uni-
versity Press.
Frazier, Lyn. 1985. Syntactic complexity. In David R. Dowty, Lauri Karttunen & Arnold
M. Zwicky (eds.), Natural language processing, 129–189. Cambridge, UK: Cambridge
University Press.
Frazier, Lyn & Charles Jr. Clifton. 1996. Construal. Cambridge, MA: MIT Press.
Freidin, Robert. 1975. The analysis of passives. Language 51(2). 384–405.
Freidin, Robert. 1997. Review article: The Minimalist Program. Language 73(3). 571–582.
Freidin, Robert. 2009. A note on methodology in linguistics. The Behavioral and Brain
Sciences 32(5). 454–455.
757
References
Freudenthal, Daniel, Julian M. Pine, Javier Aguado-Orea & Fernand Gobet. 2007.
Modeling the developmental patterning of finiteness marking in English, Dutch, Ger-
man, and Spanish using MOSAIC. Cognitive Science 31(2). 311–341. DOI: 10 . 1080 /
15326900701221454.
Freudenthal, Daniel, Julian M. Pine & Fernand Gobet. 2006. Modeling the development
of children’s use of optional infinitives in Dutch and English using MOSAIC. Cognitive
Science 30(2). 277–310. DOI: 10.1207/s15516709cog0000_47.
Freudenthal, Daniel, Julian M. Pine & Fernand Gobet. 2009. Simulating the referential
properties of Dutch, German, and English root infinitives in MOSAIC. Language Learn-
ing and Development 5(1). 1–29. DOI: 10.1080/15475440802502437.
Frey, Werner. 1993. Syntaktische Bedingungen für die semantische Interpretation: Über
Bindung, implizite Argumente und Skopus (studia grammatica 35). Berlin: Akademie
Verlag.
Frey, Werner. 2000. Über die syntaktische Position der Satztopiks im Deutschen. In
Ewald Lang, Marzena Rochon, Kerstin Schwabe & Oliver Teuber (eds.), Issues on topics
(ZAS Papers in Linguistics 20), 137–172. Berlin: ZAS, Humboldt-Universität zu Berlin.
Frey, Werner. 2001. About the whereabouts of indefinites. Theoretical Linguistics 27(2/3).
Special Issue: NP Interpretation and Information Structure, Edited by Klaus von
Heusinger and Kerstin Schwabe, 137–161. DOI: 10.1515/thli.2001.27.2-3.137.
Frey, Werner. 2004a. A medial topic position for German. Linguistische Berichte 198. 153–
190.
Frey, Werner. 2004b. The grammar-pragmatics interface and the German prefield. For-
schungsprogramm Sprache und Pragmatik 52. Germanistisches Institut der Univer-
sität Lund. 1–39.
Frey, Werner. 2005. Pragmatic properties of certain German and English left peripheral
constructions. Linguistics 43(1). 89–129.
Frey, Werner & Hans-Martin Gärtner. 2002. On the treatment of scrambling and ad-
junction in Minimalist Grammars. In Gerhard Jäger, Paola Monachesi, Gerald Penn
& Shuly Wintner (eds.), Proceedings of Formal Grammar 2002, 41–52. Trento.
Frey, Werner & Uwe Reyle. 1983a. A Prolog implementation of Lexical Functional Gram-
mar as a base for a natural language processing system. In Antonio Zampolli (ed.),
First Conference of the European Chapter of the Association for Computational Linguis-
tics: Proceedings of the conference, 52–57. Pisa, Italy: Association for Computational
Linguistics. https://www.aclweb.org/anthology/events/eacl-1983/ (18 August, 2020).
Frey, Werner & Uwe Reyle. 1983b. Lexical Functional Grammar und Diskursrepräsen-
tationstheorie als Grundlagen eines sprachverarbeitenden Systems. Linguistische Be-
richte 88. 79–100.
Fried, Mirjam. 2013. Principles of constructional change. In Thomas Hoffmann & Graeme
Trousdale (eds.), The Oxford handbook of Construction Grammar (Oxford Handbooks),
419–437. Oxford: Oxford University Press.
Fried, Mirjam. 2015. Construction Grammar. In Tibor Kiss & Artemis Alexiadou (eds.),
Syntax – theory and analysis: An international handbook, vol. 42.3 (Handbooks of Lin-
758
guistics and Communication Science), 974–1003. Berlin: Mouton de Gruyter. DOI: 10.
1515/9783110363685.
Friederici, Angela D. 2009. Pathways to language: fiber tracts in the human brain. Trends
in Cognitive Sciences 13(4). 175–181.
Friedman, Joyce. 1969. Applications of a computer system for Transformational Gram-
mar. In Research Group for Quantitative Linguistics (ed.), Proceedings of COLING 69,
1–27.
Friedman, Joyce, Thomas H. Bredt, Robert W. Doran, Bary W. Pollack & Theodore S.
Martner. 1971. A computer model of Transformational Grammar (Mathematical Lin-
guistics and Automatic Language Processing 9). New York: Elsevier.
Fries, Norbert. 1988. Über das Null-Topik im Deutschen. Forschungsprogramm Sprache
und Pragmatik 3. Germanistisches Institut der Universität Lund. 19–49.
Fukui, Naoki & Margaret Speas. 1986. Specifiers and projection. In N. Fukui, T. R.
Rapoport & E. Sagey (eds.), Papers in theoretical linguistics (MIT Working Papers 8),
128–172. Cambridge, MA: MIT.
Futrell, Richard, Laura Stearns, Daniel L. Everett, Steven T. Piantadosi & Edward Gibson.
2016. A corpus investigation of syntactic embedding in pirahã. PLoS ONE 11(3). 1–20.
DOI: 10.1371/journal.pone.0145289.
Gaifman, Haim. 1965. Dependency systems and phrase-structure systems. Information
and Control 8(3). 304–397. DOI: 10.1016/S0019-9958(65)90232-9.
Gallmann, Peter. 2003. Grundlagen der deutschen Grammatik. Lecture notes Friedrich-
Schiller-Universität Jena. http://www.syntax-theorie.de (18 August, 2020).
Gardner, R. Allen. 1957. Probability-learning with two and three choices. The American
Journal of Psychology 70(2). 174–185.
Gärtner, Hans-Martin & Jens Michaelis. 2007. Some remarks on locality conditions and
Minimalist Grammars. In Uli Sauerland & Hans-Martin Gärtner (eds.), Interfaces + re-
cursion = language? Chomsky’s Minimalism and the view from syntax-semantics (Stud-
ies in Generative Grammar 89), 161–195. Berlin: Mouton de Gruyter.
Gärtner, Hans-Martin & Markus Steinbach. 1997. Anmerkungen zur Vorfeldphobie
pronominaler Elemente. In Franz-Josef d’Avis & Uli Lutz (eds.), Zur Satzstruktur im
Deutschen (Arbeitspapiere des SFB 340 No. 90), 1–30. Tübingen: Eberhard-Karls-Uni-
versität Tübingen.
Gazdar, Gerald. 1981a. On syntactic categories. Philosophical Transactions of the Royal
Society of London. Series B, Biological Sciences 295(1077). 267–283.
Gazdar, Gerald. 1981b. Unbounded dependencies and coordinate structure. Linguistic In-
quiry 12(2). 155–184.
Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum & Ivan A. Sag. 1985. Generalized Phrase
Structure Grammar. Cambridge, MA: Harvard University Press.
Gazdar, Gerald, Geoffrey K. Pullum, Bob Carpenter, Ewan Klein, Thomas E. Hukari &
Robert D. Levine. 1988. Category structures. Computational Linguistics 14(1). 1–19.
Geach, Peter Thomas. 1970. A program for syntax. Synthese 22. 3–17.
759
References
Geißler, Stefan & Tibor Kiss. 1994. Erläuterungen zur Umsetzung einer HPSG im Basis-
formalismus STUF III . Tech. rep. 19. Heidelberg: IBM Informationssysteme GmbH –
Institut für Logik und Linguistik (Verbundvorhaben Verbmobil).
Gerdes, Kim. 2002a. DTAG? In Proceedings of the Sixth International Workshop on Tree
Adjoining Grammar and Related Frameworks (TAG+6), 242–251. Universitá di Venezia.
Gerdes, Kim. 2002b. Topologie et grammaires formelles de l’allemand. Ecole doctorale
Science du langage, UFR de linguistique, Université Paris 7. (Doctoral dissertation).
Gerdes, Kim & Sylvain Kahane. 2001. Word order in German: A formal Dependency
Grammar using a topological hierarchy. In Proceedings of the 39th Annual Meeting on
Association for Computational Linguistics, 220–227. Toulouse, France: Association for
Computational Linguistics. DOI: 10.3115/1073012.1073041.
Gerken, LouAnn. 1991. The metrical basis for children’s subjectless sentences. Journal of
Memory and Language 30. 431–451.
Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies. Cogni-
tion 68(1). 1–76.
Gibson, Edward & James Thomas. 1999. Memory limitations and structural forgetting:
The perception of complex ungrammatical sentences as grammatical. Language and
Cognitive Processes 14(3). 225–248.
Gibson, Edward & Kenneth Wexler. 1994. Triggers. Linguistic Inquiry 25(3). 407–454.
Ginzburg, Jonathan & Ivan A. Sag. 2000. Interrogative investigations: The form, meaning,
and use of English interrogatives (CSLI Lecture Notes 123). Stanford, CA: CSLI Publica-
tions.
Godard, Danièle & Pollet Samvelian. 2020. Complex predicates. In Stefan Müller, Anne
Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure
Grammar: The handbook (Empirically Oriented Theoretical Morphology and Syntax),
357–419. To appear. Berlin: Language Science Press.
Gold, Mark E. 1967. Language identification in the limit. Information and Control 10(5).
447–474.
Goldberg, Adele E. 1995. Constructions: A Construction Grammar approach to argument
structure (Cognitive Theory of Language and Culture). Chicago: The University of
Chicago Press.
Goldberg, Adele E. 2003a. Constructions: A new theoretical approach to language. Trends
in Cognitive Sciences 7(5). 219–224.
Goldberg, Adele E. 2003b. Words by default: The Persian Complex Predicate Construc-
tion. In Elaine J. Francis & Laura A. Michaelis (eds.), Mismatch: form-function incon-
gruity and the architecture of grammar (CSLI Lecture Notes 163), 117–146. Stanford,
CA: CSLI Publications.
Goldberg, Adele E. 2006. Constructions at work: The nature of generalization in language
(Oxford Linguistics). Oxford: Oxford University Press.
Goldberg, Adele E. 2009. Constructions work. [response]. Cognitive Linguistics 20(1).
201–224.
Goldberg, Adele E. 2013a. Argument structure Constructions vs. lexical rules or deriva-
tional verb templates. Mind and Language 28(4). 435–465. DOI: 10.1111/mila.12026.
760
Goldberg, Adele E. 2013b. Explanation and Constructions: Response to Adger. Mind and
Language 28(4). 479–491. DOI: 10.1111/mila.12028.
Goldberg, Adele E. 2014. Fitting a slim dime between the verb template and argument
structure construction approaches. Theoretical Linguistics 40(1–2). 113–135.
Goldberg, Adele E., Devin Casenhiser & Nitya Sethuraman. 2004. Learning argument
structure generalizations. Cognitive Linguistics 15(3). 289–316.
Goldberg, Adele E. & Ray S. Jackendoff. 2004. The English resultative as a family of
Constructions. Language 80(3). 532–568.
Gopnik, Myrna & Martha B. Cargo. 1991. Familial aggregation of a developmental lan-
guage disorder. Cognition 39(1). l–50.
Gordon, Peter. 1985. Level ordering in lexical development. Cognition 21(2). 73–93. DOI:
10.1016/0010-0277(85)90046-0.
Gosch, Angela, Gabriele Städing & Rainer Pankau. 1994. Linguistic abilities in children
with Williams-Beuren Syndrome. American Journal of Medical Genetics 52(3). 291–296.
Götz, Thilo, Walt Detmar Meurers & Dale Gerdemann. 1997. The ConTroll manual: (Con-
Troll v.1.0 beta, XTroll v.5.0 beta). User’s Manual. Universität Tübingen: Seminar für
Sprachwissenschaft. http://www.sfs.uni-tuebingen.de/controll/code.html (18 August,
2020).
Grebe, Paul & Helmut Gipper. 1966. Duden: Grammatik der deutschen Gegenwartssprache.
2nd edn. Vol. 4. Mannheim, Wien, Zürich: Dudenverlag.
Green, Georgia M. 2011. Modelling grammar growth: Universal Grammar without in-
nate principles or parameters. In Robert D. Borsley & Kersti Börjars (eds.), Non-
transformational syntax: Formal and explicit models of grammar: A guide to current
models, 378–403. Oxford, UK/Cambridge, MA: Blackwell Publishers Ltd.
Grewendorf, Günther. 1983. Reflexivierungen in deutschen A.c.I.-Konstruktionen – kein
transformationsgrammatisches Dilemma mehr. Groninger Arbeiten zur Germanisti-
schen Linguistik 23. 120–196.
Grewendorf, Günther. 1985. Anaphern bei Objekt-Koreferenz im Deutschen: Ein Pro-
blem für die Rektions-Bindungs-Theorie. In Werner Abraham (ed.), Erklärende Syntax
des Deutschen (Studien zur deutschen Grammatik 25), 137–171. Tübingen: originally
Gunter Narr Verlag now Stauffenburg Verlag.
Grewendorf, Günther. 1987. Kohärenz und Restrukturierung: Zu verbalen Komplexen
im Deutschen. In Brigitte Asbach-Schnitker & Johannes Roggenhofer (eds.), Neuere
Forschungen zur Wortbildung und Histographie: Festgabe für Herbert E. Brekle zum 50.
Geburtstag (Tübinger Beiträge zur Linguistik 284), 123–144. Tübingen: Gunter Narr
Verlag.
Grewendorf, Günther. 1988. Aspekte der deutschen Syntax: Eine Rektions-Bindungs-Ana-
lyse (Studien zur deutschen Grammatik 33). Tübingen: originally Gunter Narr Verlag
now Stauffenburg Verlag.
Grewendorf, Günther. 1989. Ergativity in German (Studies in Generative Grammar 35).
Dordrecht: Foris Publications. DOI: 10.1515/9783110859256.
Grewendorf, Günther. 1993. German: A grammatical sketch. In Joachim Jacobs, Arnim
von Stechow, Wolfgang Sternefeld & Theo Vennemann (eds.), Syntax – Ein interna-
761
References
tionales Handbuch zeitgenössischer Forschung, vol. 9.2 (Handbücher zur Sprach- und
Kommunikationswissenschaft), 1288–1319. Berlin: Walter de Gruyter Verlag. DOI: 10.
1515/9783110095869.1.
Grewendorf, Günther. 2002. Minimalistische Syntax (UTB für Wissenschaft: Uni-
Taschenbücher 2313). Tübingen, Basel: A. Francke Verlag GmbH.
Grewendorf, Günther. 2009. The left clausal periphery: Clitic left dislocation in Italian
and left dislocation in German. In Benjamin Shear, Philippa Helen Cook, Werner Frey
& Claudia Maienborn (eds.), Dislocated elements in discourse: Syntactic, semantic, and
pragmatic perspectives (Routledge Studies in Germanic Linguistics), 49–94. New York:
Routledge.
Grimshaw, Jane. 1986. Subjacency and the S/S′ Parameter. Linguistic Inquiry 17(2). 364–
369.
Grimshaw, Jane. 1997. Projections, heads, and optimality. Linguistic Inquiry 28. 373–422.
Grinberg, Dennis, John D. Lafferty & Daniel Dominic Sleator. 1995. A robust parsing
algorithm for Link Grammars. In Proceedings of the Fourth International Workshop on
Parsing Technologies. Also as Carnegie Mellon University Computer Science Technical
Report CMU-CS-95-125. http://arxiv.org/abs/cmp-lg/9508003 (18 August, 2020).
Groos, Anneke & Henk van Riemsdijk. 1981. Matching effects in free relatives: A param-
eter of core grammar. In A. Belletti, L. Brandi & L. Rizzi (eds.), Theory of markedness
in Generative Grammar, 171–216. Pisa: Scuola Normale Superiore.
Groß, Thomas M. & Timothy Osborne. 2009. Toward a practical Dependency Grammar
theory of discontinuities. SKY Journal of Linguistics 22. 43–90.
Groß, Thomas Michael. 2003. Dependency Grammar’s limits – and ways of extending
them. In Vilmos Ágel, Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans
Jürgen Heringer & Henning Lobin (eds.), Dependenz und Valenz / Dependency and va-
lency: Ein internationales Handbuch der zeitgenössischen Forschung / An international
handbook of contemporary research, vol. 1 (Handbücher zur Sprach- und Kommunika-
tionswissenschaft 25), 331–351. Berlin: Walter de Gruyter.
Grosu, Alexander. 1973. On the status of the so-called Right Roof Constraint. Language
49(2). 294–311.
Grover, Claire, John Carroll & Ted J. Briscoe. 1993. The Alvey Natural Language Tools
grammar (4th release). Technical Report 284. Computer Laboratory, Cambridge Uni-
versity, UK.
Grubačić, Emilija. 1965. Untersuchungen zur Frage der Wortstellung in der deutschen
Prosadichtung der letzten Jahrzehnte. Zagreb: Philosophische Fakultät. (Doctoral dis-
sertation).
Gruber, Jeffrey. 1965. Studies in lexical relations. MIT. (Doctoral dissertation).
Gunji, Takao. 1986. Subcategorization and word order. In William J. Poser (ed.), Papers
from the Second International Workshop on Japanese Syntax, 1–21. Stanford, CA: CSLI
Publications.
Günther, Carsten, Claudia Maienborn & Andrea Schopp. 1999. The processing of infor-
mation structure. In Peter Bosch & Rob van der Sandt (eds.), Focus: Linguistic, cognitive,
and computational perspectives (Studies in Natural Language Processing), 18–42. Rev.
762
papers orig. presented at a conference held 1994, Schloss Wolfsbrunnen, Germany.
Cambridge, UK: Cambridge University Press.
Guo, Yuqing, Haifeng Wang & Josef van Genabith. 2007. Recovering non-local depen-
dencies for Chinese. In Proceedings of the Joint Conference on Empirical Methods in
Natural Language Processing and Natural Language Learning, (EMNLP-CoNLL 2007),
257–266. Prague, Czech Republic: Association for Computational Linguistics.
Guzmán Naranjo, Matías. 2015. Unifying everything: Integrating quantitative effects into
formal models of grammar. In Proceedings of the 6th Conference on Quantitative Inves-
tigations in Theoretical Linguistics. Tübingen. DOI: 10.15496/publikation-8636.
Haddar, Kais, Sirine Boukedi & Ines Zalila. 2010. Construction of an HPSG grammar for
the Arabic language and its specification in TDL. International Journal on Information
and Communication Technologies 3(3). 52–64.
Haegeman, Liliane. 1994. Introduction to Government and Binding Theory. 2nd edn. (Black-
well Textbooks in Linguistics 1). Oxford: Blackwell Publishers Ltd.
Haegeman, Liliane. 1995. The syntax of negation. Cambridge, UK: Cambridge University
Press.
Haftka, Brigitta. 1995. Syntactic positions for topic and contrastive focus in the Ger-
man middlefield. In Inga Kohlhof, Susanne Winkler & Hans-Bernhard Drubig (eds.),
Proceedings of the Göttingen Focus Workshop, 17 DGfS, March 1–3 (Arbeitspapiere des
SFB 340 No. 69), 137–157. Eberhard-Karls-Universität Tübingen.
Haftka, Brigitta. 1996. Deutsch ist eine V/2-Sprache mit Verbendstellung und freier Wort-
folge. In Ewald Lang & Gisela Zifonun (eds.), Deutsch – typologisch (Institut für deu-
tsche Sprache, Jahrbuch 1995), 121–141. Berlin: Walter de Gruyter. DOI: 10 . 1515 /
9783110622522-007.
Hagen, Kristin, Janne Bondi Johannessen & Anders Nøklestad. 2000. A constraint-based
tagger for Norwegian. In C.-E. Lindberg & S. N. Lund (eds.), 17th Scandinavian Confer-
ence of Linguistic, Odense, vol. I (Odense Working Papers in Language and Communi-
cation 19), 1–15.
Hahn, Michael. 2011. Null conjuncts and bound pronouns in Arabic. In Stefan Müller
(ed.), Proceedings of the 18th International Conference on Head-Driven Phrase Structure
Grammar, University of Washington, 60–80. Stanford, CA: CSLI Publications. http://
csli-publications.stanford.edu/HPSG/2011/ (18 August, 2020).
Haider, Hubert. 1982. Dependenzen und Konfigurationen: Zur deutschen V-Projektion.
Groninger Arbeiten zur Germanistischen Linguistik 21. 1–60.
Haider, Hubert. 1984. Was zu haben ist und was zu sein hat – Bemerkungen zum Infinitiv.
Papiere zur Linguistik 30(1). 23–36.
Haider, Hubert. 1985a. Der Rattenfängerei muß ein Ende gemacht werden. Wiener Lin-
guistische Gazette 35–36. 28–50.
Haider, Hubert. 1985b. The case of German. In Jindřich Toman (ed.), Studies in German
grammar (Studies in Generative Grammar 21), 23–64. Dordrecht: Foris Publications.
Haider, Hubert. 1985c. Über sein oder nicht sein: Zur Grammatik des Pronomens sich.
In Werner Abraham (ed.), Erklärende Syntax des Deutschen (Studien zur deutschen
763
References
Grammatik 25), 223–254. Tübingen: originally Gunter Narr Verlag now Stauffenburg
Verlag.
Haider, Hubert. 1986a. Fehlende Argumente: Vom Passiv zu kohärenten Infinitiven. Lin-
guistische Berichte 101. 3–33.
Haider, Hubert. 1986b. Nicht-sententiale Infinitive. Groninger Arbeiten zur Germanisti-
schen Linguistik 28. 73–114.
Haider, Hubert. 1990a. Pro-bleme? In Gisbert Fanselow & Sascha W. Felix (eds.), Struk-
turen und Merkmale syntaktischer Kategorien (Studien zur deutschen Grammatik 39),
121–143. Tübingen: originally Gunter Narr Verlag now Stauffenburg Verlag.
Haider, Hubert. 1990b. Topicalization and other puzzles of German syntax. In Günther
Grewendorf & Wolfgang Sternefeld (eds.), Scrambling and Barriers (Linguistik Aktuell/
Linguistics Today 5), 93–112. Amsterdam: John Benjamins Publishing Co. DOI: 10.1075/
la.5.06hai.
Haider, Hubert. 1991. Fakultativ kohärente Infinitivkonstruktionen im Deutschen. Arbeits-
papiere des SFB 340 No. 17. Heidelberg: IBM Deutschland GmbH.
Haider, Hubert. 1993. Deutsche Syntax – generativ: Vorstudien zur Theorie einer projektiven
Grammatik (Tübinger Beiträge zur Linguistik 325). Tübingen: Gunter Narr Verlag.
Haider, Hubert. 1994. (Un-)heimliche Subjekte: Anmerkungen zur Pro-drop Causa, im
Anschluß an die Lektüre von Osvaldo Jaeggli & Kenneth J. Safir, eds., The Null Subject
Parameter. Linguistische Berichte 153. 372–385.
Haider, Hubert. 1995. Studies on phrase structure and economy. Arbeitspapiere des SFB
340 No. 70. Stuttgart: Universität Stuttgart.
Haider, Hubert. 1997a. Projective economy: On the minimal functional structure of the
German clause. In Werner Abraham & Elly van Gelderen (eds.), German: Syntactic
problems—Problematic syntax (Linguistische Arbeiten 374), 83–103. Tübingen: Max
Niemeyer Verlag.
Haider, Hubert. 1997b. Typological implications of a directionality constraint on projec-
tions. In Artemis Alexiadou & T. Alan Hall (eds.), Studies on Universal Grammar and ty-
pological variation (Linguistik Aktuell/Linguistics Today 13), 17–33. Amsterdam: John
Benjamins Publishing Co.
Haider, Hubert. 1999. The license to license: Structural case plus economy yields Burzio’s
Generalization. In Eric Reuland (ed.), Arguments and case: Explaining Burzio’s General-
ization (Linguistik Aktuell/Linguistics Today 34), 31–55. Amsterdam: John Benjamins
Publishing Co.
Haider, Hubert. 2000. OV is more basic than VO. In Peter Svenonius (ed.), The derivation
of VO and OV , 45–67. Amsterdam: John Benjamins Publishing Co.
Haider, Hubert. 2001. Parametrisierung in der Generativen Grammatik. In Martin Haspel-
math, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Sprachtypologie
und sprachliche Universalien – Language typology and language universals: Ein interna-
tionales Handbuch – An international handbook, 283–294. Berlin: Mouton de Gruyter.
Haider, Hubert. 2014. Scientific ideology in grammar theory. Ms. Universität Salzburg,
Dept. of Linguistics and Centre for Cognitive Neuroscience.
764
Haider, Hubert. 2015. ‘‘intelligent design” of grammars –a result of cognitive evolution. In
Aria Adli, Marco García García & Göz Kaufmann (eds.), Variation in language: System-
and usage-based approaches (linguae & litterae 50), 203–238. de Gruyter. DOI: 10.1515/
9783110346855-009.
Haider, Hubert. 2016. On predicting resultative adjective constructions. Ms. Universität
Salzburg.
Hajičová, Eva & Petr Sgall. 2003. Dependency Syntax in Functional Generative Descrip-
tion. In Vilmos Ágel, Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans
Jürgen Heringer & Henning Lobin (eds.), Dependenz und Valenz / Dependency and va-
lency: Ein internationales Handbuch der zeitgenössischen Forschung / An international
handbook of contemporary research, vol. 1 (Handbücher zur Sprach- und Kommunika-
tionswissenschaft 25), 570–592. Berlin: Walter de Gruyter.
Hakuta, Kenji, Ellen Bialystok & Edward Wiley. 2003. Critical evidence: A test of
the Critical-Period Hypothesis for second-language acquisition. Psychological Science
14(1). 31–38.
Hale, Kenneth. 1976. The adjoined relative clause in Australia. In R.M.W. Dixon (ed.),
Grammatical catgeories of Australian languages (Linguistic Series 22), 78–105. New
Jersey: Humanities Press.
Hale, Kenneth & Samuel Jay Keyser. 1993. On argument structure and the lexical expres-
sion of syntactic relations. In Kenneth Hale & Samuel Jay Keyser (eds.), The view from
building 20: Essays in linguistics in honor of Sylvain Bromberger (Current Studies in
Linguistics 24), 53–109. Cambridge, MA: MIT Press.
Hale, Kenneth & Samuel Jay Keyser. 1997. On the complex nature of simple predicators.
In Alex Alsina, Joan Bresnan & Peter Sells (eds.), Complex predicates (CSLI Lecture
Notes 64), 29–65. Stanford, CA: CSLI Publications.
Han, Chung-hye, Juntae Yoon, Nari Kim & Martha Palmer. 2000. A feature-based Lexi-
calized Tree Adjoining Grammar for Korean. Technical Report IRCS-00-04. University
of Pennsylvania Institute for Research in Cognitive Science. https://repository.upenn.
edu/ircs_reports/35/ (18 August, 2020).
Harbour, Daniel. 2011. Mythomania? Methods and morals from ‘The myth of language
universals’. Lingua 121(12). 1820–1830.
Harley, Heidi & Rolf Noyer. 2000. Formal versus encyclopedic properties of vocabulary:
Evidence from nominalizations. In Bert Peeters (ed.), The lexicon–encyclopedia inter-
face, 349–374. Amsterdam: Elsevier.
Harman, Gilbert. 1963. Generative grammars without transformation rules: A defence of
phrase structure. Language 39. 597–616.
Harris, Zellig S. 1957. Co-occurrence and transformation in linguistic structure. Language
33(3). 283–340.
Haspelmath, Martin. 2008. Parametric versus functional explanations of syntactic univer-
sals. In T. Biberauer (ed.), The limits of syntactic variation, 75–107. Amsterdam: John
Benjamins Publishing Co.
765
References
Haspelmath, Martin. 2009. The best-supported language universals refer to scalar pat-
terns deriving from processing costs. The Behavioral and Brain Sciences 32(5). 457–
458.
Haspelmath, Martin. 2010a. Comparative concepts and descriptive categories in crosslin-
guistic studies. Language 86(3). 663–687.
Haspelmath, Martin. 2010b. Framework-free grammatical theory. In Bernd Heine &
Heiko Narrog (eds.), The Oxford handbook of grammatical analysis, 341–365. Oxford:
Oxford University Press.
Haspelmath, Martin. 2010c. The interplay between comparative concepts and descriptive
categories (reply to Newmeyer). Language 86(3). 696–699.
Haugereid, Petter. 2007. Decomposed phrasal constructions. In Stefan Müller (ed.),
Proceedings of the 14th International Conference on Head-Driven Phrase Structure Gram-
mar, 120–129. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/
HPSG/2007/ (18 August, 2020).
Haugereid, Petter. 2009. Phrasal subconstructions: A Constructionalist grammar design,
exemplified with Norwegian and English. Norwegian University of Science & Technol-
ogy, Trondheim. (Doctoral dissertation).
Haugereid, Petter. 2017. Increasing grammar coverage through fine-grained lexical dis-
tinctions. In Victoria Rosén & Koenraad De Smedt (eds.), The very model of a modern
linguist — in honor of Helge Dyvik (Bergen Language and Linguistics Studies (BeLLS)
8), 97–111. University of Bergen. DOI: 10.15845/bells.v8i1.
Haugereid, Petter, Nurit Melnik & Shuly Wintner. 2013. Nonverbal predicates in Mod-
ern Hebrew. In Stefan Müller (ed.), Proceedings of the 20th International Conference
on Head-Driven Phrase Structure Grammar, Freie Universität Berlin, 69–89. Stanford,
CA: CSLI Publications. http://csli- publications.stanford.edu/HPSG/2013/hmw.pdf
(18 August, 2020).
Hauser, Marc D., Noam Chomsky & W. Tecumseh Fitch. 2002. The faculty of language:
What is it, who has it, and how did it evolve? Science 298(5598). 1569–1579. DOI: 10.
1126/science.298.5598.1569.
Hausser, Roland. 1992. Complexity in left-associative grammar. Theoretical Computer Sci-
ence 106(2). 283–308.
Hawkins, John A. 1999. Processing complexity and filler-gap dependencies across gram-
mars. Language 75(2). 244–285.
Hawkins, John A. 2004. Efficiency and complexity in grammars. Oxford: Oxford Univer-
sity Press.
Hays, David G. 1964. Dependency Theory: A formalism and some observations. Lan-
guage 40(4). 511–525.
Hays, David G. & T. W. Ziehe. 1960. Studies in machine translation: 10–Russian sentence-
structure determination. Tech. rep. Rand Corporation.
Heinecke, Johannes, Jürgen Kunze, Wolfgang Menzel & Ingo Schröder. 1998. Eliminative
parsing with graded constraints. In Pierre Isabelle (ed.), Proceedings of the 36th Annual
Meeting of the Association for Computational Linguistics and 17th International Confer-
766
ence on Computational Linguistics, 526–530. Montreal, Quebec, Canada: Association
for Computational Linguistics. DOI: 10.3115/980845.980953.
Heinz, Wolfgang & Johannes Matiasek. 1994. Argument structure and case assignment
in German. In John Nerbonne, Klaus Netter & Carl Pollard (eds.), German in Head-
Driven Phrase Structure Grammar (CSLI Lecture Notes 46), 199–236. Stanford, CA:
CSLI Publications.
Helbig, Gerhard & Joachim Buscha. 1969. Deutsche Grammatik: Ein Handbuch für den
Ausländerunterricht. Leipzig: VEB Verlag Enzyklopädie.
Helbig, Gerhard & Joachim Buscha. 1998. Deutsche Grammatik: Ein Handbuch für den
Ausländerunterricht. 18th edn. Leipzig Berlin München: Langenscheidt Verlag Enzyk-
lopädie.
Helbig, Gerhard & Wolfgang Schenkel. 1969. Wörterbuch zur Valenz und Distribution
deutscher Verben. Leipzig: VEB Bibliographisches Institut Leipzig.
Hellan, Lars. 1986. The headedness of NPs in Norwegian. In Peter Muysken & Henk
van Riemsdijk (eds.), Features and projections, 89–122. Dordrecht/Cinnaminson, U.S.A.:
Foris Publications.
Hellan, Lars. 2007. On ‘deep evaluation’ for individual computational grammars and
for cross-framework comparison. In Tracy Holloway King & Emily M. Bender (eds.),
Grammar Engineering across Frameworks 2007 (Studies in Computational Linguistics
ONLINE), 161–181. Stanford, CA: CSLI Publications. http://csli-publications.stanford.
edu/GEAF/2007/ (18 August, 2020).
Hellan, Lars & Dorothee Beermann. 2006. The ‘specifier’ in an HPSG grammar imple-
mentation of Norwegian. In S. Werner (ed.), Proceedings of the 15th NODALIDA Con-
ference, Joensuu 2005 (Ling@JoY: University of Joensuu electronic publications in lin-
guistics and language technology 1), 57–64. Joensuu: University of Joensuu.
Hellan, Lars & Petter Haugereid. 2003. Norsource – An excercise in the Matrix Grammar
building design. In Emily M. Bender, Daniel P. Flickinger, Frederik Fouvry & Melanie
Siegel (eds.), Proceedings of the ESSLLI 2003 Workshop “Ideas and Strategies for Multi-
lingual Grammar Development”. Vienna, Austria.
Hellwig, Peter. 1978. PLAIN – Ein Programmsystem zur Sprachbeschreibung und
maschinellen Sprachbearbeitung. Sprache und Datenverarbeitung 1(2). 16–31.
Hellwig, Peter. 1986. Dependency Unification Grammar. In Makoto Nagao (ed.),
Proceedings of COLING 86, 195–198. University of Bonn: Association for Computa-
tional Linguistics.
Hellwig, Peter. 2003. Dependency Unification Grammar. In Vilmos Ágel, Ludwig M.
Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning
Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein internationales Hand-
buch der zeitgenössischen Forschung / An international handbook of contemporary re-
search, vol. 1 (Handbücher zur Sprach- und Kommunikationswissenschaft 25), 593–
635. Berlin: Walter de Gruyter.
Hellwig, Peter. 2006. Parsing with Dependency Grammars. In Vilmos Ágel, Ludwig M.
Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning
Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein internationales Hand-
767
References
768
Höhle, Tilman N. 1978. Lexikalische Syntax: Die Aktiv-Passiv-Relation und andere Infinit-
konstruktionen im Deutschen (Linguistische Arbeiten 67). Tübingen: Max Niemeyer
Verlag.
Höhle, Tilman N. 1982. Explikationen für „normale Betonung“ und „normale Wortstel-
lung“. In Werner Abraham (ed.), Satzglieder im Deutschen – Vorschläge zur syntakti-
schen, semantischen und pragmatischen Fundierung (Studien zur deutschen Gramma-
tik 15), 75–153. Republished as Höhle (2018c). Tübingen: originally Gunter Narr Verlag
now Stauffenburg Verlag.
Höhle, Tilman N. 1983. Topologische Felder. Köln, ms, Published as Höhle (2018f).
Höhle, Tilman N. 1986. Der Begriff „Mittelfeld“: Anmerkungen über die Theorie der topol-
ogischen Felder. In Walter Weiss, Herbert Ernst Wiegand & Marga Reis (eds.), Akten
des VII. Kongresses der Internationalen Vereinigung für germanische Sprach- und Liter-
aturwissenschaft. Göttingen 1985. Band 3. Textlinguistik contra Stilistik? – Wortschatz
und Wörterbuch – Grammatische oder pragmatische Organisation von Rede? (Kontro-
versen, alte und neue 4), 329–340. Republished as Höhle (2018b). Tübingen: Max Nie-
meyer Verlag.
Höhle, Tilman N. 1988. Verum-Fokus. Netzwerk Sprache und Pragmatik 5. Republished
as Höhle (2018g). Lund: Universität Lund, Germananistisches Institut.
Höhle, Tilman N. 1991a. On reconstruction and coordination. In Hubert Haider & Klaus
Netter (eds.), Representation and derivation in the theory of grammar (Studies in Natu-
ral Language and Linguistic Theory 22), 139–197. Republished as Höhle (2018d). Dor-
drecht: Kluwer Academic Publishers.
Höhle, Tilman N. 1991b. Projektionsstufen bei V-Projektionen: Bemerkungen zu F/T. Ms.
Published as Höhle (2018e).
Höhle, Tilman N. 1997. Vorangestellte Verben und Komplementierer sind eine natürliche
Klasse. In Christa Dürscheid, Karl Heinz Ramers & Monika Schwarz (eds.), Sprache im
Fokus: Festschrift für Heinz Vater zum 65. Geburtstag, 107–120. Republished as Höhle
(2018h). Tübingen: Max Niemeyer Verlag.
Höhle, Tilman N. 1999. An architecture for phonology. In Robert D. Borsley & Adam
Przepiórkowski (eds.), Slavic in Head-Driven Phrase Structure Grammar, 61–90. Repub-
lished as Höhle (2018a). Stanford, CA: CSLI Publications.
Höhle, Tilman N. 2018a. An Architecture for Phonology. In Stefan Müller, Marga Reis
& Frank Richter (eds.), Beiträge zur Grammatik des Deutschen: Gesammelte Schriften
von Tilman N. Höhle (Classics in Linguistics 5), 571–607. Originally published as Höhle
(1999). Berlin: Language Science Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018b. Der Begriff „Mittelfeld“: Anmerkungen über die Theorie der
topologischen Felder. In Stefan Müller, Marga Reis & Frank Richter (eds.), Beiträge
zur Grammatik des Deutschen: Gesammelte Schriften von Tilman N. Höhle (Classics
in Linguistics 5), 279–294. First published as Höhle (1986). Berlin: Language Science
Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018c. Explikationen für „normale Betonung“ und „normale Wortstel-
lung“. In Stefan Müller, Marga Reis & Frank Richter (eds.), Beiträge zur Grammatik
769
References
des Deutschen: Gesammelte Schriften von Tilman N. Höhle (Classics in Linguistics 5),
107–191. Berlin: Language Science Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018d. On Reconstruction and Coordination. In Stefan Müller, Marga
Reis & Frank Richter (eds.), Beiträge zur Grammatik des Deutschen: Gesammelte Schrif-
ten von Tilman N. Höhle (Classics in Linguistics 5), 311–368. Berlin: Language Science
Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018e. Projektionsstufen bei V-Projektionen: Bemerkungen zu F/T. In
Stefan Müller, Marga Reis & Frank Richter (eds.), Beiträge zur Grammatik des Deut-
schen: Gesammelte Schriften von Tilman N. Höhle (Classics in Linguistics 5), 369–379.
First circulated in 1991. Berlin: Language Science Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018f. Topologische Felder. In Stefan Müller, Marga Reis & Frank Rich-
ter (eds.), Beiträge zur Grammatik des Deutschen: Gesammelte Schriften von Tilman N.
Höhle (Classics in Linguistics 5), 7–89. First circulated as draft in 1983. Berlin: Language
Science Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018g. Über Verum-Fokus im Deutschen. In Stefan Müller, Marga Reis
& Frank Richter (eds.), Beiträge zur Grammatik des Deutschen: Gesammelte Schriften
von Tilman N. Höhle (Classics in Linguistics 5), 381–416. Originally published as Höhle
(1988). Berlin: Language Science Press. DOI: 10.5281/zenodo.1145680.
Höhle, Tilman N. 2018h. Vorangestellte Verben und Komplementierer sind eine natür-
liche Klasse. In Stefan Müller, Marga Reis & Frank Richter (eds.), Beiträge zur Gram-
matik des Deutschen: Gesammelte Schriften von Tilman N. Höhle (Classics in Lingu-
istics 5), 417–433. First published as Höhle (1997). Berlin: Language Science Press. DOI:
10.5281/zenodo.1145680.
Holler, Anke. 2005. Weiterführende Relativsätze: Empirische und theoretische Aspekte (stu-
dia grammatica 60). Berlin: Akademie Verlag.
Hornstein, Norbert. 2013. Three grades of grammatical involvement: Syntax from a Min-
imalist perspective. Mind and Language 28(4). 392–420.
Hornstein, Norbert, Jairo Nunes & Kleantes K. Grohmann. 2005. Understanding Mini-
malism (Cambridge Textbooks in Linguistics). Cambridge, UK: Cambridge University
Press. DOI: 10.1017/CBO9780511840678.
Huddleston, Rodney, Geoffrey K. Pullum & Peter Peterson. 2002. Relative constructions
and unbounded dependencies. In Rodney Huddleston & Geoffrey K. Pullum (eds.), The
Cambridge grammar of the English language, 1031–1096. Cambridge, UK: Cambridge
University Press.
Hudson, Carla L. & Elissa L. Newport. 1999. Creolization: could adults really have done it
all? In Annabel Greenhill, Heather Littlefield & Cheryl Tano (eds.), Proceedings of the
Boston University Conference on Language Development, vol. 23, 265–276. Somerville,
MA: Cascadilla Press.
Hudson, Richard. 1980. Constituency and dependency. Linguistics 18. 179–198.
Hudson, Richard. 1984. Word grammar. Oxford: Basil Blackwell.
Hudson, Richard. 1988. Coordination and grammatical relations. Journal of Linguistics
24(2). 303–342.
770
Hudson, Richard. 1989. Towards a computer-testable Word Grammar of English. UCL
Working Papers in Linguistics 1. 321–339.
Hudson, Richard. 1990. English Word Grammar. Oxford: Basil Blackwell.
Hudson, Richard. 1991. English Word Grammar. Oxford: Basil Blackwell.
Hudson, Richard. 1997. German partial VP fronting. Ms. University College London. http:
//dickhudson.com/papers/ (20 February, 2018).
Hudson, Richard. 2000. Discontinuity. Dependency Grammars, TAL 41(1). 15–56.
Hudson, Richard. 2003. Mismatches in default inheritance. In Elaine J. Francis & Laura A.
Michaelis (eds.), Mismatch: form-function incongruity and the architecture of grammar
(CSLI Lecture Notes 163), 355–402. Stanford, CA: CSLI Publications.
Hudson, Richard. 2004. Are determiners heads? Functions of Language 11(1). 7–42.
Hudson, Richard. 2007. Language networks: The new Word Grammar. Oxford: Oxford
University Press.
Hudson, Richard. 2010a. An introduction to Word Grammar (Cambridge Textbooks in
Linguistics). Cambridge, UK: Cambridge University Press.
Hudson, Richard. 2010b. Reaction to: “The myth of language universals and cognitive
science”: On the choice between phrase structure and dependency structure. Lingua
120(12). 2676–2679.
Hudson, Richard. 2018. Pied piping in cognition. Journal of Linguistics 54(1). 85–138. DOI:
10.1017/S0022226717000056.
Hudson, Richard. 2020. HPSG and Dependency Grammar. In Stefan Müller, Anne
Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure
Grammar: The handbook (Empirically Oriented Theoretical Morphology and Syntax).
To appear. Berlin: Language Science Press.
Hudson Kam, Carla L. & Elissa L. Newport. 2005. Regularizing unpredictable variation:
The roles of adult and child learners in language formation and change. Language
Learning and Development 1. 151–195.
Humboldt, Wilhelm von. 1988. Gesammelte Werke. Berlin, New York: Walter de Gruyter.
Hurford, James R. 2002. Expression/induction models of language evolution: dimensions
and issues. In Ted J. Briscoe (ed.), Linguistic evolution through language acquisition,
301–344. Cambridge, UK: Cambridge University Press.
Hurskainen, Arvi. 2006. Constraint Grammar in unconventional use: Handling complex
Swahili idioms and proverbs. Suominen, Mickael et.al.: A Man of Measure: Festschrift
in Honour of Fred Karlsson on his 60th Birthday. Special Supplement to SKY Jounal of
Linguistics 19. 397–406.
Imrényi, András. 2013. Constituency or dependency? Notes on Sámuel Brassai’s syn-
tactic model of Hungarian. In Péter Szigetvári (ed.), Papers presented to László Varga
on his 70th birthday, 167–182. Budapest: Tinta.
Ingram, David & William Thompson. 1996. Early syntactic acquisition in German: Evi-
dence for the modal hypothesis. Language 72(1). 97–120.
Iordanskaja, L., M. Kim, R. Kittredge, B. Lavoie & A. Polguère. 1992. Generation of ex-
tended bilingual statistical reports. In Antonio Zampolli (ed.), 14th International Con-
771
References
772
Jacobson, Pauline. 1987b. Review of Generalized Phrase Structure Grammar. Linguistics
and Philosophy 10(3). 389–426.
Jaeggli, Osvaldo A. 1986. Passive. Linguistic Inquiry 17(4). 587–622.
Jäger, Gerhard & Reinhard Blutner. 2003. Competition and interpretation: The Ger-
man adverb wieder ‘again’. In Ewald Lang, Claudia Maienborn & Cathrine Fabricius-
Hansen (eds.), Modifying adjuncts (Interface Explorations 4), 393–416. Berlin: Mouton
de Gruyter.
Jäppinen, H., A. Lehtola & K. Valkonen. 1986. Functional structures for parsing depen-
dency constraints. In Makoto Nagao (ed.), Proceedings of COLING 86, 461–463. Bonn,
Germany: Association for Computational Linguistics. DOI: 10.3115/991365.991501.
Johnson, David E. & Shalom Lappin. 1997. A critique of the Minimalist Programm. Lin-
guistics and Philosophy 20(3). 273–333.
Johnson, David E. & Shalom Lappin. 1999. Local constraints vs. economy (Stanford Mono-
graphs in Linguistics). Stanford, CA: CSLI Publications.
Johnson, David E. & Paul M. Postal. 1980. Arc Pair Grammar. Princeton, NJ: Princeton
University Press.
Johnson, Jacqueline S. & Elissa L. Newport. 1989. Critical period effects in second lan-
guage learning: The influence of maturational state on the acquisition of English as a
second language. Cognitive Psychology 21(1). 60–99.
Johnson, Kent. 2004. Gold’s theorem and cognitive science. Philosophy of Science 71(4).
571–592.
Johnson, Mark. 1986. A GPSG account of VP structure in German. Linguistics 24(5). 871–
882.
Johnson, Mark. 1988. Attribute-value logic and the theory of grammar (CSLI Lecture Notes
16). Stanford, CA: CSLI Publications.
Johnson, Mark. 1989. Parsing as deduction: The use of knowledge of language. Journal
of Psycholinguistic Research 18(1). 105–128.
Johnson, Mark, Stuart Geman, Stephen Canon, Zhiyi Chi & Stefan Riezler. 1999.
Estimators for stochastic “unification-based” grammars. In Robert Dale & Ken Church
(eds.), Proceedings of the Thirty-Seventh Annual Meeting of the ACL, 535–541.
Joshi, Aravind K. 1985. Tree Adjoining Grammars: How much context-sensitivity is re-
quired to provide reasonable structural descriptions? In David Dowty, Lauri Karttunen
& Arnold Zwicky (eds.), Natural language parsing, 206–250. Cambridge University
Press.
Joshi, Aravind K. 1987a. Introduction to Tree Adjoining Grammar. In Alexis Manaster-
Ramer (ed.), The mathematics of language, 87–114. Amsterdam: John Benjamins Pub-
lishing Co.
Joshi, Aravind K. 1987b. Word-order variation in natural language generation. In AAAI
87, Sixth National Conference on Artificial Intelligence, 550–555. Seattle.
Joshi, Aravind K., Tilman Becker & Owen Rambow. 2000. Complexity of scrambling: A
new twist to the competence-performance distinction. In Anne Abeillé & Owen Ram-
bow (eds.), Tree Adjoining Grammars: formalisms, linguistic analysis and processing
(CSLI Lecture Notes 156), 167–181. Stanford, CA: CSLI Publications.
773
References
Joshi, Aravind K., Leon S. Levy & Masako Takahashi. 1975. Tree Adjunct Grammar. Jour-
nal of Computer and System Science 10(2). 136–163.
Joshi, Aravind K. & Yves Schabes. 1997. Tree-Adjoning Grammars. In G. Rozenberg &
A. Salomaa (eds.), Handbook of formal languages, 69–123. Berlin: Springer Verlag.
Joshi, Aravind K., K. Vijay-Shanker & David Weir. 1990. The convergence of mildly context-
sensitive grammar formalisms. Tech. rep. MS-CIS-90-01. Department of Computer &
Information Science, University of Pennsylvania. https://repository.upenn.edu/cis_
reports/539/ (18 August, 2020).
Jungen, Oliver & Horst Lohnstein. 2006. Einführung in die Grammatiktheorie (UTB 2676).
München: Wilhelm Fink Verlag.
Jurafsky, Daniel. 1996. A probabilistic model of lexical and syntactic access and disam-
biguation. Cognitive Science 20(2). 137–194.
Kahane, Sylvain. 1997. Bubble trees and syntactic representations. In Tilman Becker &
Hans-Ulrich Krieger (eds.), Proceedings of Mathematics of Language (MOL5) Meeting,
70–76. Saarbrücken.
Kahane, Sylvain. 2003. The Meaning-Text Theory. In Vilmos Ágel, Ludwig M. Eichinger,
Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning Lobin (eds.),
Dependenz und Valenz / Dependency and valency: Ein internationales Handbuch der
zeitgenössischen Forschung / An international handbook of contemporary research, vol. 1
(Handbücher zur Sprach- und Kommunikationswissenschaft 25), 546–570. Berlin: Wal-
ter de Gruyter.
Kahane, Sylvain. 2009. On the status of phrases in Head-Driven Phrase Structure Gram-
mar: Illustration by a fully lexical treatment of extraction. In Alain Polguère & Igor A.
Mel’čuk (eds.), Dependency in linguistic description (Studies in Language Companion
Series 111), 111–150. Amsterdam: John Benjamins Publishing Co.
Kahane, Sylvain, Alexis Nasr & Owen Rambow. 1998. Pseudo-projectivity: A poly-
nomially parsable non-projective Dependency Grammar. In Pierre Isabelle (ed.),
Proceedings of the 36th Annual Meeting of the Association for Computational Linguis-
tics and 17th International Conference on Computational Linguistics, 646–652. Montreal,
Quebec, Canada: Association for Computational Linguistics. DOI: 10 . 3115 / 980845 .
980953.
Kahane, Sylvain & Timothy Osborne. 2015. Translators’ introduction. In Elements of struc-
tural syntax, xxix–lxxiii. Translated by Timothy Osborne and Sylvain Kahane. Ams-
terdam: John Benjamins Publishing Co.
Kallmeyer, Laura. 2005. Tree-local Multicomponent Tree Adjoining Grammars with
shared nodes. Computational Linguistics 31(2). 187–225.
Kallmeyer, Laura & Aravind K. Joshi. 2003. Factoring predicate argument and scope se-
mantics: underspecified semantics with LTAG. Research on Language and Computation
1(1–2). 3–58. DOI: 10.1023/A:1024564228892.
Kallmeyer, Laura, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert
& Kilian Evang. 2008. TuLiPA: Towards a multi-formalism parsing environment for
grammar engineering. In Stephen Clark & Tracy Holloway King (eds.), Coling 2008:
774
Proceedings of the Workshop on Grammar Engineering Across Frameworks, 1–8. Manch-
ester, England: Association for Computational Linguistics.
Kallmeyer, Laura & Rainer Osswald. 2012. A frame-based semantics of the dative alter-
nation in Lexicalized Tree Adjoining Grammars. In Christopher Piñón (ed.), Empirical
issues in syntax and semantics, vol. 9, 167–184. Paris: CNRS.
Kallmeyer, Laura & Maribel Romero. 2008. Scope and situation binding in LTAG using
semantic unification. Research on Language and Computation 6(1). 3–52.
Kallmeyer, Laura & SinWon Yoon. 2004. Tree-local MCTAG with shared nodes: An analy-
sis of word order variation in German and Korean. Traitement automatique des langues
TAL 45(3). 49–69.
Kamp, Hans & Uwe Reyle. 1993. From discourse to logic: Introduction to modeltheoretic se-
mantics of natural language, formal logic and Discourse Representation Theory (Studies
in Linguistics and Philosophy 42). Dordrecht: Kluwer Academic Publishers.
Kaplan, Ronald M. 1989. The formal architecture of Lexical-Functional Grammar. Journal
of Inforamtion Science and Engineering 5(4). 305–322.
Kaplan, Ronald M. & Joan Bresnan. 1982. Lexical-Functional Grammar: A formal system
for grammatical representation. In Joan Bresnan (ed.), The mental representation of
grammatical relations (MIT Press Series on Cognitive Theory and Mental Represen-
tation), 173–281. Reprint in: Dalrymple, Kaplan, Maxwell III & Zaenen (1995: 29–130).
Cambridge, MA: MIT Press.
Kaplan, Ronald M. & John T. Maxwell III. 1996. LFG grammar writer’s workbench. Tech.
rep. Xerox PARC.
Kaplan, Ronald M., Stefan Riezler, Tracy Holloway King, John T. Maxwell III, Alexander
Vasserman & Richard Crouch. 2004. Speed and accuracy in shallow and deep stochas-
tic parsing. In Proceedings of the Human Language Technology Conference and the 4th
Annual Meeting of the North American Chapter of the Association for Computational
Linguistics (HLT-NAACL’04). Boston, MA: Association for Computational Linguistics.
Kaplan, Ronald M. & Jürgen Wedekind. 1993. Restriction and correspondence-based
translation. In Proceedings of the 31st Annual Meeting of the Association for Compu-
tational Linguistics, 193–202. Columbus, Ohio, USA: Association for Computational
Linguistics.
Kaplan, Ronald M. & Annie Zaenen. 1989. Long-distance dependencies, constituent
structure and functional uncertainty. In Mark R. Baltin & Anthony S. Kroch (eds.),
Alternative conceptions of phrase structure, 17–42. Chicago/London: The University of
Chicago Press.
Karimi, Simin. 2005. A Minimalist approach to scrambling: Evidence from Persian (Studies
in Generative Grammar 76). Berlin, New York: Mouton de Gruyter.
Karimi-Doostan, Gholamhossein. 2005. Light verbs and structural case. Lingua 115(12).
1737–1756.
Karlsson, Fred. 1990. Constraint Grammar as a framework for parsing running text. In
Hans Karlgren (ed.), COLING-90: Papers presented to the 13th International Conference
on Computational Linguistics, 168–173. Helsinki: Association for Computational Lin-
guistics.
775
References
776
Kay, Martin. 2000. David G. Hays. In William J. Hutchins (ed.), Early years in machine
translation (Amsterdam Studies in the Theory and History of Linguistics Science Se-
ries 3), 165–170. Amsterdam: John Benjamins Publishing Co.
Kay, Martin. 2011. Zipf’s law and L’Arbitraire du Signe. Linguistic Issues in Language Tech-
nology 6(8). Special Issue on Interaction of Linguistics and Computational Linguistics,
1–25. http : / / journals . linguisticsociety . org / elanguage / lilt / article / view / 2584 . html
(18 August, 2020).
Kay, Paul. 2002. An informal sketch of a formal architecture for Construction Grammar.
Grammars 5(1). 1–19.
Kay, Paul. 2005. Argument structure constructions and the argument-adjunct distinc-
tion. In Mirjam Fried & Hans C. Boas (eds.), Grammatical constructions: Back to the
roots (Constructional Approaches to Language 4), 71–98. Amsterdam: John Benjamins
Publishing Co.
Kay, Paul & Charles J. Fillmore. 1999. Grammatical constructions and linguistic general-
izations: The What’s X Doing Y? Construction. Language 75(1). 1–33.
Kay, Paul, Ivan A. Sag & Daniel P. Flickinger. 2015. A lexical theory of phrasal idioms. Ms.
CSLI Stanford.
Kayne, Richard S. 1994. The antisymmetry of syntax (Linguistic Inquiry Monographs 25).
Cambridge, MA: MIT Press.
Kayne, Richard S. 2011. Why are there no directionality parameters? In Mary Byram
Washburn, Katherine McKinney-Bock, Erika Varis, Ann Sawyer & Barbara Tomaszew-
icz (eds.), Proceedings of the 28th West Coast Conference on Formal Linguistics, 1–23.
Somerville, MA: Cascadilla Press.
Keenan, Edward L. & Bernard Comrie. 1977. Noun phrase accessibility and Universal
Grammar. Linguistic Inquiry 8(1). 63–99.
Keller, Frank. 1994. German functional HPSG – An experimental CUF encoding. Tech. rep.
Stuttgart: Institut für Maschinelle Sprachverarbeitung.
Keller, Frank. 1995. Towards an account of extraposition in HPSG. In Steven P. Abney
& Erhard W. Hinrichs (eds.), Proceedings of the Seventh Conference of the European
Chapter of the Association for Computational Linguistics, 301–306. Dublin: Association
for Computational Linguistics.
Kern, Franz. 1884. Grundriß der Deutschen Satzlehre. Berlin: Nicolaische Verlags-
Buchhandlung.
Kettunen, Kimmo. 1986. On modelling dependency-oriented parsing. In Fred Karlsson
(ed.), Papers from the Fifth Scandinavian Conference of Computational Linguistics, 113–
120. Helsinki.
Kibort, Anna. 2008. On the syntax of ditransitive constructions. In Miriam Butt & Tracy
Holloway King (eds.), Proceedings of the LFG 2008 conference, 312–332. Stanford, CA:
CSLI Publications. http://csli-publications.stanford.edu/LFG/13/ (18 August, 2020).
Kiefer, Bernd, Hans-Ulrich Krieger & Mark-Jan Nederhof. 2000. Efficient and robust pars-
ing of word hypotheses graphs. In Wolfgang Wahlster (ed.), Verbmobil: Foundations of
speech-to-speech translation (Artificial Intelligence), 280–295. Berlin: Springer Verlag.
777
References
778
LINE), 182–202. Stanford, CA: CSLI Publications. http : / / csli - publications . stanford .
edu/GEAF/2007/ (18 August, 2020).
Kinyon, Alexandra, Owen Rambow, Tatjana Scheffler, SinWon Yoon & Aravind K.
Joshi. 2006. The Metagrammar goes multilingual: A cross-linguistic look at the V2-
phenomenon. In Laura Kallmeyer & Tilman Becker (eds.), TAG+8: The Eighth Interna-
tional Workshop on Tree Adjoining Grammar and Related Formalisms: Proceedings of
the workshop, 17–24. Sydney, Australia: Association for Computational Linguistics.
Kiparsky, Paul. 1987. Morphology and grammatical relations. unpublished paper, Stanford
University, Stanford.
Kiparsky, Paul. 1988. Agreement and linking theory. unpublished paper, Stanford Univer-
sity, Stanford.
Kiparsky, Paul & Carol Kiparsky. 1970. Fact. In Manfred Bierwisch & Karl Erich Heidolph
(eds.), Progress in linguistics, 143–173. The Hague/Paris: Mouton.
Kiss, Katalin E. 2003. Argument scrambling, focus movement and topic movement in
Hungarian. In Simin Karimi (ed.), Word order and scrambling, 22–43. London: Black-
well.
Kiss, Tibor. 1991. The grammars of LILOG. In Otthein Herzog & Claus-Rainer Rollinger
(eds.), Text understanding in LILOG (Lecture Notes in Artificial Intelligence 546), 183–
199. Berlin: Springer Verlag.
Kiss, Tibor. 1992. Variable Subkategorisierung: Eine Theorie unpersönlicher Einbettun-
gen im Deutschen. Linguistische Berichte 140. 256–293.
Kiss, Tibor. 1993. Infinite Komplementation – Neue Studien zum deutschen Verbum infini-
tum. Arbeiten des SFB 282 No. 42. Bergische Universität Gesamthochschule Wupper-
tal.
Kiss, Tibor. 1995. Infinite Komplementation: Neue Studien zum deutschen Verbum infini-
tum (Linguistische Arbeiten 333). Tübingen: Max Niemeyer Verlag.
Kiss, Tibor. 2001. Configurational and relational scope determination in German. In Walt
Detmar Meurers & Tibor Kiss (eds.), Constraint-based approaches to Germanic syntax
(Studies in Constraint-Based Lexicalism 7), 141–175. Stanford, CA: CSLI Publications.
Kiss, Tibor. 2005. Semantic constraints on relative clause extraposition. Natural Lan-
guage & Linguistic Theory 23(2). 281–334. DOI: 10.1007/s11049-003-1838-7.
Kiss, Tibor & Birgit Wesche. 1991. Verb order and head movement. In Otthein Herzog &
Claus-Rainer Rollinger (eds.), Text understanding in LILOG (Lecture Notes in Artificial
Intelligence 546), 216–242. Berlin: Springer Verlag.
Klann-Delius, Gisela. 2008. Spracherwerb. 2nd edn. Stuttgart: J.B. Metzler–Verlag.
Klein, Wolfgang. 1971. Parsing: Studien zur maschinellen Satzanalyse mit Abhängigkeits-
grammatiken und Transformationsgrammatiken. Vol. 2. Frankfurt a. M.: Athenäum Ver-
lag.
Klein, Wolfgang. 1985. Ellipse, Fokusgliederung und thematischer Stand. In Reinhard
Meyer-Hermann & Hannes Rieser (eds.), Ellipsen und fragmentarische Ausdrücke, 1–
24. Tübingen: Max Niemeyer Verlag.
Klein, Wolfgang. 1986. Second language acquisition (Cambridge Textbooks in Linguistics).
Cambridge, UK: Cambridge University Press.
779
References
Klein, Wolfgang. 2009. Finiteness, Universal Grammar and the language faculty. In Jian-
sheng Guo, Elena Lieven, Nancy Budwig, Susan Ervin-Tripp, Keiko Nakamura &
Seyda Ozcaliskan (eds.), Cross-linguistic approaches to the study of language: Research
in the tradition of Dan Isaac Slobin (Psychology Press Festschrift Series), 333–344. New
York: Psychology Press.
Klenk, Ursula. 2003. Generative Syntax (Narr Studienbücher). Tübingen: Gunter Narr
Verlag.
Kluender, Robert. 1992. Deriving island constraints from principles of predication. In
Helen Goodluck & Michael Rochemont (eds.), Island constraints: Theory, acquisition,
and processing, 223–258. Dordrecht: Kluwer Academic Publishers.
Kluender, Robert & Marta Kutas. 1993. Subjacency as a processing phenomenon. Lan-
guage and Cognitive Processes 8(4). 573–633.
Knecht, Laura. 1985. Subject and object in Turkish. M.I.T. (Doctoral dissertation).
Kobele, Gregory M. 2008. Across-the-board extraction in Minimalist Grammars. In
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Re-
lated Formalisms (TAG+9), 113–128.
Koenig, Jean-Pierre. 1999. Lexical relations (Stanford Monographs in Linguistics). Stan-
ford, CA: CSLI Publications.
Koenig, Jean-Pierre & Karin Michelson. 2010. Argument structure of Oneida kinship
terms. International Journal of American Linguistics 76(2). 169–205.
Koenig, Jean-Pierre & Karin Michelson. 2012. The (non)universality of syntactic selection
and functional application. In Christopher Piñón (ed.), Empirical issues in syntax and
semantics, vol. 9, 185–205. Paris: CNRS.
Kohl, Dieter. 1992. Generation from under- and overspecified structures. In Antonio Zam-
polli (ed.), 14th International Conference on Computational Linguistics (COLING ’92),
August 23–28, 686–692. Nantes, France: Association for Computational Linguistics.
Kohl, Dieter, Claire Gardent, Agnes Plainfossé, Mike Reape & Stefan Momma. 1992. Text
generation from semantic representation. In Gabriel G. Bes & Thierry Guillotin (eds.),
The construction of a natural language and graphic interface: Results and perspectives
from the ACORD project, 94–161. Berlin: Springer Verlag.
Kohl, Karen T. 1999. An analysis of finite parameter learning in linguistic spaces. Mas-
sachusetts Institute of Technology. (MA thesis). http://karentkohl.org/papers/SM.pdf
(18 August, 2020).
Kohl, Karen T. 2000. Language learning in large parameter spaces. In Proceedings of the
Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on
Innovative Applications of Artificial Intelligence, 1080. AAAI Press / The MIT Press.
Kolb, Hans-Peter. 1997. GB blues: Two essays on procedures and structures in Generative
Syntax. Arbeitspapiere des SFB 340 No. 110. Tübingen: Eberhard-Karls-Universität.
Kolb, Hans-Peter & Craig L. Thiersch. 1991. Levels and empty categories in a Principles
and Parameters based approach to parsing. In Hubert Haider & Klaus Netter (eds.),
Representation and derivation in the theory of grammar (Studies in Natural Language
and Linguistic Theory 22), 251–301. Dordrecht: Kluwer Academic Publishers.
780
Koller, Alexander. 2017. A feature structure algebra for FTAG. In Proceedings of the 13th
International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+13).
Umea. http://www.coli.uni-saarland.de/~koller/papers/ftag-irtg-17.pdf (28 February,
2018).
Konieczny, Lars. 1996. Human sentence processing: A semantics-oriented parsing approach.
IIG-Berichte 3/96. Universität Freiburg. (Dissertation).
König, Esther. 1999. LexGram: A practical Categorial Grammar formalism. Journal of
Language and Computation 1(1). 33–52.
Koopman, Hilda & Dominique Sportiche. 1991. The position of subjects. Lingua 85(2–3).
211–258.
Kordoni, Valia (ed.). 1999. Tübingen studies in Head-Driven Phrase Structure Grammar
(Arbeitspapiere des SFB 340, No. 132, Volume 1). Tübingen: Eberhard-Karls-Universität
Tübingen.
Kordoni, Valia. 2001. Linking experiencer-subject psych verb constructions in Modern
Greek. In Dan Flickinger & Andreas Kathol (eds.), The proceedings of the 7th Inter-
national Conference on Head-Driven Phrase Structure Grammar: University of Califor-
nia, Berkeley, 22–23 July, 2000, 198–213. CSLI Publications. http://csli- publications.
stanford.edu/HPSG/2001/ (18 August, 2020).
Kordoni, Valia & Julia Neu. 2005. Deep analysis of Modern Greek. In Keh-Yih Su, Oi Yee
Kwong, Jn’ichi Tsujii & Jong-Hyeok Lee (eds.), Natural language processing IJCNLP
2004 (Lecture Notes in Artificial Intelligence 3248), 674–683. Berlin: Springer Verlag.
Kornai, András & Geoffrey K. Pullum. 1990. The X-bar Theory of phrase structure. Lan-
guage 66(1). 24–50.
Koster, Jan. 1975. Dutch as an SOV language. Linguistic Analysis 1(2). 111–136. http : / /
www.dbnl.org/tekst/kost007dutc01_01/kost007dutc01_01.pdf (18 August, 2020).
Koster, Jan. 1978. Locality principles in syntax. Dordrecht: Foris Publications.
Koster, Jan. 1986. The relation between pro-drop, scrambling, and verb movements.
Groningen Papers in Theoretical and Applied Linguistics 1. 1–43.
Koster, Jan. 1987. Domains and dynasties: The radical autonomy of syntax. Dordrecht:
Foris Publications.
Kratzer, Angelika. 1984. On deriving syntactic differences between German and English.
TU Berlin, ms.
Kratzer, Angelika. 1996. Severing the external argument from its verb. In Johan Rooryck
& Laurie Zaring (eds.), Phrase structure and the lexicon, 109–137. Dordrecht: Kluwer
Academic Publishers.
Krieger, Hans-Ulrich & John Nerbonne. 1993. Feature-based inheritance networks for
computational lexicons. In Ted Briscoe, Ann Copestake & Valeria de Paiva (eds.),
Inheritance, defaults, and the lexicon, 90–136. A version of this paper is available as
DFKI Research Report RR-91-31. Also published in: Proceedings of the ACQUILEX
Workshop on Default Inheritance in the Lexicon, Technical Report No. 238, Univer-
sity of Cambridge, Computer Laboratory, October 1991. Cambridge, UK: Cambridge
University Press.
781
References
782
Kunze, Jürgen. 1975. Abhängigkeitsgrammatik (studia grammatica 12). Berlin: Akademie
Verlag.
Kunze, Jürgen. 1991. Kasusrelationen und semantische Emphase (studia grammatica
XXXII). Berlin: Akademie Verlag.
Kunze, Jürgen. 1993. Sememstrukturen und Feldstrukturen (studia grammatica 36). unter
Mitarbeit von Beate Firzlaff. Berlin: Akademie Verlag.
Labelle, Marie. 2007. Biolinguistics, the Minimalist Program, and psycholinguistic real-
ity. Snippets 14. 6–7. http : / / www . ledonline . it / snippets / allegati / snippets14002 . pdf
(18 August, 2020).
Laczkó, Tibor, György Rákosi & Ágoston Tóth. 2010. HunGram vs. EngGram in ParGram:
On the comparison of Hungarian and English in an international computational lin-
guistics project. In Irén Hegedűs & Sándor Martsa (eds.), Selected papers in linguistics
from the 9th HUSSE Conference, vol. 1, 81–95. Pécs: Institute of English Studies, Faculty
of Humanities, University of Pécs.
Laenzlinger, Christoph. 2004. A feature-based theory of adverb syntax. In Jennifer R.
Austin, Stefan Engelberg & Gisa Rauh (eds.), Adverbials: The interplay between mean-
ing, context, and syntactic structure (Linguistik Aktuell/Linguistics Today 70), 205–252.
Amsterdam: John Benjamins Publishing Co.
Lai, Cecilia S. L., Simon E. Fisher, Jane A. Hurst, Faraneh Vargha-Khadem & Anthony P.
Monaco. 2001. A forkhead-domain gene is mutated in a severe speech and language
disorder. Nature 413(6855). 519–523. DOI: 10.1038/35097076.
Lakoff, George. 1987. Women, fire, and dangerous things: What categories reveal about the
mind. Chicago: The University of Chicago Press.
Langacker, Ronald W. 1987. Foundations of Cognitive Grammar. Vol. 1. Stanford, CA: Stan-
ford University Press.
Langacker, Ronald W. 2000. A dynamic usage-based model. In Michael Barlow & Suzanne
Kemmer (eds.), Usage-based models of language, 1–63. Stanford, CA: CSLI Publications.
Langacker, Ronald W. 2008. Cognitive Grammar: A basic introduction. Oxford: Oxford
University Press.
Langacker, Ronald W. 2009. Cognitive (Construction) Grammar. Cognitive Linguistics
20(1). 167–176.
Lappin, Shalom, Robert D. Levine & David E. Johnson. 2000a. The revolution confused:
A response to our critics. Natural Language & Linguistic Theory 18(4). 873–890.
Lappin, Shalom, Robert D. Levine & David E. Johnson. 2000b. The structure of unscien-
tific revolutions. Natural Language & Linguistic Theory 18(3). 665–671.
Lappin, Shalom, Robert D. Levine & David E. Johnson. 2001. The revolution maximally
confused. Natural Language & Linguistic Theory 19(4). 901–919.
Larson, Richard K. 1988. On the double object construction. Linguistic Inquiry 19(3). 335–
391.
Lascarides, Alex & Ann Copestake. 1999. Default representation in constraint-based
frameworks. Computational Linguistics 25(1). 55–105.
Lasnik, Howard & Mamoru Saito. 1992. Move 𝛼: conditions on its application and output
(Current Studies in Linguistics 22). Cambridge, MA: MIT Press.
783
References
Lasnik, Howard & Juan Uriagereka. 2002. On the poverty of the challenge. The Linguistic
Review 19(1–2). 147–150.
Lavoie, Benoit & Owen Rambow. 1997. RealPro–A fast, portable sentence realizer. In
Proceedings of the Conference on Applied Natural Language Processing (ANLP’97).
Le, Hong Phuong, Thi Minh Huyen Nguyen & Azim Roussanaly. 2008. Metagrammar
for Vietnamese LTAG. In Proceedings of the Ninth International Workshop on Tree Ad-
joining Grammars and Related Formalisms (TAG+9), 129–132. Tübingen.
Legate, Julie & Charles D. Yang. 2002. Empirical re-assessment of stimulus poverty ar-
guments. The Linguistic Review 19(1–2). 151–162. DOI: 10.1515/tlir.19.1-2.151.
Lehtola, Aarno. 1986. DPL: A computational method for describing grammars and mod-
elling parsers. In Fred Karlsson (ed.), Papers from the Fifth Scandinavian Conference of
Computational Linguistics, 151–159. Helsinki.
Leiss, Elisabeth. 2003. Empirische Argumente für Dependenz. In Vilmos Ágel, Ludwig
M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer & Henning
Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein internationales Hand-
buch der zeitgenössischen Forschung / An international handbook of contemporary re-
search, vol. 1 (Handbücher zur Sprach- und Kommunikationswissenschaft 25), 311–
324. Berlin: Walter de Gruyter.
Leiss, Elisabeth. 2009. Sprachphilosophie (de Gruyter Studienbuch). Berlin: Walter de
Gruyter.
Lenerz, Jürgen. 1977. Zur Abfolge nominaler Satzglieder im Deutschen (Studien zur deut-
schen Grammatik 5). Tübingen: originally Gunter Narr Verlag now Stauffenburg Ver-
lag.
Lenerz, Jürgen. 1994. Pronomenprobleme. In Brigitta Haftka (ed.), Was determiniert Wort-
stellungsvariation? Studien zu einem Interaktionsfeld von Grammatik, Pragmatik und
Sprachtypologie, 161–174. Opladen: Westdeutscher Verlag.
Lenneberg, Eric H. 1964. The capacity for language acquisition. In Jerry A. Fodor & Jer-
rold J. Katz (eds.), The structure of language, 579–603. Englewood Cliffs, NJ: Prentice-
Hall.
Lenneberg, Eric H. 1967. Biological foundations of language. New York: John Wiley &
Sons, Inc.
Levelt, Willem J. M. 1989. Speaking: from intonation to articulation (ACL-MIT Press Series
in Natural Language Processing). Cambridge, MA: MIT Press.
Levin, Beth. 1993. English verb classes and alternations: A preliminary investigation.
Chicago, IL: University of Chicago Press.
Levin, Beth & Malka Rappaport Hovav. 2005. Argument realization. Cambridge Univer-
sity Press.
Levine, Robert D. 2003. Adjunct valents, cumulative scopings and impossible descrip-
tions. In Jongbok Kim & Stephen Mark Wechsler (eds.), The proceedings of the 9th In-
ternational Conference on Head-Driven Phrase Structure Grammar, 209–232. Stanford,
CA: CSLI Publications. http://csli-publications.stanford.edu/HPSG/3/ (25 September,
2018).
784
Levine, Robert D. & Thomas E. Hukari. 2006. The unity of unbounded dependency con-
structions (CSLI Lecture Notes 166). Stanford, CA: CSLI Publications.
Levine, Robert D. & Walt Detmar Meurers. 2006. Head-Driven Phrase Structure Gram-
mar: Linguistic approach, formal foundations, and computational realization. In Keith
Brown (ed.), The encyclopedia of language and linguistics, 2nd edn., 237–252. Oxford:
Elsevier Science Publisher B.V. (North-Holland).
Lewis, Geoffrey L. 1967. Turkish grammar. Oxford: Clarendon Press.
Lewis, John D. & Jeffrey L. Elman. 2001. Learnability and the statistical structure of lan-
guage: Poverty of Stimulus arguments revisited. In Barbora Skarabela, Sarah Fish &
Anna H.-J. Do (eds.), Proceedings of the 26th Annual Boston University Conference on
Language Development, 359–370. http : / / crl . ucsd . edu / ~elman / Papers / BU2001 . pdf
(18 August, 2020).
Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese: A functional reference
grammar. Berkeley & Los Angeles: University of California Press.
Li, Wei. 1996. Esperanto inflection and its interface in HPSG. Working Papers of the Lin-
guistics Circle. University of Victoria. 71–80.
Lichte, Timm. 2007. An MCTAG with tuples for coherent constructions in German. In
Laura Kallmeyer, Paola Monachesi, Gerald Penn & Giorgio Satta (eds.), Proceedings of
the 12th Conference on Formal Grammar 2007. Dublin, Ireland.
Lichte, Timm & Laura Kallmeyer. 2017. Tree-Adjoining Grammar: A tree-based construc-
tionist grammar framework for natural language understanding. In Proceedings of the
AAAI 2017 Spring Symposium on Computational Construction Grammar and Natural
Language Understanding (Technical Report SS-17-02). Association for the Advance-
ment of Artificial Intelligence.
Lieb, Hans-Heinrich. 1983. Integrational linguistics: Vol. I.: General outline (Current Issues
in Linguistic Theory 17). Amsterdam: John Benjamins Publishing Co.
Lightfoot, David W. 1997. Catastrophic change and learning theory. Lingua 100(1). 171–
192.
Lin, Dekang. 1993. Principle-based parsing without overgeneration. In Proceedings of the
31st Annual Meeting of the Association for Computational Linguistics, 112–120. Colum-
bus, Ohio, USA: Association for Computational Linguistics. DOI: 10 . 3115 / 981574 .
981590.
Lin, Francis Y. 2017. A refutation of Universal Grammar. Zombie Lingua 193. 1–22. DOI:
10.1016/j.lingua.2017.04.003.
Link, Godehard. 1984. Hydras: On the logic of relative constructions with multiple heads.
In Fred Landmann & Frank Veltman (eds.), Varieties of formal semantics, 245–257. Dor-
drecht: Foris Publications.
Lipenkova, Janna. 2009. Serienverbkonstruktionen im Chinesischen und ihre Analyse im
Rahmen von HPSG. Institut für Sinologie, Freie Universität Berlin. (MA thesis).
Liu, Gang. 1997. Eine unifikations-basierte Grammatik für das moderne Chinesisch –
dargestellt in der HPSG. FG Sprachwissenschaft, Universität Konstanz. (Doctoral dis-
sertation). http://nbn-resolving.de/urn:nbn:de:bsz:352-opus-1917 (18 August, 2020).
Liu, Haitao. 2009. Dependency Grammar: From theory to practice. Beijing: Science Press.
785
References
Liu, Haitao & Wei Huang. 2006. Chinese Dependency Syntax for treebanking. In
Proceedings of the Twentieth Pacific Asia Conference on Language, Information and Com-
putation, 126–133. Beijing: Tsinghua University Press.
Lloré, F. Xavier. 1995. Un Método de ‘Parsing’ para Gramáticas Categoriales Multimodales.
I.C.E. de la Universidad Politécnica de Catalunya. (Doctoral dissertation).
Lobin, Henning. 1993. Koordinationssyntax als strukturales Phänomen (Studien zur Gram-
matik 46). Tübingen: Gunter Narr Verlag.
Lobin, Henning. 2003. Dependenzgrammatik und Kategorialgrammatik. In Vilmos Ágel,
Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen Heringer &
Henning Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein interna-
tionales Handbuch der zeitgenössischen Forschung / An international handbook of con-
temporary research, vol. 1 (Handbücher zur Sprach- und Kommunikationswissenschaft
25), 325–330. Berlin: Walter de Gruyter.
Löbner, Sebastian. 1986. In Sachen Nullartikel. Linguistische Berichte 101. 64–65.
Lohndal, Terje. 2012. Toward the end of argument structure. In María Cristina Cuervo
& Yves Roberge (eds.), The end of argument structure?, vol. 38 (Syntax and Semantics),
155–184. Bingley, UK: Emerald Group Publishing.
Lohnstein, Horst. 1993. Projektion und Linking: Ein prinzipienbasierter Parser fürs Deu-
tsche (Linguistische Arbeiten 287). Tübingen: Max Niemeyer Verlag.
Lohnstein, Horst. 2014. Artenvielfalt in freier Wildbahn: Generative Grammatik. In Jörg
Hagemann & Sven Staffeldt (eds.), Syntaxtheorien: Analysen im Vergleich (Stauffenburg
Einführungen 28), 165–185. Tübingen: Stauffenburg Verlag.
Longobardi, Giuseppe & Ian Roberts. 2010. Universals, diversity and change in the sci-
ence of language: Reaction to “The myth of language universals and cognitive sci-
ence”. Lingua 120(12). 2699–2703.
Lorenz, Konrad. 1970. Studies in human and animal behavior. Vol. I. Cambridge, MA:
Harvard University Press.
Lötscher, Andreas. 1985. Syntaktische Bedingungen der Topikalisierung. Deutsche Spra-
che 13(3). 207–229.
Loukam, Mourad, Amar Balla & Mohamed Tayeb Laskri. 2015. Towards an open platform
based on HPSG formalism for the Standard Arabic language. International Journal of
Speech Technology. DOI: 10.1007/s10772-015-9314-4.
Lüdeling, Anke. 2001. On particle verbs and similar constructions in German (Dissertations
in Linguistics). Stanford, CA: CSLI Publications.
Luuk, Erkki & Hendrik Luuk. 2011. The redundancy of recursion and infinity for natural
language. Cognitive Processing 12(1). 1–11.
Maas, Heinz Dieter. 1977. The Saarbrücken Automatic Translation System (SUSY). In
Eric James Coates (ed.), Proceedings of the Third European Congress on Information
Systems and Networks: Overcoming the Language Barrier, vol. 1, 585–592. München:
Verlag Dokumentation.
Maché, Jakob. 2010. Towards a compositional analysis of verbless directives in German.
Paper presented at the HPSG 2010 Conference.
786
Machicao y Priemer, Antonio. 2015. SpaGram: An implemented grammar fragment of
Spanish. Ms. Humboldt Universität zu Berlin. In Preparation.
MacWhinney, Brian. 1995. The CHILDES project: Tools for analyzing talk. 2nd edn. Hills-
dale, NJ: Erlbaum.
Maess, Burkhard, Stefan Koelsch, Thomas C. Gunter & Angela D. Friederici. 2001.
Musical syntax is processed in Broca’s area: An MEG study. Nature Neuroscience 4(5).
540–545.
Marantz, Alec. 1984. On the nature of grammatical relations (Linguistic Inquiry Mono-
graphs 10). Cambridge, MA: MIT Press.
Marantz, Alec. 1997. No escape from syntax: Don’t try morphological analysis in the
privacy of your own lexicon. U. Penn Working Papers in Linguistics 4(2). 201–225. http:
//www.ling.upenn.edu/papers/v4.2-contents.html (18 August, 2020).
Marantz, Alec. 2005. Generative linguistics within the cognitive neuroscience of lan-
guage. The Linguistic Review 22(2–4). 429–445.
Marcus, Gary F. 1993. Negative evidence in language acquisition. Cognition 46(1). 53–85.
Marcus, Gary F. & Simon E. Fisher. 2003. FOXP2 in focus: What can genes tell us about
speech and language? TRENDS in Cognitive Sciences 7(6). 257–262.
Marcus, Mitchell P. 1980. A theory of syntactic recognition for natural language. London,
England/Cambridge, MA: MIT Press.
Marimon, Montserrat. 2013. The Spanish DELPH-IN grammar. Language Resources and
Evaluation 47(2). 371–397. DOI: 10.1007/s10579-012-9199-7.
Marshall, Ian & Éva Sáfár. 2004. Sign Language generation in an ALE HPSG. In Stefan
Müller (ed.), Proceedings of the 11th International Conference on Head-Driven Phrase
Structure Grammar, Center for Computational Linguistics, Katholieke Universiteit Leu-
ven, 189–201. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/
HPSG/2004/ (18 August, 2020).
Marslen-Wilson, William D. 1975. Sentence perception as an interactive parallel process.
Science 189(4198). 226–228. DOI: 10.1126/science.189.4198.226.
Martorell, Jordi. 2018. Merging generative linguistics and psycholinguistics. Frontiers in
Psychology 9(2283). 1–5. DOI: 10.3389/fpsyg.2018.02283.
Masuichi, Hiroshi & Tomoko Ohkuma. 2003. Constructing a practical Japanese parser
based on Lexical-Functional Grammar. Journal of Natural Language Processing 10. in
Japanese, 79–109.
Masum, Mahmudul Hasan, Muhammad Sadiqul Islam, M. Sohel Rahman & Reaz Ahmed.
2012. HPSG analysis of type-based Arabic nominal declension. In The 13th Interna-
tional Arab Conference, 272–279.
Mayo, Bruce. 1997. Die Konstanzer LFG-Umgebung. Arbeitspapier 82 des Fachbereichs
Sprachwissenschaft der Universität Konstanz. Universität Konstanz.
Mayo, Bruce. 1999. A computational model of derivational morphology. Universität Ham-
burg. (Doctoral dissertation). http://www.sub.uni-hamburg.de/opus/volltexte/1999/
386/ (18 August, 2020).
787
References
McCawley, James D. 1981. The syntax and semantics of English relative clauses. Lingua
53(2). 99–149. DOI: 10.1016/0024- 3841(81)90014- 0. http://www.sciencedirect.com/
science/article/pii/0024384181900140.
Meinunger, André. 2000. Syntactic aspects of topic and comment (Linguistik Aktuell/Lin-
guistics Today 38). Amsterdam: John Benjamins Publishing Co.
Meisel, Jürgen M. 1995. Parameters in acquisition. In Paul Fletcher & Brian MacWhinny
(eds.), The handbook of child language, 10–35. Oxford: Blackwell Publishers Ltd. DOI:
10.1111/b.9780631203124.1996.00002.x.
Meisel, Jürgen M. 2013. Sensitive phases in successive language acquisition: The critical
period hypothesis revisited. In The Cambridge handbook of Biolinguistics (Cambridge
Handbooks in Language and Linguistics), 69–85. Cambridge, UK: Cambridge Univer-
sity Press. DOI: 10.1017/CBO9780511980435.
Mel’čuk, Igor A. 1964. Avtomatičeskij sintaksičeskij analiz. Vol. 1. Novosibirsk: Izda-
tel´stvo SO AN SSSR.
Mel’čuk, Igor A. 1981. Meaning-Text Models: A recent trend in Soviet linguistics. Annual
Review of Anthropology 10. 27–62.
Mel’čuk, Igor A. 1988. Dependency Syntax: Theory and practice (SUNY Series in Linguis-
tics). Albany, NY: SUNY Press.
Mel’čuk, Igor A. 2003. Levels of dependency description: Concepts and problems. In
Vilmos Ágel, Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen
Heringer & Henning Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein
internationales Handbuch der zeitgenössischen Forschung / An international handbook
of contemporary research, vol. 1 (Handbücher zur Sprach- und Kommunikationswis-
senschaft 25), 188–230. Berlin: Walter de Gruyter.
Melnik, Nurit. 2007. From “hand-written” to computationally implemented HPSG theo-
ries. Research on Language and Computation 5(2). 199–236.
Mensching, Guido & Eva-Maria Remberger. 2011. Syntactic variation and change in Ro-
mance: A Minimalist approach. In Peter Siemund (ed.), Linguistic universals and lan-
guage variation (Trends in Linguistics. Studies and Monographs 231), 361–403. Berlin:
Mouton de Gruyter.
Menzel, Wolfgang. 1998. Constraint satisfaction for robust parsing of spoken language.
Journal of Experimental & Theoretical Artificial Intelligence 10(1). 77–89.
Menzel, Wolfgang & Ingo Schröder. 1998a. Constraint-based diagnosis for intelligent
language tutoring systems. In Proceedings of the ITF & KNOWS Conference at the 1FIP
’98 Congress. Wien/Budapest.
Menzel, Wolfgang & Ingo Schröder. 1998b. Decision procedures for Dependency Parsing
using graded constraints. In Alain Polguère & Sylvain Kahane (eds.), Processing of
dependency-based grammars: Proceedings of the workshop at COLING-ACL’98, 78–87.
Association for Computational Linguistics. http://www.aclweb.org/anthology/W98-
0509 (18 August, 2020).
Meurer, Paul. 2009. A computational grammar for Georgian. In Peter Bosch, David
Gabelaia & Jérôme Lang (eds.), Logic, language, and computation: 7th International
Tbilisi Symposium on Logic, Language, and Computation, TbiLLC 2007, Tbilisi, Geor-
788
gia, October 2007, revised selected papers (Lecture Notes in Artificial Intelligence 5422),
1–15. Berlin: Springer Verlag.
Meurers, Walt Detmar. 1994. On implementing an HPSG theory. In Erhard W. Hinrichs,
Walt Detmar Meurers & Tsuneko Nakazawa (eds.), Partial-VP and split-NP topicaliza-
tion in German – An HPSG analysis and its implementation (Arbeitspapiere des SFB 340
No. 58), 47–155. Tübingen: Eberhard-Karls-Universität. http://www.sfs.uni-tuebingen.
de/~dm/papers/on-implementing.html (18 August, 2020).
Meurers, Walt Detmar. 1999a. German partial-VP fronting revisited. In Gert Webelhuth,
Jean-Pierre Koenig & Andreas Kathol (eds.), Lexical and Constructional aspects of lin-
guistic explanation (Studies in Constraint-Based Lexicalism 1), 129–144. Stanford, CA:
CSLI Publications.
Meurers, Walt Detmar. 1999b. Lexical generalizations in the syntax of German non-finite
constructions. Tübingen: Eberhard-Karls-Universität. (Doctoral dissertation).
Meurers, Walt Detmar. 1999c. Raising spirits (and assigning them case). Groninger Arbei-
ten zur Germanistischen Linguistik (GAGL) 43. 173–226. http://www.sfs.uni-tuebingen.
de/~dm/papers/gagl99.html (18 August, 2020).
Meurers, Walt Detmar. 2000. Lexical generalizations in the syntax of German non-finite
constructions. Arbeitspapiere des SFB 340 No. 145. Tübingen: Eberhard-Karls-Univer-
sität. http://www.sfs.uni-tuebingen.de/~dm/papers/diss.html (18 August, 2020).
Meurers, Walt Detmar. 2001. On expressing lexical generalizations in HPSG. Nordic Jour-
nal of Linguistics 24(2). 161–217.
Meurers, Walt Detmar, Kordula De Kuthy & Vanessa Metcalf. 2003. Modularity of gram-
matical constraints in HPSG-based grammar implementations. In Emily M. Bender,
Daniel P. Flickinger, Frederik Fouvry & Melanie Siegel (eds.), Proceedings of the ESS-
LLI 2003 Workshop “Ideas and Strategies for Multilingual Grammar Development”, 83–
90. Vienna, Austria. http://www.sfs.uni-tuebingen.de/~dm/papers/meurers-dekuthy-
metcalf-03.html (22 February, 2008).
Meurers, Walt Detmar & Stefan Müller. 2009. Corpora and syntax. In Anke Lüdeling &
Merja Kytö (eds.), Corpus linguistics: An international handbook, vol. 29 (Handbücher
zur Sprach- und Kommunikationswissenschaft), chap. 42, 920–933. Berlin: Mouton de
Gruyter.
Meurers, Walt Detmar, Gerald Penn & Frank Richter. 2002. A web-based instructional
platform for constraint-based grammar formalisms and parsing. In Dragomir Radev
& Chris Brew (eds.), Effective tools and methodologies for teaching NLP and CL, 18–25.
Proceedings of the Workshop held at 40th Annual Meeting of the Association for Com-
putational Linguistics. Philadelphia, PA. Association for Computational Linguistics.
Micelli, Vanessa. 2012. Field topology and information structure: A case study for Ger-
man constituent order. In Luc Steels (ed.), Computational issues in Fluid Construction
Grammar (Lecture Notes in Computer Science 7249), 178–211. Berlin: Springer Verlag.
Michaelis, Jens. 2001. On formal properties of Minimalist Grammars. Universität Potsdam.
(Doctoral dissertation).
789
References
Michaelis, Laura A. 2006. Construction Grammar. In Keith Brown (ed.), The encyclopedia
of language and linguistics, 2nd edn., 73–84. Oxford: Elsevier Science Publisher B.V.
(North-Holland).
Michaelis, Laura A. & Josef Ruppenhofer. 2001. Beyond alternations: A Constructional
model of the German applicative pattern (Stanford Monographs in Linguistics). Stan-
ford, CA: CSLI Publications.
Miller, George A. & Kathryn Ojemann McKean. 1964. A chronometric study of some
relations between sentences. Quarterly Journal of Experimental Psychology 16(4). 297–
308.
Miller, Philip H. & Ivan A. Sag. 1997. French clitic movement without clitics or movement.
Natural Language & Linguistic Theory 15(3). 573–639.
Mittendorf, Ingo & Louisa Sadler. 2005. Numerals, nouns and number in Welsh NPs. In
Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2005 conference,
294–312. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/LFG/
10/ (18 August, 2020).
Miyao, Yusuke, Takashi Ninomiya & Jun’ichi Tsujii. 2005. Corpus-oriented grammar de-
velopment for acquiring a Head-Driven Phrase Structure Grammar from the Penn
Treebank. In Keh-Yih Su, Oi Yee Kwong, Jn’ichi Tsujii & Jong-Hyeok Lee (eds.), Natural
language processing IJCNLP 2004 (Lecture Notes in Artificial Intelligence 3248), 684–
693. Berlin: Springer Verlag.
Miyao, Yusuke & Jun’ichi Tsujii. 2008. Feature forest models for probabilistic HPSG pars-
ing. Computational Linguistics 34(1). 35–80.
Moeljadi, David, Francis Bond & Sanghoun Song. 2015. Building an HPSG-based Indone-
sian resource grammar. In Emily M. Bender, Lori Levin, Stefan Müller, Yannick Par-
mentier & Aarne Ranta (eds.), Proceedings of the Grammar Engineering Across Frame-
works (GEAF) Workshop, 9–16. The Association for Computational Linguistics.
Moens, Marc, Jo Calder, Ewan Klein, Mike Reape & Henk Zeevat. 1989. Expressing gener-
alizations in unification-based grammar formalisms. In Harold Somers & Mary McGee
Wood (eds.), Proceedings of the Fourth Conference of the European Chapter of the Asso-
ciation for Computational Linguistics, 174–181. Manchester, England: Association for
Computational Linguistics.
Monachesi, Paola. 1998. Italian restructuring verbs: A lexical analysis. In Erhard W. Hin-
richs, Andreas Kathol & Tsuneko Nakazawa (eds.), Complex predicates in nonderiva-
tional syntax (Syntax and Semantics 30), 313–368. San Diego: Academic Press.
Montague, Richard. 1974. Formal philosophy. New Haven: Yale University Press.
Moortgat, Michael. 1989. Categorical investigations: Logical and linguistic aspects
of the Lambek Calculus (Groningen Amsterdam Studies in Semantics 9). Dor-
drecht/Cinnaminson, U.S.A.: Foris Publications.
Moortgat, Michael. 2011. Categorial type logics. In Johan F. A. K. van Benthem & G. B. Al-
ice ter Meulen (eds.), Handbook of logic and language, 2nd edn., 95–179. Amsterdam:
Elsevier.
Moot, Richard. 2002. Proof nets for linguistic analysis. University of Utrecht. (Doctoral
dissertation).
790
Morgan, James L. 1989. Learnability considerations and the nature of trigger experiences
in language acquisition. Behavioral and Brain Sciences 12(2). 352–353.
Morin, Yves Ch. 1973. A computer tested Transformational Grammar of French. Linguis-
tics 116(11). 49–114.
Morrill, Glyn. 1995. Discontinuity in Categorial Grammar. Linguistics and Philosophy
18(2). 175–219.
Morrill, Glyn. 2012. CatLog: A Categorial parser/theorem-prover. In Logical aspects of
computational linguistics: System demonstrations, 13–16. Nantes, France: University of
Nantes.
Morrill, Glyn. 2017. Parsing Logical Grammar: CatLog3. In Roussanka Loukanova &
Kristina Liefke (eds.), Proceedings of the workshop on logic and algorithms in compu-
tational linguistics 2017, LACompLing2017, DiVA, 107–131. Stockholm: Stockholm Uni-
versity.
Morrill, Glyn V. 1994. Type Logical Grammars: Categorial logic of signs. Dordrecht: Kluwer
Academic Publishers.
Müller, Gereon. 1996a. A constraint on remnant movement. Natural Language & Linguis-
tic Theory 14(2). 355–407.
Müller, Gereon. 1996b. On extraposition and successive cyclycity. In Uli Lutz & Jürgen
Pafel (eds.), On extraction and extraposition in German (Linguistik Aktuell/Linguistics
Today 11), 213–243. Amsterdam: John Benjamins Publishing Co.
Müller, Gereon. 1998. Incomplete category fronting: A derivational approach to remnant
movement in German (Studies in Natural Language and Linguistic Theory 42). Dor-
drecht: Kluwer Academic Publishers.
Müller, Gereon. 2000. Elemente der optimalitätstheoretischen Syntax (Stauffenburg Lin-
guistik 20). Tübingen: Stauffenburg Verlag.
Müller, Gereon. 2009a. There are no Constructions. Handout Ringvorlesung: Algorithmen
und Muster: Strukturen in der Sprache. Freie Universität Berlin, 20. Mai. https://hpsg.hu-
berlin.de/Events/Ring2009/gmueller.pdf (30 June, 2019).
Müller, Gereon. 2011. Regeln oder Konstruktionen? Von verblosen Direktiven zur sequen-
tiellen Nominalreduplikation. In Stefan Engelberg, Anke Holler & Kristel Proost (eds.),
Sprachliches Wissen zwischen Lexikon und Grammatik (Institut für Deutsche Sprache,
Jahrbuch 2010), 211–249. Berlin: de Gruyter.
Müller, Gereon. 2014a. Syntactic buffers. Linguistische Arbeitsberichte 91. Institut für Lin-
guistic Universität Leipzig. http://www.uni-leipzig.de/~muellerg/mu765.pdf (18 Au-
gust, 2020).
Müller, Natascha & Beate Riemer. 1998. Generative Syntax der romanischen Sprachen:
Französisch, Italienisch, Portugiesisch, Spanisch (Stauffenburg Einführungen 17). Tü-
bingen: Stauffenburg Verlag.
Müller, Stefan. 1995. Scrambling in German – Extraction into the Mittelfeld. In Benjamin
K. T’sou & Tom Bong Yeung Lai (eds.), Proceedings of the Tenth Pacific Asia Conference
on Language, Information and Computation, 79–83. City University of Hong Kong.
791
References
Müller, Stefan. 1996c. The Babel-System: An HPSG fragment for German, a parser, and
a dialogue component. In Proceedings of the Fourth International Conference on the
Practical Application of Prolog, 263–277. London.
Müller, Stefan. 1996d. Yet another paper about partial verb phrase fronting in German. In
Jun-ichi Tsuji (ed.), Proceedings of COLING-96: 16th International Conference on Com-
putational Linguistics (COLING96). Copenhagen, Denmark, August 5–9, 1996, 800–805.
Copenhagen, Denmark: Association for Computational Linguistics.
Müller, Stefan. 1999a. An HPSG-analysis for free relative clauses in German. Grammars
2(1). 53–105. DOI: 10.1023/A:1004564801304.
Müller, Stefan. 1999b. Deutsche Syntax deklarativ: Head-Driven Phrase Structure Grammar
für das Deutsche (Linguistische Arbeiten 394). Tübingen: Max Niemeyer Verlag.
Müller, Stefan. 1999c. Parsing of an HPSG grammar for German: word order domains
and discontinuous constituents. In Jost Gippert & Peter Olivier (eds.), Multilinguale
Corpora: Codierung, Strukturierung, Analyse. 11. Jahrestagung der Gesellschaft für Lin-
guistische Datenverarbeitung, 292–303. Prag: enigma corporation.
Müller, Stefan. 1999d. Restricting discontinuity. In Proceedings of the 5th Natural Lan-
guage Processing Pacific Rim Symposium 1999 (NLPRS’99), 85–90. Peking.
Müller, Stefan. 2001. Case in German – towards an HPSG analysis. In Walt Detmar Meu-
rers & Tibor Kiss (eds.), Constraint-based approaches to Germanic syntax (Studies in
Constraint-Based Lexicalism 7), 217–255. Stanford, CA: CSLI Publications.
Müller, Stefan. 2002a. Complex predicates: Verbal complexes, resultative constructions, and
particle verbs in German (Studies in Constraint-Based Lexicalism 13). Stanford, CA:
CSLI Publications.
Müller, Stefan. 2002b. Multiple frontings in German. In Gerhard Jäger, Paola Monachesi,
Gerald Penn & Shuly Wintner (eds.), Proceedings of Formal Grammar 2002, 113–124.
Trento.
Müller, Stefan. 2002c. Syntax or morphology: German particle verbs revisited. In Nicole
Dehé, Ray S. Jackendoff, Andrew McIntyre & Silke Urban (eds.), Verb-particle explo-
rations (Interface Explorations 1), 119–139. Berlin: Mouton de Gruyter.
Müller, Stefan. 2003a. Mehrfache Vorfeldbesetzung. Deutsche Sprache 31(1). 29–62.
Müller, Stefan. 2003b. Object-to-subject-raising and lexical rule: An analysis of the Ger-
man passive. In Stefan Müller (ed.), Proceedings of the 10th International Conference on
Head-Driven Phrase Structure Grammar, Michigan State University, East Lansing, 278–
297. Stanford, CA: CSLI Publications.
Müller, Stefan. 2003c. Solving the bracketing paradox: An analysis of the morphol-
ogy of German particle verbs. Journal of Linguistics 39(2). 275–325. DOI: 10 . 1017 /
S0022226703002032.
Müller, Stefan. 2004a. An analysis of depictive secondary predicates in German without
discontinuous constituents. In Stefan Müller (ed.), Proceedings of the 11th International
Conference on Head-Driven Phrase Structure Grammar, Center for Computational Lin-
guistics, Katholieke Universiteit Leuven, 202–222. Stanford, CA: CSLI Publications.
Müller, Stefan. 2004b. An HPSG analysis of German depictive secondary predicates. In
Lawrence S. Moss & Richard T. Oehrle (eds.), Proceedings of the joint meeting of the 6th
792
Conference on Formal Grammar and the 7th Conference on Mathematics of Language
(Electronic Notes in Theoretical Computer Science 53), 233–245. Helsinki: Elsevier
Science Publisher B.V. (North-Holland). DOI: 10.1016/S1571-0661(05)82585-X.
Müller, Stefan. 2004c. Complex NPs, subjacency, and extraposition. Snippets 8. 10–11.
Müller, Stefan. 2004d. Continuous or discontinuous constituents? A comparison between
syntactic analyses for constituent order and their processing systems. Research on Lan-
guage and Computation, Special Issue on Linguistic Theory and Grammar Implementa-
tion 2(2). 209–257. DOI: 10.1023/B:ROLC.0000016785.49729.d7.
Müller, Stefan. 2005a. Resultative Constructions: Syntax, world knowledge, and colloca-
tional restrictions: Review of Hans C. Boas: A Constructional approach to resultatives.
Studies in Language 29(3). 651–681.
Müller, Stefan. 2005b. Zur Analyse der deutschen Satzstruktur. Linguistische Berichte 201.
3–39.
Müller, Stefan. 2005c. Zur Analyse der scheinbar mehrfachen Vorfeldbesetzung. Lingui-
stische Berichte 203. 297–330.
Müller, Stefan. 2006. Phrasal or lexical Constructions? Language 82(4). 850–883. DOI:
10.1353/lan.2006.0213.
Müller, Stefan. 2007a. Head-Driven Phrase Structure Grammar: Eine Einführung. 1st edn.
(Stauffenburg Einführungen 17). Tübingen: Stauffenburg Verlag.
Müller, Stefan. 2007b. Phrasal or lexical Constructions: Some comments on underspec-
ification of constituent order, compositionality, and control. In Stefan Müller (ed.),
Proceedings of the 14th International Conference on Head-Driven Phrase Structure Gram-
mar, 373–393. Stanford, CA: CSLI Publications.
Müller, Stefan. 2007c. Qualitative Korpusanalyse für die Grammatiktheorie: Introspek-
tion vs. Korpus. In Gisela Zifonun & Werner Kallmeyer (eds.), Sprachkorpora – Daten-
mengen und Erkenntnisfortschritt (Institut für Deutsche Sprache Jahrbuch 2006), 70–
90. Berlin: Walter de Gruyter.
Müller, Stefan. 2007d. The Grammix CD Rom: A software collection for developing typed
feature structure grammars. In Tracy Holloway King & Emily M. Bender (eds.), Gram-
mar Engineering across Frameworks 2007 (Studies in Computational Linguistics ON-
LINE), 259–266. Stanford, CA: CSLI Publications. http : / / csli - publications . stanford .
edu/GEAF/2007/ (18 August, 2020).
Müller, Stefan. 2008. Depictive secondary predicates in German and English. In
Christoph Schroeder, Gerd Hentschel & Winfried Boeder (eds.), Secondary predicates
in Eastern European languages and beyond (Studia Slavica Oldenburgensia 16), 255–
273. Oldenburg: BIS-Verlag.
Müller, Stefan. 2009b. A Head-Driven Phrase Structure Grammar for Maltese. In Bernard
Comrie, Ray Fabri, Beth Hume, Manwel Mifsud, Thomas Stolz & Martine Vanhove
(eds.), Introducing Maltese linguistics: Papers from the 1st International Conference on
Maltese Linguistics (Bremen/Germany, 18–20 October, 2007) (Studies in Language Com-
panion Series 113), 83–112. Amsterdam: John Benjamins Publishing Co.
793
References
Müller, Stefan. 2009c. On predication. In Stefan Müller (ed.), Proceedings of the 16th Inter-
national Conference on Head-Driven Phrase Structure Grammar, University of Göttingen,
Germany, 213–233. Stanford, CA: CSLI Publications.
Müller, Stefan. 2010a. Grammatiktheorie (Stauffenburg Einführungen 20). Tübingen:
Stauffenburg Verlag.
Müller, Stefan. 2010b. Persian complex predicates and the limits of inheritance-based
analyses. Journal of Linguistics 46(3). 601–655. DOI: 10.1017/S0022226709990284.
Müller, Stefan. 2012a. A personal note on open access in linguistics. Journal of Language
Modelling (1). 9–39. DOI: 10.15398/jlm.v0i1.52.
Müller, Stefan. 2012b. On the copula, specificational constructions and type shifting. Ms.
Freie Universität Berlin.
Müller, Stefan. 2013a. Grammatiktheorie. 2nd edn. (Stauffenburg Einführungen 20). Tü-
bingen: Stauffenburg Verlag.
Müller, Stefan. 2013b. The CoreGram project: A brief overview and motivation. In Denys
Duchier & Yannick Parmentier (eds.), Proceedings of the workshop on high-level method-
ologies for grammar engineering (HMGE 2013), Düsseldorf , 93–104.
Müller, Stefan. 2013c. Unifying everything: Some remarks on Simpler Syntax, Construc-
tion Grammar, Minimalism and HPSG. Language 89(4). 920–950. DOI: 10 . 1353 / lan .
2013.0061.
Müller, Stefan. 2014b. Artenvielfalt und Head-Driven Phrase Structure Grammar. In Jörg
Hagemann & Sven Staffeldt (eds.), Syntaxtheorien: Analysen im Vergleich (Stauffenburg
Einführungen 28), 187–233. Tübingen: Stauffenburg Verlag.
Müller, Stefan. 2014c. Elliptical constructions, multiple frontings, and surface-based
syntax. In Paola Monachesi, Gerhard Jäger, Gerald Penn & Shuly Wintner (eds.),
Proceedings of Formal Grammar 2004, Nancy, 91–109. Stanford, CA: CSLI Publications.
Müller, Stefan. 2014d. Kernigkeit: Anmerkungen zur Kern-Peripherie-Unterscheidung.
In Antonio Machicao y Priemer, Andreas Nolda & Athina Sioupi (eds.), Zwischen
Kern und Peripherie (studia grammatica 76), 25–39. Berlin: de Gruyter. DOI: 10.1524/
9783050065335.
Müller, Stefan. 2015a. HPSG – A synopsis. In Tibor Kiss & Artemis Alexiadou (eds.),
Syntax – Theory and analysis: An international handbook (Handbooks of Linguistics
and Communication Science 42.2), 937–973. Berlin: Walter de Gruyter. DOI: 10.1515/
9783110363708-004.
Müller, Stefan. 2015b. Satztypen: Lexikalisch oder/und phrasal. In Rita Finkbeiner & Jörg
Meibauer (eds.), Satztypen und Konstruktionen im Deutschen (Linguistik – Impulse &
Tendenzen 65), 72–105. Berlin: de Gruyter. DOI: 10.1515/9783110423112-004.
Müller, Stefan. 2015c. The CoreGram project: Theoretical linguistics, theory development
and verification. Journal of Language Modelling 3(1). 21–86. DOI: 10.15398/jlm.v3i1.91.
Müller, Stefan. 2016. Grammatical theory: From Transformational Grammar to constraint-
based approaches (Textbooks in Language Sciences 1). Berlin: Language Science Press.
DOI: 10.17169/langsci.b25.167.
794
Müller, Stefan. 2017a. German clause structure: An analysis with special consideration of
so-called multiple fronting (Empirically Oriented Theoretical Morphology and Syntax).
Revise and resubmit. Berlin: Language Science Press.
Müller, Stefan. 2017b. Head-Driven Phrase Structure Grammar, Sign-Based Construc-
tion Grammar, and Fluid Construction Grammar: Commonalities and differences. Con-
structions and Frames 9(1). 139–174. DOI: 10.1075/cf.9.1.05mul.
Müller, Stefan. 2018a. A lexicalist account of argument structure: Template-based phrasal
LFG approaches and a lexical HPSG alternative (Conceptual Foundations of Language
Science 2). Berlin: Language Science Press. DOI: 10.5281/zenodo.1441351.
Müller, Stefan. 2018b. The end of lexicalism as we know it? Language 94(1). e54–e66.
DOI: 10.1353/lan.2018.0014.
Müller, Stefan. 2019a. Complex predicates: Structure, potential structure and underspec-
ification. Linguistic Issues in Language Technology 16(3). 2–10.
Müller, Stefan. 2019b. Evaluating theories: Counting nodes and the question of con-
stituency. Language Under Discussion 5(1). 52–67. DOI: 10.31885/lud.5.1.226.
Müller, Stefan. 2019c. Germanic syntax. Ms. Humboldt Universität zu Berlin, to be sub-
mitted to Language Science Press. Berlin. https : / / hpsg . hu - berlin . de / ~stefan / Pub /
germanic.html (30 June, 2019).
Müller, Stefan. 2020a. Constituent order. In Stefan Müller, Anne Abeillé, Robert D. Bors-
ley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The handbook
(Empirically Oriented Theoretical Morphology and Syntax). To appear. Berlin: Lan-
guage Science Press.
Müller, Stefan. 2020b. HPSG and Construction Grammar. In Stefan Müller, Anne Abeillé,
Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar:
The handbook (Empirically Oriented Theoretical Morphology and Syntax). To appear.
Berlin: Language Science Press.
Müller, Stefan, Anne Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.). 2020. Head-
Driven Phrase Structure Grammar: The handbook (Empirically Oriented Theoretical
Morphology and Syntax). To appear. Berlin: Language Science Press.
Müller, Stefan & Masood Ghayoomi. 2010. PerGram: A TRALE implementation of an
HPSG fragment of Persian. In Proceedings of 2010 IEEE International Multiconference
on Computer Science and Information Technology – Computational Linguistics Applica-
tions (CLA’10). Wisła, Poland, 18–20 October 2010, vol. 5, 461–467. Polnish Information
Processing Society.
Müller, Stefan & Martin Haspelmath. 2013. Language Science Press: A publication model
for open-access books in linguistics. Grant Proposal to the DFG.
Müller, Stefan & Walter Kasper. 2000. HPSG analysis of German. In Wolfgang Wahlster
(ed.), Verbmobil: foundations of speech-to-speech translation (Artificial Intelligence),
238–253. Berlin: Springer Verlag.
Müller, Stefan & Janna Lipenkova. 2009. Serial verb constructions in Chinese: An HPSG
account. In Stefan Müller (ed.), Proceedings of the 16th International Conference on
Head-Driven Phrase Structure Grammar, University of Göttingen, Germany, 234–254.
Stanford, CA: CSLI Publications.
795
References
796
Müürisep, Kaili, Tiina Puolakainen, Kadri Muischnek, Mare Koit, Tiit Roosmaa & Heli
Uibo. 2003. A new language for Constraint Grammar: Estonian. In International Con-
ference Recent Advances in Natural Language Processing, 304–310.
Muysken, Pieter. 1982. Parametrizing the notion of “head”. Journal of Linguistic Research
2. 57–75.
Mykowiecka, Agnieszka, Małgorzata Marciniak, Adam Przepiórkowski & Anna Kupść.
2003. An implementation of a Generative Grammar of Polish. In Peter Kosta, Joanna
Błaszczak, Jens Frasek, Ljudmila Geist & Marzena Żygis (eds.), Investigations into for-
mal slavic linguistics: Contributions of the Fourth European Conference on Formal De-
scription of Slavic Languages – FDSL IV held at Potsdam University, November 28–30,
2001, 271–285. Frankfurt am Main: Peter Lang.
Nanni, Debbie L. & Justine T. Stillings. 1978. Three remarks on pied piping. Linguistic
Inquiry 9(2). 310–318.
Naumann, Sven. 1987. Ein einfacher Parser für generalisierte Phrasenstrukturgram-
matiken. Zeitschrift für Sprachwissenschaft 6(2). 206–226.
Naumann, Sven. 1988. Generalisierte Phrasenstrukturgrammatik: Parsingstrategien,
Regelorganisation und Unifikation (Linguistische Arbeiten 212). Tübingen: Max Nie-
meyer Verlag.
Neeleman, Ad. 1994. Complex predicates. Utrecht: Onderzoeksinstituut voor Taal en
Spraak (OTS). (Doctoral dissertation).
Nelimarkka, Esa, Harri Jäppinen & Aarno Lehtola. 1984. Two-way finite automata and
Dependency Grammar: A parsing method for inflectional free word order languages.
In Yorick Wilks (ed.), Proceedings of the 10th International Conference on Computational
Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics,
389–392. Stanford University, CA: Association for Computational Linguistics.
Nerbonne, John. 1986a. ‘Phantoms’ and German fronting: Poltergeist constituents? Lin-
guistics 24(5). 857–870. DOI: 10.1515/ling.1986.24.5.857.
Nerbonne, John. 1986b. A phrase-structure grammar for German passives. Linguistics
24(5). 907–938.
Nerbonne, John. 1993. A feature-based syntax/semantics interface. Annals of Mathemat-
ics and Artificial Intelligence 8(1–2). Special issue on Mathematics of Language edited
by Alexis Manaster-Ramer and Wlodek Zadrozsny, selected from the 2nd Conference
on Mathematics of Language. Also published as DFKI Research Report RR-92-42, 107–
132.
Nerbonne, John, Klaus Netter & Carl Pollard (eds.). 1994. German in Head-Driven Phrase
Structure Grammar (CSLI Lecture Notes 46). Stanford, CA: CSLI Publications.
Netter, Klaus. 1991. Clause union phenomena and complex predicates in German. DYANA
Report, Deliverable R1.1.B. University of Edinburgh.
Netter, Klaus. 1992. On Non-Head Non-Movement: An HPSG Treatment of Finite Verb
Position in German. In Günther Görz (ed.), Konvens 92. 1. Konferenz „Verarbeitung na-
türlicher Sprache“. Nürnberg 7.–9. Oktober 1992 (Informatik aktuell), 218–227. Berlin:
Springer Verlag.
797
References
Netter, Klaus. 1993. Architecture and coverage of the DISCO Grammar. In Stephan Buse-
mann & Karin Harbusch (eds.), DFKI Workshop on Natural Language Systems: Re-
Usability and Modularity, October 23 (DFKI Document D-93-03), 1–10. Saarbrücken,
Germany: DFKI.
Netter, Klaus. 1994. Towards a theory of functional heads: German nominal phrases.
In John Nerbonne, Klaus Netter & Carl Pollard (eds.), German in Head-Driven Phrase
Structure Grammar (CSLI Lecture Notes 46), 297–340. Stanford, CA: CSLI Publications.
Netter, Klaus. 1996. Functional categories in an HPSG for German. Saarbrücken: Univer-
sität des Saarlandes. (Dissertation).
Netter, Klaus. 1998. Functional categories in an HPSG for German (Saarbrücken Dis-
sertations in Computational Linguistics and Language Technology 3). Saarbrücken:
Deutsches Forschungszentrum für Künstliche Intelligenz and Universität des Saarlan-
des.
Neville, Anne & Patrizia Paggio. 2004. Developing a Danish grammar in the GRASP
project: A construction-based approach to topology and extraction in Danish. In
Lawrence S. Moss & Richard T. Oehrle (eds.), Proceedings of the joint meeting of the
6th Conference on Formal Grammar and the 7th Conference on Mathematics of Language
(Electronic Notes in Theoretical Computer Science 53), 246–259. Helsinki: Elsevier Sci-
ence Publisher B.V. (North-Holland).
Nevins, Andrew Ira, David Pesetsky & Cilene Rodrigues. 2009. Pirahã exceptionality: A
reassessment. Language 85(2). 355–404. DOI: 10.1353/lan.0.0107.
Newmeyer, Frederick J. 2004a. Against a parameter-setting approach to language varia-
tion. Linguistic Variation Yearbook 4. 181–234.
Newmeyer, Frederick J. 2004b. Typological evidence and Universal Grammar. Studies in
Language 28(3). 527–548.
Newmeyer, Frederick J. 2005. Possible and probable languages: A Generative perspective
on linguistic typology. Oxford: Oxford University Press.
Newmeyer, Frederick J. 2010. On comparative concepts and descriptive categories: A
reply to Haspelmath. Language 86(3). 688–695.
Newport, Elissa L. 1990. Maturational constraints on language learning. Cognitive Science
14(1). 11–28.
Ng, Say Kiat. 1997. A double-specifier account of Chinese NPs using Head-Driven Phrase
Structure Grammar. University of Edinburgh, Department of Linguistics. (MSc Speech
and Language Processing).
Nivre, Joakim. 2003. An efficient algorithm for projective dependency parsing. In Gertjan
van Noord (ed.), Proceedings of the 8th International Workshop on Parsing Technologies
(IWPT 03). Nancy.
Nolda, Andreas. 2007. Die Thema-Integration: Syntax und Semantik der gespaltenen Top-
ikalisierung im Deutschen (Studien zur deutschen Grammatik 72). Tübingen: Stauffen-
burg Verlag.
Noonan, Michael. 1994. A tale of two passives in Irish. In Barbara Fox & Paul J. Hopper
(eds.), Voice: form and function (Typological Studies in Language 27), 279–311. Amster-
dam: John Benjamins Publishing Co.
798
Nordgård, Torbjørn. 1994. E-Parser: An implementation of a deterministic GB-related
parsing system. Computers and the Humanities 28(4–5). 259–272.
Nordlinger, Rachel. 1998. Constructive case: Evidence from Australian languages (Disser-
tations in Linguistics). Stanford, CA: CSLI Publications.
Nowak, Martin A., Natalia L. Komarova & Partha Niyogi. 2001. Evolution of Universal
Grammar. Science 291(5501). 114–118.
Nozohoor-Farshi, R. 1986. On formalizations of Marcus’ parser. In Makoto Nagao (ed.),
Proceedings of COLING 86, 533–535. University of Bonn: Association for Computa-
tional Linguistics.
Nozohoor-Farshi, R. 1987. Context-freeness of the language accepted by Marcus’ parser.
In Candy Sidner (ed.), 25th Annual Meeting of the Association for Computational Lin-
guistics, 117–122. Stanford, CA: Association for Computational Linguistics.
Nunberg, Geoffrey. 1995. Transfers of meaning. Journal of Semantics 12(2). 109–132.
Nunberg, Geoffrey, Ivan A. Sag & Thomas Wasow. 1994. Idioms. Language 70(3). 491–
538.
Nunes, Jairo. 2004. Linearization of chains and Sideward Movement (Linguistic Inquiry
Monographs 43). Cambridge, MA: MIT Press.
Nykiel, Joanna & Jong-Bok Kim. 2020. Ellipsis. In Stefan Müller, Anne Abeillé, Robert D.
Borsley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The hand-
book (Empirically Oriented Theoretical Morphology and Syntax). To appear. Berlin:
Language Science Press.
O’Donovan, Ruth, Michael Burke, Aoife Cahill, Josef van Genabith & Andy Way. 2005.
Large-scale induction and evaluation of lexical resources from the Penn-II and Penn-
III Treebanks. Computational Linguistics 31(3). 328–365.
O’Neill, Michael & Randall Wood. 2012. The grammar of happiness. Essential Media &
Entertainment / Smithsonian Networks. https://daneverettbooks.com/media/movies/
(10 December, 2019).
Ochs, Elinor. 1982. Talking to children in Western Samoa. Language and Society 11(1).
77–104.
Ochs, Elinor & Bambi B. Schieffelin. 1985. Language acquisition and socialization: Three
developmental stories. In Richard A. Shweder & Robert A. LeVine (eds.), Culture the-
ory: Essays in mind, self and emotion, 276–320. Cambridge, UK: Cambridge University
Press.
Oepen, Stephan & Daniel P. Flickinger. 1998. Towards systematic grammar profiling:
Test suite technology ten years after. Journal of Computer Speech and Language 12(4).
(Special Issue on Evaluation), 411–436. http://www.delph-in.net/itsdb/publications/
profiling.ps.gz (18 August, 2020).
Oliva, Karel. 1992. Word order constraints in binary branching syntactic structures. CLAUS-
Report 20. Saarbrücken: Universität des Saarlandes.
Oliva, Karel. 2003. Dependency, valency and Head-Driven Phrase-Structure Grammar. In
Vilmos Ágel, Ludwig M. Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Jürgen
Heringer & Henning Lobin (eds.), Dependenz und Valenz / Dependency and valency: Ein
internationales Handbuch der zeitgenössischen Forschung / An international handbook
799
References
800
Osborne, Timothy & Thomas M. Groß. 2012. Constructions are catenae: Construction
Grammar meets Dependency Grammar. Cognitive Linguistics 23(1). 165–216.
Osborne, Timothy, Michael Putnam & Thomas M. Groß. 2011. Bare Phrase Structure,
label-less trees, and specifier-less syntax: Is Minimalism becoming a Dependency
Grammar? The Linguistic Review 28(3). 315–364.
Osenova, Petya. 2010a. Bulgarian Resource Grammar – efficient and realistic (BURGER).
Tech. rep. LingoLab, CSLI Stanford. http://www.bultreebank.org/BURGER/BURGER3.
pdf (18 August, 2020).
Osenova, Petya. 2010b. Bulgarian Resource Grammar: modeling Bulgarian in HPSG. Saar-
brücken: VDM Verlag Dr. Müller.
Osenova, Petya. 2011. Localizing a core HPSG-based grammar for Bulgarian. In
Hanna Hedeland, Thomas Schmidt & Kai Wörner (eds.), Multilingual resources
and multilingual applications: Proceedings of the Conference of the German Society
for Computational Linguistics and Language Technology (GSCL) 2011 (Arbeiten zur
Mehrsprachigkeit/Working Papers in Multilingualism, Folge B/Series B 96), 175–182.
Hamburg: Universitat Hamburg.
Ott, Dennis. 2011. A note on free relative clauses in the theory of Phases. Linguistic In-
quiry 42(1). 183–192.
Özkaragöz, İnci. 1986. Monoclausal double passives in Turkish. In Dan I. Slobin & Karl
Zimmer (eds.), Studies in Turkish linguistics (Typological Studies in Language 8), 77–
91. Amsterdam: John Benjamins Publishing Co.
Packard, Woodley. 2015. Full forest treebanking. University of Washington. (MA thesis).
Pafel, Jürgen. 1993. Ein Überblick über die Extraktion aus Nominalphrasen im Deutschen.
In Franz-Josef d’Avis, Sigrid Beck, Uli Lutz, Jürgen Pafel & Susanne Trissler (eds.),
Extraktion im Deutschen I (Arbeitspapiere des SFB 340 No. 34), 191–245. Tübingen:
Eberhard-Karls-Universität Tübingen.
Pafel, Jürgen. 2011. Einführung in die Syntax: Grundlagen – Strukturen – Theorien.
Stuttgart, Weimar: Verlag J. B. Metzler.
Paggio, Patrizia. 2005. Representing information structure in a formal grammar of Dan-
ish. In Takashi Washio, Akito Sakurai, Katsuto Nakajima, Hideaki Takeda, Satoshi Tojo
& Makoto Yokoo (eds.), New frontiers in artificial intelligence: Joint JSAI 2005 Workshop
post-proceedings (Lecture Notes in Computer Science 4012), 93–102. Berlin: Springer
Verlag. DOI: 10.1007/11780496.
Parmentier, Yannick, Laura Kallmeyer, Wolfgang Maier, Timm Lichte & Johannes
Dellert. 2008. TuLiPA: A syntax-semantics parsing environment for mildly context-
sensitive formalisms. In Proceedings of the Ninth International Workshop on Tree Ad-
joining Grammars and Related Formalisms (TAG+9), 121–128. Tübingen. https://www.
aclweb.org/anthology/W08-2316.pdf (18 August, 2020).
Partee, Barbara H. 1987. Noun phrase interpretation and type-shifting principles. In
Jeroen A. G. Groenendijk, Dick de Jongh & Martin J. B. Stokhof (eds.), Studies in
Discourse Representation Theory and the theory of generalized quantifiers (Groningen-
Amsterdam Studies in Semantics 8), 115–143. Dordrecht: Foris Publications.
801
References
Patejuk, Agnieszka & Adam Przepiórkowski. 2012. Towards an LFG parser for Polish: An
exercise in parasitic grammar development. In Proceedings of the Eighth International
Conference on Language Resources and Evaluation, LREC 2012, 3849–3852. Istanbul,
Turkey.
Paul, Hermann. 1919. Deutsche Grammatik. Teil IV: Syntax. Vol. 3. 2nd unchanged edition
1968, Tübingen: Max Niemeyer Verlag. Halle an der Saale: Max Niemeyer Verlag.
Paul, Soma. 2004. An HPSG account of Bangla compound verbs with LKB implementation.
Hyderabad, India: CALTS, University of Hyderabad, India. (Doctoral dissertation).
Penn, Gerald. 2004. Balancing clarity and efficiency in typed feature logic through de-
laying. In Donia Scott (ed.), Proceedings of the 42nd Meeting of the Association for Com-
putational Linguistics (ACL’04), main volume, 239–246. Barcelona, Spain.
Penn, Gerald & Bob Carpenter. 1999. ALE for speech: A translation prototype. In Géza
Gordos (ed.), Proceedings of the 6th Conference on Speech Communication and Technol-
ogy (EUROSPEECH). Budapest, Hungary.
Perlmutter, David M. 1978. Impersonal passives and the Unaccusative Hypothesis. In
Proceedings of the 4th Annual Meeting of the Berkeley Linguistics Society, 157–189. Berke-
ley Linguistic Society. https://escholarship.org/uc/item/73h0s91v (26 April, 2018).
Perlmutter, David M. (ed.). 1983. Studies in Relational Grammar. Vol. 1. Chicago, IL: The
University of Chicago Press.
Perlmutter, David M. (ed.). 1984. Studies in relational grammar. Vol. 2. Chicago, IL: The
University of Chicago Press.
Perlmutter, David M. & John Robert Ross. 1970. Relative clauses with split antecedents.
Linguistic Inquiry 1(3). 350.
Pesetsky, David. 1996. Zero syntax: Experiencers and cascades. Cambridge, MA: MIT Press.
Peters, Stanley & R. W. Ritchie. 1973. On the generative power of Transformational Gram-
mar. Information Sciences 6(C). 49–83. DOI: 10.1016/0020-0255(73)90027-3.
Petrick, Stanley Roy. 1965. A recognition procedure for Transformational Grammars. Mas-
sachusetts Institute of Technology. Dept. of Modern Languages. (Doctoral disserta-
tion). http://hdl.handle.net/1721.1/13013 (18 August, 2020).
Phillips, Colin. 2003. Linear order and constituency. Linguistic Inquiry 34(1). 37–90. DOI:
10.1162/002438903763255922.
Phillips, John D. 1992. A computational representation for Generalised Phrase Structure
Grammars. Linguistics and Philosophy 15(3). 255–287.
Phillips, John D. & Henry S. Thompson. 1985. GPSGP – A parser for Generalized Phrase
Structure Grammar. Linguistics 23(2). 245–261.
Piattelli-Palmarini, Massimo (ed.). 1980. Language and learning: The debate between Jean
Piaget and Noam Chomsky. Cambridge: Harvard University Press.
Pickering, Martin & Guy Barry. 1993. Dependency Categorial Grammar and coordina-
tion. Linguistics 31(5). 855–902.
Pienemann, Manfred. 2005. An introduction to Processability Theory. In Manfred Piene-
mann (ed.), Cross-linguistic aspects of processablity theory, 1–60. Amsterdam: John Ben-
jamins Publishing Co.
802
Piñango, Maria Mercedes, Jennifer Mack & Ray S. Jackendoff. 2006. Semantic combi-
natorial processes in argument structure: evidence from light-verbs. In Zhenya Antić,
Charles B. Chang, Emily Cibelli, Jisup Hong, Michael J. Houser, Clare S. Sandy, Maziar
Toosarvandani & Yao Yao (eds.), Proceedings of the 32nd Annual Meeting of the Berkeley
Linguistics Society: Theoretical approaches to argument structure, vol. 32. Berkeley, CA:
Berkeley Linguistics Society.
Pineda, Luis Alberto & Iván V. Meza. 2005a. A computational model of the Spanish clitic
system. In Alexander Gelbkuh (ed.), Computational linguistics and intelligent language
processing, 73–82. Berlin: Springer Verlag.
Pineda, Luis Alberto & Iván V. Meza. 2005b. The Spanish pronominal clitic system. Proce-
samiento del Lenguaje Natural 34. 67–103.
Pinker, Steven. 1984. Learnability and cognition: The acquisition of argument structure.
London/Cambridge, MA: MIT Press.
Pinker, Steven. 1994. The language instinct: How the mind creates language. New York:
William Morrow.
Pinker, Steven & Ray S. Jackendoff. 2005. The faculty of language: What’s special about
it? Cognition 95(2). 201–236.
Pittner, Karin. 1995. Regeln für die Bildung von freien Relativsätzen: Eine Antwort an
Oddleif Leirbukt. Deutsch als Fremdsprache 32(4). 195–200.
Plank, Frans & Elena Filimonova. 2000. The universals archive: A brief introduction for
prospective users. Sprachtypologie und Universalienforschung 53(1). 109–123.
Plath, Warren J. 1973. Transformational Grammar and transformational parsing in the
Request System. In A. Zampolli & N. Calzolari (eds.), COLING 1973: computational and
mathematical linguistics: proceedings of the international conference on computational
linguistics, vol. 2. https://www.aclweb.org/anthology/C73-2028.
Poletto, Cecilia. 2000. The higher functional field: Evidence from Northern Italian Dialects.
Oxford: Oxford University Press.
Pollard, Carl. 1994. Toward a unified account of passive in German. In John Nerbonne,
Klaus Netter & Carl Pollard (eds.), German in Head-Driven Phrase Structure Grammar
(CSLI Lecture Notes 46), 273–296. Stanford, CA: CSLI Publications.
Pollard, Carl. 1997. The nature of constraint-based grammar. Linguistic Research 15. http:
//isli.khu.ac.kr/journal/content/data/15/1.pdf (18 August, 2020).
Pollard, Carl & Ivan A. Sag. 1987. Information-based syntax and semantics (CSLI Lecture
Notes 13). Stanford, CA: CSLI Publications.
Pollard, Carl & Ivan A. Sag. 1992. Anaphors in English and the scope of Binding Theory.
Linguistic Inquiry 23(2). 261–303.
Pollard, Carl & Ivan A. Sag. 1994. Head-Driven Phrase Structure Grammar (Studies in
Contemporary Linguistics). Chicago: The University of Chicago Press.
Pollard, Carl J. 1988. Categorial Grammar and Phrase Structure Grammar: An excursion
on the syntax-semantics frontier. In Richard Oehrle, Emmon Bach & Deirdre Wheeler
(eds.), Categorial Grammars and natural language structures, 391–415. Dordrecht: D.
Reidel Publishing Company.
803
References
Pollard, Carl J. 1996. On head non-movement. In Harry Bunt & Arthur van Horck (eds.),
Discontinuous constituency (Natural Language Processing 6), 279–305. Published ver-
sion of a Ms. dated January 1990. Berlin: Mouton de Gruyter.
Pollard, Carl J. 1999. Strong generative capacity in HPSG. In Gert Webelhuth, Jean-Pierre
Koenig & Andreas Kathol (eds.), Lexical and Constructional aspects of linguistic expla-
nation (Studies in Constraint-Based Lexicalism 1), 281–298. Stanford, CA: CSLI Publi-
cations.
Pollard, Carl J. & Andrew M. Moshier. 1990. Unifying partial descriptions of sets. In Philip
P. Hanson (ed.), Information, language and cognition (Vancouver Studies in Cognitive
Science 1), 285–322. Vancouver: University of British Columbia Press.
Pollard, Carl Jesse. 1984. Generalized Phrase Structure Grammars, Head Grammars, and
natural language. Stanford University. (Doctoral dissertation).
Pollock, Jean-Yves. 1989. Verb movement, Universal Grammar and the structure of IP.
Linguistic Inquiry 20(3). 365–424.
Popowich, Fred & Carl Vogel. 1991. A logic based implementation of Head-Driven Phrase
Structure Grammar. In Charles Grant Brown & Gregers Koch (eds.), Natural Language
Understanding and Logic Programming, III. The 3rd International Workshop, Stockholm,
Sweden, 23–25 Jan., 1991, 227–246. Amsterdam: Elsevier, North-Holland.
Porzel, Robert, Vanessa Micelli, Hidir Aras & Hans-Peter Zorn. 2006. Tying the knot:
Ground entities, descriptions and information objects for Construction-based infor-
mation extraction. In Proceedings of the OntoLex Workshop at LREC, May 2006. Genoa,
Italy, 35–40.
Postal, Paul M. 2004. Skeptical linguistic essays. Oxford: Oxford University Press.
Postal, Paul M. 2009. The incoherence of Chomsky’s ‘Biolinguistic’ ontology. Biolinguis-
tics 3(1). 104–123.
Postal, Paul M. & Geoffrey K. Pullum. 1986. Misgovernment. Linguistic Inquiry 17(1). 104–
110.
Prince, Alan & Paul Smolensky. 1993. Optimality Theory: Constraint interaction in Gen-
erative Grammar. RuCCS Technical Report 2. Center for Cognitive Science, Rutgers
University, Piscataway, N.J., & Computer Science Department, University of Colorado,
Boulder. http://roa.rutgers.edu/files/537-0802/537-0802-PRINCE-0-0.PDF (18 August,
2020).
Przepiórkowski, Adam. 1999a. Case assignment and the complement-adjunct dichotomy: A
non-configurational constraint-based approach. Eberhard-Karls-Universität Tübingen.
(Doctoral dissertation). https://publikationen.uni-tuebingen.de/xmlui/handle/10900/
46147 (18 August, 2020).
Przepiórkowski, Adam. 1999b. On case assignment and “adjuncts as complements”. In
Gert Webelhuth, Jean-Pierre Koenig & Andreas Kathol (eds.), Lexical and Construc-
tional aspects of linguistic explanation (Studies in Constraint-Based Lexicalism 1), 231–
245. Stanford, CA: CSLI Publications.
Przepiórkowski, Adam & Anna Kupść. 2006. HPSG for Slavicists. Glossos 8. 1–68.
804
Przepiórkowski, Adam, Anna Kupść, Małgorzata Marciniak & Agnieszka Mykowiecka.
2002. Formalny opis języka polskiego: Teoria i implementacja. Warsaw: Akademicka
Oficyna Wydawnicza EXIT.
Pullum, Geoffrey K. 1977. Word order universals and grammatical relations. In Peter Cole
& Jerrold M. Sadock (eds.), Grammatical relations (Syntax and Semantics 8), 249–277.
New York, NY: Academic Press.
Pullum, Geoffrey K. 1982. Free word order and phrase structure rules. In James Puste-
jovsky & Peter Sells (eds.), Proceedings of the 12th Anual Meeting of the Northeast Lin-
guistic Society, 209–220. Amherst: Graduate Linguistics Student Association.
Pullum, Geoffrey K. 1983. How many possible human languages are there? Linguistic
Inquiry 14(3). 447–467.
Pullum, Geoffrey K. 1984. Stalking the perfect journal. Natural Language & Linguistic
Theory 2(2). 261–267.
Pullum, Geoffrey K. 1985. Assuming some version of X-bar Theory. In Papers from the
21st Annual Meeting of the Chicago Linguistic Society, 323–353.
Pullum, Geoffrey K. 1986. Footloose and context-free. Natural Language & Linguistic The-
ory 4(3). 409–414.
Pullum, Geoffrey K. 1988. Citation etiquette beyond thunderdome. Natural Language &
Linguistic Theory 6(4). 579–588.
Pullum, Geoffrey K. 1989a. Formal linguistics meets the Boojum. Natural Language &
Linguistic Theory 7(1). 137–143. DOI: 10.1007/BF00141350.
Pullum, Geoffrey K. 1989b. The incident of the node vortex problem. Natural Language
& Linguistic Theory 7(3). 473–479.
Pullum, Geoffrey K. 1991. The great Eskimo vocabulary hoax and other irreverent essays
on the study of language. Chicago, IL: The University of Chicago Press.
Pullum, Geoffrey K. 1996. Learnability, hyperlearning, and the Poverty of the Stimulus.
In J. Johnson, M. L. Juge & J. L. Moxley (eds.), Proceedings of the 22nd Annual Meeting of
the Berkeley Linguistics Society: General session and parasession on the role of learnabil-
ity in grammatical theory, 498–513. Berkeley, CA: Berkeley Linguistic Society. https://
journals.linguisticsociety.org/proceedings/index.php/BLS/article/viewFile/1336/1120
(18 August, 2020).
Pullum, Geoffrey K. 2003. Learnability: Mathematical aspects. In William J. Frawley (ed.),
Oxford international encyclopedia of linguistics, 2nd edn., 431–434. Oxford: Oxford Uni-
versity Press.
Pullum, Geoffrey K. 2007. The evolution of model-theoretic frameworks in linguistics. In
James Rogers & Stephan Kepser (eds.), Model-theoretic syntax at 10 – Proceedings of the
ESSLLI 2007 MTS@10 Workshop, August 13–17, 1–10. Dublin: Trinity College Dublin.
Pullum, Geoffrey K. 2009. Response to Anderson. Language 85(2). 245–247.
Pullum, Geoffrey K. 2013. The central question in comparative syntactic metatheory.
Mind and Language 28(4). 492–521.
Pullum, Geoffrey K. & Kyle Rawlins. 2007. Argument or no argument? Linguistics and
Philosophy 30(2). 277–287.
805
References
Pullum, Geoffrey K. & Barbara C. Scholz. 2001. On the distinction between generative-
enumerative and model-theoretic syntactic frameworks. In Philippe de Groote, Glyn
Morrill & Christian Retor (eds.), Logical Aspects of Computational Linguistics: 4th Inter-
national Conference (Lecture Notes in Computer Science 2099), 17–43. Berlin: Springer
Verlag. DOI: 10.1007/3-540-48199-0_2.
Pullum, Geoffrey K. & Barbara C. Scholz. 2002. Empirical assessment of stimulus poverty
arguments. The Linguistic Review 19(1–2). 9–50. DOI: 10.1515/tlir.19.1-2.9.
Pullum, Geoffrey K. & Barbara C. Scholz. 2010. Recursion and the infinitude claim. In
Harry van der Hulst (ed.), Recursion in human language (Studies in Generative Gram-
mar 104), 113–138. Berlin: Mouton de Gruyter.
Pulman, Stephen G. 1985. A parser that doesn’t. In Maghi King (ed.), Proceedings of the 2nd
European Meeting of the Association for Computational Linguistics, 128–135. Geneva:
Association for Computational Linguistics. https : / / www . aclweb . org / anthology /
events/eacl-1985/ (18 August, 2020).
Pulvermüller, Friedemann. 2003. The neuroscience of language: On brain circuits of words
and serial order. Cambridge, UK: Cambridge University Press.
Pulvermüller, Friedemann. 2010. Brain embodiment of syntax and grammar: discrete
combinatorial mechanisms spelt out in neuronal circuits. Brain & Language 112(3).
167–179.
Pulvermüller, Friedemann, Bert Cappelle & Yury Shtyrov. 2013. Brain basis of meaning,
words, constructions, and grammar. In Thomas Hoffmann & Graeme Trousdale (eds.),
The Oxford handbook of Construction Grammar (Oxford Handbooks), 397–416. Oxford:
Oxford University Press.
Quaglia, Stefano. 2014. On the syntax of some apparent spatial particles in Italian. In
Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 2014 conference,
503–523. Stanford, CA: CSLI Publications.
Radford, Andrew. 1990. Syntactic theory and the acquisition of English syntax. Cambridge,
MA: Blackwell Publishers Ltd.
Radford, Andrew. 1997. Syntactic theory and the structure of English: A Minimalist ap-
proach (Cambridge Textbooks in Linguistics). Cambridge, UK: Cambridge University
Press.
Rákosi, György, Tibor Laczkó & Gábor Csernyi. 2011. On English phrasal verbs and their
Hungarian counterparts: From the perspective of a computational linguistic project.
Argumentum 7. 80–89.
Rambow, Owen. 1994. Formal and computational aspects of natural language syntax. Uni-
versity of Pennsylvania. (Doctoral dissertation).
Ramchand, Gillian. 2005. Post-Davidsonianism. Theoretical Linguistics 31(3). 359–373.
DOI: 10.1515/thli.2005.31.3.359.
Randriamasimanana, Charles. 2006. Simple sentences in Malagasy. In Henry Y. Chang,
Lillian M. Huang & Dah-ah Ho (eds.), Streams converging into an ocean: Festschrift in
honor of Professor Paul Jen-kuei Li on his 70th birthday, 71–96. Taipei, Taiwan: Institute
of Linguistics, Academia Sinica.
806
Raposo, Eduardo & Juan Uriagereka. 1990. Long-distance case assignment. Linguistic In-
quiry 21(4). 505–537.
Rappaport, Malka. 1983. On the nature of derived nominals. In Lori S. Levin, Malka Rap-
paport & Annie Zaenen (eds.), Papers in Lexical Functional Grammar, 113–42. Indiana:
Indiana University Linguistics Club.
Rauh, Gisa. 2016. Linguistic categories and the syntax-semantics interface: Evaluating
competing approaches. In Jens Fleischhauer, Anja Latrouite & Rainer Osswald (eds.),
Explorations of the syntax-semantics interface (Studies in Language and Cognition 3),
15–55. Düsseldorf: düsseldorf university press.
Reape, Mike. 1991. Word order variation in Germanic and parsing. DYANA Report Deliv-
erable R1.1.C. University of Edinburgh.
Reape, Mike. 1992. A formal theory of word order: A case study in West Germanic. Univer-
sity of Edinburgh. (Doctoral dissertation).
Reape, Mike. 1994. Domain union and word order variation in German. In John Ner-
bonne, Klaus Netter & Carl Pollard (eds.), German in Head-Driven Phrase Structure
Grammar (CSLI Lecture Notes 46), 151–198. Stanford, CA: CSLI Publications.
Reape, Mike. 2000. Formalisation and abstraction in linguistic theory II: Toward a radical
Linearisation Theory of German. unpublished paper.
Redington, Martin, Nick Chater & Steven Finch. 1998. Distributional information: A pow-
erful cue for acquiring syntactic categories. Cognitive Science 22(4). 425–469.
Reis, Marga. 1974. Syntaktische Hauptsatzprivilegien und das Problem der deutschen
Wortstellung. Zeitschrift für Germanistische Linguistik 2(3). 299–327.
Reis, Marga. 1980. On justifying topological frames: ‘Positional field’ and the order of
nonverbal constituents in German. Documentation et Recherche en Linguistique Alle-
mande Contemporaine. Revue de Linguistique 22/23. 59–85.
Reis, Marga. 1982. Zum Subjektbegriff im Deutschen. In Werner Abraham (ed.),
Satzglieder im Deutschen – Vorschläge zur syntaktischen, semantischen und pragmatis-
chen Fundierung (Studien zur deutschen Grammatik 15), 171–211. Tübingen: originally
Gunter Narr Verlag now Stauffenburg Verlag.
Remberger, Eva-Maria. 2009. Null subjects, expletives and locatives in Sardinian. In
Georg A. Kaiser & Eva-Maria Remberger (eds.), Proceedings of the workshop Null-
Subjects, Expletives, and Locatives in Romance (Arbeitspapier 123), 231–261. Konstanz:
Fachbereich Sprachwissenschaft, Universität Konstanz.
Resnik, Philip. 1992. Probabilistic Tree-Adjoining Grammar as a framework for statistical
natural language processing. In Antonio Zampolli (ed.), 14th International Conference
on Computational Linguistics (COLING ’92), August 23–28, 418–424. Nantes, France:
Association for Computational Linguistics.
Reyle, Uwe. 1993. Dealing with ambiguities by underspecification: Construction, repre-
sentation and deduction. Jounal of Semantics 10(2). 123–179.
Richards, Marc. 2015. Minimalism. In Tibor Kiss & Artemis Alexiadou (eds.), Syntax –
theory and analysis: An international handbook, vol. 42.3 (Handbooks of Linguistics
and Communication Science), 803–839. Berlin: Mouton de Gruyter. DOI: 10 . 1515 /
9783110363685.
807
References
Richter, Frank. 2004. A mathematical formalism for linguistic theories with an application
in Head-Driven Phrase Structure Grammar. Universität Tübingen. (Phil. Dissertation
(2000)). https://publikationen.uni- tuebingen.de/xmlui/handle/10900/46230 (18 Au-
gust, 2020).
Richter, Frank. 2007. Closer to the truth: A new model theory for HPSG. In James Rogers
& Stephan Kepser (eds.), Model-theoretic syntax at 10 – Proceedings of the ESSLLI 2007
MTS@10 Workshop, August 13–17, 101–110. Dublin: Trinity College Dublin.
Richter, Frank. 2020. Formal background. In Stefan Müller, Anne Abeillé, Robert D. Bors-
ley & Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The handbook
(Empirically Oriented Theoretical Morphology and Syntax). To appear. Berlin: Lan-
guage Science Press.
Richter, Frank & Manfred Sailer. 1999a. A lexicalist collocation analysis of sentential
negation in French. In Valia Kordoni (ed.), Tübingen studies in Head-Driven Phrase
Structure Grammar (Arbeitspapiere des SFB 340, No. 132, Volume 1), 231–300. Tübing-
en: Eberhard-Karls-Universität Tübingen.
Richter, Frank & Manfred Sailer. 1999b. Lexicalizing the left periphery of German fi-
nite sentences. In Valia Kordoni (ed.), Tübingen studies in Head-Driven Phrase Struc-
ture Grammar (Arbeitspapiere des SFB 340, No. 132, Volume 1), 116–154. Tübingen:
Eberhard-Karls-Universität Tübingen.
Richter, Frank & Manfred Sailer. 2004. Basic concepts of Lexical Resource Semantics.
In Arnold Beckmann & Norbert Preining (eds.), ESSLLI 2003 – Course material I (Col-
legium Logicum 5), 87–143. Wien: Kurt Gödel Society.
Richter, Frank & Manfred Sailer. 2009. Phraseological clauses as Constructions in HPSG.
In Stefan Müller (ed.), Proceedings of the 16th International Conference on Head-Driven
Phrase Structure Grammar, University of Göttingen, Germany, 297–317. Stanford, CA:
CSLI Publications. http : / / csli - publications . stanford . edu / HPSG / 2009/ (18 August,
2020).
Riehemann, Susanne. 1993. Word formation in lexical type hierarchies: A case study of
bar-adjectives in German. Also published as SfS-Report-02-93, Seminar für Sprachwis-
senschaft, University of Tübingen. Eberhard-Karls-Universität Tübingen. (MA thesis).
Riehemann, Susanne Z. 1998. Type-based derivational morphology. Journal of Compara-
tive Germanic Linguistics 2(1). 49–77. DOI: 10.1023/A:1009746617055.
Riemsdijk, Henk van. 1978. A case study in syntactic markedness: The binding nature of
prepositional phrases. Lisse: The Peter de Ridder Press.
Riezler, Stefan, Tracy Holloway King, Ronald M. Kaplan, Richard Crouch, John T.
Maxwell III & Mark Johnson. 2002. Parsing the Wall Street Journal using a Lexical-
Functional Grammar and discriminative estimation techniques. In Pierre Isabelle (ed.),
40th Annual Meeting of the Association for Computational Linguistics: Proceedings of the
conference, 271–278. University of Pennsylvania, Philadelphia: Association for Compu-
tational Linguistics. https://www.aclweb.org/anthology/events/acl-2002/ (18 August,
2020).
808
Rizzi, Luigi. 1982. Violations of the wh island constraint and the Subjacency Condition.
In Luigi Rizzi (ed.), Issues in Italian syntax (Studies in Generative Grammar 11), 49–76.
Dordrecht: Foris Publications.
Rizzi, Luigi. 1986. Null objects in Italian and the theory of pro. Linguistic Inquiry 17(3).
501–577.
Rizzi, Luigi. 1997. The fine structure of the left periphery. In Liliane Haegeman (ed.),
Elements of grammar, 281–337. Dordrecht: Kluwer Academic Publishers.
Rizzi, Luigi. 2009a. Language variation and universals: Some notes on N. Evans and S. C.
Levinson (2009) “The myth of language universals: Language diversity and its impor-
tance for cognitive science”. In Paola Cotticelli-Kurras & Alessandra Tomaselli (eds.),
La Grammatica tra storia e teoria. Studi in onore di Giorgio Graffi, 153–162. Alessandra:
Edizioni dell’Orso.
Rizzi, Luigi. 2009b. The discovery of language invariance and variation, and its relevance
for the cognitive sciences. The Behavioral and Brain Sciences 32(5). 467–468.
Rizzi, Luigi. 2014. Syntactic Cartography and the Syntacticisation of Scope-Discourse
Semantics. In Anne Reboul (ed.), Mind, Values, and Metaphysics: Philosophical Essays
in Honor of Kevin Mulligan, vol. 2, 517–533. Cham: Springer International Publishing.
DOI: 10.1007/978-3-319-05146-8_30.
Roberts, Ian F. & Anders Holmberg. 2005. On the role of parameters in Universal Gram-
mar: A reply to Newmeyer. In Hans Broekhuis, N. Corver, Riny Huybregts, Ursula
Kleinhenz & Jan Koster (eds.), Organizing grammar: Linguistic studies in honor of
Henk van Riemsdijk (Studies in Generative Grammar 86), 538–553. Berlin: Mouton
de Gruyter. DOI: 10.1515/9783110892994.538.
Robins, Robert Henry. 1997. A short history of linguistics. 4th edn. (Longman Linguistics
Library). London: Routledge.
Rogers, James. 1994. Obtaining trees from their descriptions: An application to Tree-
Adjoining Grammars. Computational Intelligence 10(4). 401–421.
Rogers, James. 1997. ‘‘Grammarless” Phrase Structure Grammar. Linguistics and Philoso-
phy 20. 721–746.
Rogers, James. 1998. A descriptive approach to language-theoretic complexity (Studies in
Logic, Language and Information). Stanford, CA: CSLI Publications.
Rohrer, Christian. 1996. Fakultativ kohärente Infinitkonstruktionen im Deutschen und
deren Behandlung in der Lexikalisch Funktionalen Grammatik. In Gisela Harras &
Manfred Bierwisch (eds.), Wenn die Semantik arbeitet: Klaus Baumgärtner zum 65. Ge-
burtstag, 89–108. Tübingen: Max Niemeyer Verlag.
Rohrer, Christian & Martin Forst. 2006. Improving coverage and parsing quality of a
large-scale LFG for German. In Proceedings of the Language Resources and Evaluation
Conference (LREC-2006). Genoa, Italy.
Ross, John Robert. 1967. Constraints on variables in syntax. Reproduced by the Indiana
University Linguistics Club and later published as Ross (1986). Cambridge, MA: MIT.
(Doctoral dissertation). http : / / files . eric . ed . gov / fulltext / ED016965 . pdf (18 August,
2020).
809
References
Ross, John Robert. 1986. Infinite syntax! (Language and Being 5). Norwood, NJ: Ablex
Publishing Corporation.
Rothkegel, Annely. 1976. Valenzgrammatik (Linguistische Arbeiten 19). Saarbrücken,
Germany: Sonderforschungsbereich Elektronische Sprachforschung, Universität des
Saarlandes.
Sabel, Joachim. 1999. Das Passiv im Deutschen: Derivationale Ökonomie vs. optionale
Bewegung. Linguistische Berichte 177. 87–112.
Sáfár, Éva & John Glauert. 2010. Sign Language HPSG. In Proceedings of the 4th Workshop
on the Representation and Processing of Sign Languages: Corpora and Sign Language
Technologies, LREC 2010, 22–23 May 2010, Malta, 204–207.
Sáfár, Éva & Ian Marshall. 2002. Sign language translation via DRT and HPSG. In Alexan-
der Gelbukh (ed.), Computational linguistics and intelligent text processing: Third Inter-
national Conference, CICLing 2002 Mexico City, Mexico, February 17–23, 2002 Proceed-
ings (Lecture Notes in Computer Science 2276), 58–68. Berlin: Springer Verlag.
Safir, Kenneth J. 1985. Missing subjects in German. In Jindřich Toman (ed.), Studies in
German grammar (Studies in Generative Grammar 21), 193–229. Dordrecht: Foris Pub-
lications.
Sag, Ivan A. 1983. On parasitic gaps. Linguistics and Philosophy 6(1). 35–45. DOI: 10.1007/
BF00868089.
Sag, Ivan A. 1997. English relative clause constructions. Journal of Linguistics 33(2). 431–
484. DOI: 10.1017/S002222679700652X.
Sag, Ivan A. 2000. Another argument against Wh-trace. Jorge Hankamer Webfest. https:
//babel.ucsc.edu/jorgewebfest/sag.html (18 August, 2020).
Sag, Ivan A. 2007. Remarks on locality. In Stefan Müller (ed.), Proceedings of the 14th
International Conference on Head-Driven Phrase Structure Grammar, 394–414. Stanford,
CA: CSLI Publications. http://csli-publications.stanford.edu/HPSG/2007/ (18 August,
2020).
Sag, Ivan A. 2010. English filler-gap constructions. Language 86(3). 486–545.
Sag, Ivan A. 2012. Sign-Based Construction Grammar: An informal synopsis. In Hans C.
Boas & Ivan A. Sag (eds.), Sign-Based Construction Grammar (CSLI Lecture Notes 193),
69–202. Stanford, CA: CSLI Publications.
Sag, Ivan A., Hans C. Boas & Paul Kay. 2012. Introducing Sign-Based Construction Gram-
mar. In Hans C. Boas & Ivan A. Sag (eds.), Sign-Based Construction Grammar (CSLI
Lecture Notes 193), 1–29. Stanford, CA: CSLI Publications.
Sag, Ivan A., Rui Chaves, Anne Abeillé, Bruno Estigarribia, Frank Van Eynde, Dan Flick-
inger, Paul Kay, Laura A. Michaelis-Cummings, Stefan Müller, Geoffrey K. Pullum &
Tom Wasow. 2020. Lessons from the English auxiliary system. Journal of Linguistics
56(1). 87–155. DOI: 10.1017/S002222671800052X.
Sag, Ivan A., Philip Hofmeister & Neal Snider. 2007. Processing complexity in subjacency
violations: The Complex Noun Phrase Constraint. In Malcolm Elliott, James Kirby,
Osamu Sawada, Eleni Staraki & Suwon Yoon (eds.), Proceedings of the 43rd Annual
Meeting of the Chicago Linguistic Society, 215–229. Chicago, IL: Chicago Linguistic
Society.
810
Sag, Ivan A. & Carl Pollard. 1991. An integrated theory of complement control. Language
67(1). 63–113.
Sag, Ivan A. & Thomas Wasow. 2011. Performance-compatible competence grammar.
In Robert D. Borsley & Kersti Börjars (eds.), Non-transformational syntax: Formal and
explicit models of grammar: A guide to current models, 359–377. Oxford, UK/Cambridge,
MA: Blackwell Publishers Ltd.
Sag, Ivan A. & Thomas Wasow. 2015. Flexible processing and the design of grammar.
Journal of Psycholinguistic Research 44(1). 47–63.
Sag, Ivan A., Thomas Wasow & Emily M. Bender. 2003. Syntactic theory: A formal intro-
duction. 2nd edn. (CSLI Lecture Notes 152). Stanford, CA: CSLI Publications.
Sailer, Manfred. 2000. Combinatorial semantics and idiomatic expressions in Head-
Driven Phrase Structure Grammar. Eberhard-Karls-Universität Tübingen. (Disserta-
tion). https://publikationen.uni-tuebingen.de/xmlui/handle/10900/46191 (18 August,
2020).
Samarin, William J. 1984. Socioprogrammed linguistics. The Behavioral and Brain Sciences
7(2). 206–207.
Sampson, Geoffrey. 1989. Language acquisition: growth or learning? Philosophical Papers
18(3). 203–240.
Samvelian, Pollet. 2007. A (phrasal) affix analysis of the Persian Ezafe. Journal of Linguis-
tics 43(3). 605–645. DOI: 10.1017/S0022226707004781.
Sarkar, Anoop & Aravind K. Joshi. 1996. Coordination in Tree Adjoining Grammars:
formalization and implementation. In Jun-ichi Tsuji (ed.), Proceedings of COLING-96.
16th International Conference on Computational Linguistics (COLING96). Copenhagen,
Denmark, August 5–9, 1996, 610–615. Copenhagen, Denmark: Association for Compu-
tational Linguistics.
Sato, Yo. 2006. Constrained free word order parsing with Lexicalised Linearisation Gram-
mar. In Proceedings of 9th Annual CLUK Research Colloquium. Open University, UK.
Sato, Yo. 2008. Implementing Head-Driven Linearisation Grammar. King’s College Lon-
don. (Doctoral dissertation).
Sauerland, Uli & Paul Elbourne. 2002. Total reconstruction, PF movement, and deriva-
tional order. Linguistic Inquiry 33(2). 283–319.
Savin, Harris B. & Ellen Perchonock. 1965. Grammatical structure and the immediate
recall of English sentences. Journal of Verbal Learning and Verbal Behavior 4(5). 348–
353.
Schabes, Yves, Anne Abeillé & Aravind K. Joshi. 1988. Parsing strategies with ‘lexicalized’
grammars: Application to Tree Adjoining Grammars. Technical Report MS-CIS-88-65.
University of Pennsylvania Department of Computer & Information Science.
Schein, Barry. 1993. Plurals and events (Current Studies in Linguistics 23). Cambridge,
MA: MIT Press.
Scherpenisse, Wim. 1986. The connection between base structure and linearization restric-
tions in German and Dutch (Europäische Hochschulschriften, Reihe XXI, Linguistik
47). Frankfurt/M.: Peter Lang.
811
References
Schluter, Natalie & Josef van Genabith. 2009. Dependency parsing resources for French:
Converting acquired Lexical Functional Grammar f-structure annotations and parsing
f-structures directly. In Kristiina Jokinen & Eckhard Bick (eds.), Nodalida 2009 confer-
ence proceedings, 166–173.
Schmidt, Paul, Sibylle Rieder & Axel Theofilidis. 1996. Final documentation of the German
LS-GRAM lingware. Deliverable DC-WP6e (German). Saarbrücken: IAI.
Schmidt, Paul, Axel Theofilidis, Sibylle Rieder & Thierry Declerck. 1996. Lean for-
malisms, linguistic theory, and applications: grammar development in ALEP. In Jun-
ichi Tsuji (ed.), Proceedings of COLING-96. 16th International Conference on Compu-
tational Linguistics (COLING96). Copenhagen, Denmark, August 5–9, 1996, 286–291.
Copenhagen, Denmark: Association for Computational Linguistics. DOI: 10 . 3115 /
992628.992679.
Scholz, Barbara C. & Geoffrey K. Pullum. 2002. Searching for arguments to support lin-
guistic nativism. The Linguistic Review 19(1–2). 185–223. DOI: 10.1515/tlir.19.1-2.185.
Schubert, K. 1987. Metataxis: Contrastive Dependency Syntax for machine translation. Dor-
drecht: Foris Publications.
Schumacher, Helmut, Jacqueline Kubczak, Renate Schmidt & Vera de Ruiter. 2004.
VALBU – Valenzwörterbuch deutscher Verben. Tübingen: Gunter Narr Verlag.
Schütz, Jörg. 1996. The ALEP formalism in a nutshell. Tech. rep. Saarbrücken: IAI.
Schwarze, Christoph & Leonel de Alencar. 2016. Lexikalisch-funktionale Grammatik: Eine
Einführung am Beispiel des Französischen, mit computerlinguistischer Implementierung
(Stauffenburg Einführungen 30). Tübingen: Stauffenburg Verlag.
Seiss, Melanie & Rachel Nordlinger. 2012. An electronic dictionary and translation sys-
tem for Murrinh-Patha. The EUROCALL Review: Proceedings of the EUROCALL 2011
Conference 20(1). 135–138.
Sengupta, Probal & B. B. Chaudhuri. 1997. A delayed syntactic-encoding-based LFG pars-
ing strategy for an Indian language—Bangla. Computational Linguistics 23(2). 345–351.
Seuren, Pieter A. M. 1984. The Bioprogram Hypothesis: facts and fancy. The Behavioral
and Brain Sciences 7(2). 208–209.
Seuren, Pieter A. M. 2004. Chomsky’s Minimalism. Oxford: Oxford University Press.
Shieber, Stuart M. 1985. Evidence against the context-freeness of natural language. Lin-
guistics and Philosophy 8(3). 333–343. DOI: 10.1007/BF00630917.
Shieber, Stuart M. 1986. An introduction to unification-based approaches to grammar (CSLI
Lecture Notes 4). Stanford, CA: CSLI Publications.
Shieber, Stuart M. & Mark Johnson. 1993. Variations on incremental interpretation. Jour-
nal of Psycholinguistic Research 22(2). 287–318.
Shieber, Stuart M., Hans Uszkoreit, Fernando Pereira, Jane Robinson & Mabry Tyson.
1983. The formalism and implementation of PATR-II. In Research on interactive acqui-
sition and use of knowledge, 39–79. Menlo Park, CA: Artificial Intelligence Center, SRI
International.
Shtyrov, Yury, Elina Pihko & Friedemann Pulvermüller. 2005. Determinants of domi-
nance: is language laterality explained by physical or linguistic features of speech?
Neuroimage 27(1). 37–47. DOI: 10.1016/j.neuroimage.2005.02.003.
812
Siegel, Melanie. 2000. HPSG analysis of Japanese. In Wolfgang Wahlster (ed.), Verbmobil:
Foundations of speech-to-speech translation (Artificial Intelligence), 264–279. Berlin:
Springer Verlag.
Siegel, Melanie & Emily M. Bender. 2002. Efficient deep processing of Japanese. In
Proceedings of the 3rd Workshop on Asian Language Resources and International Stan-
dardization at the 19th International Conference on Computational Linguistics. Taipei,
Taiwan. http://www.aclweb.org/anthology/W02-1210 (18 August, 2020).
Siegel, Melanie, Emily M. Bender & Francis Bond. 2016. Jacy: An implemented grammar
of Japanese (CSLI Studies in Computational Linguistics). Stanford, CA: CSLI Publica-
tions.
Simov, Kiril, Petya Osenova, Alexander Simov & Milen Kouylekov. 2004. Design and
implementation of the Bulgarian HPSG-based treebank. Research on Language and
Computation 2(4). 495–522.
Simpson, Jane. 1983. Resultatives. In Lori S. Levin, Malka Rappaport & Annie Zaenen
(eds.), Papers in Lexical Functional Grammar, 143–157. Reprint: Simpson (2005b). Indi-
ana: Indiana University Linguistics Club.
Simpson, Jane. 2005a. Depictives in English and Warlpiri. In Nikolaus P. Himmelmann
& Eva Schultze-Berndt (eds.), Secondary predication and adverbial modification: The
typology of depictives, 69–106. Oxford: Oxford University Press.
Simpson, Jane. 2005b. Resultatives. In Miriam Butt & Tracy Holloway King (eds.), Lexical
semantics in LFG, 149–161. Stanford, CA: CSLI Publications.
Singleton, Jenny L. & Elissa L. Newport. 2004. When learners surpass their models: The
acquisition of American Sign Language from inconsistent input. Cognitive Psychology
49(4). 370–407.
Slayden, Glenn C. 2012. Array TFS storage for unification grammars. University of Wash-
ington. (MA thesis).
Sleator, Daniel D. K. & Davy Temperley. 1991. Parsing English with a Link Grammar.
CMU-CS-TR-91-126. School of Computer Science, Carnegie Mellon University.
Smith, Carlota S. 1970. Jespersen’s “move and change” class and causative verbs in En-
glish. In Bert Peeters (ed.), The lexicon–encyclopedia interface, 101–109. Amsterdam:
Elsevier.
Smith, Carlota S. 1972. On causative verbs and derived nominals in English. Linguistic
Inquiry 3(1). 136–138.
Snyder, William. 2001. On the nature of syntactic variation: Evidence from complex pred-
icates and complex word-formation. Language 77(2). 324–342.
Soehn, Jan-Philipp & Manfred Sailer. 2008. At first blush on tenterhooks: About selec-
tional restrictions imposed by nonheads. In Gerhard Jäger, Paola Monachesi, Gerald
Penn & Shuly Wintner (eds.), Proceedings of Formal Grammar 2003, Vienna, Austria,
149–161. Stanford, CA: CSLI Publications. http://csli- publications.stanford.edu/FG/
2003/soehn.pdf (18 August, 2020).
Son, Minjeong. 2007. Directionality and resultativity: The cross-linguistic correlation
revisited. Tromsø University Working Papers on Language & Linguistics 34. 126–164.
http://hdl.handle.net/10037/3191 (18 August, 2020).
813
References
814
Steedman, Mark. 1989a. Constituency and coordination in a Combinatory Grammar. In
Mark R. Baltin & Anthony S. Kroch (eds.), Alternative conceptions of phrase structure,
201–231. Chicago/London: The University of Chicago Press.
Steedman, Mark. 1991. Structure and intonation. Language 67(2). 260–296. DOI: 10.2307/
415107.
Steedman, Mark. 1997. Surface structure and interpretation (Linguistic Inquiry Mono-
graphs 30). Cambridge, MA: MIT Press.
Steedman, Mark. 2000. The syntactic process (Language, Speech, and Communication).
Cambridge, MA: MIT Press.
Steedman, Mark. 2011. Romantics and revolutionaries. Linguistic Issues in Language Tech-
nology 6(11). Special Issue on Interaction of Linguistics and Computational Linguistics,
1–20. http : / / journals . linguisticsociety . org / elanguage / lilt / article / view / 2587 . html
(18 August, 2020).
Steedman, Mark & Jason Baldridge. 2006. Combinatory Categorial Grammar. In Keith
Brown (ed.), Encyclopedia of language and linguistics, 2nd edn., 610–621. Oxford: Else-
vier.
Steedman, Mark J. 1989b. Grammar, interpretation, and processing from the lexicon. In
William Marslen-Wilson (ed.), Lexical representation and process, 463–504. Cambridge,
MA: The MIT Press. DOI: 10.7551/mitpress/4213.003.0022.
Steels, Luc. 2003. Evolving grounded communication for robots. Trends in Cognitive Sci-
ence 7(7). 308–312.
Steels, Luc (ed.). 2011. Design patterns in Fluid Construction Grammar (Constructional
Approaches to Language 11). Amsterdam: John Benjamins Publishing Co.
Steels, Luc (ed.). 2012. Computational issues in Fluid Construction Grammar (Lecture
Notes in Computer Science 7249). Berlin: Springer Verlag.
Steels, Luc. 2013. Fluid Construction Grammar. In Thomas Hoffmann & Graeme Trous-
dale (eds.), The Oxford handbook of Construction Grammar (Oxford Handbooks), 153–
167. Oxford: Oxford University Press.
Steels, Luc. 2015. The Talking Heads experiment: Origins of words and meanings (Compu-
tational Models of Language Evolution 1). Berlin: Language Science Press.
Steels, Luc & Joachim De Beule. 2006. A (very) brief introduction to Fluid Construction
Grammar. Paper presented at the Third International Workshop on Scalable Natural
Language Understanding (ScaNaLU 2006) June 8, 2006, following HLT/NAACL, New
York City.
Steels, Luc & Remi van Trijp. 2011. How to make Construction Grammars fluid and ro-
bust. In Luc Steels (ed.), Design patterns in Fluid Construction Grammar (Construc-
tional Approaches to Language 11), 301–330. Amsterdam: John Benjamins Publishing
Co. DOI: 10.1075/cal.11.
Stefanowitsch, Anatol. 2008. Negative entrenchment: A usage-based approach to nega-
tive evidence. Cognitive Linguistics 19(3). 513–531.
Stefanowitsch, Anatol & Kerstin Fischer (eds.). 2008. Konstruktionsgrammatik II: Von
der Konstruktion zur Grammatik (Stauffenburg Linguistik 47). Tübingen: Stauffenburg
Verlag.
815
References
Stefanowitsch, Anatol & Stephan Th. Gries. 2009. Corpora and grammar. In Anke Lüdel-
ing & Merja Kytö (eds.), Corpus linguistics: An international handbook, vol. 29 (Handbü-
cher zur Sprach- und Kommunikationswissenschaft), chap. 43, 933–952. Berlin: Mou-
ton de Gruyter.
Sternefeld, Wolfgang. 1985a. Deutsch ohne grammatische Funktionen: Ein Beitrag zur
Rektions- und Bindungstheorie. Linguistische Berichte 99. 394–439.
Sternefeld, Wolfgang. 1985b. On case and binding theory. In Jindřich Toman (ed.), Studies
in German grammar (Studies in Generative Grammar 21), 231–285. Dordrecht: Foris
Publications.
Sternefeld, Wolfgang. 1995. Voice phrases and their specifiers. FAS Papers in Linguistics
3. 48–85.
Sternefeld, Wolfgang. 2006. Syntax: Eine morphologisch motivierte generative Beschrei-
bung des Deutschen (Stauffenburg Linguistik 31). Tübingen: Stauffenburg Verlag.
Sternefeld, Wolfgang & Frank Richter. 2012. Wo stehen wir in der Grammatiktheorie? —
Bemerkungen anläßlich eines Buchs von Stefan Müller. Zeitschrift für Sprachwissen-
schaft 31(2). 263–291. DOI: 10.1515/zfs-2012-0010.
Stiebels, Barbara. 1996. Lexikalische Argumente und Adjunkte: Zum semantischen Beitrag
verbaler Präfixe und Partikeln (studia grammatica 39). Berlin: Akademie Verlag.
Stowell, Timothy. 1981. Origins of phrase structure. MIT. (Doctoral dissertation). http :
//hdl.handle.net/1721.1/15626 (18 August, 2020).
Strunk, Jan & Neal Snider. 2013. Subclausal locality constraints on relative clause extra-
position. In Gert Webelhuth, Manfred Sailer & Heike Walker (eds.), Rightward move-
ment in a comparative perspective (Linguistik Aktuell/Linguistics Today 200), 99–143.
Amsterdam: John Benjamins Publishing Co.
Suchsland, Peter. 1997. Syntax-Theorie: Ein zusammengefaßter Zugang (Konzepte der
Sprach- und Literaturwissenschaft 55). Deutsche Bearbeitung von Borsley (1991) durch
Peter Suchsland. Tübingen: Max Niemeyer Verlag.
Sulger, Sebastian. 2009. Irish clefting and information-structure. In Miriam Butt & Tracy
Holloway King (eds.), Proceedings of the LFG 2009 conference, 562–582. Stanford, CA:
CSLI Publications. http://csli-publications.stanford.edu/LFG/14/ (18 August, 2020).
Sulger, Sebastian. 2010. Analytic and synthetic verb forms in Irish – An agreement-based
implementation in LFG. In Manfred Pinkal, Ines Rehbein, Sabine Schulte im Walde &
Angelika Storrer (eds.), Semantic approaches in natural language processing: Proceed-
ings of the Conference on Natural Language Processing 2010, 169–173. Saarbrücken:
Saarland University Press (universaar).
Svenonius, Peter. 2004. Slavic prefixes inside and outside VP. Nordlyd. Special Issue on
Slavic Prefixes 32(2). 205–253.
Takami, Ken-ichi. 1988. Preposition stranding: Arguments against syntactic analyses and
an alternative functional explanation. Lingua 76(4). 299–335.
Tanenhaus, Michael K., Michael J. Spivey-Knowlton, Kathleen M. Eberhard & Julie C.
Sedivy. 1995. Integration of visual and linguistic information in spoken language com-
prehension. Science 268(5217). 1632–1634. DOI: 10.1126/science.7777863.
816
Tanenhaus, Michael K., Michael J. Spivey-Knowlton, Kathleen M. Eberhard & Julie C. Se-
divy. 1996. Using eye movements to study spoken language comprehension: Evidence
for visually mediated incremental interpretation. In Toshio Inui & James L. McClel-
land (eds.), Information integration in perception and communication (Attention and
Performance XVI), 457–478. Cambridge, MA: MIT Press.
ten Hacken, Pius. 2007. Chomskyan linguistics and its competitors. London: Equinox Pub-
lishing Ltd.
Tesnière, Lucien. 1959. Eléments de syntaxe structurale. Paris: Librairie C. Klincksieck.
Tesnière, Lucien. 1980. Grundzüge der strukturalen Syntax. Translated by Ulrich Engel.
Stuttgart: Klett-Cotta.
Tesnière, Lucien. 2015. Elements of structural syntax. Translated by Timothy Osborne
and Sylvain Kahane. Amsterdam: John Benjamins Publishing Co.
Thiersch, Craig L. 1978. Topics in German syntax. M.I.T. (Dissertation). http://hdl.handle.
net/1721.1/16327 (18 August, 2020).
Thompson, Henry S. 1982. Handling metarules in a parser for GPSG. D.A.I. Research 175.
University of Edinburgh.
Timberlake, Alan. 1982. The impersonal passive in Lithuanian. In Monica Macaulay, Orin
D. Gensler, Claudia Brugmann, Inese Čivkulis, Amy Dahlstrom, Katherine Krile &
Rob Sturm (eds.), Proceedings of the Eighth Annual Meeting of the Berkeley Linguistics
Society, 508–524. Berkeley: University of California.
Toivonen, Ida. 2013. English benefactive NPs. In Miriam Butt & Tracy Holloway King
(eds.), Proceedings of the LFG 2013 conference, 503–523. Stanford, CA: CSLI Publica-
tions.
Tomasello, Michael. 1995. Language is not an instinct. Cognitive Development 10(1). 131–
156.
Tomasello, Michael. 2000. Do young children have adult syntactic competence? Cogni-
tion 74(3). 209–253.
Tomasello, Michael. 2003. Constructing a language: A usage-based theory of language ac-
quisition. Cambridge, MA: Harvard University Press.
Tomasello, Michael. 2005. Beyond formalities: The case of language acquisition. The Lin-
guistic Review 22(2–4). 183–197.
Tomasello, Michael. 2006a. Acquiring linguistic constructions. In Deanna Kuhn & Robert
Siegler (eds.), Handbook of child psychology, 6th edn., vol. 2. New York: John Wiley &
Sons, Inc.
Tomasello, Michael. 2006b. Construction Grammar for kids. Constructions Special Vol-
ume 1. https://journals.linguisticsociety.org/elanguage/constructions/article/view/26.
html (18 August, 2020).
Tomasello, Michael. 2006c. Konstruktionsgrammatik und früher Erstspracherwerb. In
Kerstin Fischer & Anatol Stefanowitsch (eds.), Konstruktionsgrammatik: Von der An-
wendung zur Theorie (Stauffenburg Linguistik 40), 19–37. Tübingen: Stauffenburg Ver-
lag.
Tomasello, Michael. 2009. Universal Grammar is dead. The Behavioral and Brain Sciences
32(5). 470–471.
817
References
Tomasello, Michael, Nameera Akhtar, Kelly Dodsen & Laura Rekau. 1997. Differential
productivity in young children’s use of nouns and verbs. Journal of Child Language
24(2). 373–387.
Tomasello, Michael, Malinda Carpenter, Josep Call, Tanya Behne & Henrike Moll. 2005.
Understanding and sharing intentions: The origins of cultural cognition. The Behav-
ioral and Brain Sciences 28(5). 675–735.
Torr, John & Edward P. Stabler. 2016. Coordination in Minimalist Grammars: Excorpora-
tion and across the board (head) movement. In.
Torr, John, Milos Stanojevic, Mark Steedman & Shay B. Cohen. 2019. Wide-coverage
neural A* parsing for Minimalist Grammars. In Anna Korhonen, David Traum & Lluís
Màrquez (eds.), Proceedings of the 57th annual meeting of the association for computa-
tional linguistics, 2486–2505. Florence, Italy: Association for Computational Linguis-
tics. DOI: 10.18653/v1/P19-1238.
Travis, Lisa. 1984. Parameters and effects of word order variation. Cambridge, MA: M.I.T.
(Dissertation).
Trinh, Tue. 2019. The edginess of silence: A study on chain linearization (studia grammatica
84). Berlin: de Gruyter. DOI: 10.1515/9783110637465.
Trinh, Tue H. 2011. Edges and linearization. Massachusetts Institute of Technology. (Doc-
toral dissertation). https://dspace.mit.edu/handle/1721.1/68523 (9 May, 2019).
Trosterud, Trond. 2009. A Constraint Grammar for Faroese. In Eckhard Bick, Kristin Ha-
gen, Kaili Müürisep & Trond Trosterud (eds.), Constraint Grammar and robust parsing:
Proceedings of the NODALIDA 2009 workshop, vol. 8 (NEALT Proceedings Series 8), 1–
7. Tartu: Tartu University Library.
Tseng, Jesse (ed.). 2000. Aspekte eines HPSG-Fragments des Deutschen (Arbeitspapiere
des SFB 340 No. 156). Tübingen: Eberhard-Karls-Universität Tübingen. http://www.
sfs.uni-tuebingen.de/sfb/reports/berichte/156/156abs.html (18 August, 2020).
Tseng, Jesse L. 2003. LKB grammar implementation: French and beyond. In Emily M.
Bender, Daniel P. Flickinger, Frederik Fouvry & Melanie Siegel (eds.), Proceedings of the
ESSLLI 2003 Workshop “Ideas and Strategies for Multilingual Grammar Development”,
91–97. Vienna, Austria. https : / / hal . inria . fr / inria - 00099472 / document (18 August,
2020).
Tseng, Jesse L. 2007. English prepositional passive constructions. In Stefan Müller (ed.),
Proceedings of the 14th International Conference on Head-Driven Phrase Structure Gram-
mar, 271–286. Stanford, CA: CSLI Publications. http://csli-publications.stanford.edu/
HPSG/2007/ (18 August, 2020).
Umemoto, Hiroshi. 2006. Implementing a Japanese semantic parser based on glue ap-
proach. In Proceedings of The 20th Pacific Asia Conference on Language, Information
and Computation, 418–425. https://www.aclweb.org/anthology/Y06-1061 (18 August,
2020).
Uszkoreit, Hans. 1986a. Categorial Unification Grammars. In Makoto Nagao (ed.),
Proceedings of COLING 86, 187–194. University of Bonn: Association for Computa-
tional Linguistics. http://aclweb.org/anthology/C86-1045 (18 August, 2020).
818
Uszkoreit, Hans. 1986b. Linear precedence in discontinuous constituents: Complex fronting
in German. Report No. CSLI-86-47. Stanford, CA: Center for the Study of Language &
Information.
Uszkoreit, Hans. 1987. Word order and constituent structure in German (CSLI Lecture
Notes 8). Stanford, CA: CSLI Publications.
Uszkoreit, Hans. 1990. Extraposition and adjunct attachment in Categorial Unification
Grammar. In Werner Bahner, Joachim Schildt & Dieter Viehweger (eds.), Proceedings
of the Fourteenth International Congress of Linguists, Berlin/GDR, August 10–15, 1987,
vol. 3, 2331–2336. Berlin: Akademie Verlag.
Uszkoreit, Hans, Rolf Backofen, Stephan Busemann, Abdel Kader Diagne, Elizabeth A.
Hinkelman, Walter Kasper, Bernd Kiefer, Hans-Ulrich Krieger, Klaus Netter, Günter
Neumann, Stephan Oepen & Stephen P. Spackman. 1994. DISCO—An HPSG-based
NLP system and its application for appointment scheduling. In Makoto Nagao (ed.),
Proceedings of COLING 94, 436–440. Kyoto, Japan: Association for Computational Lin-
guistics.
Uszkoreit, Hans, Rolf Backofen, Jo Calder, Joanne Capstick, Luca Dini, Jochen Dörre, Gre-
gor Erbach, Dominique Estival, Suresh Manandhar, Anne-Marie Mineur & Stephan
Oepen. 1996. The EAGLES formalisms working group: Final report Expert Advisory
Group on Language Engineering Standards. Technical Report LRE 61–100. http://www.
coli.uni-sb.de/publikationen/softcopies/Uszkoreit:1996:EFW.pdf (18 August, 2020).
Vaidya, Ashwini, Owen Rambow & Martha Palmer. 2019. Syntactic composition and se-
lectional preferences in Hindi light verb constructions. Linguistic Issues in Language
Technology 16(1).
Valian, Virginia. 1991. Syntactic subjects in the early speech of American and Italian
children. Cognition 40(1–2). 21–81.
Van Eynde, Frank. 2015. Sign-Based Construction Grammar: A guided tour. Journal of
Linguistics. DOI: 10.1017/S0022226715000341.
Van Langendonck, Willy. 1994. Determiners as heads? Cognitive Linguistics 5. 243–259.
Van Valin, Robert D. Jr. (ed.). 1993. Advances in Role and Reference Grammar. Amsterdam:
John Benjamins Publishing Co.
Van Valin, Robert D. Jr. 1998. The acquisition of wh-questions and the mechanisms of lan-
guage acquisition. In Michael Tomasello (ed.), The new psychology of language: Cogni-
tive and functional approaches to language structure, 221–249. Hillsdale, NJ: Lawrence
Erlbaum.
Vancoppenolle, Jean, Eric Tabbert, Gerlof Bouma & Manfred Stede. 2011. A German
grammar for generation in Open CCG. In Hanna Hedeland, Thomas Schmidt & Kai
Wörner (eds.), Multilingual resources and multilingual applications: Proceedings of the
Conference of the German Society for Computational Linguistics and Language Technol-
ogy (GSCL) 2011 (Arbeiten zur Mehrsprachigkeit/Working Papers in Multilingualism,
Folge B/Series B 96), 145–150. Hamburg: Universität Hamburg.
van Noord, Gertjan & Gosse Bouma. 1994. The scope of adjuncts and the processing of
lexical rules. In Makoto Nagao (ed.), Proceedings of COLING 94, 250–256. Kyoto, Japan:
Association for Computational Linguistics.
819
References
van Trijp, Remi. 2011. A design pattern for argument structure constructions. In Luc
Steels (ed.), Design patterns in Fluid Construction Grammar (Constructional Ap-
proaches to Language 11), 115–145. Amsterdam: John Benjamins Publishing Co. DOI:
10.1075/cal.11.07tri.
van Trijp, Remi. 2013. A comparison between Fluid Construction Grammar and Sign-
Based Construction Grammar. Constructions and Frames 5(1). 88–116.
van Trijp, Remi. 2014. Long-distance dependencies without filler−gaps: A cognitive-
functional alternative in Fluid Construction Grammar. Language and Cognition 6(2).
242–270.
Vargha-Khadem, Faraneh, Kate E. Watkins, Katie Alcock, Paul Fletcher & Richard Pass-
ingham. 1995. Praxic and nonverbal cognitive deficits in a large family with a ge-
netically transmitted speech and language disorder. In Proceedings of the National
Academy of Sciences of the United States of America, vol. 92, 930–933.
Vasishth, Shravan & Richard L. Lewis. 2006. Human language processing: Symbolic mod-
els. In Keith Brown (ed.), The encyclopedia of language and linguistics, 2nd edn., vol. 5,
410–419. Oxford: Elsevier Science Publisher B.V. (North-Holland).
Vasishth, Shravan, Katja Suckow, Richard L. Lewis & Sabine Kern. 2010. Short-term for-
getting in sentence comprehension: Crosslinguistic evidence from verb-final struc-
tures. Language and Cognitive Processes 25(4). 533–567.
Vater, Heinz. 2010. Strukturalismus und generative Grammatik in Deutschland. In Hans-
Harald Müller, Marcel Lepper & Andreas Gardt (eds.), Strukturalismus in Deutschland:
Literatur- und Sprachwissenschaft 1910–1975 (Marbacher Schriften. Neue Folge 5), 125–
160. Göttingen: Wallstein Verlag.
Veenstra, Mettina Jolanda Arnoldina. 1998. Formalizing the Minimalist Program. Rijksuni-
versiteit Groningen. (Ph.D. thesis).
Vennemann, Theo & Ray Harlow. 1977. Categorial Grammar and consistent basic VX
serialization. Theoretical Linguistics 4(1–3). 227–254. DOI: 10.1515/thli.1977.4.1-3.227.
Verhagen, Arie. 2010. What do you think is the proper place of recursion? Conceptual and
empirical issues. In Harry van der Hulst (ed.), Recursion in human language (Studies
in Generative Grammar 104), 93–110. Berlin: Mouton de Gruyter.
Verspoor, Cornelia Maria. 1997. Contextually-dependent lexical semantics. University of
Edinburgh. (Doctoral dissertation).
Vierhuff, Tilman, Bernd Hildebrandt & Hans-Jürgen Eikmeyer. 2003. Effiziente Verar-
beitung deutscher Konstituentenstellung mit der Combinatorial Categorial Grammar.
Linguistische Berichte 194. 213–237.
Vijay-Shanker, K. & Aravind K. Joshi. 1988. Feature structures based Tree Adjoining
Grammars. In Dénes Vargha (ed.), Proceedings of COLING 88, vol. 1, 714–719. Univer-
sity of Budapest: Association for Computational Linguistics. http://www.aclweb.org/
anthology/C88-2147 (18 August, 2020).
Villavicencio, Aline. 2002. The acquisition of a unification-based Generalised Categorial
Grammar. UCAM-CL-TR-533. University of Cambridge Computer Laboratory.
820
Vogel, Ralf. 2001. Case conflict in German free relative constructions: An Optimality
Theoretic treatment. In Gereon Müller & Wolfgang Sternefeld (eds.), Competition in
syntax, 341–375. Berlin: Mouton de Gruyter.
Vogel, Ralf & Markus Steinbach. 1998. The dative – An oblique case. Linguistische Berichte
173. 65–91.
Volk, Martin. 1988. Parsing German with GPSG: The problem of separable-prefix verbs.
University of Georgia. (MA thesis).
von Stechow, Arnim. 1979. Deutsche Wortstellung und Montague-Grammatik. In Jür-
gen M. Meisel & Martin D. Pam (eds.), Linear order and Generative theory, 317–490.
Amsterdam: John Benjamins Publishing Co.
von Stechow, Arnim. 1989. Distinguo: Eine Antwort auf Dieter Wunderlich. Linguistische
Berichte 122. 330–339.
von Stechow, Arnim. 1996. The different readings of wieder ‘again’: A structural account.
Journal of Semantics 13(2). 87–138.
von Stechow, Arnim & Wolfgang Sternefeld. 1988. Bausteine syntaktischen Wissens: Ein
Lehrbuch der Generativen Grammatik. Opladen/Wiesbaden: Westdeutscher Verlag.
Voutilainen, Atro, Juha Heikkilä & Arto Anttila. 1992. Constraint Grammar of English: A
performance-oriented introduction (Publications of the Department of General Linguis-
tics 21). Helsinki: University of Helsinki.
Wada, Hajime & Nicholas Asher. 1986. BUILDRS: An implementation of DR Theory and
LFG. In Makoto Nagao (ed.), Proceedings of COLING 86, 540–545. University of Bonn:
Association for Computational Linguistics.
Wahlster, Wolfgang (ed.). 2000. Verbmobil: Foundations of speech-to-speech translation
(Artificial Intelligence). Berlin: Springer Verlag.
Walther, Markus. 1999. Deklarative prosodische Morphologie: Constraint-basierte Analy-
sen und Computermodelle zum Finnischen und Tigrinya (Linguistische Arbeiten 399).
Tübingen: Max Niemeyer Verlag.
Wasow, Thomas. 2020. Processing. In Stefan Müller, Anne Abeillé, Robert D. Borsley &
Jean-Pierre Koenig (eds.), Head-Driven Phrase Structure Grammar: The handbook (Em-
pirically Oriented Theoretical Morphology and Syntax). To appear. Berlin: Language
Science Press.
Webelhuth, Gert. 1985. German is configurational. The Linguistic Review 4(3). 203–246.
Webelhuth, Gert. 1990. Diagnostics for structure. In Günther Grewendorf & Wolfgang
Sternefeld (eds.), Scrambling and Barriers (Linguistik Aktuell/Linguistics Today 5), 41–
75. Amsterdam: John Benjamins Publishing Co.
Webelhuth, Gert. 1995. X-bar Theory and Case Theory. In Gert Webelhuth (ed.), Gov-
ernment and Binding Theory and the Minimalist Program: Principles and Parameters in
syntactic theory (Generative Syntax), 15–95. Oxford, UK & Cambrigde, USA: Blackwell
Publishers Ltd.
Webelhuth, Gert. 2011. Paradigmenwechsel rückwärts: Die Renaissance der gramma-
tischen Konstruktion. In Stefan Engelberg, Anke Holler & Kristel Proost (eds.),
Sprachliches Wissen zwischen Lexikon und Grammatik (Institut für Deutsche Sprache,
Jahrbuch 2010), 149–180. Berlin: de Gruyter.
821
References
Weber, Heinz J. 1997. Dependenzgrammatik: Ein interaktives Arbeitsbuch. 2nd edn. (Narr
Studienbücher). Tübingen: Gunter Narr Verlag.
Wechsler, Stephen. 1995. The semantic basis of argument structure (Dissertations in Lin-
guistics). Stanford, CA: CSLI Publications.
Wechsler, Stephen. 1997. Resultative predicates and control. In Ralph C. Blight & Michelle
J. Moosally (eds.), Texas Linguistic Forum 38: The syntax and semantics of predication:
Proceedings of the 1997 Texas Linguistics Society Conference, 307–321. Austin, Texas:
University of Texas, Department of Linguistics.
Wechsler, Stephen. 2005. What is right and wrong about little v. In Grammar and
beyond—Essays in honour of Lars Hellan, 179–195. Oslo, Norway: Novus Press.
Wechsler, Stephen. 2008a. A diachronic account of English deverbal nominals. In Charles
B. Chang & Hannah J. Haynie (eds.), Proceedings of the 26th West Coast Conference on
Formal Linguistics, 498–506. Somerville, MA: Cascadilla Proceedings Project.
Wechsler, Stephen. 2008b. Dualist syntax. In Stefan Müller (ed.), Proceedings of the 15th In-
ternational Conference on Head-Driven Phrase Structure Grammar, 294–304. Stanford,
CA: CSLI Publications. http://csli-publications.stanford.edu/HPSG/2008/ (18 August,
2020).
Wechsler, Stephen & Ash Asudeh. 2020. HPSG and Lexical Functional Grammar. In Ste-
fan Müller, Anne Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.), Head-Driven
Phrase Structure Grammar: The handbook (Empirically Oriented Theoretical Morphol-
ogy and Syntax). To appear. Berlin: Language Science Press.
Wechsler, Stephen, Jean-Pierre Koenig & Anthony Davis. 2020. Argument structure and
linking. In Stefan Müller, Anne Abeillé, Robert D. Borsley & Jean-Pierre Koenig (eds.),
Head-Driven Phrase Structure Grammar: The handbook (Empirically Oriented Theoret-
ical Morphology and Syntax). To appear. Berlin: Language Science Press.
Wechsler, Stephen & Bokyung Noh. 2001. On resultative predicates and clauses: Parallels
between Korean and English. Language Sciences 23(4). 391–423.
Wechsler, Stephen Mark. 1991. Argument structure and linking. Stanford University. (Doc-
toral dissertation).
Wegener, Heide. 1985. Der Dativ im heutigen Deutsch (Studien zur deutschen Grammatik
28). Tübingen: originally Gunter Narr Verlag now Stauffenburg Verlag.
Weir, Morton W. 1964. Developmental changes in problem-solving strategies. Psycholog-
ical Review 71(6). 473–490.
Weissgerber, Monika. 1983. Valenz und Kongruenzbeziehungen: Ein Modell zur Vereindeu-
tigung von Verben in der maschinellen Analyse und Übersetzung. Frankfurt a. M.: Peter
Lang.
Weisweber, Wilhelm. 1987. Ein Dominanz-Chart-Parser für generalisierte Phrasenstruktur-
grammatiken. KIT-Report 45. Berlin: Technische Universität Berlin.
Weisweber, Wilhelm & Susanne Preuss. 1992. Direct parsing with metarules. In Antonio
Zampolli (ed.), 14th International Conference on Computational Linguistics (COLING
’92), August 23–28, 1111–1115. Nantes, France: Association for Computational Linguis-
tics.
822
Welke, Klaus. 1988. Einführung in die Valenz- und Kasustheorie. Leipzig: Bibliographi-
sches Institut.
Welke, Klaus. 2009. Konstruktionsvererbung, Valenzvererbung und die Reichweite von
Konstruktionen. Zeitschrift für Germanistische Linguistik 37(3). 514–543.
Welke, Klaus. 2011. Valenzgrammatik des Deutschen: Eine Einführung (De Gruyter
Studium). Berlin: de Gruyter. DOI: 10.1515/9783110254198.
Welke, Klaus. 2019. Konstruktionsgrammatik des Deutschen: Ein sprachgebrauchsbezo-
gener Ansatz (Linguistik – Impulse & Tendenzen 77). Berlin: de Gruyter.
Wells, Rulon S. 1947. Immediate constituents. Language 23(2). 81–117.
Werner, Edeltraud. 1993. Translationstheorie und Dependenzmodell: Kritik und Reinter-
pretation des Ansatzes von Lucien Tesnière (Kultur und Erkenntnis: Schriften der
Philosophischen Fakultät der Heinrich-Heine-Universität Düsseldorf 10). Tübingen:
Francke Verlag.
Wetta, Andrew C. 2011. A Construction-based cross-linguistic analysis of V2 word order.
In Stefan Müller (ed.), Proceedings of the 18th International Conference on Head-Driven
Phrase Structure Grammar, University of Washington, 248–268. Stanford, CA: CSLI Pub-
lications. http://csli-publications.stanford.edu/HPSG/2011/ (18 August, 2020).
Wexler, Kenneth. 1998. Very early parameter setting and the unique checking constraint:
A new explanation of the optional infinitive stage. Lingua 106(1–4). 23–79.
Wexler, Kenneth & Peter W. Culicover. 1980. Formal principles of language acquisition.
Cambridge, MA/London: MIT Press.
Weydt, Harald. 1972. „Unendlicher Gebrauch von endlichen Mitteln“: Mißverständnisse
um ein linguistisches Theorem. Poetica 5(3/4). 249–267.
Wharton, R. M. 1974. Approximate language identification. Information and Control
26(3). 236–255.
White, Mike & Jason Baldridge. 2003. Adapting chart realization to CCG. In Ehud Reiter,
Helmut Horacek & Kees van Deemter (eds.), Proceedings of the 9th European Workshop
on Natural Language Generation (ENLG-2003) at EACL 2003, 119–126.
Wijnen, Frank, Masja Kempen & Steven Gillis. 2001. Root infinitives in Dutch early child
language: An effect of input? Journal of Child Language 28(3). 629–660.
Wiklund, Anna-Lena, Gunnar Hrafn Hrafnbjargarson, Kristine Bentzen & Þorbjörg
Hróarsdóttir. 2007. Rethinking Scandinavian verb movement. Journal of Comparative
Germanic Linguistics 10(3). 203–233.
Wilcock, Graham. 2001. Towards a discourse-oriented representation of information
structure in HPSG. In 13th Nordic Conference on Computational Linguistics, Uppsala,
Sweden. http://www.ling.helsinki.fi/~gwilcock/Pubs/2001/Nodalida-01.pdf (18 August,
2020).
Wilcock, Graham. 2005. Information structure and Minimal Recursion Semantics. In
Antti Arppe, Lauri Carlson, Krister Lindén, Jussi Piitulainen, Mickael Suominen,
Martti Vainio, Hanna Westerlund & Anssi Yli-Jyrä (eds.), Inquiries into words, con-
straints and contexts: Festschrift for Kimmo Koskenniemi on his 60th birthday (CSLI
Studies in Computational Linguistics ONLINE), 268–277. Stanford, CA: CSLI Publica-
tions.
823
References
Wilder, Chris. 1991. Small clauses and related objects. Groninger Arbeiten zur Germanisti-
schen Linguistik 34. 215–236.
Williams, Edwin. 1984. Grammatical relations. Linguistic Inquiry 15(4). 639–673.
Winkler, Susanne. 1997. Focus and secondary predication (Studies in Generative Grammar
43). Berlin, New York: Mouton de Gruyter.
Wittenberg, Eva, Ray S. Jackendoff, Gina Kuperberg, Martin Paczynski, Jesse Snedeker
& Heike Wiese. 2014. The processing and representation of light verb constructions.
In Asaf Bachrach, Isabelle Roy & Linnaea Stockall (eds.), Structuring the argument
(Language Faculty and Beyond 10), 61–80. Amsterdam: John Benjamins Publishing
Co.
Wittenberg, Eva & Maria Mercedes Piñango. 2011. Processing light verb constructions.
The Mental Lexicon 6(3). 393–413. DOI: 10.1075/ml.6.3.03wit.
Wöllstein, Angelika. 2010. Topologisches Satzmodell (Kurze Einführungen in die Germa-
nistische Linguistik 8). Heidelberg: Universitätsverlag Winter.
Wunderlich, Dieter. 1987. Vermeide Pronomen – Vermeide leere Kategorien. Studium
Linguistik 21. 36–44.
Wunderlich, Dieter. 1989. Arnim von Stechow, das Nichts und die Lexikalisten. Lingui-
stische Berichte 122. 321–333.
Wunderlich, Dieter. 1992. CAUSE and the structure of verbs. Arbeiten des SFB 282 No. 36.
Düsseldorf/Wuppertal: Heinrich Heine Uni/BUGH.
Wunderlich, Dieter. 1997. Argument extension by lexical adjunction. Journal of Semantics
14(2). 95–142. DOI: 10.1093/jos/14.2.95.
Wunderlich, Dieter. 2004. Why assume UG? Studies in Language 28(3). 615–641.
Wunderlich, Dieter. 2008. Spekulationen zum Anfang von Sprache. Zeitschrift für Sprach-
wissenschaft 27(2). 229–265.
Wurmbrand, Susanne. 2003a. Infinitives: Restructuring and clause structure (Studies in
Generative Grammar 55). Berlin: Mouton de Gruyter.
Wurmbrand, Susanne. 2003b. Long passive (corpus search results). Ms. University of Con-
necticut.
XTAG Research Group. 2001. A lexicalized Tree Adjoining Grammar for English. Tech.
rep. Philadelphia: Institute for Research in Cognitive Science. ftp://ftp.cis.upenn.edu/
pub/xtag/release-2.24.2001/tech-report.pdf (23 December, 2015).
Yamada, Hiroyasu & Yuji Matsumoto. 2003. Statistical dependency analysis with sup-
port vector machines. In Gertjan van Noord (ed.), Proceedings of the 8th International
Workshop on Parsing Technologies (IWPT 03). Nancy.
Yamada, Jeni. 1981. Evidence for the independence of language and cognition: Case study
of a “hyperlinguistic” adolescent. UCLA Working Papers in Cognitive Linguistics 3.
University of California, Los Angeles. 121–160.
Yampol, Todd & Lauri Karttunen. 1990. An efficient implementation of PATR for Catego-
rial Unification Grammar. In Hans Karlgren (ed.), COLING-90: Papers presented to the
13th International Conference on Computational Linguistics, 419–424. Helsinki: Associ-
ation for Computational Linguistics.
824
Yang, Charles D. 2004. Universal Grammar, statistics or both? Trends in Cognitive Sci-
ences 8(10). 451–456. DOI: 10.1016/j.tics.2004.08.006.
Yang, Chunlei & Dan Flickinger. 2014. ManGO: Grammar engineering for deep linguistic
processing. New Technology of Library and Information Service 30(3). 57–64.
Yasukawa, Hidekl. 1984. LFG System in Prolog. In Yorick Wilks (ed.), Proceedings of the
10th International Conference on Computational Linguistics and 22nd Annual Meeting
of the Association for Computational Linguistics, 358–361. Stanford University, CA: As-
sociation for Computational Linguistics.
Yip, Moira, Joan Maling & Ray S. Jackendoff. 1987. Case in tiers. Language 63(2). 217–250.
DOI: 10.2307/415655.
Yoshinaga, Naoki, Yusuke Miyao, Kentaro Torisawa & Jun’ichi Tsujii. 2001. Resource
sharing amongst HPSG and LTAG communities by a method of grammar conversion
between FB-LTAG and HPSG. In Proceedings of ACL/EACL workshop on Sharing Tools
and Resources for Research and Education, 39–46. Toulouse, France.
Zaenen, Annie & Ronald M. Kaplan. 1995. Formal devices for linguistic generalizations:
West Germanic word order in LFG. In Mary Dalrymple, Ronald M. Kaplan, John T.
Maxwell III & Annie Zaenen (eds.), Formal issues in Lexical-Functional Grammar (CSLI
Lecture Notes 47), 215–239. Stanford, CA: CSLI Publications.
Zaenen, Annie & Ronald M. Kaplan. 2002. Subsumption and equality: German partial
fronting in LFG. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG
2002 conference. Stanford, CA: CSLI Publications. http : / / csli - publications . stanford .
edu/LFG/7/ (18 August, 2020).
Zaenen, Annie, Joan Maling & Höskuldur Thráinsson. 1985. Case and grammatical func-
tions: The Icelandic passive. Natural Language & Linguistic Theory 3(4). 441–483. DOI:
10.1007/BF00133285.
Zappa, Frank. 1986. Does humor belong in music? EMI Music Germany GmbH & Co. KG.
Ziem, Alexander & Alexander Lasch. 2013. Konstruktionsgrammatik: Konzepte und Grund-
lagen gebrauchsbasierter Ansätze (Germanistische Arbeitshefte 44). de Gruyter.
Zucchi, Alessandro. 1993. The language of propositions and events: issues in the syntax
and the semantics of nominalization (Studies in Linguistics and Philosophy 51). Berlin:
Springer Verlag.
Zwart, C. Jan-Wouter. 1994. Dutch is head-initial. The Linguistic Review 11(3–4). 377–406.
Zweigenbaum, Pierre. 1991. Un analyseur pour grammaires lexicales-fonctionnelles. TA
Informations 32(2). 19–34.
Zwicky, Arnold M., Joyce Friedman, Barbara C. Hall & Donald E. Walker. 1965. The
MITRE syntactic analysis procedure for Transformational Grammars. In Proceedings
– FALL Joint Computer Conference, 317–326. DOI: 10.1109/AFIPS.1965.108.
825
Name index
Abbott, Barbara, 608 Bahrani, Mohammad, 183
Abeillé, Anne, xvii, 121, 162, 163, 178, 179, 417, Baker, Carl Lee, 484, 490
418, 563, 564, 653, 683 Baker, Mark C., 452, 456, 460, 532
Abney, Steven P., 29, 118–120, 128, 489, 505, Baldridge, Jason, x, 163, 245, 246, 2515 , 253,
521, 541, 703 257, 263, 299, 578
Abraham, Werner, 12, 153 Balla, Amar, 265
Abzianidze, Lasha, 266 Ballweg, Joachim, 550
Ackerman, Farrell, 118, 538, 688, 703 Baltin, Mark, 460, 461
Adams, Marianne, 467 Bangalore, Srinivas, 415
Ades, Anthony E., 526 Bannard, Colin, 315, 568
Adger, David, x, xv, 127, 132–134, 136–142, Bar-Hillel, Yehoshua, 569, 571
148, 166, 173, 180, 560, 591, 593 Bargmann, Sascha, 413, 678
Ágel, Vilmos, 367, 415 Barry, Guy, 263
Aguado-Orea, Javier, 497, 542 Bartsch, Renate, 104, 161
Ahmed, Mohamed Ben, 417 Barwise, Jon, 282
Ahmed, Reaz, 265 Baschung, K., 245
Ajdukiewicz, Kazimierz, 160, 163, 245 Bates, Elizabeth A., 477, 479, 507
Alqurashi, Abdulrahman, 570 Baumgärtner, Klaus, 367, 371, 401
Alsina, Alex, 37, 314, 456, 599, 604, 611 Bausewein, Karin, 157, 158, 288
Altmann, Hans, 47 Bayer, Josef, 101, 117
Ambridge, Ben, 465–467, 492, 497, 503, 548 Beavers, John, 162, 163, 245, 305, 341, 654,
Anderson, John M., 367 658
Anderson, Stephen R., 231 Bech, Gunnar, 47, 261, 423, 436
Andrews, Avery, 222 Becker, Tilman, 96, 423, 425, 434
Aoun, Joseph, 120 Beermann, Dorothee, 266
Arad Greshler, Tali, 265, 266 Beghelli, Filippo, 146
Arends, Jacques, 481 Behrens, Heike, 365, 477, 546, 655
Arka, I Wayan, 222 Bellugi, Ursula, 483
Arnold, Doug, 569 Bender, Emily, 333, 336, 562
Arnold, Jennifer E., 525 Bender, Emily M., x, 6, 27, 119, 266–268, 283,
Asher, Nicholas, 221 309, 329, 330, 527, 562, 565, 570
Askedal, John Ole, 43 Bentzen, Kristine, 146
Asudeh, Ash, xv, 230, 242, 244, 308, 314, 599, Bergen, Benjamin K., 96, 313, 324, 325, 340,
611–614, 617, 621, 679, 685 342, 343, 544, 617, 639
Attardi, Giuseppe, 368 Berman, Judith, 37, 75, 101, 102, 221, 233–236,
Attia, Mohammed A., 221 238, 241, 244, 433, 456, 458
Avgustinova, Tania, 267 Berwick, Robert C., 160, 172, 176, 491, 533,
534, 549
Bach, Emmon, 62, 102 Bes, G. G., 245
Backofen, Rolf, 265 Bever, Thomas G., 522
Name index
828
Name index
183, 270, 313, 321, 357, 434, 445, Culicover, Peter W., 83, 161, 166, 458, 476,
449, 451, 452, 454, 456, 458, 460, 523, 540, 571, 599, 645, 646, 658
468–471, 473, 475–477, 480, 482, Culy, Christopher, 203, 417, 475, 549
484, 488, 489, 491, 504, 505, 519, Curran, James, 245
522, 525, 527, 528, 531, 537, 539– Curtiss, Susan, 479
541, 549, 556, 570, 610, 618, 627–
629, 701, 739, 740 Da Sylva, Lyne, 183
Chouinard, Michelle M., 505 Dąbrowska, Ewa, 313, 365, 482, 484, 495, 543,
Christiansen, Morten, 467 599
Chrupala, Grzegorz, 222 Dahl, Östen, 401, 410, 468
Chung, Chan, 683 Dahllöf, Mats, 265, 266
Chung, Sandra, 467 Dale, Robert, 221
Church, Kenneth, 703 Dalrymple, Mary, 37, 101, 222, 223, 227–230,
Cinque, Guglielmo, 145, 147, 150, 152, 175, 234, 242, 244, 308, 314, 375, 456,
468, 469, 475, 698 459, 556, 570, 576, 611, 679, 685,
Citko, Barbara, 157, 159 775
Clark, Alexander, 504 Davidson, Donald, 623
Clark, Eve V., 505 Davis, Anthony R., 284, 318, 640
Clark, Herbert H., 524 de Alencar, Leonel, 221, 222, 244
Clark, Stephen, 245, 246 de Alencar, Leonel Figueiredo, 221, 222
Clément, Lionel, 221, 318, 640 De Beule, Joachim, 314
Clifton, Charles Jr., 521, 528 De Kuthy, Kordula, 14, 17, 119, 171, 266, 325,
Coch, Jose, 369 349, 466, 550, 551
Cohen, Shay B., 177 de Saussure, Ferdinand, 3, 471
Cole, Jennifer, 120, 521 Declerck, Thierry, 265
Comrie, Bernard, 158, 163, 288 Dellert, Johannes, 417
Cook, Philippa, x, 17, 38, 154, 349, 397 Delmonte, Rodolfo, 221, 222
Cook, Philippa Helen, 221, 466 Demberg, Vera, 357
Cooper, Robin, 282 Demske, Ulrike, 29
Coopmans, Peter, 569 den Besten, Hans, 117, 161
Copestake, Ann, 177, 265, 266, 274, 284, 317, Deppermann, Arnulf, 365, 485, 614
329, 358, 361, 363, 422, 578, 585, Derbyshire, Desmond C., 473
603, 623 Devlin, Keith, 282
Corluy, A., 245 Dhonnchadha, E. Uí, 369
Correa, Nelson, 120 Diagne, Abdel Kader, 265
Costa, Francisco, 266 Diesing, Molly, 101
Covington, Michael A., 369, 400 Dini, Luca, 265
Crabbé, Benoit, 417 Dione, Cheikh Mouhamadou Bamba, 222
Crain, Stephen, 458, 459, 467, 496, 503, 526 Dipper, Stefanie, 221
Cramer, Bart, 267 Donati, Caterina, 156
Crocker, Matthew Walter, 120–123 Donohue, Cathryn, 305
Croft, William, 313, 314, 456, 469, 544, 599, Doran, Christine, 417
600, 605, 641, 643, 656, 679, 695 Doran, Robert W., 120
Crouch, Richard, 221 Dorna, Michael, 265
Crysmann, Berthold, 17, 104, 162, 163, 266, Dörre, Jochen, 221, 265
269, 305, 341, 355, 654, 660 Dowty, David, 246
Csernyi, Gábor, 222
829
Name index
Dowty, David R., 6, 92, 161, 249, 347, 577, 583, 299, 301, 398, 438, 456, 459, 466,
606, 612, 623, 628, 632, 652 480, 521, 550, 595, 699
Dras, Mark, 221 Feldhaus, Anke, 102
Drellishak, Scott, 267 Feldman, Jerome, 486
Drosdowski, Günther, 25, 41 Felix, Sascha W., 83, 100, 110, 125
Dryer, Matthew S., 103, 452, 453, 695 Filimonova, Elena, 451
Dürscheid, Christa, 47, 117, 550 Fillmore, Charles J., 68, 79, 122, 172, 313–317,
Dyvik, Helge, xi, 222 325–328, 340, 495, 564, 571, 603,
605, 640
Egg, Markus, 577, 579 Fischer, Ingrid, 563
Eichinger, Ludwig M., 415 Fischer, Kerstin, 365
Eikmeyer, Hans-Jürgen, 245 Fischer, Klaus, 367
Eisele, Andreas, 221 Fisher, Simon E., 482–484
Eisenberg, Peter, x, 13, 23, 25, 28, 33, 34, 37, Fitch, W. Tecumseh, 86, 145, 445, 460, 469,
42, 47, 48, 64, 72, 161, 270, 488 471–473, 475, 476, 482, 484, 504
Elbourne, Paul, 116, 176, 529, 701 Flickinger, Dan, 265, 266
Ellefson, Michelle R., 467 Flickinger, Daniel P., 119, 177, 210, 266, 267,
Elman, Jeffrey L., 479, 481, 482, 492, 496, 507 274, 284, 309, 317, 329, 333, 336,
Embick, David, 623 347, 422, 562, 578, 586, 616
Emerson, Guy, 268 Fodor, Janet Dean, 5202 , 532, 533, 537, 538,
Emirkanian, Louisette, 183 540
Engdahl, Elisabet, 349 Fodor, Jerry A., 522
Engel, Ulrich, 367, 370, 372, 373, 375, 376, Fokkens, Antske, 266, 267
380, 406, 415, 551, 570 Fong, Sandiway, x, 121, 176
Epstein, Samuel David, 160, 172 Fordham, Andrew, 121–123
Erbach, Gregor, 265 Forst, Martin, 37, 221, 456
Ernst, Thomas, 152, 153 Fortmann, Christian, 550
Eroms, Hans-Werner, 96, 245, 367, 372, 375– Fourquet, Jean, 102
377, 379–381, 387, 393, 394, 412, Fouvry, Frederik, 267
415, 570 Fox Tree, Jean E., 524
Erteschik-Shir, Nomi, 464–467 Fraj, Fériel Ben, 417
Estigarribia, Bruno, 497 Frank, Anette, 102, 221, 578, 757
Estival, Dominique, 265 Frank, Robert, 417
Evang, Kilian, 417 Franks, Steven, 494, 531
Evans, Nicholas, 453, 456, 459, 468, 469, 475, Frazier, Lyn, 520, 528
699 Freidin, Robert, 120, 183, 460, 699
Evans, Roger, 183 Freudenthal, Daniel, 497, 541–543
Everett, Daniel L., 472, 474 Frey, Werner, 114, 115, 117, 123, 153, 160, 167,
Evers, Arnold, 117, 119, 637 221, 227, 308, 550
Evert, Stefan, x Fried, Mirjam, 313, 315
Friederici, Angela D., 479, 482
Faaß, Gertrud, 222 Friedman, Joyce, 120
Fabregas, Antonio, 159 Fries, Norbert, 288
Falk, Yehuda N., 122 Fukui, Naoki, 101
Fan, Zhenzhen, 265, 266 Fukumochi, Yasutomo, 369
Fang, Ji, 222 Futrell, Richard, 472
Fanselow, Gisbert, x, 83, 100, 110, 115–117,
119, 120, 125, 153, 170, 171, 245, 253,
830
Name index
831
Name index
Hays, David G., 368, 369, 371, 401 Hurford, James R., 477, 496
Heinecke, Johannes, 369 Hurskainen, Arvi, 369
Heinz, Wolfgang, 109, 243, 631
Helbig, Gerhard, 367 Ibbotson, Paul, 697
Hellan, Lars, 29, 266, 703 Ichiyama, Shunji, 369
Hellwig, Peter, 367, 369, 374, 375, 400, 408, Imrényi, András, 367
409, 415, 549, 701 Ingram, David, 543
Her, One-Soon, 221, 222 Iordanskaja, L., 369
Heringer, Hans Jürgen, 96, 368, 371, 377, 380, Islam, Md. Asfaqul, 265
381, 401, 415 Islam, Muhammad Sadiqul, 265
Herring, Joshua, 176
Herzig Sheinfux, Livnat, 265, 266 Jackendoff, Ray S., 75, 83, 94, 96, 119, 144, 161,
Higginbotham, James, 309, 566 163, 166, 290, 314, 357, 365, 413,
Higinbotham, Dan, 221, 222 453, 458, 469, 476, 496, 523, 527,
Hildebrandt, Bernd, 245 529, 530, 540, 571, 585, 599, 611,
Hinkelman, Elizabeth A., 265 645, 646, 648, 658, 668, 677, 678,
Hinrichs, Erhard W., 119, 171, 178, 266, 291, 688
309, 376, 435, 585, 594, 597, 637, Jacobs, Joachim, 102, 196, 252, 365, 667, 688
683 Jacobson, Pauline, 196, 204, 295
Hinterhölzl, Roland, 153 Jaeggli, Osvaldo A., 590
Hoberg, Ursula, 436, 593 Jäger, Gerhard, 577
Hockenmaier, Julia, 245, 246 Jäppinen, H., 369
Hockett, Charles F., 473 Jäppinen, Harri, 369
Hockey, Beth Ann, 417 Johannessen, Janne Bondi, 369
Hoeksema, Jack, 648 Johnson, David E., 144
Hoekstra, Teun, 270 Johnson, Jacqueline S., 478, 479
Hoffman, Beryl Ann, 245, 253, 299 Johnson, Kent, 487, 488
Hofman, Ute, 47 Johnson, Mark, 122, 200, 202, 205, 219, 221,
Hofmeister, Philip, 467 222, 357, 441, 526–528, 549
Höhle, Tilman N., 43, 51, 101, 103, 114, 253, Johnson, Mark H., 479, 507
269, 293, 515, 594, 769, 770 Jones, Wendy, 483
Holler, Anke, 274 Joshi, Aravind K., 96, 341, 415, 417, 418, 420,
Holmberg, Anders, 532 422, 423, 425, 427–429, 434–436,
Hornstein, Norbert, 128, 151, 175, 180, 468, 439–441, 632, 653
471, 472, 477 Jungen, Oliver, x
Hrafnbjargarson, Gunnar Hrafn, 146 Jurafsky, Daniel, 315, 568
Hróarsdóttir, Þorbjörg, 146
Huang, Wei, 369 Kahane, Sylvain, x, 367, 369, 372, 374, 381,
Huddleston, Rodney, 261 387, 391, 392, 402, 412, 420
Hudson Kam, Carla L., 481 Kallmeyer, Laura, x, 315, 318, 417, 418, 420,
Hudson, Carla L., 481 422, 439, 441, 570, 639–641, 679
Hudson, Richard, x, 29, 367–371, 375–377, Kamp, Hans, 227
381–384, 387, 392, 406, 407, 409, Kanerva, Jonni M., 232, 233
410, 412, 414, 415, 477 Kaplan, Ronald M., 123, 125, 221, 222, 226,
Hukari, Thomas E., 569, 570 231, 234, 235, 238, 240, 242, 308,
Humboldt, Wilhelm von, 470 326, 456, 472, 511, 549, 570, 583,
Hunze, Rudolf, 221 685, 775
Karimi-Doostan, Gholamhossein, 148
832
Name index
833
Name index
Lareau, François, 221 Lohnstein, Horst, x, 97, 113, 121, 123, 125, 522,
Larson, Richard K., 111, 133, 160, 166, 569, 589 569, 590
Lascarides, Alex, 585 Longobardi, Giuseppe, 477
Lasch, Alexander, 643 Lorenz, Konrad, 478
Laskri, Mohamed Tayeb, 265 Lötscher, Andreas, 565
Lasnik, Howard, 493, 528 Loukam, Mourad, 265
Lavoie, Benoit, 369 Lüdeling, Anke, x, 662
Le, Hong Phuong, 418 Luuk, Erkki, 471, 472, 476
Lee-Goldmann, Russell R., 79 Luuk, Hendrik, 471, 472, 476
Legate, Julie, 493–495
Lehtola, A., 369 Maas, Heinz Dieter, 369
Lehtola, Aarno, 369 Maché, Jakob, x, 669, 673
Leiss, Elisabeth, 4, 406, 519, 567 Machicao y Priemer, Antonio, 266, 311, 692
Lenerz, Jürgen, 12, 112, 117, 119, 142, 593 Mack, Jennifer, 658
Lenneberg, Eric H., 478, 479 Mackie, Lisa, 222
Levelt, Willem J. M., 244 MacWhinney, Brian, 491
Levin, Beth, 632 Maess, Burkhard, 482
Levine, Robert D., x, 144, 311, 358, 361, 388, Maier, Wolfgang, 417
389, 569, 570 Maling, Joan, 119, 193, 233, 290, 630
Levinson, Stephen C., 453, 456, 459, 468, 469, Malouf, Robert, 170, 266, 286, 305, 324, 351,
475, 699 383, 389, 569, 570, 584, 645, 685
Levy, Leon S., 417, 441 Manandhar, Suresh, 265
Lewin, Ian, 120–122 Manshadi, Mehdi Hafezi, 183
Lewis, Geoffrey L., 321, 640 Marantz, Alec, 155, 522, 525, 623, 625, 627,
Lewis, John D., 492, 496 628, 669
Lewis, Richard L., 521 Marciniak, Małgorzata, 266
Li, Charles N., 673 Marcus, Gary F., 482–484, 505
Li, Wei, 266 Marcus, Mitchell P., 120, 121
Liakata, Maria, 222 Marimon, Montserrat, 267
Lichte, Timm, x, 315, 417, 425, 439, 679 Marshall, Ian, 267
Lichtenberger, Liz, 483 Marslen-Wilson, William D., 524
Lieb, Hans-Heinrich, x Martinet, André, 47025
Lieven, Elena, 315, 568 Martner, Theodore S., 120
Lieven, Elena V. M., 548 Martorell, Jordi, 4
Lightfoot, David W., 120, 541 Masuichi, Hiroshi, 222
Lin, Dekang, 121, 123 Masum, Mahmudul Hasan, 265
Lin, Francis Y., 4 Matiasek, Johannes, 109, 243, 631
Link, Godehard, 455 Matsumoto, Yuji, 368
Lipenkova, Janna, 266, 365, 674, 692 Maxwell III, John T., 221, 775
Liu, Gang, 266 Mayo, Bruce, 221, 222
Liu, Haitao, x, 369 McCawley, James D., 467
Lloré, F. Xavier, 245 McCloskey, James, 467
Lobin, Henning, 263, 368, 415 Mchombo, Sam A., 231, 612
Löbner, Sebastian, 569 McIntyre, Andrew, x
Lødrup, Helge, 37, 456 McKean, Kathryn Ojemann, 521
Lohndal, Terje, 623, 624, 632 Meinunger, André, 97, 117, 142, 146, 583
Meisel, Jürgen M., 453, 478, 480, 531, 548
834
Name index
Mel’čuk, Igor A., 367–369, 375, 409 177–182, 219, 231, 243, 251, 261,
Melnik, Nurit, 265, 266 265–267, 273, 276, 283, 287, 288,
Mensching, Guido, x, xi, 539 290–292, 295, 297, 298, 306–309,
Menzel, Wolfgang, 369 311, 313–315, 320–322, 326, 331,
Metcalf, Vanessa, 266 336, 338, 339, 344–346, 349, 355,
Meurer, Paul, 221, 222 356, 358, 361–363, 365, 397–399,
Meurers, Walt Detmar, 17, 102, 119, 171, 265, 404, 412, 413, 428, 435, 438, 449,
266, 289–291, 297, 309, 311, 347, 456, 459, 461, 463, 471, 475, 516,
358, 361, 405, 436, 463, 562, 565, 517, 544, 550, 562, 565, 569, 583–
585, 649, 683, 703 586, 594, 600, 602, 605, 608, 612–
Meza, Iván V., 267 615, 617, 620, 621, 623, 627, 631,
Micelli, Vanessa, 315, 322 636, 637, 642, 643, 645–650, 656–
Michaelis, Jens, x, 164, 167, 549, 550 662, 673, 674, 679, 681, 683, 685,
Michaelis, Laura A., x, 243, 314, 318, 328, 329, 686, 690, 692, 697, 703, 707, 718
365, 571, 640, 701 Muraki, Kazunori, 368, 369
Michelson, Karin, 362, 694 Musso, Mariacristina, 482, 697
Miller, George A., 434, 519, 521 Müürisep, Kaili, 369
Miller, Philip H., 683 Muysken, Pieter, 76, 162
Mineur, Anne-Marie, 265 Mykowiecka, Agnieszka, 266
Mistica, Meladel, 222
Mittendorf, Ingo, 222 Nakayama, Mineharu, 496, 503
Miyao, Yusuke, 267, 440 Nakazawa, Tsuneko, 119, 171, 178, 291, 309,
Moeljadi, David, 266 376, 435, 585, 594, 597, 637, 683
Moens, Marc, 702 Nanni, Debbie L., 261
Mohanan, KP, 37, 456 Nasr, Alexis, 392
Mohanan, Tara, 37, 456 Naumann, Sven, 183
Mok, Eva, 314 Nederhof, Mark-Jan, 363
Momma, Stefan, 221 Neeleman, Ad, 314
Monachesi, Paola, 683 Nelimarkka, Esa, 369
Montague, Richard, 190 Nerbonne, John, 200, 202, 282, 307, 322, 456,
Moortgat, Michael, 246 578, 614, 641
Moot, Richard, 245 Netter, Klaus, 29, 81, 102, 103, 217, 251, 252,
Morgan, James L., 488 265, 266, 307, 435, 440, 696, 706
Morin, Yves Ch., 120 Neu, Julia, 266
Morrill, Glyn, 245, 259 Neumann, Günter, 265
Morrill, Glyn V., 246 Neville, Anne, 265
Moshier, Andrew M., 320 Nevins, Andrew Ira, 472
Motazedi, Yasaman, 221 Newmeyer, Frederick J., 453, 454, 460, 469,
Muischnek, Kadri, 369 477, 532, 540, 541, 569, 696
Mukai, Kuniaki, 282 Newport, Elissa L., 478, 479, 481
Müller, Gereon, x, 94, 117, 119, 143, 171, 174, Ng, Say Kiat, 266
417, 669–671, 678 Nguyen, Thi Minh Huyen, 418
Müller, Max, 476 Niño, María-Eugenia, 222
Müller, Natascha, 478 Nivre, Joakim, 368
Müller, Stefan, ix, xiii, xv–xvii, 6, 15, 29, 34, Niyogi, Partha, 4, 470, 532–534
42, 102, 110, 119, 148, 155, 157, 158, Noh, Bokyung, 585, 612, 637
160, 163, 164, 167, 168, 170–173, Nøklestad, Anders, 369
835
Name index
836
Name index
Popowich, Fred, 265 Richter, Frank, x, 181, 205, 219, 265, 282, 305,
Porzel, Robert, 340 331, 333, 335, 339, 358, 361, 392,
Postal, Paul M., 120, 472, 569 472, 562, 566, 699
Poulson, Laurie, 267 Rieder, Sibylle, 265
Preuss, Susanne, 183, 193, 195 Riemer, Beate, 478
Prince, Alan, x Riemsdijk, Henk van, 95, 469
Przepiórkowski, Adam, 119, 222, 266, 290, Riezler, Stefan, 221, 222
309, 311, 562, 565 Ritchie, R. W., 86
Pullum, Geoffrey K., x, xiii, 4, 77, 86, 93, 120, Rizzi, Luigi, 145, 146, 148, 150, 153, 175, 458,
158, 163, 183, 184, 186, 187, 190, 191, 460, 468, 469, 475, 531, 698
195, 203, 204, 288, 321, 417, 454, Roberts, Ian, 477
471–473, 475, 484, 486–488, 490, Roberts, Ian F., 532
492, 495, 498, 506, 508–513, 517, Robins, Robert Henry, x
531, 570 Robinson, Jane, 340
Pulman, Stephen G., 526 Rodrigues, Cilene, 472
Pulvermüller, Friedemann, 4, 496, 508, 658, Rogers, James, 123, 472, 511
659, 661, 662 Rohrer, Christian, 221, 222
Puolakainen, Tiina, 369 Romero, Maribel, 441
Putnam, Michael, 158, 415 Roosmaa, Tiit, 369
Rosén, Victoria, 222
Quaglia, Stefano, 222 Ross, John Robert, 119, 178, 201, 258, 352, 455,
460, 464, 809
Radford, Andrew, 150, 180, 541 Roth, Sebastian, 222
Rahman, M. Sohel, 265 Rothkegel, Annely, 368
Rahman, Md. Mizanur, 265 Roussanaly, Azim, 418
Rákosi, György, 222 Rowland, Caroline F., 497, 503
Rambow, Owen, 96, 221, 369, 392, 415, 417, Ruppenhofer, Josef, 243, 314, 318, 328, 571,
418, 423, 425, 428, 429, 432–434, 640, 701
438, 439, 441, 570, 681
Ramchand, Gillian, 586 Sabel, Joachim, 550
Randriamasimanana, Charles, 222 Sadler, Louisa, 222
Raposo, Eduardo, 494, 531 Sáfár, Éva, 267
Rappaport Hovav, Malka, 632 Safir, Kenneth J., 569
Rappaport, Malka, 628 Sag, Ivan A., x, xv, xvi, 27, 29, 120, 122, 153,
Rauh, Gisa, 155 156, 162, 163, 169–172, 177, 178,
Rawlins, Kyle, 475 183, 184, 186, 190, 191, 195, 204,
Reape, Mike, 96, 178, 179, 221, 298, 305, 325, 205, 210, 217, 258, 260, 265, 266,
341, 349, 356, 363, 550, 660 269–271, 274, 276, 282–284, 286,
Redington, Martin, 504 299, 302, 305, 307, 314, 317, 320,
Reis, Marga, 35, 36, 43, 49, 51, 102, 104, 117, 321, 324, 329, 330, 336–341, 349,
457 351, 357, 360, 364, 365, 383, 389,
Remberger, Eva-Maria, 494, 539 402, 411–413, 422, 467, 485, 495,
Resnik, Philip, 418 497, 509, 521, 524, 526, 527, 529,
Reyle, Uwe, 221, 227, 578 530, 538, 540, 556, 561, 562, 564,
Rhomieux, Russell, 79 565, 569, 570, 578, 584, 612, 616,
Richards, Marc, 129, 144, 522, 525 626, 627, 631, 642, 645, 649, 654,
657, 660, 662, 674–676, 683, 685
837
Name index
838
Name index
839
Name index
840
Language index
Palauan, 305
Persian, 266, 267, 452, 683, 692
Pirahã, 472
Polish, 222, 266
Portuguese, 222, 266, 368, 369
Proto-Uralic, 473
Romance, 683
Russian, 23610 , 267, 368, 369
Sahaptin, 267
Sami, 370
sign language, 480–481
American (ASL), 479, 481, 532
British, 267
French, 267
German, 267
Greek, 267
South African, 267
Slavic, 367
Sorbian
Lower, 494, 5311
Upper, 494, 5311
Sotho, Northern, 222
Spanish, 154, 222, 267, 305, 325, 368, 369, 479,
542–543, 692
Straits Salish, 468
Swahili, 369
Swedish, 368, 370, 464, 61220 , 621
Swiss German, 475
Vietnamese, 418
842
Subject index
844
Subject index
845
Subject index
core grammar, 93, 313, 531, 539 454, 457, 458, 556, 569–586, 663,
CoreGram, 267 675
corpus, 6 PRO, see PRO
corpus annotation, 368 empty head, 668, 677
corpus linguistics, 506, 703 endocentricity, 95
coverb, 453, 469 entity, 190
creole language, 480–481 epsilon, 571
critical period, 478–479 epsilon production, 68
cycle escape hatch, 462
in feature description, 213, 297, 343, event, 283
403 evidence
transformational, 465, 523, 559 negative
indirect, 505
D-structure, 87, 128, 144, 308, 528 evokes operator, 343
declarative clause, 105 Exceptional Case Marking (ECM), 560
deep structure, see D-structure experiencer, 30, 92, 233
Deep Structure, 308 explanatory adequacy, 449
Definite Clause Grammar (DCG), 81, 219 Extended Projection Principle (EPP), 456,
deletion, 522 537
DELPH-IN, 265 external argument, 93
dependency, 477 extraction, 389, 460, 464–467, 570, 620
Dependency Categorial Grammar, 263 from adjuncts, 550
Dependency Grammar (DG), 96, 117, 173, from specifier, 550
196, 19914 , 441, 5191 , 65449 , 66053 island, 464
Dependency Unification Grammar (DUG), subject, 531
409, 549, 701 extraction path marking, 305, 351, 383
depictive predicate, 28815 , 30926 , 565–567 extraposition, 10414 , 14712 , 172, 399, 435,
derivation, 27, 123, 322, 690 460–464
derivation tree, 420
Derivational Theory of Complexity (DTC), f-structure, 1932 , 223–227, 556, 559, 564
5213 , 521–524 Faculty of Language
descriptive adequacy, 449 in the Broad Sense (FLB), 476
determiner, 24, 53 in the Narrow Sense (FLN), 477
as head, 29 feature
directive, 667 ADJ, 224, 230
disjunction, 210–211, 237 ARG-ST, 272, 556, 584
Distributed Morphology, 669 COMPS, 270, 330
do-Support, 543 COMP, 224
dominance, 54 CONT, 643
immediate, 54, 187, 269, 324 DAUGHTERS, 169
DF, 241
economy DSL, 292, 295, 302
transderivational, 144 EVOKES, 343
elementary tree, 418 FOCUS, 224, 239
ellipsis, 28815 , 48531 , 522, 52712 , 556, 569, 668 GEN, 282
empty element, 158 , 68, 113, 154, 160, 166, 170, HEAD-DTR, 273
297, 307, 314, 325, 424, 434, 439, HEAD, 278, 338–339
846
Subject index
847
Subject index
441, 456, 459, 472, 48531 , 495, language acquisition, 118, 134, 246, 314, 449,
511, 514, 521, 523, 527, 52914 , 530, 46011 , 555
538, 549, 556–558, 561, 562, 565, language evolution, 314
570, 572, 578, 583, 5865 , 623, 654, learnability, 487
66053 , 681, 688, 692, 701, 703 learning theory, 498
Constructional, 336 left associativity, 247
Heavy-NP-Shift, 325 lexeme, 26
Hole Semantics, 578 Lexical Decomposition Grammar, 314
hydra clause, 455 Lexical Functional Grammar (LFG), 37, 38,
hypotenuse, 343 75, 10111 , 125, 154, 173, 177, 180,
1932 , 196, 202, 219, 292, 306–309,
iambus, 535 314, 325, 375, 384, 433, 456, 457,
Icelandic, 29016 472, 509, 511, 521, 5213 , 527, 52914 ,
ID/LP grammar, 187, 2358 , 269, 422 530, 549, 556, 557, 561, 562, 564,
identification in the limit, 485–488 570, 574, 576, 604, 61423 , 648, 654,
ideophone, 469 688, 701, 703
idiom, 340, 539, 562–567, 61625 , 642 lexical integrity, 231, 314, 612, 615, 662
imperative, 20, 35 Lexical Mapping Theory (LMT), 232–234,
implication, 275, 280 292
index, 154 Lexical Resource Semantics (LRS), 2829
indicative, 20 lexical rule, 195, 251, 287–586
infinitude, 41 , 469–477 verb-initial position, 295–297
inflection, 19–21, 88, 690 lexicon, 88–281
inflectional class, 21, 58 linear logic, 228
information structure, 107, 153–154, 223, linear precedence, 187, 269, 324
349, 467, 61422 Linear Precedence Rule, 188
inheritance, 242, 284, 316, 317, 322, 614, 641– linearization rule, 170, 231, 342
643, 677, 702 Linguistic Knowledge Builder (LKB), 265
default, 123, 410, 642 Link Grammar, 368
multiple, 210, 428 linking, 93, 232–234, 272, 283–284, 317–322,
instrument, 233 374
Integrational Linguistics, x list, 207, 713
interface, 357 difference, 713
interjection, 22, 23 local maximum, 534
interrogative clause, 47–49, 274, 674–677 locality, 153, 274, 329–336, 340, 420, 439,
intervention, 138 558–567
introspection, 703 of matching, 137
inversion, 569 locative alternation, 656
IQ, 483 Logical Form (LF), 88, 90–91, 308, 441
long-distance dependency, 105–109, 197–
Kleene star, 192, 234 200, 237–241, 254–257, 29517 ,
300–306, 325–326, 329, 379–384,
label, 154–160, 169 428
𝜆-abstraction, 61 lowering, 100
𝜆-conversion, 61 LP-rule, 188
language
formal, 120, 485 macaque, 479
848
Subject index
849
Subject index
850
Subject index
851
Subject index
852
Subject index
verb-particle, 5
verbal complex, 119, 637
36
Verbmobil, 177, 267, 361
visual perception, 476
Vorfeld, 179
853
Grammatical theory
This book introduces formal grammar theories that play a role in current linguistic theorizing (Phrase Struc-
ture Grammar, Transformational Grammar/Government & Binding, Generalized Phrase Structure Gram-
mar, Lexical Functional Grammar, Categorial Grammar, Head-Driven Phrase Structure Grammar, Con-
struction Grammar, Tree Adjoining Grammar). The key assumptions are explained and it is shown how
the respective theory treats arguments and adjuncts, the active/passive alternation, local reorderings, verb
placement, and fronting of constituents over long distances. The analyses are explained with German as
the object language.
The second part of the book compares these approaches with respect to their predictions regarding lan-
guage acquisition and psycholinguistic plausibility. The nativism hypothesis, which assumes that humans
posses genetically determined innate language-specific knowledge, is critically examined and alternative
models of language acquisition are discussed. The second part then addresses controversial issues of cur-
rent theory building such as the question of flat or binary branching structures being more appropriate,
the question whether constructions should be treated on the phrasal or the lexical level, and the question
whether abstract, non-visible entities should play a role in syntactic analyses. It is shown that the analy-
ses suggested in the respective frameworks are often translatable into each other. The book closes with a
chapter showing how properties common to all languages or to certain classes of languages can be captured.
ISBN 978-3-96110-273-0
9 783961 102730