Paerns of alignment in verb agreement*
Balthasar Bickel, Giorgio Iemmolo, Taras Zakharko, & Alena Witzlack-Makarevich
University of Zürich
1
Siewierska’s Problem
A highly productive inquiry in typology concerns the alignment of argument roles, especially
the identical vs. di erent treatment of the three core roles S, A, and P by the rules of case assignment and agreement marking. With regard to case marking, determining alignment is straightforward: one can simply check which argumental NPs are assigned the same case markers.
With regard to agreement, the issue is more complex. Whereas argumental NPs exist independently of case marking, agreement consist of two components: (i) whether or not it exists (i.e.
whether certain argument features like person, number of gender, show up at all in the verb
morphology), and (ii) if agreement exists, how its markers align roles. In many cases, the answers to these question are still straightforward and one can easily observe that the agreement
markers of, e.g. Latin show accusative alignment.
However, when expanding the typological scope, one oen runs into what we call here
“Siewierska’s Problem”: argument marking in agreement is oen complex and does not allow
simple answers. As a maer of fact, the analysis of an agreement system as being primarily
ergative, accusative or neutral heavily depends on which criteria one employs. As Siewierska
(2003) notes in her seminal paper on the determination of the alignment of agreement in ditransitive constructions, in some instances the consideration of di erent criteria gives rise to
con icting classi cations, i.e. the criteria may not converge in identifying a unique alignment
type. Siewierska (2003:342) considers the following four criteria that apply to the determination
the alignment of agreement:¹
* Earlier versions of this paper were presented at the Anna Siewierska Memorial Workshop in Leipzig, April 27,
2012, and at the conference “Syntax of the World’s Languages IV” in Dubrovnik, October 1-4, 2012. We thank the
audiences for helpful comments and questions. We are also grateful for very useful comments and suggestions on
a rst dra by Dik Bakker and Martin Haspelmath. Author contributions: B.B., G.I. and A.W.-M. conceived and
designed the study and all contributed to the writing. B.B. conducted the statistical analysis. All authors were
involved in discussion and interpretation of the results. G.I. and A.W.-M. contributed to data analysis and coded
agreement data. T.Z. did most of the data extraction and aggregation work. We thank Lennart Bierkandt and
Kevin Bätscher for help in data collection and encoding.
¹ A further criterion, not considered by Siewierska (2003), concerns the host(s) of agreement marker(s), i.e. auxiliaries, lexical verbs, etc. We will not consider this criterion here either.
REVISED DRAFT – March 15, 2013
2
1. Trigger Potential: which argument(s) do and which do not trigger agreement marking (i.e.
does agreement exist at all)?
2. Form: which argument(s) are covered by the markers with the same phonological form?
3. Position: which arguments trigger agreement in the same position relative to the verbal
stem and/or relative to each other (e.g. pre, post, etc.)?
4. Conditions: which arguments trigger agreement under the same condition?
As observed by Siewierska, typically these four factors converge in establishing an overall
agreement paern, as e.g. in German in (1) below, where all the criteria listed above give a
consistent alignment paern. In terms of Trigger Potential, German displays accusative alignment: only S and A trigger agreement. When we take into consideration the Form and Position
criteria, we see that they comply with the Trigger Potential characterization: with respect to
the Form criterion, the system is consistently accusative, with S and A marked di erently from
P, since P is never overtly marked in German verb agreement.² Likewise, with regard to the
Position criterion, we have again S=A≠P, since agreement is realized by means of an overt su x
only for S and A.
(1)
German
a. I
d. I
slaf-e.
1sNOM sleep-1sS/A
‘I sleep .’
b. Du
släf-st.
seh-e
‘I see her.’
e. Du
sieh-st
2sNOM sleep-2sS/A
‘You sleep.’
c. Er
släf-t.
sie.
1sNOM see-1sS/A 3sF.ACC
mi.
2sNOM see-2sS/A 1sACC
‘You see me.’
f. Er
sieh-t
di.
3sM.NOM sleep-3sS/A
3sM.NOM see-3sS/A 2sACC
‘He sleeps.’
‘He sees you.’
However, in many other languages these criteria diverge in de ning the alignment of agreement, thus giving rise to discrepancies. e situation can be illustrated with English: most
English verbs in the present indicative are marked with the su x -s when the subject is third
person singular and are unmarked otherwise, as in (2):
(2)
a. ey like sailing.
b. He like-s sailing.
With respect to the Trigger Potential criterion, the English present indicative agreement system
can be characterized as exhibiting accusative alignment. However, when the distribution of zero
versus overt agreement markers is taken into account (i.e. the Form criterion), S/A is marked
² Here and in the remainder of the paper, we simplify. We only consider default lexical classes and do not discuss
deviating valency classes such as experiencer verbs. Also see below on this point.
REVISED DRAFT – March 15, 2013
3
di erently from P only in the third person singular, whereas the alignment is neutral (S=A=P)
in the rest of the paradigm, as none of the argument roles triggers an overt agreement marker.
More complex discrepancies arise in systems with multiple markers per argument. An illustration of such a system comes from the imperfective agreement paradigm found in Tirmaga
(Surmic; Bryant 1999), which has three slots of agreement marking: one pre x and two su x
slots. Table 1 shows the paradigms separately for each of the three roles S, A, and P. e appliPerson
pf
sf1
sf2
pf
sf1
sf2
pf
sf1
sf2
1s
1pi
1pe
2s
2p
3s
3p
kkk-
-
-i
-(G)o
-i
-(G)o
-(G)ɛ
kkk-
-
-i
-(G)o
-i
-(G)o
-(G)ɛ
-
-aɲ
-ey
-ey
-aɲ
-oŋ
-
-
S
A
P
Table 1: Agreement paradigms for S, A, and P in the Tirmaga Imperfective aspect
cation of the criteria to the Tirmaga paradigm provides con icting evidence on the alignment
paern. When one considers the Trigger Potential criterion, the resulting alignment is neutral,
since all the three roles S, A, and P display some kind of agreement marking, at least in part of
the system. With regard to the Position criterion, Tirmaga shows accusative alignment, since
S and A are marked in the pre x (‘p’) slot and in the second su x (‘sf2’) slot respectively, as
opposed to the markers for P, which occupy the rst su x (‘sf1’) slot. Under the Form criterion,
nally, one considers the phonological shape of individual markers and asks which argument
roles are marked by identical vs. distinct markers. Unlike the other criteria, the Form criterion
does not establish a unique alignment paern in the Tirmaga paradigm: the pre x position
shows accusative alignment in the rst person (S and A is marked with k- and thus di erently
from P), whereas other persons have zero exponence which covers all roles alike, thereby constituting neutral alignment. In the rst su x slot non-third person argument is accusatively
aligned due to the su xes -aɲ, -ey, -oŋ, whereas the absence of overt markers for the third
person arguments establishes neutral alignment. In the nal su x slot, there is again a number of markers (-i, -(G)o, -(G)ɛ) which establish the accusative alignment, whereas arguments of
those referential categories which have zero exponents for all three argument roles (i.e. the rst
person plural inclusive and the third person singular) align neutrally. e alignment paerns
established on the basis of these three criteria and the observed discrepancies are summarized
in (3).
REVISED DRAFT – March 15, 2013
4
(3)
Tirmaga agreement alignment
a. Trigger Potential: S=A=P
b. Form: S=A≠P, S=A=P
c. Position: S=A≠P
With the exception of Siewierska (2003), discrepancies like these have received lile attention in the typological literature or in the description of individual languages. is paper
intends to explore the distribution and in uence of such discrepancies in the determination
of the alignment in agreement systems, focusing speci cally on discrepancies between alignments in terms of Trigger Potentials and alignments in terms of Form. We explore two research
questions:
1. How frequent and how strong are these discrepancies cross-linguistically?
2. Do these discrepancies have an impact on our generalizations about the distribution of
alignment systems?
We begin by describing the database used for this study and then address these questions in
turn.
2
Data, analysis and coding methods
We surveyed 260 languages and coded their agreement systems for alignment paerns as part
of the AUTOTYP database of grammatical relations.³
Unlike Siewierska (2003), whose focus was on person agreement only, we also considered
instances of gender, number and honori city agreement. To keep our dataset manageable in
size, however, we treated gender-di erentiating agreement markers as if they were just one
marker, i.e. we did not track the di erence between for example third person masculine vs.
feminine agreement, but simply third person gender agreement. We considered a particular
person-number-gender combination as overtly marked if it is overtly marked for at least one
gender.
Also departing from Siewierska, we only looked at grammatical agreement in the sense of
Bickel & Nichols (2007), i.e. we only coded verbal markers of argument properties that can in
principle co-occur with a coreferential noun phrase in the same clause (regardless of whether
this co-occurrence is frequent or rare in discourse). Grammatical agreement in this sense corresponds to what Siewierska (2004) treats as the union of syntactic and ambiguous agreement.
Cliticized or incorporated pronouns that cannot co-occur with co-referential noun phrases were
not analyzed as instances of agreement.
For coding alignments, we considered only the coding of S, A, and P argument roles and
excluded arguments of ditransitive verbs from our present purview. S, A, and P are de ned by
numerical valency and semantic entailment properties of lexical predicates, following earlier
³ e dataset used in this study is available for download at http://www.spw.uzh.ch/autotyp/available.
html
REVISED DRAFT – March 15, 2013
5
proposals of ours (Bickel & Nichols 2009, Bickel et al. 2010, Bickel 2011a, Witzlack-Makarevich
2011). We furthermore limited our aention to lexical predicates that qualify as open, default
classes of their language and excluded predicates with non-canonical agreement paerns, other
special behavior, or lexical constraints of any kind.
We analyzed the alignment of agreement systems under the two criteria of (i) Trigger Potential, i.e. which argument(s) trigger(s) agreement; and (ii) identity of Morphological Marking,
which implies identity of both phonological form and morphological slot. e formulation
of the second criterion is similar to Siewierska’s Form and Position criteria but departs from
her original proposal in so far as we took into consideration individual slots in which given
phonological forms appear in the string of morphemes, rather than a binary pre x vs. su x
distinction.
e two criteria basically equate the Trigger Potential with syntax and Morphological Marking with morphology, allowing us to frame the question in terms of possible discrepancies between how argument roles are aligned in agreement syntax as opposed to agreement morphology. Agreement syntax in this sense refers to whether or not the verb – or more generally, any
predicate complex that heads a clause – registers features contained in S, A or P and therefore
systematically interacts with these arguments. If a speci c argument does not trigger agreement at all (e.g. P arguments in German), this means that the verb does not interact with this
argument at all in the syntax. Such questions of verb-argument interaction are fundamental
for the organization of syntax, typically requiring speci c modeling in formal theories.
is conceptualization of Trigger Potentials and Morphological Marking as two dimensions
of agreement does not match traditional grammar, where they are not kept separate. For data
like those from Tirmaga in Table 1, one would traditionally focus on the form and position of
markers and argue that the paradigms show (mostly) accusative alignment. e fact that all
three arguments behave alike in triggering agreement would not be considered an interesting
fact. For other languages, however, traditional grammar would focus precisely on triggering
behavior and not consider form and position criteria. For German for example, one would traditionally say that only S and A arguments trigger agreement; one would not say that German
is accusatively aligned because S and A have overt agreement markers whereas P shows zero
markers. Applying di erent criteria in Tirmaga and in German is typologically inconsistent,
as Siewierska has noted.
Furthermore, it is essential to keep apart cases (i) where an argument has a Trigger Potential
but the morphology happens to be zero in speci c category (such as third person singular in Tirmaga) and (ii) where an argument never triggers agreement (like German P arguments). In type
(i), the grammar of the verb has to check for the presence of speci c features in all arguments,
and as a result, the verb enters a speci c morphosyntactic relationship with all arguments.
e same morphosyntactic relationship does not exist between the verb morphology and arguments that never trigger verb agreement, i.e. in type (ii). In other words, there is a fundamental
di erence between accusative alignment in a language like Tirmaga and accusative alignment
An alternative approach would be to take into account just phonological properties, abstracted, if possible, across
positions. While possible and interesting, we leave the exploration of this alternative for another occasion.
REVISED DRAFT – March 15, 2013
6
in a language like German, and this di erence can only be captured by following Siewierska’s
innovation and consider Trigger Potentials independently of Morphological Marking.
Trigger Potential is a notion that is uniquely tied to agreement: it is only for agreement that
it makes sense to ask whether there exists a speci c syntactic relationship between the verb and
features of a speci c set of arguments. ere is no equivalent of this in case assignment: the
syntactic relationship that is marked by case exists independently of case assignment, as argumental NPs always bear a syntactic relationship to the predicate since they are assigned a
semantic role by it. e relationship is not established by the presence of case morphology, and
so one would not say that P arguments in, say, ai bear no syntactic relation to the verb just
because there is no case marking. Instead, case morphology can be said to mark the existing
relationship. As a result of this, the absence of case morphology is equivalent to zero marking
and not to the absence of syntactic relationships. erefore, in contrast to agreement marking, case marking can be fully determined by considering Morphological Marking; the Trigger
Potential has no role to play here.
When looking at Morphological Marking in agreement, we considered which roles trigger
overt agreement morphology per referential category (i.e. per every person/number combination) in every relevant morphological slot in the predicate. Consider the data in (4) from the
Uto-Aztecan language Pipil:
(4)
Pipil (Uto-Aztecan language; Campbell 1985)
a. ni-panu
1sS/A-pass
‘I pass’
b. ni-mits-ita-k
1sS/A-2sP-see-PST
‘I saw you’
c. ti-ne-ita-k
d. panu
[3S/A-]pass
‘he passes’
e. ki-neki
[3S/A-]3sP-want
‘he wants it’
f. ni-k-neki
2sS/A-1sP-see-PST
1sS/A-3sP-want
‘You saw me’
‘I want it’
If we consider the morphological realization of agreement in the rst pre x slot in Pipil, we
observe S=A≠P alignment for the rst person singular: there is ni- ‘1sS/A’ for S in (4a) and for
A in (4b), but zero exponence for the rst person singular P role in this slot, as (4c) shows; rst
person singular P is instead marked in the second slot (-ne in 4c). e situation is identical
for the rst person plural and for the second person. However, when we consider the morphological marking of the third person within the rst pre x slot, we observe that three roles
behave alike (S=A=P) in that none of them shows up with an overt morphological trace in this
slot (be it as dedicated marker or as a portemanteau a x, cf. 4d-4f). e markers in the rst
pre x slot here only register rst person (4f). is is di erent for the second pre x position,
lled by mits- in (4b), and ki- in (4e) and (4f). Here one obtains S=A≠P alignment, since the
markers that appear in this slot encode the P argument, as opposed to S and A, which leave no
overt morphological trace in this slot.
e situation is again di erent in the su x position. Here we have neutral alignment for
singular arguments, since this category never results in overt morphology across all persons.
REVISED DRAFT – March 15, 2013
7
For plural arguments, however, there is an opposition between overt marking of S and A (cf. -t
in 5a and 5b) vs. no marking for P (5c), again across all persons:
(5)
Pipil (Uto-Aztecan language; Campbell 1985)
a. panu-t
b. te-ita-ke-t
c. ni-kin-ita-k
[3S/A-]pass-pS/A
1pP-see-PST-pS/A
1sS/A-3pP-see-PST[-pP]
‘they pass (S)’
‘they saw us (A)’
‘I saw them (P)’
e example of Pipil also shows that alignment can di er across referential categories. In the
rst pre x we get S=A=P for the third person and S=A≠P elsewhere; in the su x slot, we get
S=A=P in the singular and S=A≠P in the plural. e second pre x slot, by contrast, shows
consistent S=A≠P alignment for all referential categories.
In case a language has multiple allomorphs of agreement markers (e.g. conditioned by inectional classes), we proceeded as follows: morphologically overt allomorphs were encoded
as the same marker for the present purposes. If one of the allomorphs has zero exponence, we
considered the size and productivity of individual in ectional classes. Only the major paern
of marking – either in terms of the number of in ectional classes or, where the information
is available, in terms of the class size – was considered. For instance, for Latvian three conjugation classes with several subclasses are di erentiated. Class II (also referred to as “long”)
and the overwhelming majority of verbs in Class I (called “short”) have zero exponence for
the second person singular present, whereas the verbs of Class III (“mixed”) use the su x -i
in this context. As the most productive and numerous class is Class II, the exemplar paradigm
selected for Latvian has no overt marker in the second person singular present (cf. Holst 2001,
Mathiassen 1997, Nau 1998).
For easy data entry, we only coded overt markers. e distribution and semantics of zero
exponents was then automatically inferred with the help of an ancillary database that tracks all
referential features that an agreement system is sensitive to. us, in the case of the Pipil rst
pre x slot, zero exponence of S/A agreement for third person forms is not explicitly coded in
the database, but it can be inferred from the list of the referential types of Pipil which includes
three persons and two numbers. e same holds for the singular arguments in the su x slot.
Since agreement systems sometimes undergo splits conditioned by temporal-aspectual properties of the clause (e.g. past vs. non-past, perfective vs. imperfective) we tracked the e ects of
these conditions in the database and considered the a ected alignment paerns as individual
datapoints. We refer to these paerns as constituting agreement ‘systems’ within a language
in the following. e database thus contains a total of 289 systems from 260 languages.
3
Does it make a difference?
ere are many languages where the alignment of Trigger Potentials deviates from the alignment of Morphological Marking. e extent of such discrepancies can be quanti ed by counting how oen Morphological Marking shows alignment that is identical to the alignment of
All data processing, analysis and visualization was done in R (R Development Core Team 2012), with the added
packages lattice (Sarkar 2010) and vcd (Meyer et al. 2009).
REVISED DRAFT – March 15, 2013
8
Number of systems
the Trigger Potential. In the English present tense, for example, one marker (-s) di ers and one
marker (zero) is identical with the alignment of the Trigger Potential (which is S=A≠P), resulting in an identical alignment proportion of .5 for this system. e histogram in Figure 1 shows
the frequency of identical alignment proportions binned into ten intervals running from [0,.1]
to (.9,1]. e rightmost interval consists almost completely of systems with no discrepancy at
all (111 systems with an identical alignment proportion of 1, compared to 2 systems with a proportion between .9 and 1); the lemost interval contains 19 systems with no identical alignment
at all and 46 systems with identical alignment proportions greater than 0 and smaller or equal
to .1.
100
80
60
40
20
0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 1: Histogram of the proportions of identical alignment between agreement morphology and trigger potentials in each system (N = 289)
In total, almost two thirds (N = 178) of the 289 systems in our database show at least some
kind of discrepancy between alignments in terms of Trigger Potentials and alignments in terms
of Morphological Marking. e histogram furthermore shows that discrepancies tend to be severe: 43% (N = 125) show an identical alignment value below (or equal to) .5. ese ndings
suggest that Siewierska’s Problem is a serious one. It is imperative that typologies of alignment
in agreement be clear on whether they refer to trigger potential or to agreement morphology
and apply criteria consistently across languages. e two ways of looking at alignment di er
substantially. While this is an important insight with many practical consequences for typology’s day-to-day business, the theoretically more pressing question concerns the source and
consequences of such discrepancies between syntax and morphology. We take up this issue in
the following.
4
Sources of the discrepancies
Two causes of discrepancies are trivial. First, if a referential type, e.g. third person singular, is
always zero-marked (i.e. in any role) in a particular slot, its alignment is neutral, while overt
markers can be distributed both according to neutral as well as according to any other alignment
paern. Second, tripartite alignment (S≠A≠P) is logically possible only with Morphological
Marking. Trigger Potentials can never have this type of alignment: if all roles trigger agreement,
REVISED DRAFT – March 15, 2013
9
this leads to neutral (S=A=P) alignment, no maer how diverse the morphological shapes and
positions may be; if only a subset triggers agreement, this leads to accusative (S=A≠P), ergative
(S=P≠A) or horizontal (S≠A=P) agreement, again regardless of the morphological structure. is
situation can be illustrated with the mrophology of second person agreement in the Mayan
language Ch’orti’:
(6)
Ch’orti’ (Mayan; izar 1994)
a. i-wayan.
b. a-ira-en.
c. in-ira-et.
2sS-sleep
2sA-see-1sP
1sA-see-2sP
‘you sleep (S)’
‘you see me (A)’
‘I see you (P)’
In the incompletive aspect there are two dedicated markers for the second person singular S
(6a) and A arguments (6b). e P argument is not marked with a pre x, but with a su x instead
(6c). us, although the individual markers are di erent for the three argument roles S, A, and,
P, in terms of Trigger Potential the alignment is neutral, since all three argument roles equally
trigger agreement.
Excluding all instances of zero exponence and of tripartite alignment in morphology brings
down the proportion of systems with at least one discrepancy to 122 (42%) out of 289 systems
(from 178 or 62%, cf. above). ese remaining discrepancies are empirical observations, and not
logically derivable from how alignment is de ned. In other words, it could well be the case that
languages would tend to favor similar alignments in the morphology as in the syntax, perhaps
in response to iconicity principles. In that case, we would expect, for example, that neutral
alignment in the syntax would tend to go together with neutral alignment in the morphology,
so that we would nd neutral markers in most morphological slots. Systems like this are apparently rare. What comes closest corresponds to what is sometimes called hierarchical agreement.
A case in point is agreement pre xes in Plains Cree. Here, categories like second person trigger
agreement in all three roles, and these roles receive exactly the same morphological marking
(the pre x ki-):
(7)
Plains Cree (Algonquian; Dahlstrom 1991)
a. ki-pimipahtā-n.
b. ki-pēhtaw-i-n.
2-run-sSAP
2-hear-2→1-sSAP
‘you (sg) run (S)’
‘you (sg) hear me (A)’
c. ki-pēhtaw-iti-n.
2-hear-1→2-sSAP
‘I hear you (sg) (P)’
But this seems to be very strongly disfavored worldwide and markers tend to di erentiate roles,
leading thus to discrepancies.
Discrepancies can arise independently in every slot of the agreement morphology and in
every referential category: while in Cree, the alignment of the pre x slot is identical to the
alignment of the Trigger Potential for the rst and second person, the su xes show various
discrepancies. Consider, for example, the distribution of the second person plural su x -nāwāw
in one of the su x slots (su x slot 5):
(8)
a. ki-pimipahtā-nāwāw.
b. ki-wāpam-i-nān.
c. ki-wāpam-iti-nāwāw.
2-run-2p
2-hear-2→1-1p
2-hear-1→2-2p
‘you (pl.) run (S)’
‘you (pl.) see us (A)’
‘I see you (pl.) (P)’
REVISED DRAFT – March 15, 2013
10
Whereas the S and P arguments of this referential type are marked with -nāwāw, as in (8a)
and (8c), the A argument of the same referential type is not marked in this slot; instead we nd
a rst person su x -nān (8b). is results in ergative alignment.
In general, each agreement category in each slot allows for maximally four types of how
overt morphology can align roles (S=A=P, S=A≠P, S≠A=P, S=P≠A) if we exclude tripartite alignment (following the reasoning above). erefore, the range of logically possible opportunities
for discrepancies rises with the number of agreement categories and agreement slots. For instance, Jero (Opgenort 2005) has 11 referential categories for the S argument (three person categories, three number categories and inclusive vs. exclusive distinction in the rst person of
both dual and plural). Each of the marking of the A argument of these 11 types can be conditioned by the P arguments which again are of these 11 types (e.g. A of the rst person singular
when acting on the second person singular P, A of the rst person singular when acting on
the second person plural P, etc.). In the same fashion, the marking of the P argument across all
11 referential types varies with respect to the A argument and its referential types. To calculate alignment we take an S argument of a particular referential type and compare it with the
A argument of the same referential type under one of the 11 conditions and with the P argument of the same referential type under one of the 11 conditions (Witzlack-Makarevich 2011,
Witzlack-Makarevich et al. 2011). is results in 113 alignment statements per agreement slot.
Jero has 3 slots relevant for agreement and the number of alignment statements for each of them
is theoretically 113 , that is, 113 × 3 = 3993 alignment statements in total. e actual number of
alignment statements is, however, somewhat lower than this amount of combinatorial possibilities, as particular referential categories or referential category combinations are non-existent
or belong to a di erent (e.g. re exive) paradigm. Nevertheless, there is still a very large space
of opportunity for discrepancies, easily extending into several thousands when there are many
categories and a complex system of morphological slots.
Interestingly, languages seem to exploit these possibilities to a substantial extent: Figure 2
plots the proportion of discrepancies, i.e. alignment statements that di er between Morphological Marking and Trigger Potential, per system against the number of category/slot combinations
that are distinguished by that system. e data are limited to nontrivial cases of non-identical
alignments, i.e. following the reasoning above, we consider here only overt morphology and
exclude tripartite alignment. e plot suggests that the opportunity space for discrepancies
becomes heavily, and oen fully, exploited with systems that contain more than 6 categories
(67% discrepancies with 7 categories in 6 systems, 34% with 8 categories in 17 systems, 88% with
9 categories in 8 systems etc.). Systems with fewer categories tend to show alignments that
match the alignment of agreement trigger potentials either completely (displayed in the graph
as thin horizontal lines at 0% with systems of 1, 2, 4 or 5 categories) or to a large extent (12.5%
discrepancies with 3 categories in 8 systems, 14% with 6 categories in 14 systems).
See Witzlack-Makarevich et al. (2011) on deriving basic alignment types from systems with hierarchical and coargument conditioned systems of alignment.
Note that a language a like English counts as having 1 agreement category in the non-past (third person singular),
i.e. we counted the number of overtly marked categories, not the number of feature values in oppositions.
REVISED DRAFT – March 15, 2013
Proportion discrepancies
11
1.0
0.8
0.6
0.4
0.2
0.0
1
2
3
4 5 6 7 8 910
102
103
104
Number of category/slot combinations (plotted on a log10 scale)
Figure 2: Proportion of alignment discrepancies in overt agreement morphology vs. agreement
trigger potentials (y-axis) in correlation with the number of category/slot combinations de ned
per agreement system (x-axis, ploed on a log₁₀ scale). Barwidth is proportional to the count of
systems (from the total of N = 289) within each given number of category/slot combinations.
It is not immediately clear why languages exploit the opportunity space for discrepancies so
strongly. One possibility is that complex morphological systems may have developed through
repeated accretion of freshly grammaticalized markers, each giving rise to new alignment patterns somewhere in the system. For example, if a language develops P agreement based on
accusatively-marked pronouns, one expects the morphology to keep the emerging agreement
markers separate and in a di erent position from older agreement markers. e result would
be neutral alignment in terms of trigger potentials, but S=A≠P alignment in the morphological structure for this position. is is a plausible scenario and can be observed, for example,
throughout Romance. e question whether this is a universally valid scenario, however, must
be le for detailed research on the extent to which agreement systems re ect layered grammaticalization of case-marked pronouns. For now, we conclude that richer paradigms lead to
more discrepancies and that 7 categories represent the critical threshold for this.
5
Implications for typological generalizations
Another question that arise from our ndings concerns the kinds of alignment where discrepancies are concentrated. Table 2 gives an overview of the distribution of alignments types
in overt Morphological Marking and among Trigger Potentials, excluding again non-tripartite
alignment. e strongest deviation, alone accounting for 51% of the total χ2 -deviation (284.41),
comes from the increased proportion of neutral alignments among agreement Trigger Potentials (with 41% as compared to 14% in the morphology). While these discrepancies are not
logically necessary, they re ect the widespread paern in agreement systems illustrated by the
Tirmaga, Pipil and Ch’orti’ examples above: although there is agreement morphology for all
three arguments, the morphology makes distinctions, mostly aligning A with S.
REVISED DRAFT – March 15, 2013
12
Morphological Marking
Trigger Potential
S=A=P
S=A≠P
S=P≠A
S≠A=P
0.14
0.41
0.37
0.55
0.21
0.03
0.28
0.01
Table 2: Proportion of alignments in overt morphology compared to
trigger potentials, excluding tripartite alignment (N = 289)
Proportion of S=A(=P) alignments
e ip side of this is a heavily increased proportion of ergative and S≠A=P alignments in
Morphological Marking (together 49% vs. 4% in Trigger Potentials). is could potentially challenge the relatively well-established principle that verb agreement is strongly biased against
S≠A alignment paerns (e.g. Siewierska 2004). Given the discrepancies we noted above, it is
possible that such an anti-ergative bias only holds for relatively simple agreement systems
where discrepancies are more limited (cf. Figure 2).
Figure 3 appears to con rm this suspicion since more complex systems (to the right on the
graph) indeed tend to have a lower proportion of S=A(=P) alignments, i.e. more S≠A paerns.
Decreased S=A(=P) proportions are less common among simpler systems (to the le of the
graph), where the only notable exception consists of a few radically ergative systems with one
single agreement category (e.g. gender agreement in Nakh-Daghestanian, represented here by
5 systems ).
1.0
0.8
0.6
0.4
1
2
3
4 5 6 7 8 910
102
103
104
Number of category/slot combinations (plotted on a log10 scale)
Figure 3: Proportion of S=A(=P) alignments in overt agreement morphology (y-axis) in correlation with the number of category/slot combinations de ned per agreement system (x-axis,
ploed on a log₁₀ scale). Barwidth is proportional to the count of systems (from the total of
N = 289) within each given number of category/slot combinations.
e only other cases in our database are ergative agreement in Nias (Austronesian) and in Hurrian, and S-only
agreement in Tuvaluan (Austronesian), which results in S≠A=P alignment.
REVISED DRAFT – March 15, 2013
13
However, as shown by the thin bar widths on the righthand side of Figure 3, more complex
systems are much rarer than simpler systems (at least in our database, but we believe this to
be fairly representative of worldwide distributions). Also, they tend to be concentrated only in
a few families: in our database of 289 systems, there are only 4 families (Algonquian, Nilotic,
Tacanan and the Kiranti group of Sino-Tibetan) and the family-level isolate Ainu which contain at least one system that is complex in the sense that it contains at least 60 category/slot
combinations. When one surveys the proportions of S=A(=P) alignments in these systems (see
the Appendix for a complete list), one notices that they hardly ever fall below 50%. is re ects
a general trend, also found in families with members showing moderate complexity: Table 3
lists the mean proportions of S=A(=P) (and if applicable, standard deviations) for all families
where this mean is below 1. ere are only seven further families that have mean proportions of S=A(=P) below or equal 0.5, i.e. families that show a possible trend favoring ergative
alignments. Nakh-Daghestanian and Algonquian are the only families in the table where this
trend is relatively compact and suggestive of a family-wide feature. At the same time, these
two families show relatively complex systems (with between 8 and over 10’000 category/slot
combinations). e other families in Table 3 with mean proportions below or equal 0.5 either
show large standard deviations (Mayan, Macro-Ge) or are represented only by single members
(Hurrian, Zuni, Muskogean).
Family
N (systems)
N (cat./slot comb.)
µ
Hurrian
Nakh-Daghestanian
Zuni
Mayan
Algonquian
Macro-Ge
Muskogean
Kiranti
Tacanan
Ainu
Sepik
Austronesian
Nilotic
Indo-European
1
5
1
11
10
2
1
29
1
1
5
16
5
1
1
(1,1)
6
(8, 17)
(2266, 10047)
(4, 6)
8
(457, 1889)
66
85
(1, 9)
(1, 14)
(1, 210)
(1, 13)
0.00
0.00
0.00
0.39
0.40
0.50
0.50
0.63
0.68
0.75
0.80
0.81
0.87
0.95
std. dev.
0.00
0.49
0.09
0.71
0.10
0.45
0.40
0.18
0.22
Table 3: Mean proportions µ of S=A(=P) in overt morphology below
1 in families, ordered by proportions. N (cat. comb) shows the range
of number of category/slot combinations across all members of the
family in our database
60 is a reasonable threshold for calling a system ‘complex’ because there is a natural gap in Figure 3 between
systems up to 30 and systems with more than 60 categories/slot combinations.
REVISED DRAFT – March 15, 2013
14
is suggests that decreased S=A(=P) proportions are limited to only few families and is
hardly ever a dominant trait of entire families. Given this, we expect that paradigm complexity
has lile impact on the universal trend towards S=A alignment in agreement morphology, i.e.
that the correlation noted in Figure 3 only re ects e ects in very few languages and systems
and is not a robust principle of typology. To test this hypothesis, we applied Bickel’s (2011b, in
press) Family Bias Method to our data. is method estimates statistical signals for diachronic
biases from their expected synchronic results: if S=A alignments outnumber S≠A alignments
signi cantly (under binomial testing) in a family, a change towards S=A alignments in this family was more likely than a change away from it (either because the proto-paradigm(s) showed
S=A, which then hardly ever got lost, or because S=A was not there and then it was innovated
early or oen in the family). If there is no signi cant synchronic preference, by contrast, no
signal can be inferred because, in this case, there was either no diachronic bias towards a particular structure, or the di erence in biases was too small to leave a signal, or the family is too
young to allow a signal to show up. Using extrapolation methods, signals for diachronic biases
can also be estimated for isolates and small families.¹⁰
In order to nd out whether paradigm complexity has an e ect on diachronic biases towards or against S=A alignments in agreement morphology, families were grouped into simple
(between 1 and 5 categories), moderately complex (between 6 and 30 categories or category
combinations) and highly complex (above 60 categories or category combinations). e choice
of cut-o points is arbitrary but it is based on the fact that 6 categories is the rst point (aer
1) at which S=A proportions fall below 1.0 in Figure 3 and that, as noted earlier, there is a gap
between systems with up to 30 and systems with more than 60 category combinations.¹¹
Figure 4 summarizes the results. Almost all families are diachronically biased towards
S=A(=P) alignments in their agreement morphology, and this preference is observed to a comparable extent across degrees of paradigm complexity.¹² e summary gure also includes the
results of a separate analysis of diachronic biases in trigger potentials (rightmost bar), and the
preference for S=A(=P) alignment is in the same ballpark here as well. We can conclude that
agreement systems strongly prefer S=A(=P) alignments in both Morphological Marking and
Trigger Potential. Deviations from this are limited to a few groups and languages with high
(such as Algonquian) and moderate complexity (such as Mayan).
¹⁰ e method is implemented in and available as an R package (Zakharko & Bickel 2011 ). We used the method
with the default seings of the package.
¹¹ When families were diverse with regard to these categories of complexity (e.g. Indo-European or Austronesian,
cf. the range of category counts in Table 3), we split the family into smaller groups that fell consistently into one
or the other group. Whenever possible, such groups were based on known genealogical subgroups, as de ned in
Nichols & Bickel (2009).
¹² A likelihood ratio χ2 test comparing a loglinear model with vs. without an interaction between bias direction ×
complexity type suggests independence: χ2∆ = 2.29, df = 2, p = .32.
REVISED DRAFT – March 15, 2013
15
no bias
bias towards S≠A
bias towards S=A
simple
moderately
complex
highly
complex
trigger
potentials
Figure 4: Proportion of estimated diachronic family biases towards S=A(=P)
vs. S≠A alignments in Morphological Marking across di erent degrees of
paradigm complexity, and among agreement Trigger Potentials. Tile sizes
are proportional to frequencies (Meyer et al. 2009); a small circle indicates
zero counts.
6
Conclusions
Siewierska (2003) raised an important issue for typologies of alignment. Looking at alignment
paerns in agreement systems in terms of the type of roles that can trigger agreement in the
syntax (i.e. Trigger Potential) leads to very di erent characterizations than when one examines alignment paerns for speci c agreement markers in speci c morphological positions (i.e.
Morphological Marking). Discrepancies are in fact severe, and it is imperative that typology
carefully distinguish between di erent notions of alignment in agreement systems. Some of the
sources of these discrepancies are trivial and have to do with the logic of determining alignments. However, we also observed (Section 4) that a substantial proportion of discrepancies is
empirical in nature: agreement morphology could in principle be more in line with agreement
syntax. At present it is not clear to us why morphological systems should exploit the possibility
for discrepancies as strongly as they do, but we suspect that this has to do with the complex
histories of grammaticalizing layer aer layer in agreement systems. Such a scenario would
explain why discrepancies become stronger the more complex paradigms are in terms of the
number of referential categories and category combinations they are sensitive to.
While the study of discrepancies that Siewierska called for gives new insights into possible
historical scenarios on how alignment paerns have developed in agreement systems, it could
in principle challenge received universal principles on preferred alignments in such systems.
As we showed in Section 5, however, confounding e ects are severely limited: there are only
very few language families in the world where there seems to have been a bias away from S=A
and towards S≠A alignments, and this is true regardless of whether one looks at agreement
syntax or agreement morphology. ere is a slight preference for S≠A alignments in more
complex paradigms, but it is only in a handful of language families that this is a signi cant and
REVISED DRAFT – March 15, 2013
16
diachronically relevant trend (e.g. in Algonquian). In all other families, there is a very strong
overall bias towards S=A, even when paradigms are exceedingly complex, as, for instance, in
Kiranti.
REVISED DRAFT – March 15, 2013
17
Appendix: Proportion of S=A(=P) in overt morphology and number of category/slot combinations per system in families where at least one system has more than 60 combinations
Family
Language
System
Ainu
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Algonquian
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Kiranti
Nilotic
Nilotic
Nilotic
Nilotic
Tacanan
Ainu
Arapaho
Atikamekw
Blackfoot
Cheyenne
Cree (Plains)
Menomini
Micmac
Munsee
Ojibwa (Eastern)
Passamaquoddy
Athpare
Bahing
Bahing
Bantawa
Belhare
Camling
Chintang
Dumi
Hayu
Hayu
Jero
Koyi
Koyi
Kulung
Kulung
Kõic
Kõic
Limbu
Limbu
Lohorung
Lohorung
Old ulung (Mukli)
Old ulung (Mukli)
Puma
ulung (Mukli)
ulung (Mukli)
Wambule
Yakkha
Yamphu
Nandi
Nandi
Teso
Turkana
Reyesano
INDEP
INDEP.NPST
INDEP
INDEP.PRS.IND
INDEP
INDEP
INDEP.IND
INDEP.NPST
INDEP.PRS
INDEP.PRS
IND
NPST.IND
PST.IND
IND
IND
IND
NPST.IND
NPST.IND
NPST.IND
PST.IND
IND
NPST.IND
PST.IND
NPST.IND
PST.IND
NPST.IND
PST.IND
NPST.IND
PST.IND
NPST.IND
PST.IND
NPST.IND
PST.IND
NPST.IND
NPST
PST
IND
IND
IND
NONSNPST
SNPST
Pr(S=A)
N (category/slot comb.)
0.753
0.287
0.450
0.578
0.338
0.517
0.372
0.353
0.381
0.350
0.421
0.676
0.626
0.540
0.589
0.721
0.503
0.658
0.479
0.519
0.665
0.536
0.602
0.605
0.633
0.606
1.000
0.794
0.640
0.642
0.592
0.592
0.626
0.625
0.656
0.708
0.701
0.535
0.656
0.670
1.000
1.000
0.742
0.619
0.682
85
2495
8117
2817
3354
8845
2266
10047
2467
3391
2726
1761
951
859
1621
1772
1247
1822
1057
682
811
941
850
901
922
903
471
457
1889
1847
1238
1238
1052
989
1668
1016
973
1071
1606
1391
10
8
186
210
66
REVISED DRAFT – March 15, 2013
18
References
Bickel, Balthasar. 2011a. Grammatical relations typology. In Jae Jung Song (ed.), e Oxford Handbook of
Language Typology, Oxford: Oxford University Press.
Bickel, Balthasar. 2011b. Statistical modeling of language universals. Linguistic Typology 15. 401 – 414.
Bickel, Balthasar. in press. Distributional biases in language families. In Balthasar Bickel, Lenore A.
Grenoble, David A. Peterson & Alan Timberlake (eds.), Language typology and historical contingency:
studies in honor of Johanna Niols, Amsterdam: Benjamins [pre-print available at http://www.spw.
uzh.ch/bickel-files/papers/stability.fsjn.2011bickelrevised.pdf].
Bickel, Balthasar & Johanna Nichols. 2007. In ectional morphology. In Timothy Shopen (ed.), Language
typology and syntactic description, 169 – 240. Cambridge: Cambridge University Press (Revised second
edition).
Bickel, Balthasar & Johanna Nichols. 2009. Case marking and alignment. In Andrej Malchukov & Andrew
Spencer (eds.), e Oxford Handbook of Case, 304 – 321. Oxford: Oxford University Press.
Bickel, Balthasar, Manoj Rai, Netra Paudyal, Goma Banjade, Toya Nath Bhaa, Martin Gaenszle, Elena
Lieven, Iccha Purna Rai, Novel K. Rai & Sabine Stoll. 2010. Ditransitives and three-argument verbs in
Chintang and Belhare (Southeastern Kiranti). In Andrej Malchukov, Martin Haspelmath & Bernard
Comrie (eds.), Studies in Ditransitive Constructions. A Comparative Handbook, 382–408. Berlin: Mouton
de Gruyter.
Bryant, Michael Grayson. 1999. Aspects of Tirmaga grammar: University of Texas at Arlington dissertation.
Campbell, Lyle. 1985. e Pipil language of El Salvador. Berlin: Mouton de Gruyter.
Dahlstrom, Amy. 1991. Plains Cree Morphosyntax. Garland Publishing.
Holst, Jan Henrik. 2001. Leise Grammatik. Hamburg: Helmut Buske Verlag.
Mathiassen, Terje. 1997. A Short Grammar of Latvian. Columbus, Ohio: Slavica Publishers.
Meyer, David, Achim Zeileis & Kurt Hornik. 2009. vcd: visualizing categorical data. R package, http:
//www.R-project.org/.
Nau, Nicole. 1998. Latvian, vol. 217 Languages of the World/Materials. München: Lincom Europa.
Nichols, Johanna & Balthasar Bickel. 2009. e genealogy and geography database: 2009 release.
Electronic database, http://www.uzh.ch/spw/autotyp.
Opgenort, J. R. 2005. A Grammar of Jero, vol. 5/3 Brill’s Tibetan Studies Library: Languages of the Greater
Himalayn Region. E. J. Brill.
izar, Robin. 1994. Motion verbs in Ch’orti’. Funcion 15–16.
R Development Core Team. 2012. R: a language and environment for statistical computing. Vienna: R
Foundation for Statistical Computing, http://www.r-project.org.
Sarkar, Deepayan. 2010. lattice: Laice Graphics. R package version 0.18-8. http://CRAN.
R-project.org/package=lattice.
Siewierska, Anna. 2003. Person agreement and the determination of alignment. Transactions of the
Philological Society 101. 339 – 370.
Siewierska, Anna. 2004. Person. Cambridge: Cambridge University Press.
Witzlack-Makarevich, Alena. 2011. Typological variations in grammatical relations: University of Leipzig
dissertation.
Witzlack-Makarevich, Alena, Lennart Bierkandt, Taras Zakharko & Balthasar Bickel. 2011. Decomposing
hierarchical alignment: participant scenarios as conditions on alignment. 44th Annual Meeting of the
Societas Linguistica Europaea, Logroño, September 8.
Zakharko, Taras & Balthasar Bickel. 2011 . familybias: Family bias estimation. R package, http:
//www.spw.uzh.ch/software.html.
REVISED DRAFT – March 15, 2013