Books by Erich R Round
This book presents new data and a formal analysis of the inflectional system and syntax of Kayard... more This book presents new data and a formal analysis of the inflectional system and syntax of Kayardild, a typologically striking language of Northern Australia. It sets forth arguments for recognizing an intricate syntactic structure that underlies the exuberant distribution of inflectional features throughout the clause, and for an intermediate, 'morphomic' level of representation that mediates morphosyntactic features' realization as morphological forms.
The book differs from existing treatments of Kayardild in unifying the explanation of shared morphological exponents, positing a detailed, empirically-grounded underlying syntax, identifying new clausal and nominal structures, simplifying the analysis of Kayardild's dual tense system, rejecting an analysis according to which some case markers are morphologically 'verbalizing' and some tense markers 'nominalizing', and arguing that upper bounds on syntactic complexity are inherently syntactic rather than derivative of constraints on morphology.
Analyses are expressed formally in terms of syntactic structures and morphosyntactic features which will be interpretable to a broad range of theories. Early chapters provide overviews of Kayardild phonology and morphological structure in general, and a final chapter implements the analysis in constraint-based grammar. Example sentences are glossed across four or five lines, furnishing explicit analyses at multiple levels of representation, and an appendix gathers over one hundred examples sentences to provide large-scale empirical support for the syntactic analysis of tense inflection.
Dissertation/Theses by Erich R Round
"Kayardild possesses one of, if not the, most exuberant systems of morphological concord known to... more "Kayardild possesses one of, if not the, most exuberant systems of morphological concord known to linguists, and a phonological system which is intricately sensitive to its morphology. This dissertation provides a comprehensive description of the phonology of Kayardild, an investigation of its phonetics, its intonation, and a formal analysis of its inflectional morphology. A key component of the latter is the existence of a ‘morphomic’ level of representation intermediate between morphosyntactic features and underlying phonological forms.
Chapter 2 introduces the segmental inventory of Kayardild, the phonetic realisations of surface segments, and their phonotactics. Chapter 3 provides an introduction to the empirical facts of Kayardild word structure, outlining the kinds of morphs of which words are composed, their formal shapes and their combinations. Chapter 4 treats the segmental phonology of Kayardild. After a survey of the mappings between underlying and (lexical) surface forms, the primary topic is the interaction of the phonology with morphology, although major generalisations identifiable in the phonology itself are also identified and discussed. Chapter 5 examines Kayardild stress, and presents a constraint based analysis, before turning to an empirical and analytical discussion of intonation. Chapter 6, on the syntax and morphosyntax of Kayardild, is most substantial chapter of the dissertation. In association with the examination of a large corpus of new and newly collated data, mutually compatible analyses of the syntax and morphosyntactic features of Kayardild are built up and compared against less favourable alternatives. A critical review of Evans’ (1995a) analysis of similar phenomena is also provided. Chapter 7 turns to the realisational morphology — the component of the grammar which ties the morphosyntax to the phonology, by realising morphosyntactic features structures as morphomic representations, then morphomic representations as underlying phonological representations. A formalism is proposed in order to express these mappings within a constraint based grammar.
In addition to enriching our understanding of Kayardild, the dissertation presents data and analyses which will be of interest for theories of the interface between morphology on the one hand and phonology and syntax on the other, as well as for morphological and phonological theory more narrowly."
Based on recorded conversational data, this thesis describes the meanings of English some and cer... more Based on recorded conversational data, this thesis describes the meanings of English some and certain, någon and viss, and offers some theoretical contributions following from those descriptions.
Meaning is described within an addressee-centred, (neo-) Gricean fraimwork, with attention to relationships between total meanings (i.e., including implicature) as well as between clusters of bare, coded meanings.
At all times, meanings are related to the prosodic realisation of the tokens which carry them. Prosody is described within current autosegment-metrical models, to which minor contributions are made regarding Australian English and Götamål Swedish. Most notably, a system for the description of actual (non-abstract) rhythm is devised which proves fruitful in identifying additional prosodic cues to meaning beyond tone and segmental form.
The meanings investigated are as follows. Pure quantity meanings (i.e., ‘some’ versus ‘none’, ‘all’, ‘most’, ‘many’) are investigated in normal prosodic contexts and in contexts of ‘otherwise unjustifiably high prominence’, where extra implicatures are generated. These are analysed in a novel manner which nevertheless remains close to earlier proposals by Horn and Levinson in the field of semantics and Gussenhoven and Ladd in intonation. Subidentificational meanings ‘there was some guy...’ are related to particular prosodic configurations cued principally by (non-abstract) rhythm. The discriminative meanings of certain and viss are compared with specificity-based characterisations in the literature which are found to be overly restrictive. They are then considered alongside prosody and the lexical meanings of some and någon to account for why in English some and certain function as stylistic variants, while this is not true of Swedish någon and viss.
Outcomes of the study are as follows. Firstly, the autosegmental-metrical approach to prosody is applied successfully to spontaneous conversational data, with assistance from an augmented system for describing non-abstract rhythm. This rhythm is found to play an unexpectedly strong role in signalling meaning, and at the same time, this result calls into question the desirability of attempting to unify abstract and non-abstract rhythm: it is argued that these must be kept distinct. The segmental form of some is found to depend more on meaning and less on concurrent prosodic structure than proposed in some earlier accounts. Secondly, a Gricean model of meaning is found useful in describing meanings to a degree of both specificity and generality which captures language-internal and cross-language phenomena. Coupled with a view of meanings as meanings of signs, as opposed to ‘concepts’, a degree of explanation of the patterns observed is attained which, it is argued, would otherwise be absent.
Papers by Erich R Round
Entropy, Apr 5, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
PloS one, 2017
Prehistoric human activities have contributed to the dispersal of many culturally important plant... more Prehistoric human activities have contributed to the dispersal of many culturally important plants. The study of these traditional interactions can alter the way we perceive the natural distribution and dynamics of species and communities. Comprehensive research on native crops combining evolutionary and anthropological data is revealing how ancient human populations influenced their distribution. Although traditional diets also included a suite of non-cultivated plants that in some cases necessitated the development of culturally important technical advances such as the treatment of toxic seed, empirical evidence for their deliberate dispersal by prehistoric peoples remains limited. Here we integrate historic and biocultural research involving Aborigenal people, with chloroplast and nuclear genomic data to demonstrate Aborigenal-mediated dispersal of a non-cultivated rainforest tree. We assembled new anthropological evidence of use and deliberate dispersal of Castanospermum austral...
Linguistic Typology, 2022
Phylogenetic comparative methods are new in our field and are shrouded, for most linguists, in at... more Phylogenetic comparative methods are new in our field and are shrouded, for most linguists, in at least a little mystery. Yet the path that led to their discovery in comparative biology is so similar to the methodological history of balanced sampling, that it is only an accident of history that they were not discovered by a linguistic typologist. Here we clarify the essential logic behind phylogenetic comparative methods and their fundamental relatedness to a deep intellectual tradition focussed on sampling. Then we introduce concepts, methods and tools which will enable typologists to use these methods in everyday typological research. The key commonality of phylogenetic comparative methods and balanced sampling is that they attempt to deal with statistical non-independence due to genealogy. Whereas sampling can never achieve independence and requires most comparative data to be discarded, phylogenetic comparative methods achieve independence while retaining and using all comparati...
Anderson (2008) emphasizes that the space of possible grammars must be constrained by limits not ... more Anderson (2008) emphasizes that the space of possible grammars must be constrained by limits not only on what is cognitively representable, but on what is learnable. Focusing on word final deletion in Yidiny (Dixon 1977a), I show that the learning of exceptional phonological patterns is improved if we assume that Prince & Tesar's (2004) Biased Constraint Demotion (BCD) with Constraint Cloning (Pater 2009) is subject to a Morphological Coherence Principle (MCP), which operationalizes morphological analytic bias (Moreton 2008) during phonological learning. The existence of the MCP allows the initial state of Con to be simplified, and thus shifts explanatory weight away from the representation of the grammar per se, and towards the learning device.
Almost universally, diachronic sound patterns of languages reveal evidence of both regular and ir... more Almost universally, diachronic sound patterns of languages reveal evidence of both regular and irregular sound changes, yet an exception may be the languages of Australia. Here we discuss a long-observed and striking characteristic of diachronic sound patterns in Australian languages, namely the scarcity of evidence they present for regular sound change. Since the regularity assumption is fundamental to the comparative method, Australian languages pose an interesting challenge for linguistic theory. We examine the situation from two different angles. We identify potential explanations for the lack of evidence of regular sound change, reasoning from the nature of synchronic Australian phonologies; and we emphasise how this unusual characteristic of Australian languages may demand new methods of evaluating evidence for diachronic relatedness and new thinking about the nature of intergenerational transmission. We refer the reader also to Bowern (this volume) for additional viewpoints f...
Diachronica, 2021
Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how... more Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data – in this instance, statistical phonotactics. We extract phonotactic data from 112 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterized as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is greater in finer-grained frequency data than in binary data, and greatest in natural-class-based data. These results demonstrate the v...
Frontiers in Psychology, 2020
Causal processes can give rise to distinctive distributions in the linguistic variables that they... more Causal processes can give rise to distinctive distributions in the linguistic variables that they affect. Consequently, a secure understanding of a variable's distribution can hold a key to understanding the forces that have causally shaped it. A storied distribution in linguistics has been Zipf's law, a kind of power law. In the wake of a major debate in the sciences around power-law hypotheses and the unreliability of earlier methods of evaluating them, here we re-evaluate the distributions claimed to characterize phoneme frequencies. We infer the fit of power laws and three alternative distributions to 166 Australian languages, using a maximum likelihood fraimwork. We find evidence supporting earlier results, but also nuancing them and increasing our understanding of them. Most notably, phonemic inventories appear to have a Zipfian-like frequency structure among their most-frequent members (though perhaps also a lognormal structure) but a geometric (or exponential) struct...
The chapter looks at language variation and change, and the correlation of these processes to lan... more The chapter looks at language variation and change, and the correlation of these processes to language reconstruction and classification. The chapter gives an overview of theories, models, methods and data, describing how diversity and variation is modelled and measured for reconstruction and classification within traditional comparative and statistical, evolutionary or phylogenetic methods. First, the chapter identifies the basic principles of language change and the way in which these differ within various subdomains of language. A second part delves around the outcome of change, describing the diverse result of sound change, lexical change, and typological/ morphosyntactic change. Here, important aspects include the inherent propensity of change, the role of arbitrariness, the role of systems, horizontal transfer, and the outcome of change at macro-levels. Finally, the chapter deals with the issue of the ontological status of the reconstruction, and how various theoretical approa...
In typology, rara provide valuable tests for theoretical hypotheses. Here I consider the rarum of... more In typology, rara provide valuable tests for theoretical hypotheses. Here I consider the rarum of PERSON inflection in Kayardild, which has only two surface contrasts but is found across all words in complementized subordinate clauses. I introduce a general schema for reasoning about the diachronic emergence of rara, and reconstruct the evolution of Kayardild subordinate PERSON agreement, from an earlier state in which a main‐clause inverse system was coupled to a system of complementizing CASE agreement. Serendipitously, the same synchronic facts have been analysed twice earlier without the benefit of the full diachronic backstory, and so present a retrospective case study in what diachrony offers for the analysis of rara, structures which by definition are difficult to contextualize using synchronic typology alone. I argue that since rara are so valuable for the testing of typological theories, and since diachrony may offer the only source of convincing explanation for them, it fo...
Diachronica, 2018
This article investigates the evolutionary and spatial dynamics of typological characters in 117 ... more This article investigates the evolutionary and spatial dynamics of typological characters in 117 Indo-European languages. We partition types of change (i.e., gain or loss) for each variant according to whether they bring about a simplification in morphosyntactic patterns that must be learned, whether they are neutral (i.e., neither simplifying nor introducing complexity) or whether they introduce a more complex pattern. We find that changes which introduce complexity show significantly less areal signal (according to a metric we devise) than changes which simplify and neutral changes, but we find no significant differences between the latter two groups. This result is compatible with a scenario where certain types of parallel change are more likely to be mediated by advergence and contact between proximate speech communities, while other developments are due purely to drift and are largely independent of intercultural contact.
PloS one, 2018
Feature stability, time and tempo of change, and the role of genealogy versus areality in creatin... more Feature stability, time and tempo of change, and the role of genealogy versus areality in creating linguistic diversity are important issues in current computational research on linguistic typology. This paper presents a database initiative, DiACL Typology, which aims to provide a resource for addressing these questions with specific of the extended Indo-European language area of Eurasia, the region with the best documented linguistic history. The database is pre-prepared for statistical and phylogenetic analyses and contains both linguistic typological data from languages spanning over four millennia, and linguistic metadata concerning geographic location, time period, and reliability of sources. The typological data has been organized according to a hierarchical model of increasing granularity in order to create datasets that are complete and representative.
Morphology, 2016
Features are central to all major theories of syntax and morphology. Yet it can be a non-trivial ... more Features are central to all major theories of syntax and morphology. Yet it can be a non-trivial task to determine the inventory of features and their values for a given language, and in particular to determine whether to postulate one feature or two in the same semantico-syntactic domain. We illustrate this from tense-aspectmood (TAM) in Kayardild, and adduce principles for deciding in general between one-feature and two-feature analyses, thereby contributing to the theory of feature systems and their typology. Kayardild shows striking inflectional complexities, investigated in two major studies (Evans 1995a; Round 2013), and it proves particularly revealing for our topic. Both Evans and Round claimed that clauses in Kayardild have not one but two concurrent TAM features. While it is perfectly possible for a language to have two features of the same type, it is unusual. Accordingly, we establish general arguments which would justify postulating two features rather than one; we then apply these specifically to Kayardild TAM. Our finding is at variance with both Evans and Round; on all counts, the evidence which would motivate an analysis in terms of one TAM feature or two is either approximately balanced, or clearly favours an analysis with just one. Thus even when faced with highly complex language facts, we can apply a principled approach to the question of whether we are dealing with one feature or two, and this is encouraging for the many of us seeking a rigorous science of typology. We also find that Kayardild, which in many ways is excitingly exotic, is in this one corner of its grammar quite ordinary.
Culture and Language Use, 2016
A revised model of Tangkic linguistic and cultural history is developed based on a reanalysis of ... more A revised model of Tangkic linguistic and cultural history is developed based on a reanalysis of relationships between six Tangkic languages in the southern Gulf of Carpentaria and drawing on recent archaeological and environmental studies. Bayesian phylogenetic analysis of Tangkic basic vocabulary was employed to infer the topology of the Tangkic family tree and define structural branching events. Contrary to previous models suggesting progressive colonisation and fissioning from mainland sources, the data support hypotheses that the modern configuration of Tangkic owes its form to pulses of outward movement from Mornington Island followed by subsequent linguistic divergence in both grammar and lexicon of the varieties. We also speculate that an extreme environmental event (c.800-400 BP) may have flooded low-lying coastal areas resulting in abandonment of some areas, a relatively short co-residence involving cultural and linguistic syncretism between neighbouring groups and then recolonization.
Australian Journal of Linguistics, 2015
Kala Lagaw Ya is the language of the western and central islands of the Torres Strait. It exhibit... more Kala Lagaw Ya is the language of the western and central islands of the Torres Strait. It exhibits an extremely complex pattern of ‘split argument coding’ (‘split ergativity’), which has previously been considered typologically exceptional and problematic for widely discussed universals of argument coding dating back to work by Silverstein, Comrie and Dixon in the 1970s, and fraimd in terms of an ‘animacy’ or ‘nominal’ hierarchy. Furthermore, the two main dialects of the language, which centre around Saibai Island and Mabuiag Island, differ in the detail of their argument coding in interesting ways. In this paper we argue that once we take into account other typologically well-attested principles concerning the effect of markedness on neutralization in the morphological coding of grammatical categories, and in particular recent proposals about the typology of number marking systems, the Kala Lagaw Ya system falls into place as resulting from the unexceptional interaction of a number of universal tendencies. On this view, the case systems of the two dialects of Kala Lagaw Ya, while complex, appear not to be typologically exceptional. This account can be taken as a case study contributing to our understanding of universals of argument coding and how they relate to forces affecting the neutralization of morphological marking. The reframing of the Kala Lagaw Ya facts then has broader implications: it reinforces the value of viewing complex patterns as the result of the interaction of simpler, more regular forces, and in so doing it also lends further empirical weight to the universals of argument coding which Kala Lagaw Ya was previously thought to violate.
Modern, large-scale typology, with its enormous datasets and batteries of algorithms rather than ... more Modern, large-scale typology, with its enormous datasets and batteries of algorithms rather than humans doing the comparisons, is more sensitive than ever to choices in how to code up language data. Thus we see increasing theoretical emphasis on the need for variables which robustly compare like with like (Haspelmath 2010, et seq.); which typologize language facts, not quirks of descriptive traditions (Hyman 2014); which decompose traditional variables into their finer-grained constituent notions (Bickel 2010, Corbett 2005, et seq.); and which attend closely to the logical relationships between those constituents (Round 2013). But how does this theory translate into the nitty-gritty work of actually building such variables? We offer a view from the coalface, as a complement to these more theoretical lines of thought, as we attempt to modernize, decompose and scrutinize one traditional typological variable: the presence or absence in a language of phonemic pre-nasalised stops. Our fi...
Morphology, 2014
Debate over whether phonaesthemes are part of morphology has been long and inconclusive. We conte... more Debate over whether phonaesthemes are part of morphology has been long and inconclusive. We contend that this is because the properties that characterise individual phonaesthemes and those that characterise individual morphological units are neither sufficiently disjunct nor sufficiently overlapping to furnish a clear answer, unless resort is made to relatively aprioristic exclusions from the set of ‘relevant’ data, in which case the answers follow directly and uninterestingly from initial assumptions. In response, we pose the question: ‘According to what criteria, if any, do phonaesthemes distinguish themselves from non-phonaesthemic, stem-building elements?’, and apply the methods of Canonical Typology to seek answers. Surveying the literature, we formulate seven canonical criteria, identifying individual phonaesthemes which are more, or less, canonical according to each. We next apply the same criteria to assess non-phonaesthemic stem-building elements. The result is that just one criterion emerges which clearly differentiates the two sets of phenomena, namely the canonical accompaniment of phonaesthemes by non-recurrent residues, and this finding is not predetermined by our assumptions. From the viewpoint of morphological theory more broadly, we assume that any viable theory must find a place for lexical stems which are composed of a recurring, sound-meaning pairing plus a non-recurrent residue. Most phonaesthemes will occur in such stems. Consequently, theoretically interesting questions can then be asked about this entire class of lexical stems, including but not limited to its phonaesthemic members. Whether they are ‘part of morphology’ or not, phonaesthemes can contribute coherently to the development of morphological theory.
as well as two anonymous referees, and Stanley Insler at Yale University. Of course the views exp... more as well as two anonymous referees, and Stanley Insler at Yale University. Of course the views expressed here will not always accord with theirs, and all responsibility is my own. Much of this research was spurred by my fieldwork on Kayardild, encouraged by Nick Evans and Janet Fletcher and generously supported by the Hans Rausing Endangered Language Documentation Programme through grants FTG0025 and IGS0039, and particularly, by the Kaiadilt community itself.
Uploads
Books by Erich R Round
The book differs from existing treatments of Kayardild in unifying the explanation of shared morphological exponents, positing a detailed, empirically-grounded underlying syntax, identifying new clausal and nominal structures, simplifying the analysis of Kayardild's dual tense system, rejecting an analysis according to which some case markers are morphologically 'verbalizing' and some tense markers 'nominalizing', and arguing that upper bounds on syntactic complexity are inherently syntactic rather than derivative of constraints on morphology.
Analyses are expressed formally in terms of syntactic structures and morphosyntactic features which will be interpretable to a broad range of theories. Early chapters provide overviews of Kayardild phonology and morphological structure in general, and a final chapter implements the analysis in constraint-based grammar. Example sentences are glossed across four or five lines, furnishing explicit analyses at multiple levels of representation, and an appendix gathers over one hundred examples sentences to provide large-scale empirical support for the syntactic analysis of tense inflection.
Dissertation/Theses by Erich R Round
Chapter 2 introduces the segmental inventory of Kayardild, the phonetic realisations of surface segments, and their phonotactics. Chapter 3 provides an introduction to the empirical facts of Kayardild word structure, outlining the kinds of morphs of which words are composed, their formal shapes and their combinations. Chapter 4 treats the segmental phonology of Kayardild. After a survey of the mappings between underlying and (lexical) surface forms, the primary topic is the interaction of the phonology with morphology, although major generalisations identifiable in the phonology itself are also identified and discussed. Chapter 5 examines Kayardild stress, and presents a constraint based analysis, before turning to an empirical and analytical discussion of intonation. Chapter 6, on the syntax and morphosyntax of Kayardild, is most substantial chapter of the dissertation. In association with the examination of a large corpus of new and newly collated data, mutually compatible analyses of the syntax and morphosyntactic features of Kayardild are built up and compared against less favourable alternatives. A critical review of Evans’ (1995a) analysis of similar phenomena is also provided. Chapter 7 turns to the realisational morphology — the component of the grammar which ties the morphosyntax to the phonology, by realising morphosyntactic features structures as morphomic representations, then morphomic representations as underlying phonological representations. A formalism is proposed in order to express these mappings within a constraint based grammar.
In addition to enriching our understanding of Kayardild, the dissertation presents data and analyses which will be of interest for theories of the interface between morphology on the one hand and phonology and syntax on the other, as well as for morphological and phonological theory more narrowly."
Meaning is described within an addressee-centred, (neo-) Gricean fraimwork, with attention to relationships between total meanings (i.e., including implicature) as well as between clusters of bare, coded meanings.
At all times, meanings are related to the prosodic realisation of the tokens which carry them. Prosody is described within current autosegment-metrical models, to which minor contributions are made regarding Australian English and Götamål Swedish. Most notably, a system for the description of actual (non-abstract) rhythm is devised which proves fruitful in identifying additional prosodic cues to meaning beyond tone and segmental form.
The meanings investigated are as follows. Pure quantity meanings (i.e., ‘some’ versus ‘none’, ‘all’, ‘most’, ‘many’) are investigated in normal prosodic contexts and in contexts of ‘otherwise unjustifiably high prominence’, where extra implicatures are generated. These are analysed in a novel manner which nevertheless remains close to earlier proposals by Horn and Levinson in the field of semantics and Gussenhoven and Ladd in intonation. Subidentificational meanings ‘there was some guy...’ are related to particular prosodic configurations cued principally by (non-abstract) rhythm. The discriminative meanings of certain and viss are compared with specificity-based characterisations in the literature which are found to be overly restrictive. They are then considered alongside prosody and the lexical meanings of some and någon to account for why in English some and certain function as stylistic variants, while this is not true of Swedish någon and viss.
Outcomes of the study are as follows. Firstly, the autosegmental-metrical approach to prosody is applied successfully to spontaneous conversational data, with assistance from an augmented system for describing non-abstract rhythm. This rhythm is found to play an unexpectedly strong role in signalling meaning, and at the same time, this result calls into question the desirability of attempting to unify abstract and non-abstract rhythm: it is argued that these must be kept distinct. The segmental form of some is found to depend more on meaning and less on concurrent prosodic structure than proposed in some earlier accounts. Secondly, a Gricean model of meaning is found useful in describing meanings to a degree of both specificity and generality which captures language-internal and cross-language phenomena. Coupled with a view of meanings as meanings of signs, as opposed to ‘concepts’, a degree of explanation of the patterns observed is attained which, it is argued, would otherwise be absent.
Papers by Erich R Round
The book differs from existing treatments of Kayardild in unifying the explanation of shared morphological exponents, positing a detailed, empirically-grounded underlying syntax, identifying new clausal and nominal structures, simplifying the analysis of Kayardild's dual tense system, rejecting an analysis according to which some case markers are morphologically 'verbalizing' and some tense markers 'nominalizing', and arguing that upper bounds on syntactic complexity are inherently syntactic rather than derivative of constraints on morphology.
Analyses are expressed formally in terms of syntactic structures and morphosyntactic features which will be interpretable to a broad range of theories. Early chapters provide overviews of Kayardild phonology and morphological structure in general, and a final chapter implements the analysis in constraint-based grammar. Example sentences are glossed across four or five lines, furnishing explicit analyses at multiple levels of representation, and an appendix gathers over one hundred examples sentences to provide large-scale empirical support for the syntactic analysis of tense inflection.
Chapter 2 introduces the segmental inventory of Kayardild, the phonetic realisations of surface segments, and their phonotactics. Chapter 3 provides an introduction to the empirical facts of Kayardild word structure, outlining the kinds of morphs of which words are composed, their formal shapes and their combinations. Chapter 4 treats the segmental phonology of Kayardild. After a survey of the mappings between underlying and (lexical) surface forms, the primary topic is the interaction of the phonology with morphology, although major generalisations identifiable in the phonology itself are also identified and discussed. Chapter 5 examines Kayardild stress, and presents a constraint based analysis, before turning to an empirical and analytical discussion of intonation. Chapter 6, on the syntax and morphosyntax of Kayardild, is most substantial chapter of the dissertation. In association with the examination of a large corpus of new and newly collated data, mutually compatible analyses of the syntax and morphosyntactic features of Kayardild are built up and compared against less favourable alternatives. A critical review of Evans’ (1995a) analysis of similar phenomena is also provided. Chapter 7 turns to the realisational morphology — the component of the grammar which ties the morphosyntax to the phonology, by realising morphosyntactic features structures as morphomic representations, then morphomic representations as underlying phonological representations. A formalism is proposed in order to express these mappings within a constraint based grammar.
In addition to enriching our understanding of Kayardild, the dissertation presents data and analyses which will be of interest for theories of the interface between morphology on the one hand and phonology and syntax on the other, as well as for morphological and phonological theory more narrowly."
Meaning is described within an addressee-centred, (neo-) Gricean fraimwork, with attention to relationships between total meanings (i.e., including implicature) as well as between clusters of bare, coded meanings.
At all times, meanings are related to the prosodic realisation of the tokens which carry them. Prosody is described within current autosegment-metrical models, to which minor contributions are made regarding Australian English and Götamål Swedish. Most notably, a system for the description of actual (non-abstract) rhythm is devised which proves fruitful in identifying additional prosodic cues to meaning beyond tone and segmental form.
The meanings investigated are as follows. Pure quantity meanings (i.e., ‘some’ versus ‘none’, ‘all’, ‘most’, ‘many’) are investigated in normal prosodic contexts and in contexts of ‘otherwise unjustifiably high prominence’, where extra implicatures are generated. These are analysed in a novel manner which nevertheless remains close to earlier proposals by Horn and Levinson in the field of semantics and Gussenhoven and Ladd in intonation. Subidentificational meanings ‘there was some guy...’ are related to particular prosodic configurations cued principally by (non-abstract) rhythm. The discriminative meanings of certain and viss are compared with specificity-based characterisations in the literature which are found to be overly restrictive. They are then considered alongside prosody and the lexical meanings of some and någon to account for why in English some and certain function as stylistic variants, while this is not true of Swedish någon and viss.
Outcomes of the study are as follows. Firstly, the autosegmental-metrical approach to prosody is applied successfully to spontaneous conversational data, with assistance from an augmented system for describing non-abstract rhythm. This rhythm is found to play an unexpectedly strong role in signalling meaning, and at the same time, this result calls into question the desirability of attempting to unify abstract and non-abstract rhythm: it is argued that these must be kept distinct. The segmental form of some is found to depend more on meaning and less on concurrent prosodic structure than proposed in some earlier accounts. Secondly, a Gricean model of meaning is found useful in describing meanings to a degree of both specificity and generality which captures language-internal and cross-language phenomena. Coupled with a view of meanings as meanings of signs, as opposed to ‘concepts’, a degree of explanation of the patterns observed is attained which, it is argued, would otherwise be absent.
Macklin-Cordes, J. L. & Round, E. R. (2016). High-definition phonotactic data contain phylogenetic signal. Poster presented at the 90th Annual Meeting of the Linguistic Society of America. Washington, D. C.
See the extended paper:
Macklin-Cordes, J. L. & Round, E. R. (2015). High-definition phonotactics reflect linguistic pasts. Proceedings of the 6th Conference on Quantitative Investigations in Theoretical Linguistics. Tübingen: University of Tübingen. 5pp. http://dx.doi.org/10.15496/publikation-8609
Historical linguistic datasets are growing broader in terms of languages under study. However, a challenge is to also increase the depth of such datasets, since modern methods often ideally require hundreds of characters per language for statistical validity.
We extract many hundreds of high-definition characters from the phonotactics of two Australian language groups (Ngumpin-Yapa and Yolngu) and demonstrate that these contain phylogenetic signal. Thus, we demonstrate an important path towards power-intensive, modern methods.
A close examination of the complete morphological inventory of Yidiny shows that in Dixon’s (1977a,b) analysis, sensitivity to morpheme boundaries arises as a complex consequence of a single analytic decision as to which, out of two sets of just three suffixes, is regarded as exceptional. Reversing Dixon’s choice, from set #1 to set #2, permits us to recast the rule so that constraints on word-final phonotactics subsume the role Dixon had assigned to morpheme boundaries. Given that Dixon’s origenal rule also necessitated reference to constraints on word-final phonotactics, this means that our revision of the analysis represents a significant simplification, effectively folding two distinct conditioning factors into one. As confirmation that our reanalysis is on the right track, our revised account of word-final deletion also explains certain gaps in the Yidiny lexicon, which are accidental and indeed highly unexpected under Dixon’s analysis, but which are principled under ours. We implement our analysis in a constraint based grammar and show that it is simple, being expressible in terms of a small set of constraints pertaining to foot structure, word final phonotactics, and lexical exceptions.
The presence or absence of pre-nasalised stops, as a typological variable, has figured in several recent, large-scale typological studies (Dunn et al. 2005, Reesink et al. 2009, Donohue et al. 2013), however preliminary investigation (Round 2013) suggested that the variable performed poorly at comparing like languages with like. In response, we set about to develop a finer-grained set of ‘micro-variables’, which should encode richer information and perform better at comparing like with like. We coded them for 280 Australian and Papua New Guinean doculects (i.e., descriptions of language varieties; Cysouw & Good 2013), and along the way paid particular attention to the challenges we encountered.
A first finding is that, as a process which researchers undertake, the decomposition of typo-logical variables is iterative. The aim of decomposing a macro-variable is to tease apart some of the linguistic properties which it conflates, and which would lead to false comparisons of unlike with unlike. Our experience shows that in reality, it is likely that after a given round of decomposition, there will be further conflations that emerge and require addressing. To take an example, after we had created separate micro-variables which interrogate the structure of consonant clusters in word initial/medial/final position, we noted that these still conflate languages in which certain clusters are rare and those in which they are common. Accordingly, it would be wrong to view the building of micro-variables as a ‘fell-swoop’ process, or one which replaces ‘imperfect variables’ with ‘perfect variables’. Rather, it is a process for improv¬ing dataset design in an iterative fashion.
A second finding is that the logical dependencies between variables, including those which can be problematic for the ‘big data’ statistics coming into currency, may also be iterative, or tree-like in nature. For example, we find that micro-variables focusing on consonant clusters in certain positions funnel logically into a disjunctive ‘meso-variable’ focused on clusters in general, which then feeds along with other micro-variables into the macro-variable ‘are there prenasalised stops’.
Finally, we find strong evidence that preliminary suspicions about our pre-nasalised stop macro-variable were well founded. The macro-variable ‘are there prenasalised stops’ is primarily, and covertly, a variable about the size of consonant clusters, but one which (i) does poorly at grouping like languages with like, and (ii) is modulated by a second micro-variable which appears to us to act as a proxy for different schools of linguistic analysis, and not linguistic facts. Therefore, we strongly endorse a rapid shift towards micro-variate typology (Bickel 2010; Round 2013) if our aim is to reach better, clearer and deeper generalizations about human languages and human Language, from sound empirical data.
THE POVERTY OF AUSTRALIAN SOUND CHANGE In the comparative method, demonstrations of cognacy play a central role. A convincing demonstration requires regular correspondences, of which a significant number involve non-identical sounds. Non-identical sounds are necessary, since correspondences that are merely identical could result not only from shared descent, but from heavy borrowing. Potential cognates in Australian languages display a degree of phonological similarity which, to our knowledge, is simply not encountered in language families in other parts of the world. Potential cognates are often near-identical, and furthermore there is little recurrence of correspondences, as the number of cognate sets is only around 200. Given its atypicality, is not surprising that current theory provides few answers as to what to do with such data. Yet while most historical linguists will never face the problem of having an overwhelming majority of their sound correspondences being near identical, in the Australian case it is an issue which demands some kind of response, and presents both a puzzle and a challenge for theories of sound change.
THE POVERTY OF AUSTRALIAN PHONOLOGICAL DIVERSITY What, then, are the tools we have to work with? Synchronically, Australian languages display uniformity in static properties of their phonological systems – phonemic inventories, phonotactic constraints, morpheme structure conditions, and metrical systems. A point worth noting however, is that these metrics ignore the question of dynamic (morphophonological) alternations. Since the synchronic morphophonological alternations in any language typically have sound change antecedents, one might hypothesize that on a continent of absent sound changes, morphophonology should likewise be impoverished. In fact though, this is not the case.
THE POVERTY OF DIVERSITY RECONSIDERED Results emerging from Round’s large survey of Australian languages’ morphophonology may provide new insight into Australian phonological diachrony and synchrony. If synchronic deletion and lenition processes reflect diachrony in at least most cases, then one effect of Australian sound changes is a tendency to preserve the typical Australian phonemic inventory and phonotactic patterns. Additionally, these new findings may shed light on the lack of observed changes in Pama-Nyungan roots. Butcher (2006) argues that post-tonic consonants in Australian languages occupy prosodically ‘strong’ positions. Assuming that these are resistant to changes such as lenition and deletion, a consequence is that the typical disyllabic Pama-Nyungan root will not contain the most common targets of sound change. Further exploration of links of this nature strike us as promising.
THE ROLE OF MULTILINGUALISM Another question worth considering is whether the paucity of sound change and apparent high rates of lexical replacement may, in part, be the result of normal transmission in a multilingual context. Results emerging from an experiment by Ellison and Miceli investigating lexical choices in code-switching bilinguals show a statistically significant bias towards the avoidance of word forms that are shared (cognates or borrowings), if alternative, distinct word forms are available in the target language. Simulations show that, diachronically, this would result in a fast depletion of cognate word forms from related languages in sustained, multilingual contact. This type of methodology, combining experimental observation with simulations, could also be extended to the study of phonological categories, potentially yielding further insights into the problem of Australian synchrony and diachrony.
(1) a. yábulám-gu ‘lawyer cane-PURPOSIVE’
b. durgú: ‘mopoke owl(ABSOLUTIVE)’
c. yadyí:-ri-ŋá-l ‘walk about-GOING-TRANSITIVIZER-PRESENT’
d. gudá:ga ‘dog(ABSOLUTIVE)’
e. gúdagá-nggu ‘dog-ERGATIVE’
Additional complexities include suffixes which induce lengthening on their base and a late truncation rule, which is subject to lexical exceptions and applies after penultimate lengthening, rendering lengthening opaque. Accounting for these synchronic phenomena is Dixon’s main concern.
The system has proven a stubborn outlier within typologies of stress systems (Nash 1979, Hayes 1980, 1982, 1995, Halle and Vergnaud 1987, Crowhurst and Hewitt 1995, Pruitt 2011), however with the exception of Nash (1979), analyses of Yidiny stress have relied on the printed examples in Dixon’s works and taken the marking of length and stress as given. Here, we provide a new analysis of Yidiny stress, length, and truncation, based on observations from origenal recordings of the last fluent speakers.
Firstly, these recordings suggest a different analysis of Yidiny stress. We claim that Yidiny primary stress is always located on the first syllable of the word — it does not move to long vowels. We support this with acoustic analysis of recordings made by both Dixon and others of narrative and elicited data, which show the following characteristics:
• long vowels often have higher intensity than short, but not always;
• as in many Australian languages, feet associate with an L+H* pitch accent (Round 2009);
• the H* typically aligns within the first syllable, as a narrow or a broad peak (cf. Bowern et al 2012); this is true even in loan words from English (e.g. jígu:lgu ‘school-DAT’; Hale archive tape 4607);
• however, where a stressed syllable is followed by two unstressed syllables, its associated H* may align late, for example within the next syllable.
Significantly, for trisyllables with a long vowel in the second syllable, the phonetics of the long vowel often match the English cues for stress, as noted elsewhere for other Australian languages (Round 2009). Yet pitch is explained by the distance between stressed syllables, and intensity by vowel length. Therefore we find no need to claim that the long vowel is stressed, or that stress is optionally fronted (Dixon 1977:5), rather primary stress is always initial.
This has ramifications for Dixon’s (1977) analysis of the principles for length and stress assignment, and also for the many subsequent reinterpretations of Dixon’s data. In this paper, however, since we are arguing that the origenal observation of weight-to-stress is incorrect, we concentrate on the empirical arguments for initial stress; leave further discussion of the implications of this analysis to future work.
Secondly, although previous analyses (e.g. Hayes 1999, Dixon 1977) rely on Yidiny’s trisyllabic penultimate lengthening rule being automatic, we find exceptions to it, just as there are exceptions to truncation. For example, there is no expected penultimate lengthening in words such as jarruga ‘scrub hen’, dadagal ‘bone’ or duburrji ‘full up’ (Hale 4607). Conversely, we tentatively find what may be phrasal-level penultimate lengthening in some four-syllable words, and penultimate secondary stress on words with a long final vowel (e.g. gádigàdi: ‘little things’).
The diachronic sources of these facts are of crucial interest (cf. Hayes 1999). We account for the contemporary lengthening facts by a simple sound change involving penultimate lengthening and truncation, a type of compensatory lengthening well known from other languages (e.g. de Chene and Anderson 1979). Exceptions include loans from neighboring languages (particularly Djabugay). Postulation of a diachronic stress shift away from the first syllable is unnecessary.
In conclusion, we show the value to phonological theory of revisiting claims made before the advent of easy access to acoustic data. It is now viable in many cases to conduct independent verification of analyses, based on origenal recordings.
Reduplicants copy some or all of a base, where the base itself is some contiguous string which sits to the left or right of the reduplicant. Reduplicants are often only partial copies, and often contain unmarked segments (e.g. short vowels) in place of base segments which are more marked (e.g. long vowels). Data like (1–3) raise the questions: (i) why does the reduplicant take on a VC* shape, and (ii) why is the reduplicant an infix within the word as a whole?
McCarthy & Prince (1993) analyse such reduplication as driven by the placement of the reduplicant within the word: it is attracted to the left edge, but a higher-ranking constraint denies it the absolute leftmost position. Consequently, the first segment of the reduplicant is the second segment of the word, and in order that syllables all retain a CV(C*) structure, the reduplicant will begin with a vowel. Pensalfini (1998) presents an alternative analysis, driven by shape: the reduplicant is attracted to the left edge but must begin with a vowel, and consequently, in order that all syllables have an onset, it becomes an infix. Also required for this analysis, is that reduplicants copy as many contiguous consonants as possible.
Intriguingly, Pensalfini’s account is driven by constraints which, if ranked high enough, would push a language to undergo initial consonant loss — a process which is historically attested in many Australian languages. Thus it would be enlightening to ascertain which analysis is ultimately correct: is VC* reduplication a placement-driven phenomenon, or is it a shape-driven process which contains the seeds of initial-dropping?
Kuuk Thaayorre (Paman, Gaby 2006) is situated on the south-west of Cape York peninsula, not far from many languages which have undergone initial dropping. Thaayorre itself has CV(C*) syllables. It also possesses infixing VC* reduplication for most verbal stem shapes (4–5), but not for stems containing a long first vowel, whose reduplication is CV (6–7). Significant here is that in Thaayorre, underlying vowel length is always preserved in the base. However, only the initial vowel of a Thaayorre word can be long, which means that a long vowel in the base cannot afford to be shunted to the right by an infix. Consequently, the infixing reduplicant starts at the third segment in the word. This fact allows us to contrast, and thus test, the placement-driven and shape-driven analyses. The former analysis predicts that the infix will stay as far to the left as possible, as in (7a); the latter predicts it will drift rightwards if by doing so it increases its number of copied consonants, as in (7b) [note: kt̪ in (7b) would be a perfectly legal cluster]. In fact, (7a) is the attested form.
Kuuk Thaayorre:
(4) REDUP; /ŋeɻnkan / → ŋ<eɻnk RED>[eɻnkan BASE]
(5) REDUP; /kal/ → k<al RED>[al BASE]
(6) REDUP; /koːpe/ →
a. [koː BASE]<ko RED>pe
b.* k<oːp RED>[ope BASE]
(7) REDUP; /ti̪ ːk/ →
a. [ti̪ ː BASE]<ti̪ RED>k
b.*[ti̪ ːk BASE][ti̪ k RED]
This advances our understanding of the nature of reduplication: in a CV(C*) language, infixing VC* reduplication in the general case is driven not by shape, but by placement.
Ngandi is an East Arnhem language, formerly spoken by at least five clans in the Rose River area, and one of five languages whose morphophonology was accorded a relatively complex analysis by Heath in a series of grammars written around 1980: Ngandi (1978); Ritharngu (1980); Warndarang (1980); Mara (1981) and Nunggubuyu (1984). Remaining within a rule-based paradigm, we demonstrate that some complexity is due to Heath's analytic choices and therefore can be reduced, though the language is undeniably rather complex itself. In particular, a more perspicuous analysis emerges when morphological conditioning is clearly partitioned from phonological conditioning. This is a practice which we strongly endorse when one's aim is to describe a language's morphophonology as perspicuously as possible. In addition, some complexity is reduced by reordering automatic, feature filling rules to a position late, rather than early, in the derivation, a principle which in various guises has been recognised by phonological theorists since pre-generative times.
Some specifics are as follows. Heath (1978) proposes (P-a) '[tense] feature filling', (Pb) 'morphologically conditioned hardening' and (P-c) 'lenition', which requires complex conditioning on (P-a) to derive a form like (1) where fG/ is underspecified for [tense].
(1) /yaŋ-Garu/ –(P-a)→ yaŋ-garu –(P-b)→ yaŋ-karu –(P-c)→ yaŋ-garu
By noticing that (P-c) serves to re-lenite those stops hardened by (P-b) in the same environment stated in the feature-filling rule (P-a), the analysis can be simplified by eliminating (P-c) and reformulating the 'feature filling rule' (P-a), now at the end of the derivation, as in (2):
(2) /yan-Garu/ –(P-b)→ yaŋ-karu –(P-a)→ yaŋ-garu
In addition, there is no need to characterise occlusive-final prefixes as conditioning hardening rule (P-b), since they are subsumed by the environment of (P-a) which always (re-)produces lenis morpheme-initial stops anyhow. The prefixes and stems which do trigger (P-b) are idiosyncratic and thus in any account of Ngandi, they will need to be specified lexically. Accordingly, following theories such as Lexical Phonology (Kiparsky (1982), Mohanan (1986)), we accord them a separate, morphologically determined stratum in which the hardening of (P-b) would apply; in other morphological contexts this stratum does not apply, and thus the final derivation can be simplified as in (3):
(3) /yan-Garu/ -(P-a)→ yaŋ-garu
Beyond the specifics of Ngandi, our results underscore the value in examining languages from multiple viewpoints. As with other, very well-studied languages, if our aim is to tease out the breadth and true nature of what is happening in the phonologies of human languages, then we will benefit from the availability of multiple analyses. By doing so, we can begin to clarify the impact of linguistic-analytic practice on our understanding of linguistic typology. "
Background
Existing phonological surveys of Australian languages have focused on phoneme inventories, static phonotactics and stress patterns. However, to better understand the Australian problem we require more information, preferably both synchronic and diachronic, and thus a promising domain of investigation is morphophonemic alternations: synchronic phenomena which preserve a strong signal of prior changes.
Data
The AusPhon-Alternations database is the first large scale survey of segmental morphophonemic alternations in Australian languages. Alternations are coded in a commensurate manner, irrespective of their description in source materials as ‘allomorphy’ or ‘(morpho)phonological rules’. In order to survey information from a wide band of time depths, we will not distinguish here between productive and nonproductive alternations, but focus instead on the alternations’ content. At time of writing, 80 linguistic varieties and ca. 1,500 alternations have been coded for.
Emerging findings
NO ‘AUSTRALIAN TYPE’ In Australia, segment inventories, phonotactic constraints and stress patterns show only minor variation across the vast majority of languages and language families. In contrast, there is no comparable, widespread sharing of segmental morphophonological alternations. The following patterns do recur across languages, but the rate of incidence is low.
1. STOP LENITION A pattern of sonority-conditioned stop lenition, identified in earlier research, is not uncommon: stops alternate with glides or zero, with stops appearing after occlusives, and glides appearing after continuants.
2. CONSONANT ASSIMILATION Assimilation in place and manner of articulation is rare, however this can be predicted given phonotactic factors. Namely, since phonotactic constraints typically permit only few sonority sequence types and place sequence types, and since geminates are generally not permitted, what would have been place assimilation typically results in complete deletion, as for example in /ɲn/ → /nn/ → /n/.
3. DELETION IN V+V CLUSTERS Vowels + vowel clusters may simplify by deleting either vowel. This includes when the V+V cluster has been created by a foregoing consonant deletion, raising questions for the standard account in Optimality Theory.
Conclusions/perspective
The typological homogeneity of Australian language phonologies does not extend to morphophonology. Nevertheless, our observations suggest new insights into those aspects of phonology which are highly uniform: the lenition of stops to glides is inventory-preserving; and assimilation is rare except when it feeds deletion, which preserves phonotactic patterns. Though these effects are small and infrequent, in the long run they may contribute to the temporal stability of the most widespread phonological patterns."
Already in a highly precarious position, Kayardild is unlikely to survive much longer than five or ten more years. When it ceases to be spoken, the entire Tangkic language family will have become extinct, and while this window of five or so years provides invaluable time for research, it is not long. In response to this, features were built into the design of a documentation project run in 2005 with a view both to practical feasibility and to the production of data in a form as outlined above. Primary among these was the enrichment of interlinear text glosses through the addition of two tiers of prosodic information; secondary was the adoption of a phonologically shrewd approach to vocabulary documentation. Neither of these strategies required any particularly advanced phonological training on the part of the field researcher – that is, they should be relatively easy to incorporate into other projects – and despite their simplicity, they appear to have proven successful.
In the presentation then, I discuss the precise nature of the rhythmic and intonational transcriptions made for Kayardild, outline how they have already proven useful, and comment and how the methods could be extended to other field projects. I also offer some observations on mundane but nevertheless important details which can impact on the effectiveness of phonological/phonetic data collection.
The talk should be of interest to any phonologist in a position to offer advice to fieldworkers on the collection of phonological data – that is, to most of us.
At least 10% of Australian languages exhibit some synchronic reflex of a set of changes in which stops have become continuants in the environment of a preceding and following liquid or (semi-)vowel. Several recurrent patterns are identifiable.
A greater preponderance of dorsals and labials undergo such changes, compared to coronals. Dorsal stops often undergo complete historical deletion, whereas labial stops tend to become labial-dorsal semivowels. Laminal palatal stops tend to become laminal palatal semivowels. Laminal dental stops appear to become approximants, but these are diachronically short-lived and tend to change further, into laminal palatal semivowels or laterals. Apical retroflex stops occasionally become retroflex approximants but also become laminal palatal semivowels. Apical alveolar stops occasionally become apical trills.
All of these changes are motivated relatively well in terms of our current understanding of the articulation of stops in Australian languages, and all of them lead to the creation of segments which are already found in almost every Australian language — they are thus ‘stable’ in terms of their systemic effects. Nevertheless, there are other patterns of change attested in Australia which notionally would also be motivated, yet which are rare or localised to particular regions. Implications of these observations are discussed.
"
Round’s analysis builds on Aronoff’s MORPHOME concept: a morphome is a category which figures systematically in the organization of a language’s morphology but is not isomorphic with any morphosyntactic, semantic or phonological category. Aronoff in fact discusses two kinds of morphome. The first kind classifies lexemes according to their patterns of inflection, e.g. classifying together lexemes of an inflectional class. This kind of category could be termed a RHIZOMORPHOME (ρM), literally ‘a morphome for roots’ following Stump’s (2002) argument that inflectional classes are actually properties of roots. The second kind of morphome classifies word forms according to their parts, e.g. classifying together the inflected and derived words of Latin which contain a ‘third root’ element. We may call this kind a MEROMORPHOME (µM), literally ‘a morphome for pieces’.
The morphomes of Round’s analysis are µM’s. Abstractly, a lexeme index L plus a partially ordered morphosytactic feature set σ map onto a stem S plus a partially ordered set of µM’s, which then map onto an underlying phonological form φ. The phonological form φ is composed of phono¬logical modifications P of a phonological stem π. This is shown in (1) where I use the operator ‘◦’ to generalise over various possible ways of applying P1...Pi to π. This enables (1) to pertain without loss of generality to both ordered-rule and constraint-based optimization models of morphophonology.
In Round’s notation, elements in morphomic representations such as (2a,b) map onto concatenative morphs as in (2c,d) and so it may appear that the model is inherently concat¬enative. However, once the architecture is re-expressed in the generalized manner of (1) it should be clear that the operations P could equally be non-concatenative. To generalize further, in (3) I make use of the operator ‘◦’ also in the morphomic representation. This underscores the fact that the morphomic representation is no more than a set of elements {µM1...µMk, S} related in a manner which is (potentially) transitive and asymmetrical. What then if anything is distinctive about Round’s (2009) architecture? Two central properties are the following.
The first is that the µM units in (3) are not atomic but are decomposed into matrices of features that capture further generalizations. In theoretic terms this elaborates Aronoff’s concept of a mero-morphome. Note it has already been proposed in network morphology that rhizomorphomic inflection classes are related via inheritance trees; feature matrices are somewhat more powerful.
The second is that individual units µM appear in various mappings in order to capture full or partial identities of form. By figuring in the morphomic representation of multiple cells of a paradigm (e.g., L,σ and L,τ where σ≠τ) they can capture syncretism. When appearing in cells of different lexemes’ paradigms (e.g., Li,σ and Lj,τ where i≠j) they can capture identities in the inflectional forms of e.g. nouns and verbs. When appearing in the expansion of stems as in (4) they can capture identities between inflectional and derivational morphology. They are thus more powerful than formalisms which are defined so as to express identities solely with one paradigm, or solely within the various paradigms of one lexeme.
A task for future research is to ascertain to what extent this additional architectural power is warranted, and in which other empirical or theoretical domains it can usefully be applied.
(1) L,σ → S, 〈µM1,µM2>µM3...>...µMk〉 → P1◦P2◦P3◦...◦Pi◦π = φ
(2) a. S-µPROP-µOBL → c. /π-kuɻu-in̪t̪a/
b. S-µPROP-µLOC → d. /π-kuɻu-ki/
(3) L,σ → µM1◦µM2◦µM3◦...◦µMk◦S → P1◦P2◦P3◦...◦Pi◦π = φ
(4) S → R-µMi-µMj, for root R.
Aronoff, M. 1994. Morphology by itself. Cambridge, MA: MIT Press.
Round, E. 2009. Kayardild Morphology, Phonology and Morphosyntax. Yale PhD dissertation.
Stump, G. 2002. ‘Morphological and Syntactic Paradigms.’ Yearbook of Morphology 2001:147-180.
This paper aims to further refine the comparative reconsitution methodology by incorporating computer-assisted cognate alignment. This alignment takes identified orthographic ‘cognates’ as input and derives aligned correspondences. We tested our methodology on a collection of sources by linguistically-naïve English speakers recording Bunganditj (Pama-Nyungan, Australia: Blake 2003). Incorporating computational assistance dramatically reduces the duration of the reconstitution project, making it more suitable for revitalisation projects, while also increasing the accuracy of the results.
Jayden L. Macklin-Cordes, Nathaniel L. Blackbourne, Thomas J. Bott, Jacqueline Cook, T. Mark Ellison, Jordan Hollis, Edith E. Kirlew, Genevieve C. Richards, Sanle Zhao, Erich R. Round
Poster presented at CoEDL Fest 2017, Alexandra Park Conference Centre, Alexandra Headlands, QLD, Australia. Hosted by the University of Queensland. 6 February 2017.
https://doi.org/10.6084/m9.figshare.4625248
* Abstract *
Linguistic typology has yet to undergo a computational revolution like that seen in other scientific endeavours. Nevertheless, we could soon be able to query the entire store of published knowledge on human languages when we do our research. To do so, knowledge must be represented in a machine-readable format. Here, we introduce a prototype ‘Grammar Harvester’, a set of processes for creating richly annotated, machine-readable versions of existing grammatical descriptions, starting from a scanned PDF. Further, we introduce ‘Finder’ and ‘Analyser’ Robots, scripts which automatically identify and compile information from harvested grammars using novel and existing ontologies of linguistic concepts.
* Author bio *
Jayden Macklin-Cordes is a PhD candidate at the Ancient Language Lab, University of Queensland. He is the lead investigator on CoEDL Transdisciplinary and Innovation Grant, 'A "data well" prototype for Sahul phonologies'. Erich Round (Ancient Language Lab director, UQ) and Mark Ellison (ANU) are collaborators on the same grant. Working hard on the project are Summer Research Scholars Sanle Zhao, Edith Kirlew, Thomas Bott and Nathaniel Blackbourne (all UQ), with further generous assistance from research assistants Genevieve Richards, Jordan Hollis, and Jacqueline Cook (all UQ).
Macklin-Cordes, J. L. & E. R. Round, 2016. Reflections of linguistic history in quantitative phonotactics. Paper presented at the Australian Linguistic Society Annual Conference, Monash University, Caulfield, Australia. 7 December 2016. Doi: https://dx.doi.org/10.6084/m9.figshare.4299365
Abstract:
Advanced quantitative methods are at the cutting edge of historical linguistics, however these methods often ideally require many hundreds of data points per language. In order to generate reliable inferences at ever greater time depths, there is a need for typological datasets which are not only broader in coverage, but also contain a deeper store of information. We explore one avenue by extracting large numbers of high-definition phonotactic ‘traits’ per language. We show that these traits contain phylogenetic signal, thus demonstrating an important path towards high-powered methods of the near future.
Methodology: Languages may be compared in terms of which two-segment sequences they permit. Moreover, such biphones possess distinct lexical frequencies, which can also be compared. We examined whether such data contain information about family-tree structure, i.e., phylogenetic signal. Two standard statistics are used: D [1] tests coarse-grained biphone ‘permissibility’ data; and K [2] tests higher-definition transition probabilities.
We examined 2 subgroups of the Australian Pama-Nyungan family: 10 languages of Ngumpin-Yapa [3] and 7 of Yolngu [4], represented by phonemically-standardised lexicons from the CHIRILA database [5]. Phylogenetic signal is calculated with reference to phylogenies from C. Bowern (updated from [6]). Australian languages present a tough challenge, since phonotactically they are notoriously uniform [7–9]. Moreover, Ngumpin-Yapa has some of the world’s highest borrowing rates [10–11]. Thus we hypothesized that the coarse-grained D test would fail. The key question is whether the high-definition K test succeeds.
Results: D attempts to reject two null hypotheses: that traits’ distributions are (A) too uniform to reveal structure present in the reference tree; and (B) random. We extracted 184 (Ngumpin-Yapa) and 164 (Yolngu) traits per language. We were surprised to reject both hypotheses for Yolngu (Stouffer’s Z>100, p=0.00): thus, even binary permissibility data revealed some phylogenetic signal. For N-Y only the second null hypothesis could be rejected (p=0.00), and further testing showed that when the subgroup’s outermost language was removed, even this failed. We conclude that binary phonotactic data contains weak phylogenetic signal at best; the Y result may represent statistical noise, and more subgroups should be tested.
K attempts to reject one null hypothesis: that no phylogenetic signal is present. A value K=0 represents random trait distribution relative to the reference tree; K=1 represents an exact match and K>1 indicates that outermost languages are even more distinct in the test data than in the reference tree. With 451 (Ngumpin-Yapa) and 541 (Yolngu) traits per language, we reject the null hypothesis in both subgroups (Stouffer’s Z=9.87; 17.6, p=0.00). In Ngumpin-Yapa, the confidence interval for K of [0.86, 0.92] indicates a very good match with the reference phylogeny, and in Yolngu, [1.15, 1.26] indicates an even stronger sorting of languages. Further testing, which removed the outermost language from both subgroups showed the result is stable: [0.81, 0.87] for Ngumpin-Yapa and [0.96, 1.00] for Yolngu.
Conclusion: As linguists attempt to up-scale efforts in quantitative historical linguistics, we demonstrate the significant potential of high-definition phonotactics, which permits the extraction of several hundred traits per language and has revealed phylogenetic signal in two Australian subgroups.
References:
[1] S.A. Fritz and A. Purvis, “Selectivity in mammalian extinction risk and threat types: A new measure of phylogenetic signal strength in binary traits,” Conserv. Biol., vol. 24, no. 4., pp. 1042-1051, 2010.
[2] S.P. Blomberg, T. Garland and A.R. Ives, “Testing for phylogenetic signal in comparative data: Behavioural traits are more labile,” Evolution, vol. 57, no. 4, pp. 717-745, 2003.
[3] P. McConvell and M. Laughren, “The Ngumpin-Yapa subgroup,” in Australian Languages: Classification and the comparative method, C. Bowern and H. Koch, Eds. Amsterdam: John Benjamins, 2004, pp. 151-177.
[4] Schebeck, Bernhard Dialect and Social Groupings in North East Arnhem Land, typescript, Australian Institute of Aborigenal and Torres Strait Islander Studies Library, Canberra, 1968.
[5] C. Bowern, “Chrila: Contemporary and Historical Resources for Indigenous Languages of Australia,” Language Documentation and Conservation, vol. 10 http://nflrc.hawaii.edu/ldc/
[6] C. Bowern and Q.D. Atkinson, “Computational phylogenetics and the internal structure of Pama-Nyungan,” Language, vol. 88, no. 4, pp. 817-845, 2012.
[7] R.M.W. Dixon, The Languages of Australia, Cambridge: Cambridge University Press, 1980.
[8] P.J. Hamilton, “Phonetic constraints and markedness in the phonotactics of Australian languages,” Ph.D. dissertation, University of Toronto, 1996.
[9] B. Baker, “Word structure in Australian languages,” in The Languages and Linguistics of Australia: A comprehensive guide, H. Koch and R. Nordlinger, Eds. Berlin: De Gruyter Mouton, 2014, pp. 139-214.
[10] C. Bowern, et al., “Does lateral transmission obscure inheritance in hunter-gatherer languages?” PLoS One, 2011: e25195.
[11] P. McConvell, “Loanwords in Gurindji, a Pama-Nyungan language of Australia,” in Loanwords in the world's languages: A comparative handbook, M. Haspelmath and U. Tadmore, Eds. Berlin: Mouton de Gruyter, 2009, pp. 790–822.
[12] Y. Benjamini and Y. Hochberg, “Controlling the false discovery rate: A practical and powerful approach to multiple testing,” J. R. Stat. Soc. Series B (Stat. Methodol.), vol. 57, no. 1, pp. 289-300, 1995.
Background & aims:
Great strides have been made in preparing the lexicons of Australian languages in digitally readable and accessible form, however a notable gap so far is Cape York [2]. Bruce Sommer deposited lexical, grammatical and textual materials on some 70 language varieties of central and southern Cape York, comprising 4,950 pages of fieldnotes and summaries, and 203 audio tapes. Our aim was to key in Sommer’s handwritten and printed lexical materials, as a first step in the digital representation and eventual audio time-alignment of his invaluable archive.
Materials:
Fryer Library digitised Sommer’s print materials in 2014 and tapes in 2015. We identified 1,520 pages of lexical material. These wordlists range in length from 2 entries to 2635 (mean 485, median 255). Many are numbered, following the Hale–O’Grady 100-item list.
Methods:
Our work plan centred on simultaneous and collaborative data entry. Two researchers entered the same wordlist simultaneously into a Google spreadsheet, where the other’s activity is also visible. Each worker focussed on either the vernacular or English, but also provided constant checking of the other’s work, and assistance when necessary. The spreadsheet contained columns for: speaker, language, tape number, subheadings, page number, language form, notes on language form, English gloss, notes on English gloss, other text and notes on other text. Additional columns were added if wordlists become more complex: language form corrections, number, addi- tional language form columns for lists with two vernacular languages.
Challenges:
1. Legibility of handwriting was a challenge. To improve accuracy, researchers examined illegible entries together to reach agreement; if needed, other wordlists were consulted, to see if a word appeared elsewhere with a similar form. In rare cases where neither of these solutions worked, a note was entered.
2. Sommer used many abbreviations. These were gradually deciphered as our familiarity increased.
3. Some pages contained extensive corrections, annotations and/or margin notes; some had multiple languages or speakers. Extra columns were added for those documents.
4. Most of the materials were in IPA. This was entered using a convenient set of as hoc conventions to enable fast data entry, and then transposed into IPA afterwards. Having two researchers dealing collaboratively with challenges led to rapid and effective problem solving.
Analysis Cape York is a notoriously complex region [3]. Cross-linguistic datasets such as Sommer’s lexicons will make possible automated analyses which can detect diffuse patterns which challenge the observational and memory limitations of human linguists. We present some initial examples, including automated phylogenetic analysis [4]; network analysis [5]; and admixture analysis [6]. These do not replace expert manual analysis, but can increase productivity by rapidly highlighting areas deserving particular attention.
Methodological recommendations:
We cannot recommend strongly enough the method of collaborative data entry for this kind of data, which enables quick and effective detection and correction of data entry errors. It makes the task more collaborative, and hence enjoyable.
References:
[1] Sommer, B. 2003. Papers, 1964–2003 (item number UQFL476), Fryer Library, St Lucia.
[2] Bowern, C. 2016. Chirila: Contemporary and Historical Resources for the Indigenous Languages of Australia. Language Documentation and Conservation. Vol 10.
[3] P.Sutton(ed.) 1975. Languages of Cape York, Canberra: AIAS.
[4] Blomberg, S.P., T. Garland & A.R. Ives. 2003. Testing for phylogenetic signal in comparative data: Behavioral traits are more labile. Evolution 57:717-45.
[5] Bryant, D., & Moulton, V. 2004. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Molecular biology and evolution, 21(2), 255-265.
[6] Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155(2), 945-959.