Publications

Displaying 1 - 7 of 7
  • Levshina, N. (2021). Cross-linguistic trade-offs and causal relationships between cues to grammatical subject and object, and the problem of efficiency-related explanations. Frontiers in Psychology, 12: 648200. doi:10.3389/fpsyg.2021.648200.

    Abstract

    Cross-linguistic studies focus on inverse correlations (trade-offs) between linguistic variables that reflect different cues to linguistic meanings. For example, if a language has no case marking, it is likely to rely on word order as a cue for identification of grammatical roles. Such inverse correlations are interpreted as manifestations of language users’ tendency to use language efficiently. The present study argues that this interpretation is problematic. Linguistic variables, such as the presence of case, or flexibility of word order, are aggregate properties, which do not represent the use of linguistic cues in context directly. Still, such variables can be useful for circumscribing the potential role of communicative efficiency in language evolution, if we move from cross-linguistic trade-offs to multivariate causal networks. This idea is illustrated by a case study of linguistic variables related to four types of Subject and Object cues: case marking, rigid word order of Subject and Object, tight semantics and verb-medial order. The variables are obtained from online language corpora in thirty languages, annotated with the Universal Dependencies. The causal model suggests that the relationships between the variables can be explained predominantly by sociolinguistic factors, leaving little space for a potential impact of efficient linguistic behavior.
  • Levshina, N. (2021). Conditional inference trees and random forests. In M. Paquot, & T. Gries (Eds.), Practical Handbook of Corpus Linguistics (pp. 611-643). New York: Springer.
  • Levshina, N., & Moran, S. (2021). Efficiency in human languages: Corpus evidence for universal principles. Linguistics Vanguard, 7(s3): 20200081. doi:10.1515/lingvan-2020-0081.

    Abstract

    Over the last few years, there has been a growing interest in communicative efficiency. It has been argued that language users act efficiently, saving effort for processing and articulation, and that language structure and use reflect this tendency. The emergence of new corpus data has brought to life numerous studies on efficient language use in the lexicon, in morphosyntax, and in discourse and phonology in different languages. In this introductory paper, we discuss communicative efficiency in human languages, focusing on evidence of efficient language use found in multilingual corpora. The evidence suggests that efficiency is a universal feature of human language. We provide an overview of different manifestations of efficiency on different levels of language structure, and we discuss the major questions and findings so far, some of which are addressed for the first time in the contributions in this special collection.
  • Levshina, N., & Moran, S. (Eds.). (2021). Efficiency in human languages: Corpus evidence for universal principles [Special Issue]. Linguistics Vanguard, 7(s3).
  • Levshina, N. (2021). Communicative efficiency and differential case marking: A reverse-engineering approach. Linguistics Vanguard, 7(s3): 20190087. doi:10.1515/lingvan-2019-0087.
  • Levshina, N. (2020). How tight is your language? A semantic typology based on Mutual Information. In K. Evang, L. Kallmeyer, R. Ehren, S. Petitjean, E. Seyffarth, & D. Seddah (Eds.), Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (pp. 70-78). Düsseldorf, Germany: Association for Computational Linguistics. doi:10.18653/v1/2020.tlt-1.7.

    Abstract

    Languages differ in the degree of semantic flexibility of their syntactic roles. For example, Eng-
    lish and Indonesian are considered more flexible with regard to the semantics of subjects,
    whereas German and Japanese are less flexible. In Hawkins’ classification, more flexible lan-
    guages are said to have a loose fit, and less flexible ones are those that have a tight fit. This
    classification has been based on manual inspection of example sentences. The present paper
    proposes a new, quantitative approach to deriving the measures of looseness and tightness from
    corpora. We use corpora of online news from the Leipzig Corpora Collection in thirty typolog-
    ically and genealogically diverse languages and parse them syntactically with the help of the
    Universal Dependencies annotation software. Next, we compute Mutual Information scores for
    each language using the matrices of lexical lemmas and four syntactic dependencies (intransi-
    tive subjects, transitive subject, objects and obliques). The new approach allows us not only to
    reproduce the results of previous investigations, but also to extend the typology to new lan-
    guages. We also demonstrate that verb-final languages tend to have a tighter relationship be-
    tween lexemes and syntactic roles, which helps language users to recognize thematic roles early
    during comprehension.

    Additional information

    full text via ACL website
  • Levshina, N. (2020). Efficient trade-offs as explanations in functional linguistics: some problems and an alternative proposal. Revista da Abralin, 19(3), 50-78. doi:10.25189/rabralin.v19i3.1728.

    Abstract

    The notion of efficient trade-offs is frequently used in functional linguis-tics in order to explain language use and structure. In this paper I argue that this notion is more confusing than enlightening. Not every negative correlation between parameters represents a real trade-off. Moreover, trade-offs are usually reported between pairs of variables, without taking into account the role of other factors. These and other theoretical issues are illustrated in a case study of linguistic cues used in expressing “who did what to whom”: case marking, rigid word order and medial verb posi-tion. The data are taken from the Universal Dependencies corpora in 30 languages and annotated corpora of online news from the Leipzig Corpora collection. We find that not all cues are correlated negatively, which ques-tions the assumption of language as a zero-sum game. Moreover, the cor-relations between pairs of variables change when we incorporate the third variable. Finally, the relationships between the variables are not always bi-directional. The study also presents a causal model, which can serve as a more appropriate alternative to trade-offs.

Share this page