Publications

Displaying 1 - 3 of 3

Levshina, N. (2020). How tight is your language? A semantic typology based on Mutual Information. In K. Evang, L. Kallmeyer, R. Ehren, S. Petitjean, E. Seyffarth, & D. Seddah (Eds.), Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (pp. 70-78). Düsseldorf, Germany: Association for Computational Linguistics. doi:10.18653/v1/2020.tlt-1.7.

DOI

Full Text

Abstract
Languages differ in the degree of semantic flexibility of their syntactic roles. For example, Eng-
lish and Indonesian are considered more flexible with regard to the semantics of subjects,
whereas German and Japanese are less flexible. In Hawkins’ classification, more flexible lan-
guages are said to have a loose fit, and less flexible ones are those that have a tight fit. This
classification has been based on manual inspection of example sentences. The present paper
proposes a new, quantitative approach to deriving the measures of looseness and tightness from
corpora. We use corpora of online news from the Leipzig Corpora Collection in thirty typolog-
ically and genealogically diverse languages and parse them syntactically with the help of the
Universal Dependencies annotation software. Next, we compute Mutual Information scores for
each language using the matrices of lexical lemmas and four syntactic dependencies (intransi-
tive subjects, transitive subject, objects and obliques). The new approach allows us not only to
reproduce the results of previous investigations, but also to extend the typology to new lan-
guages. We also demonstrate that verb-final languages tend to have a tighter relationship be-
tween lexemes and syntactic roles, which helps language users to recognize thematic roles early
during comprehension.

Additional information
full text via ACL website

Permanent link to publication record
Levshina, N. (2020). Efficient trade-offs as explanations in functional linguistics: some problems and an alternative proposal. Revista da Abralin, 19(3), 50-78. doi:10.25189/rabralin.v19i3.1728.

DOI

Full Text

Abstract
The notion of efficient trade-offs is frequently used in functional linguis-tics in order to explain language use and structure. In this paper I argue that this notion is more confusing than enlightening. Not every negative correlation between parameters represents a real trade-off. Moreover, trade-offs are usually reported between pairs of variables, without taking into account the role of other factors. These and other theoretical issues are illustrated in a case study of linguistic cues used in expressing “who did what to whom”: case marking, rigid word order and medial verb posi-tion. The data are taken from the Universal Dependencies corpora in 30 languages and annotated corpora of online news from the Leipzig Corpora collection. We find that not all cues are correlated negatively, which ques-tions the assumption of language as a zero-sum game. Moreover, the cor-relations between pairs of variables change when we incorporate the third variable. Finally, the relationships between the variables are not always bi-directional. The study also presents a causal model, which can serve as a more appropriate alternative to trade-offs.

Permanent link to publication record
Levshina, N. (2018). Probabilistic grammar and constructional predictability: Bayesian generalized additive models of help. GLOSSA-a journal of general linguistics, 3(1): 55. doi:10.5334/gjgl.294.

DOI

Full Text

Abstract
The present study investigates the construction with help followed by the bare or to-infinitive in seven varieties of web-based English from Australia, Ghana, Great Britain, Hong Kong, India, Jamaica and the USA. In addition to various factors known from the literature, such as register, minimization of cognitive complexity and avoidance of identity (horror aequi), it studies the effect of predictability of the infinitive given help and the other way round on the language user’s choice between the constructional variants. These probabilistic constraints are tested in a series of Bayesian generalized additive mixed-effects regression models. The results demonstrate that the to-infinitive is particularly frequent in contexts with low predictability, or, in information-theoretic terms, with high information content. This tendency is interpreted as communicatively efficient behaviour, when more predictable units of discourse get less formal marking, and less predictable ones get more formal marking. However, the strength, shape and directionality of predictability effects exhibit variation across the countries, which demonstrates the importance of the cross-lectal perspective in research on communicative efficiency and other universal functional principles.

Permanent link to publication record

Publications

Abstract

Additional information

Abstract

Abstract

Contact

Follow us

Breadcrumb

Publications

Abstract

Additional information

Abstract

Abstract

Share this page