Quantitative computational syntax: some initial results

In the computational study of human intelligence, the language sciences are in the unique position of resting both on sophisticated theories and representations and on large amounts of observational data available for many languages. In this paper, we discuss some recent results, where large-scale,...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Paola Merlo
Formato: article
Lenguaje:EN
Publicado: Accademia University Press 2016
Materias:
H
Acceso en línea:https://doaj.org/article/c790c8aaaffa463aa6d63477c4142415
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:In the computational study of human intelligence, the language sciences are in the unique position of resting both on sophisticated theories and representations and on large amounts of observational data available for many languages. In this paper, we discuss some recent results, where large-scale, data-intensive computational modelling techniques are used to address fundamental linguistic questions on the quantitative properties of abstract grammatical representations. Specifically, we present a programme of research exemplified in three case studies to identify the causes of frequency differentials. In the area of word order, we discuss work that investigates whether typological and corpus frequencies are systematically correlated to abstract syntactic structures and to higher-level structural principles of minimisation and efficiency. In the area of verb meaning, corpus-based computational models are discussed that investigate how frequencies are correlated to well-known lexical effects in causative alternations and morphological marking. The large corpus-based, cross-linguistic component of the work and the abstract grammatical hypotheses on word order and verb meaning provide new empirical and computational evidence to the important debate on language variation, its extent and its limits and illustrate how to bring corpus-based computational methodology to bear on theoretical syntactic issues. In so doing, we help reduce the current gap between theoretical and computational linguistics.