Abstract
Here we present an expanded version of bdproto, a database comprising phonological inventory data from 257 ancient and reconstructed languages. These data were extracted from historical linguistic reconstructions and brought together into a single unified, normalized, accessible, and Unicode-compliant language resource. This dataset is publicly available and we aim to engage language scientists doing research on language change and language evolution. Furthermore, we identify a hitherto undiscussed temporal bias that complicates the simple comparison of ancient and reconstructed languages with present-day languages. Due to the sparsity of the data and the absence of statistical and computational methods that can adequately handle this bias, we instead directly target rates of change within and across families, thereby providing a case study to highlight bdproto’s research viability; using phylogenetic comparative methods and high-resolution language family trees, we investigate whether consonantal and vocalic systems differ in their rates of change over the last 10,000 years. In light of the compilation of bdproto and the findings of our case study, we discuss the challenges involved in comparing the sound systems of reconstructed languages with modern day languages.
Original language | American English |
---|---|
Pages (from-to) | 79-103 |
Number of pages | 25 |
Journal | Language Resources and Evaluation |
Volume | 55 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2021 |
Bibliographical note
Funding Information:Open access funding provided by Max Planck Society. Thanks to Egidio Marsico, Sebastien Flavier, Ian Maddieson, and Joël Brogniart, for data collection, curation, and for making openly available the original bdproto data. Balthasar Bickel and Paul Widmer provided the ancient languages data. Special thanks to Hanan Amouyal, Patrick Haller, Gali Katsir, Júda Ronén, Lilja Sæbø, Layla Schwartz, and Yoav Yosef for recent data entry. This work was inspired by Swiss National Science Foundation grant/award number PCEFP1_186841 (Steven Moran, PI).
Funding Information:
Open access funding provided by Max Planck Society. Thanks to Egidio Marsico, Sebastien Flavier, Ian Maddieson, and Jo?l Brogniart, for data collection, curation, and for making openly available the original bdproto data. Balthasar Bickel and Paul Widmer provided the ancient languages data. Special thanks to Hanan Amouyal, Patrick Haller, Gali Katsir, J?da Ron?n, Lilja S?b?, Layla Schwartz, and Yoav Yosef for recent data entry. This work was inspired by Swiss National Science Foundation grant/award number PCEFP1_186841 (Steven Moran, PI).
Publisher Copyright:
© 2020, The Author(s).
Keywords
- Historical linguistics
- Language evolution
- Phonological inventories
- Phylogenetics