DoReCo

DoReCo version 2.0 was published on 12 December 2024! After two years of intense work, this update brings new datasets, substantial improvements to the consistency of annotations and metadata, and a myriad of smaller changes and bug fixes.

DoReCo 2.0 hosts annotated speech data from 53 low-resource and endangered languages from all inhabited continents, inviting cross-linguistic research into phonetics, phonology, and morphology.

DoReCo (Language Documentation Reference Corpus) is jointly edited by Frank Seifart, Ludger Paschen, and Matt Stave. The bulk of the update from v.1.2 to v.2.0 was developed within the AIRAL project at Leibniz-Centre General Linguistics (ZAS).

Check out the corpus website at https://doreco.huma-num.fr!

AIRAL

AIRAL was a research project funded by the German Research Foundation (DFG) hosted at ZAS Berlin between 2022 and 2025. AIRAL (Acoustic Insights into the Root-Affix asymmetry across Languages) had the goal to shed light on the acoustic properties of roots and affixes in a world-wide sample of 40 languages. The project drew upon a combination of morphological and phonetic time alignments provided by the DoReCo corpus, allowing to study the effects of morphological structure on fine phonetic detail (duration, spectral properties) across languages with vastly different sound inventories and levels of morphological synthesis.

The main outcomes of AIRAL were:

Publication of DoReCo 2.0 in December 2024
Study on fine differences in acoustic duration between homophonous morphs, published in Journal of Linguistics in 2025
Study on wordhood and prosodic detachability of affixes, published in Linguistic Typolgoy in 2025
Handbook article on phonetic corpora, published in Oxford Research Encyclopedia of Linguistics in 2025
Study on word-initial consonants, published in Nature Human Behaviour in 2024 (in collaboration with a team of co-authors lead by Frederic Blum)
Study on cross-linguistic differences and commonalities in speech rhythm, currently under review (in collaboration with a team of co-authors lead by Lara S. Burchardt)

The AIRAL team consisted of:

Ludger Paschen (Principal Investigator, ZAS Berlin)
Aleksandr Schamberger (Research Assistant)
Bruno Behling (Research Assistant)
Michelle Elizabeth Throssell Balagué (Research Assistant)

Collaborators:

Susanne Fuchs (ZAS Berlin)
Christoph Draxler (BAS, LMU Munich)
Matt Stave (CNRS)
Rachid Ridouane (Université Paris III - Sorbonne Nouvelle)
Peter M. Arkadiev (University of Potsdam)