DoReCo was a French-German collaborative project that brings together spoken language corpora from 51 languages, extracted from documentations of small and often endangered languages. The resource is intended for cross-linguistic research on phonetics, morphology, and other topics related to spoken language(s). The DoReCo project started in March 2019 and ended in August 2022, resulting in the openly accessible DoReCo database.

At the DoReCo website, you can explore the 51 datasets, download most annotation and audio files as well as metadata for free without registration, and find basic guidance on how to use the DoReCo data. If you have further questions or comments, you can write an email to or use our GitHub issue tracker.