Through the COLaF project (Corpus et Outils pour les Langues de France, Corpus and Tools for the Languages of France), Inria aims to contribute to the development of free corpora and tools for French and other languages of France, in close collaboration with academic and institutional partners.
The scope of COLaF includes both:
COLaF aims to cover French and the languages of France in all its diversity:
Activity within the project notably covers the acquisition and structuring of texts from non-textual sources (books, audio recordings, etc.), the classification by language and linguistic variety of large volumes of texts (in close connection with the OSCAR project), the development of annotation and transformation models (translation, normalisation, voice synthesis, sign language generation) serving the development of corpora and the exploitation of newly created resources.
COLaF is an Inria DEFI led by Benoît Sagot (head of the ALMAnaCH project team) and Slim Ouni (head of the MULTISPEECH project team).