Résultats | Bibliographie COLaF

Boula de Mareüil, P., Vernier, F., & Rilliard, A. (2017). Enregistrements et transcriptions pour un atlas sonore des langues régionales de France. Géolinguistique, 17, 23–48. https://hal.science/hal-01719532

Consulter sur hal.science

Bernhard, D., Ligozat, A.-L., Bras, M., Martin, F., Vergez-Couret, M., Erhart, P., Sibille, J., Todirascu, A., Boula de Mareüil, P., & Huck, D. (2021). Collecting and annotating corpora for three under-resourced languages of France: Methodological issues. Language Documentation & Conservation, 15, 316–357. https://hal.science/hal-03273196

Consulter sur hal.science

Corral, A., Leturia, I., Séguier, A., Barret, M., Dazéas, B., Boula de Mareüil, P., & Quint, N. (2020). Neural Text-to-Speech Synthesis for an Under-Resourced Language in a Diglossic Environment: the Case of Gascon Occitan. In D. Beermann, L. Besacier, S. Sakti, & C. Soria (Eds.), Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL) (pp. 53–60). European Language Resources association. https://aclanthology.org/2020.sltu-1.8

Occitan is a minority language spoken in Southern France, some Alpine Valleys of Italy, and the Val d'Aran in Spain, which only very recently started developing language and speech technologies. This paper describes the first project for designing a Text-to-Speech synthesis system for one of its main regional varieties, namely Gascon. We used a state-of-the-art deep neural network approach, the Tacotron2-WaveGlow system. However, we faced two additional difficulties or challenges: on the one hand, we wanted to test if it was possible to obtain good quality results with fewer recording hours than is usually reported for such systems; on the other hand, we needed to achieve a standard, non-Occitan pronunciation of French proper names, therefore we needed to record French words and test phoneme-based approaches. The evaluation carried out over the various developed systems and approaches shows promising results with near production-ready quality. It has also allowed us to detect the phenomena for which some flaws or fall of quality occur, pointing at the direction of future work to improve the quality of the actual system and for new systems for other language varieties and voices.

Consulter le document

Knyazeva, E., Boula de Mareüil, P., & Vernier, F. (2022). Aesop’s fable “The North Wind and the Sun” Used as a Rosetta Stone to Extract and Map Spoken Words in Under-resourced Languages. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 2072–2079). European Language Resources Association. https://aclanthology.org/2022.lrec-1.223

This paper describes a method of semi-automatic word spotting in minority languages, from one and the same Aesop fable “The North Wind and the Sun” translated in Romance languages/dialects from Hexagonal (i.e. Metropolitan) France and languages from French Polynesia. The first task consisted of finding out how a dozen words such as “wind” and “sun” were translated in over 200 versions collected in the field — taking advantage of orthographic similarity, word position and context. Occurrences of the translations were then extracted from the phone-aligned recordings. The results were judged accurate in 96–97% of cases, both on the development corpus and a test set of unseen data. Corrected alignments were then mapped and basemaps were drawn to make various linguistic phenomena immediately visible. The paper exemplifies how regular expressions may be used for this purpose. The final result, which takes the form of an online speaking atlas (enriching the https://atlas.limsi.fr website), enables us to illustrate lexical, morphological or phonetic variation.

Consulter le document

Boula de Mareüil, P., Rilliard, A., & Vernier, F. (2018). A Speaking Atlas of the Regional Languages of France. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, & T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA). https://aclanthology.org/L18-1652

Consulter le document

Votre recherche

Résultats 5 ressources

Explorer

Corpus

Langue

Tâche