Bibliographie complète
Aesop's fable “The North Wind and the Sun” Used as a Rosetta Stone to Extract and Map Spoken Words in Under-resourced Languages
Type de ressource
Conference Paper
Auteurs/contributeurs
- Knyazeva, Elena (Author)
- Boula de Mareüil, Philippe (Author)
- Vernier, Frédéric (Author)
- Calzolari, Nicoletta (Editor)
- Béchet, Frédéric (Editor)
- Blache, Philippe (Editor)
- Choukri, Khalid (Editor)
- Cieri, Christopher (Editor)
- Declerck, Thierry (Editor)
- Goggi, Sara (Editor)
- Isahara, Hitoshi (Editor)
- Maegaard, Bente (Editor)
- Mariani, Joseph (Editor)
- Mazo, Hélène (Editor)
- Odijk, Jan (Editor)
- Piperidis, Stelios (Editor)
Title
Aesop's fable “The North Wind and the Sun” Used as a Rosetta Stone to Extract and Map Spoken Words in Under-resourced Languages
Abstract
This paper describes a method of semi-automatic word spotting in minority languages, from one and the same Aesop fable “The North Wind and the Sun” translated in Romance languages/dialects from Hexagonal (i.e. Metropolitan) France and languages from French Polynesia. The first task consisted of finding out how a dozen words such as “wind” and “sun” were translated in over 200 versions collected in the field — taking advantage of orthographic similarity, word position and context. Occurrences of the translations were then extracted from the phone-aligned recordings. The results were judged accurate in 96–97% of cases, both on the development corpus and a test set of unseen data. Corrected alignments were then mapped and basemaps were drawn to make various linguistic phenomena immediately visible. The paper exemplifies how regular expressions may be used for this purpose. The final result, which takes the form of an online speaking atlas (enriching the https://atlas.limsi.fr website), enables us to illustrate lexical, morphological or phonetic variation.
Date
2022-06
Proceedings Title
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Conference Name
LREC 2022
Place
Marseille, France
Publisher
European Language Resources Association
Pages
2072–2079
Accessed
02/08/2024 13:58
Library Catalog
ACLWeb
Référence
Knyazeva, E., Boula de Mareüil, P., & Vernier, F. (2022). Aesop’s fable “The North Wind and the Sun” Used as a Rosetta Stone to Extract and Map Spoken Words in Under-resourced Languages. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 2072–2079). European Language Resources Association. https://aclanthology.org/2022.lrec-1.223
Lien vers cette notice