Production and Training of POS-tagger and lemmatizer for Occitan. COLaF host and maintain Deucalion, an API for lemmatisation and Pyrrha, a webapp for post-correction of lemmatized and morpho-syntactic tagged corpora.
Multidocuments diachronic layout analysis dataset
Documentation and validation schema to encode textual data in XML-TEI for COLaF