Show simple item record Sprugnoli, Rachele Pellegrini, Matteo Cecchini, Flavio Massimiliano Passarotti, Marco 2021-03-09T10:26:37Z 2021-03-09T10:26:37Z 2020
dc.description Training and gold test data released in EvaLatin 2020, the evaluation campaign of NLP tools for Latin. The two shared tasks proposed in EvaLatin 2020, i. e. Lemmatization and Part-of-Speech tagging, were aimed at fostering research in the field of language technologies for Classical languages. The shared dataset consists of texts taken from the Perseus Digital Library, processed with UDPipe models and then manually corrected by Latin experts. The training set includes only prose texts by Classical authors. The test set, alongside with prose texts by the same authors represented in the training set, also includes data relative to poetry and to the Medieval period.
dc.language.iso lat
dc.publisher CIRCSE Research Centre, Università Cattolica del Sacro Cuore
dc.relation info:eu-repo/grantAgreement/EC/H2020/769994
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.label PUB
dc.subject Latin
dc.subject POS tagging
dc.subject Lemmatization
dc.title EvaLatin 2020: data
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding OPEN
contact.person Rachele Sprugnoli Università Cattolica del Sacro Cuore
sponsor European Union EC/H2020/769994 LiLa - Linking Latin. Building a Knowledge Base of Linguistic Resources for Latin euFunds info:eu-repo/grantAgreement/EC/H2020/769994 341,419 tokens 16 files
files.size 3061734
files.count 1

 Files in this item

2.92 MB
Dataset of EvaLatin 2020
 Download file  Preview
 File Preview  
  • EvaLatin-dataset
    • test_gold_data
      • Cicero_InCatilinam_GOLD.conllu489 kB
      • Seneca_DeVitaBeata_GOLD.conllu283 kB
      • Seneca_DeProvidentia_GOLD.conllu160 kB
      • Plinius_Epistulae_10_GOLD.conllu395 kB
      • Tacitus_Agricola_GOLD.conllu273 kB
      • Caesar_BellumCivile1_GOLD.conllu444 kB
      • Tacitus_Germania_GOLD.conllu221 kB
      • Horatius-Carmina_GOLD.conllu524 kB
      • SummaContraGentiles_IV_GOLD.conllu451 kB
    • training_data
      • Caesar_BellumCivile_LiberII.conllu258 kB
      • Caesar_BellumGallicum.conllu1 MB
      • Pliny_Younger_Epistulae_1-8.conllu1 MB
      • Seneca_DeClementia.conllu320 kB
      • Seneca_DeBeneficiis.conllu1 MB
      • Cicero_Philippica.conllu2 MB
      • Tacitus_Historiae.conllu2 MB

Show simple item record