Please use the following text to cite this item or export to a predefined format:
Mauri, Caterina; Ballarè, Silvia and Zucchini, Eleonora, 2024, KIParla - ParlaBO transcripts, ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", http://hdl.handle.net/20.500.11752/OPEN-2126
| dc.contributor.author | Mauri, Caterina |
| dc.contributor.author | Ballarè, Silvia |
| dc.contributor.author | Zucchini, Eleonora |
| dc.date.accessioned | 2026-02-04T09:33:25Z |
| dc.date.available | 2026-02-04T09:33:25Z |
| dc.date.issued | 2024-10-31 |
| dc.description | The ParlaBO corpus is part of the larger KIParla collection, which can be freely queried through the NoSketch Engine interface. The ParlaBO corpus was compiled within the framework of “DiverSIta – Diversity in spoken Italian” project, funded by the Italian Ministry of University and Research (MUR) (PRIN 2022 PNRR Call). It was also supported by the Project PNRR PE5: CHANGES – Cultural Heritage Active Innovation for Next-Gen Sustainable Society (Spoke 3: Digital libraries, archives and philology. WP5: Languages and their legacies in oral digital archives: synchronic interdisciplinary perspectives on multilingualism, language minorities, dialects and cultural contact in Italy). It consists of over 65 hours of spoken data collected in Bologna and its province through semi-structured interviews. The interviews, conducted between 2021 and 2024, involved more than 150 speakers with different origins, ages, education levels, and occupations and covered a variety of topics (study, work, leisure activities, retirement, memories of the past, life in the city, traditions, local customs, etc.). The transcriptions have been anonymized. Overall, the module is made up of 86 conversations and includes 155 speakers. This repository contains: • metadata for both speakers (occupation, gender, age, origin, L1, educational achievement) and conversations (collection point, year, languages used), in the metadata subfolder • descriptions of the set of transcription conventions used for this module • for each conversation you will find: .eaf file in eaf/ folder (time-aligned Jefferson-style transcriptions) .txt file in linear-jefferson/ folder (linearized Jefferson-style transcription) .txt file in linear-orthographic/ folder (linearized transcription retaining only orthographic words) .tsv file in tsv/ folder (tokenised version of the transcription). More information can be found in the README.md file. Due to GDPR restrictions, pseudo-anonymized audio files (MP3) are available under a restricted-access license. To request access, please contact the corpus coordinators through the KIParla website and follow the provided procedure. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. |
| dc.identifier.uri | http://hdl.handle.net/20.500.11752/OPEN-2126 |
| dc.language.iso | ita |
| dc.publisher | Alma Mater Studiorum – Università di Bologna |
| dc.relation.isreferencedby | http://ceur-ws.org/Vol-2481/ |
| dc.relation.isreferencedby | https://doi.org/10.60760/unibo/parlabo |
| dc.relation.replaces | http://hdl.handle.net/20.500.11752/OPEN-1050 |
| dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| dc.rights.label | PUB |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
| dc.source.uri | https://kiparla.it/parlabo/ |
| dc.subject | human-human spoken dialogues |
| dc.subject | semi-structured interviews |
| dc.subject | spontaneous speech |
| dc.subject | spoken Italian |
| dc.title | KIParla - ParlaBO transcripts |
| dc.type | corpus |
| local.contact.person | Caterina Mauri caterina.mauri@unibo.it Alma Mater Studiorum – Università di Bologna |
| local.contact.person | Ludovica Pannitto ellepannitto@gmail.com Alma Mater Studiorum – Università di Bologna |
| local.contact.person | Silvia Ballarè silvia.ballare@unibo.it Alma Mater Studiorum – Università di Bologna |
| local.demo.uri | https://kiparla.it/en/search/ |
| local.files.count | 4 |
| local.files.size | 18534074 |
| local.has.files | yes |
| local.language.name | Italian |
| local.size.info | 640624 tokens |
| local.size.info | 85 texts |
| local.size.info | 65 hours |
| local.sponsor | nationalFunds PRIN 2022 PNRR n. P2022RFR8T Unione Europea – NextGenerationEU a valere sul Piano Nazionale di Ripresa e Resilienza (PNRR) – Missione 4 Istruzione e ricerca DiverSIta – Diversity in spoken Italian |
| local.sponsor | nationalFunds CHANGES MUR - PNRR PE5 CHANGES – Cultural Heritage Active Innovation for Next-Gen Sustainable Society. Spoke 3: Digital libraries, archives and philology. WP5: Languages and their legacies in oral digital archives: synchronic interdisciplinary perspectives on multilingualism, language minorities, dialects and cultural contact in Italy |
| metashare.ResourceInfo#ContentInfo.mediaType | text |
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- ParlaBO-transcripts-v1.1.0.zip
- Size
- 17.64 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 34c220db9ef992bfa127cb5389987533

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- README.md
- Size
- 10.72 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 6d1d61fe021a451c01dce45be6cde9ed

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- CITATION.cff
- Size
- 3.57 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5f353b5260ae03a73324bab01ed28028

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- LICENSE
- Size
- 20.36 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5d4469701edbc9ee68ddc28a92aa7167

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it

