KIParla - ParlaTO transcripts
Please use the following text to cite this item or export to a predefined format:
Ballarè, Silvia and Cerruti, Massimo, 2020, KIParla - ParlaTO transcripts, ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", http://hdl.handle.net/20.500.11752/OPEN-1051
Authors
Item identifier
Project URL
Demo URL
Date issued
2020-06-01
Size
502754 tokens,
67 texts,
48 hours
Language(s)
Description
The ParlaTO corpus is part of the larger KIParla collection (www.kiparla.it), which can be freely queried through the NoSketch Engine interface.
The ParlaTO corpus was was funded by the CRT Foundation ("ParlaTO - Corpus del Parlato di Torino" project).
It consists of about 50 hours of interactions collected in Turin and its province through semi-structured interviews. The interviews, conducted between 2018 and 2020, involved 88 speakers with different origins, ages, education levels, and types of occupation, and addressed personal life experiences in the city (study, work, leisure activities, retirement, memories of the past, etc.). The transcriptions have been anonymized.
Overall, the module is made up of 68 conversations and includes 100 speakers.
This repository contains:
• metadata for both speakers (occupation, gender, age, origin, L1, educational achievement) and conversations (collection point, year, languages used), in the metadata subfolder
• descriptions of the set of transcription conventions used for this module
• for each conversation you will find: .eaf file in eaf/ folder (time-aligned Jefferson-style transcriptions); .txt file in linear-jefferson/ folder (linearized Jefferson-style transcription); .txt file in linear-orthographic/ folder (linearized transcription retaining only orthographic words); .tsv file in tsv/ folder (tokenised version of the transcription).
More information can be found in the README.md file.
Due to GDPR restrictions, pseudo-anonymized audio files (MP3) are available under a restricted-access license. To request access, please contact the corpus coordinators through the KIParla website and follow the provided procedure.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Publisher
Acknowledgement
Cassa di Risparmio di Torino
Project code:Erogazioni Ordinarie 2018, II tornata. Project ID: ID63411
Project name:ParlaTO- Corpus del parlato di Torino
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README.md
- Size
- 10.26 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 9d75ef56b849defd7eddf6c6e6be2dbe

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- ParlaTO_transcripts.zip
- Size
- 11.07 MB
- Format
- application/zip
- Description
- Zip
- MD5
- dc87e7154a3877d147c6d47efba0fd65

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- LICENSE
- Size
- 20.36 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5d4469701edbc9ee68ddc28a92aa7167

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- CITATION.cff
- Size
- 3.97 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 9dd3fededf5bcb042fe964474e3f5e50

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it

