KIParla - ParlaTO transcripts

Please use the following text to cite this item or export to a predefined format:
Ballarè, Silvia and Cerruti, Massimo, 2020, KIParla - ParlaTO transcripts, ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", http://hdl.handle.net/20.500.11752/OPEN-1051
Date issued
2020-06-01
Size
502754 tokens,
67 texts,
48 hours
Language(s)
Description
The ParlaTO corpus is part of the larger KIParla collection (www.kiparla.it), which can be freely queried through the NoSketch Engine interface. The ParlaTO corpus was was funded by the CRT Foundation ("ParlaTO - Corpus del Parlato di Torino" project). It consists of about 50 hours of interactions collected in Turin and its province through semi-structured interviews. The interviews, conducted between 2018 and 2020, involved 88 speakers with different origins, ages, education levels, and types of occupation, and addressed personal life experiences in the city (study, work, leisure activities, retirement, memories of the past, etc.). The transcriptions have been anonymized. Overall, the module is made up of 68 conversations and includes 100 speakers. This repository contains: • metadata for both speakers (occupation, gender, age, origin, L1, educational achievement) and conversations (collection point, year, languages used), in the metadata subfolder • descriptions of the set of transcription conventions used for this module • for each conversation you will find: .eaf file in eaf/ folder (time-aligned Jefferson-style transcriptions); .txt file in linear-jefferson/ folder (linearized Jefferson-style transcription); .txt file in linear-orthographic/ folder (linearized transcription retaining only orthographic words); .tsv file in tsv/ folder (tokenised version of the transcription). More information can be found in the README.md file. Due to GDPR restrictions, pseudo-anonymized audio files (MP3) are available under a restricted-access license. To request access, please contact the corpus coordinators through the KIParla website and follow the provided procedure. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acknowledgement
 Files in this item
Name
README.md
Size
10.26 KB
Format
application/octet-stream
Description
Unknown
MD5
9d75ef56b849defd7eddf6c6e6be2dbe
Preview
  File Preview
Name
ParlaTO_transcripts.zip
Size
11.07 MB
Format
application/zip
Description
Zip
MD5
dc87e7154a3877d147c6d47efba0fd65
Preview
  File Preview
Name
LICENSE
Size
20.36 KB
Format
application/octet-stream
Description
Unknown
MD5
5d4469701edbc9ee68ddc28a92aa7167
Preview
  File Preview
Name
CITATION.cff
Size
3.97 KB
Format
application/octet-stream
Description
Unknown
MD5
9dd3fededf5bcb042fe964474e3f5e50
Preview
  File Preview