KIParla - KIP transcripts

Please use the following text to cite this item or export to a predefined format:
Ballarè, Silvia; Goria, Eugenio and Mauri, Caterina, 2019, KIParla - KIP transcripts, ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", http://hdl.handle.net/20.500.11752/OPEN-1048
Date issued
2019-06-04
Size
605306 tokens,
121 texts,
70 hours
Language(s)
Description
The KIP corpus is part of the larger KIParla collection (www.kiparla.it), which can be freely queried through the NoSketch Engine interface. The KIP corpus was compiled within the framework of the LEAdhoC project – Linguistic Expression of Ad Hoc Categories, funded by the Italian Ministry of Education, University and Research (MIUR) under the SIR 2016 call. It consists of approximately 70 hours of spoken data collected at the Universities of Bologna and Turin. The interactions, recorded between 2016 and 2019, involved over 180 speakers, including university students and professors from various regions of Italy, and took place in five different types of communicative situations: lessons, exams, office hours, semi-structured interviews, free conversations (among students). The transcriptions have been anonymized. Overall, the module is made up of 121 conversations and includes 184 speakers. This repository contains: - metadata for both speakers (age, origin, occupation, gender) and conversations (type of interaction), in the metadata subfolder - descriptions of the set of transcription conventions used for this module (Transcription conventions) - transcripts of the recorded conversations in the following formats: .eaf file in eaf/ folder (time-aligned Jefferson-style transcriptions) .txt file in linear-jefferson/ folder (linearized Jefferson-style transcription) .txt file in linear-orthographic/ folder (linearized transcription retaining only orthographic words) .tsv file in tsv/ folder (tokenised version of the transcription) More information can be found in the README.md file. Due to GDPR restrictions, pseudo-anonymized audio files (MP3) are available under a restricted-access license. To request access, please contact the corpus coordinators through the KIParla website and follow the provided procedure. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acknowledgement
 Files in this item
Name
README.md
Size
10.23 KB
Format
application/octet-stream
Description
Unknown
MD5
895cfdd7665ecb2bad7063bd42cf019e
Preview
  File Preview
Name
LICENSE
Size
20.36 KB
Format
application/octet-stream
Description
Unknown
MD5
5d4469701edbc9ee68ddc28a92aa7167
Preview
  File Preview
Name
CITATION.cff
Size
3.98 KB
Format
application/octet-stream
Description
Unknown
MD5
0d34f0dc5cecc0838258f4667b504af6
Preview
  File Preview
Name
KIP_transcripts.zip
Size
12.94 MB
Format
application/zip
Description
Zip
MD5
a5c6fbcee173695c99d5a5c051ec3b52
Preview
  File Preview