KIParla - KIP transcripts
Please use the following text to cite this item or export to a predefined format:
Ballarè, Silvia; Goria, Eugenio and Mauri, Caterina, 2019, KIParla - KIP transcripts, ILC-CNR for CLARIN-IT repository hosted at Institute for Computational Linguistics "A. Zampolli", http://hdl.handle.net/20.500.11752/OPEN-1048
Authors
Item identifier
Project URL
Demo URL
Date issued
2019-06-04
Size
605306 tokens,
121 texts,
70 hours
Language(s)
Description
The KIP corpus is part of the larger KIParla collection (www.kiparla.it), which can be freely queried through the NoSketch Engine interface.
The KIP corpus was compiled within the framework of the LEAdhoC project – Linguistic Expression of Ad Hoc Categories, funded by the Italian Ministry of Education, University and Research (MIUR) under the SIR 2016 call.
It consists of approximately 70 hours of spoken data collected at the Universities of Bologna and Turin. The interactions, recorded between 2016 and 2019, involved over 180 speakers, including university students and professors from various regions of Italy, and took place in five different types of communicative situations: lessons, exams, office hours, semi-structured interviews, free conversations (among students). The transcriptions have been anonymized. Overall, the module is made up of 121 conversations and includes 184 speakers.
This repository contains:
- metadata for both speakers (age, origin, occupation, gender) and conversations (type of interaction), in the metadata subfolder
- descriptions of the set of transcription conventions used for this module (Transcription conventions)
- transcripts of the recorded conversations in the following formats:
.eaf file in eaf/ folder (time-aligned Jefferson-style transcriptions)
.txt file in linear-jefferson/ folder (linearized Jefferson-style transcription)
.txt file in linear-orthographic/ folder (linearized transcription retaining only orthographic words)
.tsv file in tsv/ folder (tokenised version of the transcription)
More information can be found in the README.md file.
Due to GDPR restrictions, pseudo-anonymized audio files (MP3) are available under a restricted-access license. To request access, please contact the corpus coordinators through the KIParla website and follow the provided procedure.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Acknowledgement
MUR - SIR 2016
Project code:RBSI14IIG0
Project name:LEAdhoC - Linguistic expression of ad hoc categories
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README.md
- Size
- 10.23 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 895cfdd7665ecb2bad7063bd42cf019e

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- LICENSE
- Size
- 20.36 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5d4469701edbc9ee68ddc28a92aa7167

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- CITATION.cff
- Size
- 3.98 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 0d34f0dc5cecc0838258f4667b504af6

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it
- Name
- KIP_transcripts.zip
- Size
- 12.94 MB
- Format
- application/zip
- Description
- Zip
- MD5
- a5c6fbcee173695c99d5a5c051ec3b52

The file preview has not been generated yet. Please try again later or contact the system administrator dspace-clarin-it-ilc-help@ilc.cnr.it

