The Harmonia Corpus – A Dialogue Corpus for Automatic Analysis of Phonetic Convergence

Autor: Grazyna Demenko, Mariusz Owsianny, Jolanta Bachan
Rok vydání: 2020
Předmět:
Zdroj: Human Language Technology. Challenges for Computer Science and Linguistics ISBN: 9783030665265
LCT
Popis: The work presents the creation of a dialogue corpus for analysis and formal evaluation of phonetic convergence in spoken dialogues in human-human and human-machine communication, with the goal of comparing dialogue features at all levels of language use. The Harmonia corpus was created within a project which aims at (1) extracting phonetic features which can be mapped on a synthetic signal, (2) creating dialogue models applicable in a human-machine interaction and (3) practical evaluation of the convergence. For the corpus the following language groups were recorded: 16 pairs of Polish speakers speaking Polish (native speech), 10 pairs of German speakers speaking German (native speech), 12 pairs of German and Polish speakers speaking Polish (non-native speech), and 10 pairs of Polish and German speakers speaking German (non-native speech). The speakers could hear each other, but could not see each other. The recording scenarios consisted of controlled, neutral and expressive tasks and provided over 27 h of speech. This scenario combination is novel and promises to provide an empirical foundation for both linguistic and computational dialogue modelling of both face-to-face and man-machine dialogue.
Databáze: OpenAIRE