Relevant Criteria for Selection of Spoken Data: Theory Meets Practice
Autor: | Marie Kopřivová, Petra Poukarová, Zuzana Komrsková, David Lukeš |
---|---|
Rok vydání: | 2019 |
Předmět: |
050101 languages & linguistics
Linguistics and Language 060101 anthropology business.industry Computer science 05 social sciences 06 humanities and the arts computer.software_genre Language and Linguistics 0501 psychology and cognitive sciences 0601 history and archaeology Artificial intelligence business computer Natural language processing Selection (genetic algorithm) |
Zdroj: | Journal of Linguistics/Jazykovedný casopis. 70:324-335 |
ISSN: | 1338-4287 0021-5597 |
DOI: | 10.2478/jazcas-2019-0062 |
Popis: | The present paper seeks to review relevant criteria used in classifying speech events (SEs) from the perspective of spoken corpus design. The primary goal is to survey the landscape of possible types of spoken language, so as to assess in which directions the coverage of spoken Czech offered by Czech National Corpus corpora can be expanded in the future. We approach the problem from both theoretical and practical points of view, examining what the theoretical literature has to say as well as approaches implemented in practice by existing spoken corpora of various languages. We then synthesize the obtained information into a pragmatically motivated set of SE classification criteria which does not aspire to be universal or definitive but aims to serve as a useful guiding principle and conceptual framework for understanding and promoting SE diversity when collecting spoken data. |
Databáze: | OpenAIRE |
Externí odkaz: |