Speaker role clustering using turn features and maximum inter-cluster distances
Autor: | Zhuoming Chen, Xue Zhang, Aiwu Chen, Qian Huang, Xianku Li, Xiaohui Feng, Jichen Yang, Yanxiong Li |
---|---|
Rok vydání: | 2016 |
Předmět: |
Clustering high-dimensional data
Fuzzy clustering business.industry Computer science Speech recognition Correlation clustering Single-linkage clustering 020206 networking & telecommunications Pattern recognition 02 engineering and technology Complete-linkage clustering Determining the number of clusters in a data set ComputingMethodologies_PATTERNRECOGNITION 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business Cluster analysis k-medians clustering |
Zdroj: | 2016 International Conference on Audio, Language and Image Processing (ICALIP). |
DOI: | 10.1109/icalip.2016.7846538 |
Popis: | Speaker role clustering is to obtain the number of different roles and to merge the utterances of the same role into one cluster in an unsupervised way, which is important for rich transcription of multi-speaker spoken documents. This paper presents an approach to role clustering using turn features and maximum distances of inter-clusters. The turn features of each speaker are extracted from audio outputs of speaker diarization, and used as the initial clusters. During clustering iteration, the cluster-pair (e.g. C A and C B ) with the minimum distance is merged and the cluster number is decreased by one if the distance of the N c - 1 clusters (after merging C A and C B ) is bigger than that of the N c clusters (not merging C A and C B ); otherwise, the clustering iteration is finished. Evaluated on four types of multi-speaker spoken documents, the proposed approach outperforms the previous clustering approach and is close to the supervised approach in terms of K scores. |
Databáze: | OpenAIRE |
Externí odkaz: |