Using Closed-Set Speaker Identification Score Confidence to Enhance Audio-Based Collaborative Filtering for Multiple Users
Autor: | Sven Ewan Shepstone, Zheng-Hua Tan, Miklas Strøm Kristoffersen |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
Closed set
Computer science Collaborative filtering (CF) Speech recognition 020206 networking & telecommunications 02 engineering and technology Affect (psychology) Range (mathematics) 0202 electrical engineering electronic engineering information engineering Media Technology Collaborative filtering Speaker identification 020201 artificial intelligence & image processing Electrical and Electronic Engineering confidence Baseline (configuration management) |
Zdroj: | Shepstone, S E, Tan, Z-H & Kristoffersen, M S 2018, ' Using Closed-Set Speaker Identification Score Confidence to Enhance Audio-Based Collaborative Filtering for Multiple Users ', IEEE Transactions on Consumer Electronics, vol. 64, no. 1, pp. 11-18 . https://doi.org/10.1109/TCE.2018.2811250 |
DOI: | 10.1109/TCE.2018.2811250 |
Popis: | In this paper, we utilize a closed-set speaker-identification approach to convey the ratings needed for collaborative filtering-based recommendation. Instead of explicitly providing a rating for a given program, users use a speech interface to dictate the desired rating after watching a movie. Due to the inaccuracies that may be imposed by a state-of-the-art speaker identification system, it is possible to mistake a user for another user in the household, especially when the users exhibit similar or identical age and gender demographics. This leads to the undesirable effect of injecting unwanted ratings into the collaborative rating matrix, and when the users have different tastes, can result in the recommendation of undesirable items. We therefore propose a simple confidence-based heuristic that utilizes the log-likelihood scores from the speaker identification front-end. The algorithm limits the degree to which unwanted ratings negatively affect the integrity of the ratings information. Using real-speaker utterances over a range of age and gender demographics, we compare our approach against upper and lower-bound (nonspeaker-identification-based) baseline systems. Results show that by taking the confidence into account of users that we were able to improve upon the lower-bound that unconditionally accepts ratings by a relative 6.9%. |
Databáze: | OpenAIRE |
Externí odkaz: |