Autor: |
Bentsen T; Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark., Kressner AA; Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark., Dau T; Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark., May T; Hearing Systems Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark. |
Abstrakt: |
Computational speech segregation aims to automatically segregate speech from interfering noise, often by employing ideal binary mask estimation. Several studies have tried to exploit contextual information in speech to improve mask estimation accuracy by using two frequently-used strategies that (1) incorporate delta features and (2) employ support vector machine (SVM) based integration. In this study, two experiments were conducted. In Experiment I, the impact of exploiting spectro-temporal context using these strategies was investigated in stationary and six-talker noise. In Experiment II, the delta features were explored in detail and tested in a setup that considered novel noise segments of the six-talker noise. Computing delta features led to higher intelligibility than employing SVM based integration and intelligibility increased with the amount of spectral information exploited via the delta features. The system did not, however, generalize well to novel segments of this noise type. Measured intelligibility was subsequently compared to extended short-term objective intelligibility, hit-false alarm rate, and the amount of mask clustering. None of these objective measures alone could account for measured intelligibility. The findings may have implications for the design of speech segregation systems, and for the selection of a cost function that correlates with intelligibility. |