Autor: |
Kobashikawa Satoshi, Yamaguchi Yoshikazu, Takahashi Satoshi, Masataki Hirokazu, Asami Taichi, Hori Takaaki |
Rok vydání: |
2012 |
Předmět: |
|
Zdroj: |
SLT |
DOI: |
10.1109/slt.2012.6424209 |
Popis: |
This paper proposes a technique that efficiently controls the beam width to yield practical computation times when auto-transcribing massive volumes of speeches. We focus on the fact that a lot of time is wasted by recognizing poor quality speeches that will yield, with inordinate slowness, erroneous transcriptions and provide no useful results. To stabilize the time regardless of quality, our proposal controls the beam width based on prolonged score spread against the target speech; it formulates the score range within the width and maximizes computation efficiency by regulating the range relevant to the hypotheses' survival rate. The proposed technique can control the width rapidly by using just monophones prior to decoding. It also restricts the width in decoding by using the processing speed and remaining data time to better handle stubborn speeches. Experiments with several SNRs and actual call-center speeches confirm a reduction in computation time while matching the accuracy of existing techniques. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|