Voting experts: An unsupervised algorithm for segmenting sequences

Autor: Paul R. Cohen, Niall M. Adams, Brent Heeringa
Rok vydání: 2007
Předmět:
Zdroj: Intelligent Data Analysis. 11:607-625
ISSN: 1571-4128
1088-467X
DOI: 10.3233/ida-2007-11603
Popis: We describe a statistical signature of chunks and an algorithm for finding chunks. While there is no formal definition of chunks, they may be reliably identified as configurations with low internal entropy or unpredictability and high entropy at their boundaries. We show that the log frequency of a chunk is a measure of its internal entropy. The Voting-Experts exploits the signature of chunks to find word boundaries in text from four languages and episode boundaries in the activities of a mobile robot.
Databáze: OpenAIRE