Using idiolects and sociolects to improve word prediction

Autor: Stoop, W.M.C.A., Bosch, A.P.J. van den
Rok vydání: 2014
Předmět:
Zdroj: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 318-327. [S.l.] : Association for Computational Linguistics
STARTPAGE=318;ENDPAGE=327;TITLE=Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 318-327
Popis: Contains fulltext : 127069.pdf (Publisher’s version ) (Open Access) In this paper the word prediction system Soothsayer is described. This system predicts what a user is going to write as he is keying it in. The main innovation of Soothsayer is that it not only uses idiolects, the language of one individual person, as its source of knowledge, but also sociolects, the language of the social circle around that person. We use Twitter for data collection and experimentation. The idiolect models are based on individual Twitter feeds, the sociolect models are based on the tweets of a particular person and the tweets of the people he often communicates with. The idea behind this is that people who often communicate start to talk alike; therefore the language of the friends of person x can be helpful in trying to predict what person x is going to say. This approach achieved the best results. For a number of users, more than 50% of the keystrokes could have been saved if they had used Soothsayer. 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2014), 26 april 2014
Databáze: OpenAIRE