Using idiolects and sociolects to improve word prediction
Autor: | Stoop, W.M.C.A., Bosch, A.P.J. van den |
---|---|
Rok vydání: | 2014 |
Předmět: | |
Zdroj: | Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 318-327. [S.l.] : Association for Computational Linguistics STARTPAGE=318;ENDPAGE=327;TITLE=Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 318-327 |
Popis: | Contains fulltext : 127069.pdf (Publisher’s version ) (Open Access) In this paper the word prediction system Soothsayer is described. This system predicts what a user is going to write as he is keying it in. The main innovation of Soothsayer is that it not only uses idiolects, the language of one individual person, as its source of knowledge, but also sociolects, the language of the social circle around that person. We use Twitter for data collection and experimentation. The idiolect models are based on individual Twitter feeds, the sociolect models are based on the tweets of a particular person and the tweets of the people he often communicates with. The idea behind this is that people who often communicate start to talk alike; therefore the language of the friends of person x can be helpful in trying to predict what person x is going to say. This approach achieved the best results. For a number of users, more than 50% of the keystrokes could have been saved if they had used Soothsayer. 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2014), 26 april 2014 |
Databáze: | OpenAIRE |
Externí odkaz: |