RNN based machine translation and transliteration for Twitter data.

Autor: Vathsala, M. K., Holi, Ganga
Předmět:
Zdroj: International Journal of Speech Technology; Sep2020, Vol. 23 Issue 3, p499-504, 6p
Abstrakt: The present work aims at analyzing the social media data for code-switching and transliterated to English language using the special kind of recurrent neural network (RNN) called Long Short-Term Memory (LSTM) Network. During the course of work, TensorFlow is used to express LSTM suitably. Twitter data is stored in MongoDB to enable easy handling and processing of data. The data is parsed through different fields with the aid of Python script and cleaned using regular expressions. The LSTM model is trained for 1 M data which is further used for transliteration and translation of the Twitter data. Translation and transliteration of social media data enables publicizing the content in the language understood by majority of the population. With this, any content which is anti-social or threat to law and order can be easily verified and blocked at the source. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index