Using semantic analysis of texts for the identification of drugs with similar therapeutic effects

Autor: A. E. Tropsha, Timur I. Madzhidov, Z. Sh. Miftahutdinov, Sergey I. Nikolenko, Elena Tutubalina, R. I. Nugmanov, Ilseyar Alimova
Rok vydání: 2017
Předmět:
Zdroj: Russian Chemical Bulletin. 66:2180-2189
ISSN: 1573-9171
1066-5285
Popis: Semantic analysis of text collections was used to identify drugs with similar therapeutic activity. Natural language processing methods were applied to analyse > 2.5 mln texts from drug reviews (in English) found on patient forums and discussion boards. In order to obtain distributed word representations form the input data, a continuous bag-of-words type model was used. Such model is one of the word2vec models intended to analyse the natural language semantics. This allowed the assignment of a numeric vector to each drug name. A list of pairs of drugs with similar vectors was formed. An analysis of this list confirmed that similar word vectors correspond to either drugs with the same active compound or to drugs with close therapeutic effects that belong to the same therapeutic group. The chemical similarity in such drug pairs was found to be low. The suggested procedure was used to visualize the chemical drug space and in the search for compounds with potentially similar biological effects among drugs of different therapeutic groups.
Databáze: OpenAIRE