Using semantic analysis of texts for the identification of drugs with similar therapeutic effects
Autor: | A. E. Tropsha, Timur I. Madzhidov, Z. Sh. Miftahutdinov, Sergey I. Nikolenko, Elena Tutubalina, R. I. Nugmanov, Ilseyar Alimova |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Drug business.industry Chemistry media_common.quotation_subject Semantic analysis (machine learning) Cosine similarity General Chemistry Chemical similarity computer.software_genre 030226 pharmacology & pharmacy Chemical space 03 medical and health sciences 030104 developmental biology 0302 clinical medicine Cheminformatics Word2vec Identification (biology) Artificial intelligence business computer Natural language processing media_common |
Zdroj: | Russian Chemical Bulletin. 66:2180-2189 |
ISSN: | 1573-9171 1066-5285 |
Popis: | Semantic analysis of text collections was used to identify drugs with similar therapeutic activity. Natural language processing methods were applied to analyse > 2.5 mln texts from drug reviews (in English) found on patient forums and discussion boards. In order to obtain distributed word representations form the input data, a continuous bag-of-words type model was used. Such model is one of the word2vec models intended to analyse the natural language semantics. This allowed the assignment of a numeric vector to each drug name. A list of pairs of drugs with similar vectors was formed. An analysis of this list confirmed that similar word vectors correspond to either drugs with the same active compound or to drugs with close therapeutic effects that belong to the same therapeutic group. The chemical similarity in such drug pairs was found to be low. The suggested procedure was used to visualize the chemical drug space and in the search for compounds with potentially similar biological effects among drugs of different therapeutic groups. |
Databáze: | OpenAIRE |
Externí odkaz: |