Using semantic analysis of texts for the identification of drugs with similar therapeutic effects

Autor:	A. E. Tropsha, Timur I. Madzhidov, Z. Sh. Miftahutdinov, Sergey I. Nikolenko, Elena Tutubalina, R. I. Nugmanov, Ilseyar Alimova
Rok vydání:	2017
Předmět:	0301 basic medicine Drug business.industry Chemistry media_common.quotation_subject Semantic analysis (machine learning) Cosine similarity General Chemistry Chemical similarity computer.software_genre 030226 pharmacology & pharmacy Chemical space 03 medical and health sciences 030104 developmental biology 0302 clinical medicine Cheminformatics Word2vec Identification (biology) Artificial intelligence business computer Natural language processing media_common
Zdroj:	Russian Chemical Bulletin. 66:2180-2189
ISSN:	1573-9171 1066-5285
Popis:	Semantic analysis of text collections was used to identify drugs with similar therapeutic activity. Natural language processing methods were applied to analyse > 2.5 mln texts from drug reviews (in English) found on patient forums and discussion boards. In order to obtain distributed word representations form the input data, a continuous bag-of-words type model was used. Such model is one of the word2vec models intended to analyse the natural language semantics. This allowed the assignment of a numeric vector to each drug name. A list of pairs of drugs with similar vectors was formed. An analysis of this list confirmed that similar word vectors correspond to either drugs with the same active compound or to drugs with close therapeutic effects that belong to the same therapeutic group. The chemical similarity in such drug pairs was found to be low. The suggested procedure was used to visualize the chemical drug space and in the search for compounds with potentially similar biological effects among drugs of different therapeutic groups.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::bcf43a4b231b04bf4d57f45e2ea43f59 https://doi.org/10.1007/s11172-017-2000-8 Zobrazit plný text záznamu Full text from SpringerLink