SureChEMBL: a large-scale, chemically annotated patent document database
Autor: | John P. Overington, Anne Hersey, Sean A. Irvine, Nicholas T. Goncharoff, James Siddle, Nathan Dedman, Joe Pettersson, Richard Koks, Jon Chambers, George Papadatos, Anna Gaulton, Mark Davies |
---|---|
Rok vydání: | 2015 |
Předmět: |
0301 basic medicine
Structure (mathematical logic) Database Interface (Java) Scale (chemistry) Biology computer.software_genre Pipeline (software) Patents as Topic Set (abstract data type) 03 medical and health sciences 030104 developmental biology Resource (project management) Pharmaceutical Preparations Chemical science Genetics Data Mining Database Issue computer Patent document Databases Chemical |
Zdroj: | Nucleic Acids Research |
ISSN: | 1362-4962 0305-1048 |
Popis: | SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. |
Databáze: | OpenAIRE |
Externí odkaz: |