CODA-ML: context-specific biological knowledge representation for systemic physiology analysis

Autor: Saehwan Lee, Doheon Lee, Chungsun Jeong, Mijin Kwon, Gwangmin Kim, Soorin Yim
Rok vydání: 2019
Předmět:
Markup language
Knowledge representation and reasoning
Physiology
Computer science
computer.internet_protocol
Association (object-oriented programming)
media_common.quotation_subject
Cell
lcsh:Computer applications to medicine. Medical informatics
computer.software_genre
Models
Biological

Biochemistry
Molecular specification
03 medical and health sciences
0302 clinical medicine
Essential biological information
Structural Biology
medicine
Humans
Function (engineering)
lcsh:QH301-705.5
Molecular Biology
Language
030304 developmental biology
media_common
0303 health sciences
Standard language
business.industry
Research
Applied Mathematics
Representation (systemics)
Biological context
Biological knowledge
Computer Science Applications
Metabolic pathway
Knowledge
medicine.anatomical_structure
lcsh:Biology (General)
030220 oncology & carcinogenesis
Context specific
lcsh:R858-859.7
Artificial intelligence
DNA microarray
business
computer
Software
XML
Natural language processing
Zdroj: BMC Bioinformatics, Vol 20, Iss S10, Pp 45-53 (2019)
BMC Bioinformatics
ISSN: 1471-2105
DOI: 10.1186/s12859-019-2812-7
Popis: Background Computational analysis of complex diseases involving multiple organs requires the integration of multiple different models into a unified model. Different models are often constructed in heterogeneous formats. Thus, the integration of the models requires a standard language format that can effectively represent essential biological information. However, the previously introduced formats have limitations that prevent from adequately representing essential biological information, particularly specifications of bio-molecules and biological contexts. Results We defined an XML-based markup language called context-oriented directed association markup language (CODA-ML), which better represents essential biological information. The CODA-ML has two major strengths in designating molecular specifications and biological contexts. It can cover heterogeneous entity types involved in biological events (e.g. gene/protein, compound, cellular function, disease). Molecular types of entities can have molecular specifications which include detailed information of a molecule from isoforms to modifications, enabling high-resolution representation of molecules. In addition, it can distinguish biological events that vary depending on different biological contexts such as cell types or disease conditions. Especially representation of inter-cellular events as well as intra-cellular events is available. These two major strengths can resolve contradictory associations when different models are integrated into one unified model, which improves the accuracy of the model. Conclusions With the CODA-ML, diverse models such as signaling pathways, metabolic pathways, and gene regulatory pathways can be represented in a unified language format. Heterogeneous entity types can be covered by the CODA-ML, thus it enables detailed description for the mechanisms of diseases or drugs from multiple perspectives (e.g., molecule, function or disease). The CODA-ML is expected to help integrate different models into one systemic model in an efficient and effective. The unified model can be used to perform computational analysis not only for cancer but also for other complex diseases involving multiple organs beyond a single cell. Electronic supplementary material The online version of this article (10.1186/s12859-019-2812-7) contains supplementary material, which is available to authorized users.
Databáze: OpenAIRE