Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure.

Autor: Ko G; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Kim PG; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Yoon BH; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Kim J; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Song W; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Byeon I; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Yoon J; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea., Lee B; Korean Bioinformation Center (KOBIC), KRIBB, 125 Gwahangno, Yuseong-gu, Daejeon, 34141, Korea. bulee@kribb.re.kr., Kim YK; Department of Bio-AI Convergence, Chungnam National University, Daejeon, 34134, Korea. ykim@cnu.ac.kr.
Jazyk: angličtina
Zdroj: BMC bioinformatics [BMC Bioinformatics] 2024 Nov 12; Vol. 25 (1), pp. 353. Date of Electronic Publication: 2024 Nov 12.
DOI: 10.1186/s12859-024-05963-8
Abstrakt: Background: The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and significant computational challenges. As the cost of next-generation sequencing (NGS) has decreased, the amount of genomic data has surged globally. However, the cost and complexity of the computational resources required continue to be substantial barriers to leveraging big data. A promising solution to these computational challenges is cloud computing, which provides researchers with the necessary CPUs, memory, storage, and software tools.
Results: Here, we present Closha 2.0, a cloud computing service that offers a user-friendly platform for analyzing massive genomic datasets. Closha 2.0 is designed to provide a cloud-based environment that enables all genomic researchers, including those with limited or no programming experience, to easily analyze their genomic data. The new 2.0 version of Closha has more user-friendly features than the previous 1.0 version. Firstly, the workbench features a script editor that supports Python, R, and shell script programming, enabling users to write scripts and integrate them into their pipelines. This functionality is particularly useful for downstream analysis. Second, Closha 2.0 runs on containers, which execute each tool in an independent environment. This provides a stable environment and prevents dependency issues and version conflicts among tools. Additionally, users can execute each step of a pipeline individually, allowing them to test applications at each stage and adjust parameters to achieve the desired results. We also updated a high-speed data transmission tool called GBox that facilitates the rapid transfer of large datasets.
Conclusions: The analysis pipelines on Closha 2.0 are reproducible, with all analysis parameters and inputs being permanently recorded. Closha 2.0 simplifies multi-step analysis with drag-and-drop functionality and provides a user-friendly interface for genomic scientists to obtain accurate results from NGS data. Closha 2.0 is freely available at https://www.kobic.re.kr/closha2 .
Competing Interests: Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare no competing interests.
(© 2024. The Author(s).)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje