Popis: |
A broad range of scientific goals and a similarly diverse set of consumers drive the informatics requirements and computing needs of the JGI. The scope of work in this area encompasses not only the informatics and analysis pipelines in support of the PGF sequence production, but also the integration of data from a variety of sources and sophisticated large scale analyses led by investigators within JGI and driven by the user science community. In laying out a forward looking strategy, the full range of these activities need to be examined together to build a comprehensive program that will serve as a catalyst for the DOE research community. The science landscape envisioned in the overall strategic plan calls for significantly increasing the throughput of microbial genomes sequenced to cover their phylogenetic space and building a set of finished reference plant genomes to enable DOE relevant science. Additionally, the established impact of microbial communities on global energy cycles and their potential in remediation endeavors, warrant building upon JGI's established expertise in metagenomic analysis. Not only is each of these program areas relevant and exciting in their own right, but they also can and should be undertaken in a way that allows synthesismore » across domains (e.g. utilize knowledge from sequence of plants and the soil from which they are grown). Both dramatic increases in the scale of genomic data collection and the synergistic potential of integrating data across domains will demand new strategies in the informatics pipeline within the JGI and in the facility's approach to computational analysis and user access to the data in aggregated form. In addition to a robust and scalable informatics infrastructure, fulfilling the strategic science goals of the JGI will require ongoing investment in usability of the data, to ensure that the data collected will be used to maximal effect. It must be recognized that 'usability' will have a different appearance depending on the specific user base, and the JGI has several distinct classes of users it must enable to be successful. For some, rapid and convenient dissemination of the sequence data will be sufficient to enable their external research. For others, JGI hosted analysis tools and collaborative environments will be required to catalyze individual or team research. Finally, and significantly, there are genomic scientists within the JGI, often working closely with external collaborators, who rely on the ability to devise project-dependent and often very large scale customized analyses that result in publicly available tools. A successful strategy will require effort to satisfy each of these user classes, and careful attention to economies of software reuse and extensibility. There are only a handful of sequencing facilities worldwide that operate at the scale of the JGI's Production Genomics Facility, and these are devoted almost entirely to sequencing driven by biomedical applications. The PGF therefore fulfills a unique and vital role as a resource for genomic studies of DOE relevance. Like the other large-scale facilities, JGI has been carefully following the development of 'next-generation' sequencing technologies, and clearly must continue to refresh its instrumentation as advances are made. Critical to advances in sequencing technology are the computational infrastructure advances that are required to turn raw sequence into quality data. This is one area where JGI can leverage the broader sequencing community's investment in technology development, adopting the best practices and software for sequence processing and assembly. JGI can add unique value by further developing annotation pipelines and tools that serve to build an integrated framework where the Institute's complementary science components can be viewed in a larger 'systems' perspective than is currently possible. As technology, tools, and infrastructure advance, JGI is uniquely positioned in its ability complement core PGF expertise with a diverse set of capabilities provided by partners within the broader community that forms the Institute. The coming decade promises radical change in the field, and the ability to quickly recognize developing areas and nimbly build appropriate partner teams will be vital to maximally capitalize on the growing base of data collection capabilities. This dynamic team science environment will require an underlying computational environment that can accommodate innovative ideas and processes. This will be key to the success of the scientific goals of the JGI.« less |