Opportunities and challenges of text mining in aterials research.

Autor: Kononova O; Department of Materials Science & Engineering, University of California, Berkeley, CA 94720, USA.; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA., He T; Department of Materials Science & Engineering, University of California, Berkeley, CA 94720, USA.; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA., Huo H; Department of Materials Science & Engineering, University of California, Berkeley, CA 94720, USA.; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA., Trewartha A; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA., Olivetti EA; Department of Materials Science & Engineering, MIT, Cambridge, MA 02139, USA., Ceder G; Department of Materials Science & Engineering, University of California, Berkeley, CA 94720, USA.; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
Jazyk: angličtina
Zdroj: IScience [iScience] 2021 Feb 06; Vol. 24 (3), pp. 102155. Date of Electronic Publication: 2021 Feb 06 (Print Publication: 2021).
DOI: 10.1016/j.isci.2021.102155
Abstrakt: Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety of tools for high-quality information extraction from unstructured text. These tools are primarily trained on non-technical text and struggle to produce accurate results when applied to scientific text, involving specific technical terminology. During the last years, significant efforts in information retrieval have been made for biomedical and biochemical publications. For materials science, text mining (TM) methodology is still at the dawn of its development. In this review, we survey the recent progress in creating and applying TM and NLP approaches to materials science field. This review is directed at the broad class of researchers aiming to learn the fundamentals of TM as applied to the materials science publications.
Competing Interests: The authors declare no competing interests.
(© 2021 The Author(s).)
Databáze: MEDLINE