Adapting natural language processing for technical text.

Autor: Dima, Alden, Lukens, Sarah, Hodkiewicz, Melinda, Sexton, Thurston, Brundage, Michael P.
Předmět:
Zdroj: Applied AI Letters; Sep2021, Vol. 2 Issue 3, p1-11, 11p
Abstrakt: Despite recent dramatic successes, natural language processing (NLP) is not ready to address a variety of real-world problems. Its reliance on large standard corpora, a training and evaluation paradigm that favors the learning of shallow heuristics, and large computational resource requirements, makes domain-specific application of even the most successful NLP techniques difficult. This paper proposes technical language processing (TLP) which brings engineering principles and practices to NLP specifically for the purpose of extracting actionable information from language generated by experts in their technical tasks, systems, and processes. TLP envisages NLP as a socio-technical system rather than as an algorithmic pipeline. We describe how the TLP approach to meaning and generalization differs from that of NLP, how data quantity and quality can be addressed in engineering technical domains, and the potential risks of not adapting NLP for technical use cases. Engineering problems can benefit immensely from the inclusion of knowledge from unstructured data, currently unavailable due to issues with out of the box NLP packages. We illustrate the TLP approach by focusing on maintenance in industrial organizations as a case-study. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index