Manufacturing domain instruction comprehension using synthetic data.

Autor: Johari, Kritika, Tong, Christopher Tay Zi, Bhardwaj, Rishabh, Subbaraju, Vigneshwaran, Kim, Jung-Jae, Tan, U.-Xuan
Předmět:
Zdroj: Visual Computer; Nov2024, Vol. 40 Issue 11, p8189-8203, 15p
Abstrakt: Referring expression comprehension (REC) system solves a task to localize objects in a given image, based on natural language expression. We propose a novel approach to adapting the pre-trained REC model for the manufacturing domain. Despite significant advances in REC research, current REC datasets fail to recognize objects from specific yet important domains such as manufacturing due to the absence of domain-specific samples during training. Thus, we introduce a synthetic data-based domain adaptation approach for REC. To adapt a REC model to the manufacturing domain, we generated a synthetic REC dataset RefMD that consists of two sub-datasets: (1) dataset for manufacturing object classification, and (2) dataset for REC adaptation to manufacturing. Each dataset serves as one step toward the REC adaptation. Adaptation of the object classification network (visual backbone) is carried out by training ResNet50 on domain-specific labeled data, while the REC adaptation completes with the adaptation of modules in RealGIN altogether. The manufacturing domain-adapted model is further enhanced with the capability to handle ambiguous referring expressions through human-in-the-loop (HITL) interaction. The experiments on 3D-printed manufacturing objects demonstrate that the interactive REC model can accurately comprehend human instructions with an 82% accuracy. This paper introduces an approach to facilitate domain adaptation using solely synthetic data for the specific case of the manufacturing domain. However, the proposed adaptation methodology can be applied to any other domain by following the same synthetic data generation process. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index