Abstrakt: |
With the enhanced capabilities of edge devices in processing images and video streams, novel deep vision applications are rapidly emerging. To support such applications, lightweight neural networks have proven effective, and existing solutions often adjust the model across various dimensions to meet specific application requirements. However, as each application varies in accuracy, latency, and memory usage requirements, a single lightweight technology cannot satisfy all these diverse metrics. Additionally, processing video stream tasks on mobile devices often falls short of achieving real-time performance. Therefore, this paper proposes a lightweight and real-time framework based on neural architecture search, termed LRNAS. We developed a lightweight network search algorithm employing evolutionary strategies and multi-objective optimization to personalize designs for diverse vision tasks. To reduce inference latency further, we designed a video frame filtering strategy. This strategy uses motion vectors and inertial sensors to filter out redundant video frames. We conducted experiments on two public datasets and one custom dataset, demonstrating LRNAS’s effectiveness in enhancing mobile deep vision application performance. |