Popis: |
In this paper, we propose Dynamic Residual Convolution (DRConv), an efficient method for computing input-specific local features while addressing the limitations of dynamic convolution. DRConv utilizes global salient features calculated using efficient token attention, strengthening representation power and enabling the selection of appropriate kernels. To mitigate optimization difficulty, we divide the convolution kernel into an input-agnostic kernel and an input-dependent kernel, initializing the latter to zero. Experimental results demonstrate that DRConv improves optimization difficulty while achieving superior performance. We also introduce Dynamic Mobile-Former (DMF), inspired by parallel design, to validate the DRConv module. DMF achieves higher accuracy than the state-of-the-art MobileFormer-508M with reduced computations. Moreover, DMF outperforms ResNet101 in COCO detection while utilizing nearly half the computations. Our approach demonstrates a favorable trade-off between accuracy and FLOPs, making it suitable for various computer vision tasks. Code is available at https://github.com/ysj9909/DMF. |