Autor: |
Romera, Thomas, Petreto, Andrea, Lemaitre, Florian, Bouyer, Manuel, Meunier, Quentin, Lacassagne, Lionel |
Přispěvatelé: |
Architecture et Logiciels pour Systèmes Embarqués sur Puce (ALSOC), LIP6, Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Lacassagne, Lionel |
Jazyk: |
angličtina |
Rok vydání: |
2021 |
Předmět: |
|
Zdroj: |
European Signal Processing Conference (EUSIPCO) European Signal Processing Conference (EUSIPCO), Aug 2021, Dublin, Ireland |
Popis: |
International audience; The emergence of low-power embedded Graphical Processing Units (GPUs) with high computation capabilities has enabled the integration of image processing chains in a wide variety of embedded systems. Various optimisation techniques are however needed in order to get the most out of an embedded GPU. This paper explores several optimisation methods for iterative stencil-like image processing algorithms on embedded NVIDIA GPUs using the Compute Unified Device Architecture (CUDA) API. We chose to focus our architectural optimisations on the TV-L1 algorithm, an optical flow estimation method based on total variation (TV) regularisation and the L1 norm. It is widely used as a model for more complex optical flow estimations and is used in many recent video processing applications. In this work we evaluate the impact of architecture-oriented optimisations on both execution time and energy consumption on several Nvidia Jetson GPU embedded boards. Results show a speedup up to 3× compared to State-of-the-Art versions as well as a 2.6× decrease in energy consumption. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|