Efficient motion estimation and discrete cosine transform implementation using the graphics processing units.

Autor: Agha S; Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan., Jan F; Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia., Khan HA; Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan., Kaleem M; Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan., Khan M; Department of Electrical and Computer Engineering, COMSATS University Islamabad, Islamabad, Pakistan.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2024 Aug 28; Vol. 19 (8), pp. e0307217. Date of Electronic Publication: 2024 Aug 28 (Print Publication: 2024).
DOI: 10.1371/journal.pone.0307217
Abstrakt: Motion Estimation (ME) and the two-dimensional (2D) discrete cosine transform (2D-DCT) are both computationally expensive parts of HEVC standard, therefore real-time performance of the HEVC may not be free from glitches. To address this issue, this study deploys the graphics processing units (GPUs) to perform the ME and 2D-DCT tasks. In this concern, authors probed into four levels of parallelism (i.e., frame, macroblock, search area, and sum of the absolute difference (SAD) levels) existing in ME. For comparative analysis, authors involved full search (FS), test zone search (TZS) of HEVC, and hierarchical diamond search (EHDS) ME algorithms. Similarly, two levels of parallelism (i.e., macroblock and sub-macroblock) are also explored in 2D-DCT. Notably, the least computationally complex multithreaded Loeffler DCT algorithm is utilized for computing 2D-DCT. Experimental results show that ME processing task corresponding to 25 frames, with each frame of size (3840×2160) pixels, is accomplished in 0.15 seconds on the NVIDIA GeForce GTX 1080, whereas the 2D-DCT task along with the image reconstruction and differencing corresponding to 25 frames took 0.1 seconds. Collectively, both ME and 2D-DCT tasks are processed in 0.25 seconds, which still leaves enough room for the encoder's remaining parts to be executed within one second. Due to this enhancement, the resultant encoder can safely be used in real-time applications.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Agha et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje