Algorithm optimization and hardware implementation for Merge mode in HEVC
Autor: | Zhifeng Chen, Xiuzhi Yang, Xiaohong Gao, Mingkui Zheng, Long-zhao Shi |
---|---|
Rok vydání: | 2018 |
Předmět: |
Computer science
Cycles per instruction business.industry Clock rate 020207 software engineering 02 engineering and technology Motion vector Algorithmic efficiency Header 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Entropy encoding business Bitwise operation Computer hardware Information Systems Register-transfer level |
Zdroj: | Journal of Real-Time Image Processing. 17:623-630 |
ISSN: | 1861-8219 1861-8200 |
DOI: | 10.1007/s11554-018-0818-4 |
Popis: | Merge mode is a new tool for improving inter-frame coding efficiency in high-efficiency video coding. This tool can save the bitrate for the motion vector by sharing this vector with neighboring blocks. Merge is a process that selects a candidate motion vector by calculating the cost of rate-distortion. However, this process requires a large number of complex computations and memory access, thereby resulting in the low efficiency of hardware implementation. This paper proposes a new Merge candidate decision scheme that determines the most favorable Merge candidate from a full list of candidates by comparing the sum of absolute transformed difference with the weighted header bit instead of performing a complex calculation for sum of squared difference and entropy coding process in HM16.7. The simulation results show that the performance of the proposed algorithm is close to that of HM16.7 and increases the BD-rate only by 0.22–1.21%. The multilevel pipelines architecture is also exploited in the hardware design. The weighted header bit operation is performed by using the look-up table, which reduces both the complexity and encoding clock cycle. The designed system is implemented with a register transfer level code. The synthesis results from the Design Compiler show that compared with other architecture, the proposed architecture offers great advantages in resource utilization and can process 1920 × 1080 at 353 frame/s for P-slices with a clock frequency of 1057 MHz and logic gate count of 285.2 K. |
Databáze: | OpenAIRE |
Externí odkaz: |