Diverse Image Inpainting with Bidirectional and Autoregressive Transformers
Autor: | Jianxiong Pan, Xuansong Xie, Fangneng Zhan, Kaiwen Cui, Chunyan Miao, Feiying Ma, Shijian Lu, Yingchen Yu, Rongliang Wu |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Underdetermined system business.industry Computer science Deep learning Computer Vision and Pattern Recognition (cs.CV) Inpainting ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Computer Science - Computer Vision and Pattern Recognition Pattern recognition Inverse problem Convolutional neural network Autoregressive model Artificial intelligence Language model business Transformer (machine learning model) |
Zdroj: | ACM Multimedia |
Popis: | Image inpainting is an underdetermined inverse problem, which naturally allows diverse contents to fill up the missing or corrupted regions realistically. Prevalent approaches using convolutional neural networks (CNNs) can synthesize visually pleasant contents, but CNNs suffer from limited perception fields for capturing global features. With image-level attention, transformers enable to model long-range dependencies and generate diverse contents with autoregressive modeling of pixel-sequence distributions. However, the unidirectional attention in autoregressive transformers is suboptimal as corrupted image regions may have arbitrary shapes with contexts from any direction. We propose BAT-Fill, an innovative image inpainting framework that introduces a novel bidirectional autoregressive transformer (BAT) for image inpainting. BAT utilizes the transformers to learn autoregressive distributions, which naturally allows the diverse generation of missing contents. In addition, it incorporates the masked language model like BERT, which enables bidirectionally modeling of contextual information of missing regions for better image completion. Extensive experiments over multiple datasets show that BAT-Fill achieves superior diversity and fidelity in image inpainting qualitatively and quantitatively. 11 pages, 6 figures |
Databáze: | OpenAIRE |
Externí odkaz: |