Výsledky vyhledávání

Report

Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection

Autor: Chen, Sihao, Zhang, Fan, Sone, Kazoo, Roth, Dan

Despite significant progress in neural abstractive summarization, recent studies have shown that the current models are prone to generating summaries that are unfaithful to the original context. To address the issue, we study contrast candidate gener

Externí odkaz: http://arxiv.org/abs/2104.09061

Zobrazit plný text záznamu

Report

Diagnosing Vision-and-Language Navigation: What Really Matters

Autor: Zhu, Wanrong, Qi, Yuankai, Narayana, Pradyumna, Sone, Kazoo, Basu, Sugato, Wang, Xin Eric, Wu, Qi, Eckstein, Miguel, Wang, William Yang

Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural language instructions and navigates in visual environments. Multiple setups have been proposed, and researchers apply new model architectures or training techniq

Externí odkaz: http://arxiv.org/abs/2103.16561

Zobrazit plný text záznamu

Report

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

Autor: Zhu, Wanrong, Wang, Xin Eric, Narayana, Pradyumna, Sone, Kazoo, Basu, Sugato, Wang, William Yang

A major challenge in visually grounded language generation is to build robust benchmark datasets and models that can generalize well in real-world settings. To do this, it is critical to ensure that our evaluation protocols are correct, and benchmark

Externí odkaz: http://arxiv.org/abs/2010.03644

Zobrazit plný text záznamu

Report

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Autor: Zhu, Wanrong, Wang, Xin Eric, Fu, Tsu-Jui, Yan, An, Narayana, Pradyumna, Sone, Kazoo, Basu, Sugato, Wang, William Yang

One of the most challenging topics in Natural Language Processing (NLP) is visually-grounded language understanding and reasoning. Outdoor vision-and-language navigation (VLN) is such a task where an agent follows natural language instructions and na

Externí odkaz: http://arxiv.org/abs/2007.00229

Zobrazit plný text záznamu

Report

Multi-Image Summarization: Textual Summary from a Set of Cohesive Images

Autor: Trieu, Nicholas, Goodman, Sebastian, Narayana, Pradyumna, Sone, Kazoo, Soricut, Radu

Multi-sentence summarization is a well studied problem in NLP, while generating image descriptions for a single image is a well studied problem in Computer Vision. However, for applications such as image cluster labeling or web page summarization, su

Externí odkaz: http://arxiv.org/abs/2006.08686

Zobrazit plný text záznamu

Report

HUSE: Hierarchical Universal Semantic Embeddings

Autor: Narayana, Pradyumna, Pednekar, Aniket, Krishnamoorthy, Abishek, Sone, Kazoo, Basu, Sugato

There is a recent surge of interest in cross-modal representation learning corresponding to images and text. The main challenge lies in mapping images and text to a shared latent space where the embeddings corresponding to a similar semantic concept

Externí odkaz: http://arxiv.org/abs/1911.05978

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání