Výsledky vyhledávání - "Hyounghun Kim"

On the Limits of Evaluating Embodied Agent Model Generalization Using Validation Sets

Autor: Hyounghun Kim, Aishwarya Padmakumar, Di Jin, Mohit Bansal, Dilek Hakkani-Tur

Natural language guided embodied task completion is a challenging problem since it requires understanding natural language instructions, aligning them with egocentric visual observations, and choosing appropriate actions to execute in the environment

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::922e36ddd218344d28b9f29117f586f7
http://arxiv.org/abs/2205.09249

Zobrazit plný text záznamu

Modality-Balanced Models for Visual Dialogue

Autor: Hao Tan, Hyounghun Kim, Mohit Bansal

Publikováno v: AAAI

The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue. However, via manual analysis, we find that a large number of conversational questions can be answered

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1af3667e639900064666dd7eddac4ef0
https://doi.org/10.1609/aaai.v34i05.6320

Zobrazit plný text záznamu

CAISE: Conversational Agent for Image Search and Editing

Autor: Hyounghun Kim, Doo Soon Kim, Seunghyun Yoon, Franck Dernoncourt, Trung Bui, Mohit Bansal

Demand for image editing has been increasing as users' desire for expression is also increasing. However, for most users, image editing tools are not easy to use since the tools require certain expertise in photo effects and have complex interfaces.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a1189fd9e6a60e2c0dfdea2b00547fb1

Zobrazit plný text záznamu

NDH-Full: Learning and Evaluating Navigational Agents on Full-Length Dialogue

Autor: Hyounghun Kim, Jialu Li, Mohit Bansal

Publikováno v: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::1a64c8b2753adc24e5d392082f02ebdc
https://doi.org/10.18653/v1/2021.emnlp-main.518

Zobrazit plný text záznamu

Continuous Language Generative Flow

Autor: Shiyue Zhang, Zineng Tang, Hyounghun Kim, Mohit Bansal

Publikováno v: ACL/IJCNLP (1)

Recent years have witnessed various types of generative models for natural language generation (NLG), especially RNNs or transformer based sequence-to-sequence models, as well as variational autoencoder (VAE) and generative adversarial network (GAN)

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::6d55b0afc44df5412bc97d7a9a25a7a7
https://doi.org/10.18653/v1/2021.acl-long.355

Zobrazit plný text záznamu

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments

Autor: Abhay Zala, Hao Tan, Hyounghun Kim, Mohit Bansal, Graham Burri

Publikováno v: EMNLP (Findings)

For embodied agents, navigation is an important ability but not an isolated goal. Agents are also expected to perform specific tasks after reaching the target location, such as picking up objects and assembling them into a particular arrangement. We

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8f55f38b33f21b87b381a9bf38c5c55d
https://doi.org/10.18653/v1/2020.findings-emnlp.348

Zobrazit plný text záznamu

Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA

Autor: Zineng Tang, Hyounghun Kim, Mohit Bansal

Publikováno v: ACL

Videos convey rich information. Dynamic spatio-temporal relationships between people/objects, and diverse multimodal events are present in a video clip. Hence, it is important to develop automated models that can accurately extract such information f

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::11dd6f963941c22946571db2a28b7543

Zobrazit plný text záznamu

Development of augmented‐reality applications in otolaryngology–head and neck surgery

Autor: Henry Fuchs, Hyounghun Kim, Austin S. Rose, Jan-Michael Frahm

Publikováno v: The Laryngoscope. 129

Objectives/hypothesis Augmented reality (AR) allows for the addition of transparent virtual images and video to one's view of a physical environment. Our objective was to develop a head-worn, AR system for accurate, intraoperative localization of pat

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e6e536820551aaa0589c3e4023e711e5
https://doi.org/10.1002/lary.28098

Zobrazit plný text záznamu

Improving Visual Question Answering by Referring to Generated Paragraph Captions

Autor: Hyounghun Kim, Mohit Bansal

Publikováno v: ACL (1)

Paragraph-style image captions describe diverse aspects of an image as opposed to the more common single-sentence captions that only provide an abstract description of the image. These paragraph captions can hence contain substantial information of t

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a73e3448b2632da1ca418d1b28e1acef
http://arxiv.org/abs/1906.06216

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání