Výsledky vyhledávání - "Kankanhalli, Mohan"

Report

STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting

Autor: Chai, Zenghao, Tang, Chen, Wong, Yongkang, Kankanhalli, Mohan

The creation of 4D avatars (i.e., animated 3D avatars) from text description typically uses text-to-image (T2I) diffusion models to synthesize 3D avatars in the canonical space and subsequently applies animation with target motions. However, such an

Externí odkaz: http://arxiv.org/abs/2406.04629

Zobrazit plný text záznamu

Report

Do Vision-Language Transformers Exhibit Visual Commonsense? An Empirical Study of VCR

Autor: Li, Zhenyang, Guo, Yangyang, Wang, Kejie, Chen, Xiaolin, Nie, Liqiang, Kankanhalli, Mohan

Visual Commonsense Reasoning (VCR) calls for explanatory reasoning behind question answering over visual scenes. To achieve this goal, a model is required to provide an acceptable rationale as the reason for the predicted answers. Progress on the ben

Externí odkaz: http://arxiv.org/abs/2405.16934

Zobrazit plný text záznamu

Report

Multi-Modal Recommendation Unlearning

Autor: Sinha, Yash, Mandal, Murari, Kankanhalli, Mohan

Unlearning methods for recommender systems (RS) have emerged to address privacy issues and concerns about legal compliance. However, evolving user preferences and content licensing issues still remain unaddressed. This is particularly true in case of

Externí odkaz: http://arxiv.org/abs/2405.15328

Zobrazit plný text záznamu

Report

TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment

Autor: Li, Wei, Fan, Hehe, Wong, Yongkang, Kankanhalli, Mohan, Yang, Yi

Recent advancements in image understanding have benefited from the extensive use of web image-text pairs. However, video understanding remains a challenge despite the availability of substantial web video-text data. This difficulty primarily arises f

Externí odkaz: http://arxiv.org/abs/2405.13911

Zobrazit plný text záznamu

Report

Bridging the Intent Gap: Knowledge-Enhanced Visual Generation

Autor: Cheng, Yi, Xu, Ziwei, Lin, Dongyun, Cheng, Harry, Wong, Yongkang, Sun, Ying, Lim, Joo Hwee, Kankanhalli, Mohan

For visual content generation, discrepancies between user intentions and the generated content have been a longstanding problem. This discrepancy arises from two main factors. First, user intentions are inherently complex, with subtle details not ful

Externí odkaz: http://arxiv.org/abs/2405.12538

Zobrazit plný text záznamu

Report

DPTraj-PM: Differentially Private Trajectory Synthesis Using Prefix Tree and Markov Process

Autor: Wang, Nana, Kankanhalli, Mohan

The increasing use of GPS-enabled devices has generated a large amount of trajectory data. These data offer us vital insights to understand the movements of individuals and populations, benefiting a broad range of applications from transportation pla

Externí odkaz: http://arxiv.org/abs/2404.14106

Zobrazit plný text záznamu

Report

MCM: Multi-condition Motion Synthesis Framework

Autor: Ling, Zeyu, Han, Bo, Wongkan, Yongkang, Lin, Han, Kankanhalli, Mohan, Geng, Weidong

Publikováno v: International Joint Conference on Artificial Intelligence 2024

Conditional human motion synthesis (HMS) aims to generate human motion sequences that conform to specific conditions. Text and audio represent the two predominant modalities employed as HMS control conditions. While existing research has primarily fo

Externí odkaz: http://arxiv.org/abs/2404.12886

Zobrazit plný text záznamu

Report

Cluster-based Graph Collaborative Filtering

Autor: Liu, Fan, Zhao, Shuai, Cheng, Zhiyong, Nie, Liqiang, Kankanhalli, Mohan

Graph Convolution Networks (GCNs) have significantly succeeded in learning user and item representations for recommendation systems. The core of their efficacy is the ability to explicitly exploit the collaborative signals from both the first- and hi

Externí odkaz: http://arxiv.org/abs/2404.10321

Zobrazit plný text záznamu

Report

S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for Face Video Editing

Autor: Wang, Guangzhi, Chen, Tianyi, Ghasedi, Kamran, Wu, HsiangTao, Ding, Tianyu, Nuesmeyer, Chris, Zharkov, Ilya, Kankanhalli, Mohan, Liang, Luming

Face attribute editing plays a pivotal role in various applications. However, existing methods encounter challenges in achieving high-quality results while preserving identity, editing faithfulness, and temporal consistency. These challenges are root

Externí odkaz: http://arxiv.org/abs/2404.08111

Zobrazit plný text záznamu

Report

How to Understand Named Entities: Using Common Sense for News Captioning

Autor: Xu, Ning, Wang, Yanhui, Zhang, Tingting, Tian, Hongshuo, Kankanhalli, Mohan, Liu, An-An

News captioning aims to describe an image with its news article body as input. It greatly relies on a set of detected named entities, including real-world people, organizations, and places. This paper exploits commonsense knowledge to understand name

Externí odkaz: http://arxiv.org/abs/2403.06520

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání