Zobrazeno 1 - 10
of 4 032
pro vyhledávání: '"Bin Kim"'
How does audio describe the world around us? In this work, we propose a method for generating images of visual scenes from diverse in-the-wild sounds. This cross-modal generation task is challenging due to the significant information gap between audi
Externí odkaz:
http://arxiv.org/abs/2412.06209
Following the success of Large Language Models (LLMs), expanding their boundaries to new modalities represents a significant paradigm shift in multimodal understanding. Human perception is inherently multimodal, relying not only on text but also on a
Externí odkaz:
http://arxiv.org/abs/2410.18325
Autor:
EunGi, Han, Hyun-Bin, Oh, Sung-Bin, Kim, Etcheberry, Corentin Nivelet, Nam, Suekyeong, Joo, Janghoon, Oh, Tae-Hyun
Speech-driven 3D facial animation has recently garnered attention due to its cost-effective usability in multimedia production. However, most current advances overlook the intelligibility of lip movements, limiting the realism of facial expressions.
Externí odkaz:
http://arxiv.org/abs/2407.01034
Autor:
Sung-Bin, Kim, Chae-Yeon, Lee, Son, Gihun, Hyun-Bin, Oh, Ju, Janghoon, Nam, Suekyeong, Oh, Tae-Hyun
Recent studies in speech-driven 3D talking head generation have achieved convincing results in verbal articulations. However, generating accurate lip-syncs degrades when applied to input speech in other languages, possibly due to the lack of datasets
Externí odkaz:
http://arxiv.org/abs/2406.14272
Autor:
Ha, Hyunwoo, Hyun-Bin, Oh, Jun-Seong, Kim, Byung-Ki, Kwon, Sung-Bin, Kim, Tran, Linh-Tam, Kim, Ji-Yun, Bae, Sung-Ho, Oh, Tae-Hyun
Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye. The deep learning-based prior work successfully demonstrates the modelling of the motion magnification problem with outstand
Externí odkaz:
http://arxiv.org/abs/2403.01898
Despite the recent advances of the artificial intelligence, building social intelligence remains a challenge. Among social signals, laughter is one of the distinctive expressions that occurs during social interactions between humans. In this work, we
Externí odkaz:
http://arxiv.org/abs/2312.09818
Laughter is a unique expression, essential to affirmative social interactions of humans. Although current 3D talking head generation methods produce convincing verbal articulations, they often fail to capture the vitality and subtleties of laughter a
Externí odkaz:
http://arxiv.org/abs/2311.00994
We propose NeuFace, a 3D face mesh pseudo annotation method on videos via neural re-parameterized optimization. Despite the huge progress in 3D face reconstruction methods, generating reliable 3D face labels for in-the-wild dynamic videos remains cha
Externí odkaz:
http://arxiv.org/abs/2310.03205
Publikováno v:
ICT Express, Vol 10, Iss 6, Pp 1301-1307 (2024)
In the heterogeneous network (HetNet) employing downlink non-orthogonal multiple access (NOMA), we focus on the non-convex optimization problem to optimize the spectral efficiency (SE) while the users satisfy the quality-of-service (QoS) requirement.
Externí odkaz:
https://doaj.org/article/65da6be8fe8d4abea5f072f1566b0e6d
Publikováno v:
Fisheries and Aquatic Sciences, Vol 27, Iss 11, Pp 783-790 (2024)
Skin aging is classified according to intrinsic factors, such as genetic and metabolic processes, and extrinsic factors, such as stress and exposure to ultraviolet radiation. These factors activate enzymes in the skin, such as collagenase, elastase,
Externí odkaz:
https://doaj.org/article/70dcb71a422345af85374ba0d557e0fa