Zobrazeno 1 - 10
of 2 942
pro vyhledávání: '"Şenocak A"'
How does audio describe the world around us? In this work, we propose a method for generating images of visual scenes from diverse in-the-wild sounds. This cross-modal generation task is challenging due to the significant information gap between audi
Externí odkaz:
http://arxiv.org/abs/2412.06209
Following the success of Large Language Models (LLMs), expanding their boundaries to new modalities represents a significant paradigm shift in multimodal understanding. Human perception is inherently multimodal, relying not only on text but also on a
Externí odkaz:
http://arxiv.org/abs/2410.18325
We present a non-overlapping, Schwarz-type domain decomposition method with a generalized interface condition, designed for physics-informed machine learning of partial differential equations (PDEs) in both forward and inverse contexts. Our approach
Externí odkaz:
http://arxiv.org/abs/2409.13644
Autor:
Senocak, Arda, Ryu, Hyeonggon, Kim, Junsik, Oh, Tae-Hyun, Pfister, Hanspeter, Chung, Joon Son
Recent studies on learning-based sound source localization have mainly focused on the localization performance perspective. However, prior work and existing benchmarks overlook a crucial aspect: cross-modal interaction, which is essential for interac
Externí odkaz:
http://arxiv.org/abs/2407.13676
Transformers have rapidly overtaken CNN-based architectures as the new standard in audio classification. Transformer-based models, such as the Audio Spectrogram Transformers (AST), also inherit the fixed-size input paradigm from CNNs. However, this l
Externí odkaz:
http://arxiv.org/abs/2407.08691
Transformers have rapidly become the preferred choice for audio classification, surpassing methods based on CNNs. However, Audio Spectrogram Transformers (ASTs) exhibit quadratic scaling due to self-attention. The removal of this quadratic self-atten
Externí odkaz:
http://arxiv.org/abs/2406.03344
We study the spontaneous emergence of three-dimensional motion from a quiescent, pure conduction state in stably stratified, convective flow within a triangular enclosure, which eventually self-organizes into a two-dimensional steady state. This phen
Externí odkaz:
http://arxiv.org/abs/2312.14887
Large-scale pre-trained image-text models demonstrate remarkable versatility across diverse tasks, benefiting from their robust representational capabilities and effective multimodal alignment. We extend the application of these models, specifically
Externí odkaz:
http://arxiv.org/abs/2311.04066
Autor:
Sevilay Çoruh Şenocak, Salim Yüce
Publikováno v:
Mathematica Bohemica, Vol 149, Iss 4, Pp 549-567 (2024)
The aim of this paper is to investigate the orthogonality of vectors to each other and the Gram-Schmidt method in the Minkowski space $\mathbb{R}_2^3$. Hyperbolic cosine formulas are given for all triangle types in the Minkowski plane $\mathbb{R}_1^2
Externí odkaz:
https://doaj.org/article/5233960e5ea64782bc933f54b981276e
Autor:
Senocak, Arda, Ryu, Hyeonggon, Kim, Junsik, Oh, Tae-Hyun, Pfister, Hanspeter, Chung, Joon Son
Humans can easily perceive the direction of sound sources in a visual scene, termed sound source localization. Recent studies on learning-based sound source localization have mainly explored the problem from a localization perspective. However, prior
Externí odkaz:
http://arxiv.org/abs/2309.10724