Výsledky vyhledávání - "Shi,Yangyang"

Report

Stochastic homogenization for two dimensional Navier--Stokes equations with random coefficients

Autor: Su, Dong, Liu, Hui, Shi, Yangyang

This paper derives the stochastic homogenization for two dimensional Navier--Stokes equations with random coefficients. By means of weak convergence method and Stratonovich--Khasminskii averaging principle approach, the solution of two dimensional Na

Externí odkaz: http://arxiv.org/abs/2412.12866

Zobrazit plný text záznamu

Report

MASV: Speaker Verification with Global and Local Context Mamba

Autor: Liu, Yang, Wan, Li, Huang, Yiteng, Sun, Ming, Shi, Yangyang, Metze, Florian

Deep learning models like Convolutional Neural Networks and transformers have shown impressive capabilities in speech verification, gaining considerable attention in the research community. However, CNN-based approaches struggle with modeling long-se

Externí odkaz: http://arxiv.org/abs/2412.10989

Zobrazit plný text záznamu

Report

SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text

Autor: Liu, Haohe, Lan, Gael Le, Mei, Xinhao, Ni, Zhaoheng, Kumar, Anurag, Nagaraja, Varun, Wang, Wenwu, Plumbley, Mark D., Shi, Yangyang, Chandra, Vikas

Video and audio are closely correlated modalities that humans naturally perceive together. While recent advancements have enabled the generation of audio or video from text, producing both modalities simultaneously still typically relies on either a

Externí odkaz: http://arxiv.org/abs/2412.15220

Zobrazit plný text záznamu

Report

MFF-FTNet: Multi-scale Feature Fusion across Frequency and Temporal Domains for Time Series Forecasting

Autor: Shi, Yangyang, Ren, Qianqian, Liu, Yong, Sun, Jianguo

Time series forecasting is crucial in many fields, yet current deep learning models struggle with noise, data sparsity, and capturing complex multi-scale patterns. This paper presents MFF-FTNet, a novel framework addressing these challenges by combin

Externí odkaz: http://arxiv.org/abs/2411.17382

Zobrazit plný text záznamu

Report

Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations

This paper presents Llama Guard 3-1B-INT4, a compact and efficient Llama Guard model, which has been open-sourced to the community during Meta Connect 2024. We demonstrate that Llama Guard 3-1B-INT4 can be deployed on resource-constrained devices, ac

Externí odkaz: http://arxiv.org/abs/2411.17713

Zobrazit plný text záznamu

Report

Agent-as-a-Judge: Evaluate Agents with Agents

Autor: Zhuge, Mingchen, Zhao, Changsheng, Ashley, Dylan, Wang, Wenyi, Khizbullin, Dmitrii, Xiong, Yunyang, Liu, Zechun, Chang, Ernie, Krishnamoorthi, Raghuraman, Tian, Yuandong, Shi, Yangyang, Chandra, Vikas, Schmidhuber, Jürgen

Contemporary evaluation techniques are inadequate for agentic systems. These approaches either focus exclusively on final outcomes -- ignoring the step-by-step nature of agentic systems, or require excessive manual labour. To address this, we introdu

Externí odkaz: http://arxiv.org/abs/2410.10934

Zobrazit plný text záznamu

Report

Scaling Parameter-Constrained Language Models with Quality Data

Autor: Chang, Ernie, Paltenghi, Matteo, Li, Yang, Lin, Pin-Jie, Zhao, Changsheng, Huber, Patrick, Liu, Zechun, Rabatin, Rastislav, Shi, Yangyang, Chandra, Vikas

Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization. In this paper, we

Externí odkaz: http://arxiv.org/abs/2410.03083

Zobrazit plný text záznamu

Report

Target-Aware Language Modeling via Granular Data Sampling

Autor: Chang, Ernie, Lin, Pin-Jie, Li, Yang, Zhao, Changsheng, Kim, Daeil, Rabatin, Rastislav, Liu, Zechun, Shi, Yangyang, Chandra, Vikas

Language model pretraining generally targets a broad range of use cases and incorporates data from diverse sources. However, there are instances where we desire a model that excels in specific areas without markedly compromising performance in other

Externí odkaz: http://arxiv.org/abs/2409.14705

Zobrazit plný text záznamu

Report

High Fidelity Text-Guided Music Editing via Single-Stage Flow Matching

Autor: Lan, Gael Le, Shi, Bowen, Ni, Zhaoheng, Srinivasan, Sidd, Kumar, Anurag, Ellis, Brian, Kant, David, Nagaraja, Varun, Chang, Ernie, Hsu, Wei-Ning, Shi, Yangyang, Chandra, Vikas

We introduce MelodyFlow, an efficient text-controllable high-fidelity music generation and editing model. It operates on continuous latent representations from a low frame rate 48 kHz stereo variational auto encoder codec. Based on a diffusion transf

Externí odkaz: http://arxiv.org/abs/2407.03648

Zobrazit plný text záznamu

Report

Speech ReaLLM -- Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time

Autor: Seide, Frank, Doulaty, Morrie, Shi, Yangyang, Gaur, Yashesh, Jia, Junteng, Wu, Chunyang

We introduce Speech ReaLLM, a new ASR architecture that marries "decoder-only" ASR with the RNN-T to make multimodal LLM architectures capable of real-time streaming. This is the first "decoder-only" ASR architecture designed to handle continuous aud

Externí odkaz: http://arxiv.org/abs/2406.09569

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání