Zobrazeno 1 - 10
of 764
pro vyhledávání: '"Dang, Trung"'
Publikováno v:
ACM Multimedia 2024, pages 1467-1475
Recently, transformer-based techniques incorporating superpoints have become prevalent in 3D instance segmentation. However, they often encounter an over-segmentation problem, especially noticeable with large objects. Additionally, unreliable mask pr
Externí odkaz:
http://arxiv.org/abs/2411.01781
Existing zero-shot text-to-speech (TTS) systems are typically designed to process complete sentences and are constrained by the maximum duration for which they have been trained. However, in many streaming applications, texts arrive continuously in s
Externí odkaz:
http://arxiv.org/abs/2410.00767
Autor:
Dang, Trung, Huang, Zhiyi
We introduce a new model to study algorithm design under unreliable information, and apply this model for the problem of finding the uncorrupted maximum element of a list containing $n$ elements, among which are $k$ corrupted elements. Under our mode
Externí odkaz:
http://arxiv.org/abs/2409.06014
Mamba, a State Space Model (SSM), has recently shown competitive performance to Convolutional Neural Networks (CNNs) and Transformers in Natural Language Processing and general sequence modeling. Various attempts have been made to adapt Mamba to Comp
Externí odkaz:
http://arxiv.org/abs/2408.14415
LiveSpeech: Low-Latency Zero-shot Text-to-Speech via Autoregressive Modeling of Audio Discrete Codes
Prior works have demonstrated zero-shot text-to-speech by using a generative language model on audio tokens obtained via a neural audio codec. It is still challenging, however, to adapt them to low-latency scenarios. In this paper, we present LiveSpe
Externí odkaz:
http://arxiv.org/abs/2406.02897
Accurate retinal vessel (RV) segmentation is a crucial step in the quantitative assessment of retinal vasculature, which is needed for the early detection of retinal diseases and other conditions. Numerous studies have been conducted to tackle the pr
Externí odkaz:
http://arxiv.org/abs/2405.16815
One of the primary challenges in brain tumor segmentation arises from the uncertainty of voxels close to tumor boundaries. However, the conventional process of generating ground truth segmentation masks fails to treat such uncertainties properly. Tho
Externí odkaz:
http://arxiv.org/abs/2405.16813
We study multi-buyer multi-item sequential item pricing mechanisms for revenue maximization with the goal of approximating a natural fractional relaxation -- the ex ante optimal revenue. We assume that buyers' values are subadditive but make no assum
Externí odkaz:
http://arxiv.org/abs/2404.14679
Masked Autoencoders (MAEs) learn rich low-level representations from unlabeled data but require substantial labeled data to effectively adapt to downstream tasks. Conversely, Instance Discrimination (ID) emphasizes high-level semantics, offering a po
Externí odkaz:
http://arxiv.org/abs/2403.09579
Optimality in Mean Estimation: Beyond Worst-Case, Beyond Sub-Gaussian, and Beyond $1+\alpha$ Moments
There is growing interest in improving our algorithmic understanding of fundamental statistical problems such as mean estimation, driven by the goal of understanding the limits of what we can extract from valuable data. The state of the art results f
Externí odkaz:
http://arxiv.org/abs/2311.12784