Zobrazeno 1 - 10
of 1 955
pro vyhledávání: '"A, Sunayana"'
The ground state properties of strongly rotating bosons confined in an asymmetric anharmonic potential exhibit a split density distribution. However, the out-of-equilibrium dynamics of this split structure remain largely unexplored. Given that rotati
Externí odkaz:
http://arxiv.org/abs/2411.06163
Autor:
Campbell, Declan, Rane, Sunayana, Giallanza, Tyler, De Sabbata, Nicolò, Ghods, Kia, Joshi, Amogh, Ku, Alexander, Frankland, Steven M., Griffiths, Thomas L., Cohen, Jonathan D., Webb, Taylor W.
Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array
Externí odkaz:
http://arxiv.org/abs/2411.00238
Benchmark contamination refers to the presence of test datasets in Large Language Model (LLM) pre-training or post-training data. Contamination can lead to inflated scores on benchmarks, compromising evaluation results and making it difficult to dete
Externí odkaz:
http://arxiv.org/abs/2410.16186
Large Language Models (LLMs) demonstrate exceptional capabilities in a multitude of NLP tasks. However, the efficacy of such models to languages other than English is often limited. Prior works have shown that encoder-only models such as BERT or XLM-
Externí odkaz:
http://arxiv.org/abs/2410.16168
A common challenge towards the adaptability of Large Language Models (LLMs) is their ability to learn new languages over time without hampering the model's performance on languages in which the model is already proficient (usually English). Continual
Externí odkaz:
http://arxiv.org/abs/2410.16006
Assessing the capabilities and limitations of large language models (LLMs) has garnered significant interest, yet the evaluation of multiple models in real-world scenarios remains rare. Multilingual evaluation often relies on translated benchmarks, w
Externí odkaz:
http://arxiv.org/abs/2410.13671
Information in speech can be divided into two categories: what is being said (content) and how it is expressed (other). Current state-of-the-art (SOTA) techniques model speech at fixed segments, usually 10-25 ms, using a single embedding. Given the o
Externí odkaz:
http://arxiv.org/abs/2410.11086
Speech modeling methods learn one embedding for a fixed segment of speech, typically in between 10-25 ms. The information present in speech can be divided into two categories: "what is being said" (content) and "how it is expressed" (other) and these
Externí odkaz:
http://arxiv.org/abs/2408.10557
The lecture notes on "Many-body Quantum Dynamics with MCTDH-X," adapted from the 2023 Heidelberg MCTDH Summer School, provide an in-depth exploration of the Multiconfigurational Time-Dependent Hartree approach for indistinguishable particles. They se
Externí odkaz:
http://arxiv.org/abs/2407.20317
Autor:
Ahuja, Sanchit, Tanmay, Kumar, Chauhan, Hardik Hansrajbhai, Patra, Barun, Aggarwal, Kriti, Del Corro, Luciano, Mitra, Arindam, Dhamecha, Tejas Indulal, Awadallah, Ahmed, Choudhary, Monojit, Chaudhary, Vishrav, Sitaram, Sunayana
Despite the remarkable success of LLMs in English, there is a significant gap in performance in non-English languages. In order to address this, we introduce a novel recipe for creating a multilingual synthetic instruction tuning dataset, sPhinX, whi
Externí odkaz:
http://arxiv.org/abs/2407.09879