Zobrazeno 1 - 10
of 651
pro vyhledávání: '"CLARK, RONALD A."'
We introduce Olympus, a new approach that transforms Multimodal Large Language Models (MLLMs) into a unified framework capable of handling a wide array of computer vision tasks. Utilizing a controller MLLM, Olympus delegates over 20 specialized tasks
Externí odkaz:
http://arxiv.org/abs/2412.09612
Autor:
Motwani, Sumeet Ramesh, Smith, Chandler, Das, Rocktim Jyoti, Rybchuk, Markian, Torr, Philip H. S., Laptev, Ivan, Pizzati, Fabio, Clark, Ronald, de Witt, Christian Schroeder
Enabling effective collaboration among LLMs is a crucial step toward developing autonomous systems capable of solving complex problems. While LLMs are typically used as single-model generators, where humans critique and refine their outputs, the pote
Externí odkaz:
http://arxiv.org/abs/2412.01928
The rapid proliferation of AI-manipulated or generated audio deepfakes poses serious challenges to media integrity and election security. Current AI-driven detection solutions lack explainability and underperform in real-world settings. In this paper
Externí odkaz:
http://arxiv.org/abs/2410.07436
Autor:
Brown, Bradley, Juravsky, Jordan, Ehrlich, Ryan, Clark, Ronald, Le, Quoc V., Ré, Christopher, Mirhoseini, Azalia
Scaling the amount of compute used to train language models has dramatically improved their capabilities. However, when it comes to inference, we often limit the amount of compute to only one attempt per problem. Here, we explore inference compute as
Externí odkaz:
http://arxiv.org/abs/2407.21787
Autor:
Lin, Yuanze, Li, Yunsheng, Chen, Dongdong, Xu, Weijian, Clark, Ronald, Torr, Philip, Yuan, Lu
In recent years, multimodal large language models (MLLMs) have made significant strides by training on vast high-quality image-text datasets, enabling them to generally understand images well. However, the inherent difficulty in explicitly conveying
Externí odkaz:
http://arxiv.org/abs/2407.04681
Autor:
Batra, Hunar, Clark, Ronald
Continual learning aims to allow models to learn new tasks without forgetting what has been learned before. This work introduces Elastic Variational Continual Learning with Weight Consolidation (EVCL), a novel hybrid model that integrates the variati
Externí odkaz:
http://arxiv.org/abs/2406.15972
We present DreamPolisher, a novel Gaussian Splatting based method with geometric guidance, tailored to learn cross-view consistency and intricate detail from textual descriptions. While recent progress on text-to-3D generation methods have been promi
Externí odkaz:
http://arxiv.org/abs/2403.17237
The creation of accurate virtual models of real-world objects is imperative to robotic simulations and applications such as computer vision, artificial intelligence, and machine learning. This paper documents the different methods employed for genera
Externí odkaz:
http://arxiv.org/abs/2402.11836
Although Neural Radiance Fields (NeRFs) have markedly improved novel view synthesis, accurate uncertainty quantification in their image predictions remains an open problem. The prevailing methods for estimating uncertainty, including the state-of-the
Externí odkaz:
http://arxiv.org/abs/2312.02350
The best way to combine the results of deep learning with standard 3D reconstruction pipelines remains an open problem. While systems that pass the output of traditional multi-view stereo approaches to a network for regularisation or refinement curre
Externí odkaz:
http://arxiv.org/abs/2207.13464