Zobrazeno 1 - 10
of 1 772
pro vyhledávání: '"P. Sarto"'
Despite significant advancements in caption generation, existing evaluation metrics often fail to capture the full quality or fine-grained details of captions. This is mainly due to their reliance on non-specific human-written references or noisy pre
Externí odkaz:
http://arxiv.org/abs/2410.07336
Effectively aligning with human judgment when evaluating machine-generated image captions represents a complex yet intriguing challenge. Existing evaluation metrics like CIDEr or CLIP-Score fall short in this regard as they do not take into account t
Externí odkaz:
http://arxiv.org/abs/2407.20341
We develop a model based on mean-field games of competitive firms producing similar goods according to a standard AK model with a depreciation rate of capital generating pollution as a byproduct. Our analysis focuses on the widely-used cap-and-trade
Externí odkaz:
http://arxiv.org/abs/2407.12754
Autor:
Del Sarto, Gianmarco, Flandoli, Franco
We develop a three-timescale framework for modelling climate change and introduce a space-heterogeneous one-dimensional energy balance model. This model, addressing temperature fluctuations from rising carbon dioxide levels and the super-greenhouse e
Externí odkaz:
http://arxiv.org/abs/2406.11881
The objective of image captioning models is to bridge the gap between the visual and linguistic modalities by generating natural language descriptions that accurately reflect the content of input images. In recent years, researchers have leveraged de
Externí odkaz:
http://arxiv.org/abs/2405.13127
Autor:
Caffagni, Davide, Cocchi, Federico, Moratelli, Nicholas, Sarto, Sara, Cornia, Marcella, Baraldi, Lorenzo, Cucchiara, Rita
Multimodal LLMs are the natural evolution of LLMs, and enlarge their capabilities so as to work beyond the pure textual modality. As research is being carried out to design novel architectures and vision-and-language adapters, in this paper we concen
Externí odkaz:
http://arxiv.org/abs/2404.15406
Autor:
Caffagni, Davide, Cocchi, Federico, Barsellotti, Luca, Moratelli, Nicholas, Sarto, Sara, Baraldi, Lorenzo, Cornia, Marcella, Cucchiara, Rita
Connecting text and visual modalities plays an essential role in generative intelligence. For this reason, inspired by the success of large language models, significant research efforts are being devoted to the development of Multimodal Large Languag
Externí odkaz:
http://arxiv.org/abs/2402.12451
Autor:
Sama, Juvert Njeck, Biancalani, Alessandro, Bottino, Alberto, Del Sarto, Daniele, Dumont, Remi, Di Giannatale, Giovanni., Ghizzo, Alain, Hayward-Schneider, Thomas, Lauber, Philipp, McMillan, Ben, Mishchenko, Alexey, Muruggapan, Moahan, Rettino, Brando, Rofman, Baruch, Vannini, Francesco, Villard, Laurent, Wang, Xin
In this work, we use the global electromagnetic and electrostatic gyro kinetic approaches to investigate the effects of zonal flows forced-driven by Alfv\'en modes due to their excitation by energetic particles (EPs), on the dynamics of ITG (Ion temp
Externí odkaz:
http://arxiv.org/abs/2401.04501
Autor:
Gnaldi, Michela, Del Sarto, Simone
The Agenda 2030 recognises corruption as a major obstacle to sustainable development and integrates its reduction among SDG targets, in view of developing peaceful, just and strong institutions. In this paper, we propose a method to assess the validi
Externí odkaz:
http://arxiv.org/abs/2309.01462
Image captioning, like many tasks involving vision and language, currently relies on Transformer-based architectures for extracting the semantics in an image and translating it into linguistically coherent descriptions. Although successful, the atten
Externí odkaz:
http://arxiv.org/abs/2308.12383