Zobrazeno 1 - 10
of 50 673
pro vyhledávání: '"LiQiang"'
Autor:
杨利强1 yangliqiang817.ossl@sinopec.com
Publikováno v:
Petroleum Drilling Techniques. Sep2024, Vol. 52 Issue 5, p138-144. 7p.
Video Large Language Models (VideoLLMs) have achieved remarkable progress in video understanding. However, existing VideoLLMs often inherit the limitations of their backbone LLMs in handling long sequences, leading to challenges for long video unders
Externí odkaz:
http://arxiv.org/abs/2412.20504
This technical report introduces our top-ranked solution that employs two approaches, \ie suffix injection and projected gradient descent (PGD) , to address the TiFA workshop MLLM attack challenge. Specifically, we first append the text from an incor
Externí odkaz:
http://arxiv.org/abs/2412.15614
The rapid advancements in large language models (LLMs) have demonstrated their potential to accelerate scientific discovery, particularly in automating the process of research ideation. LLM-based systems have shown promise in generating hypotheses an
Externí odkaz:
http://arxiv.org/abs/2412.14626
Modern software systems produce vast amounts of logs, serving as an essential resource for anomaly detection. Artificial Intelligence for IT Operations (AIOps) tools have been developed to automate the process of log-based anomaly detection for softw
Externí odkaz:
http://arxiv.org/abs/2412.15445
We introduce a new task called Defeasible Visual Entailment (DVE), where the goal is to allow the modification of the entailment relationship between an image premise and a text hypothesis based on an additional update. While this concept is well-est
Externí odkaz:
http://arxiv.org/abs/2412.16232
Autor:
Qu, Leigang, Li, Haochuan, Wang, Wenjie, Liu, Xiang, Li, Juncheng, Nie, Liqiang, Chua, Tat-Seng
Large Multimodal Models (LMMs) have demonstrated impressive capabilities in multimodal understanding and generation, pushing forward advancements in text-to-image generation. However, achieving accurate text-image alignment for LMMs, particularly in
Externí odkaz:
http://arxiv.org/abs/2412.05818
We introduce camera ray matching (CRAYM) into the joint optimization of camera poses and neural fields from multi-view images. The optimized field, referred to as a feature volume, can be "probed" by the camera rays for novel view synthesis (NVS) and
Externí odkaz:
http://arxiv.org/abs/2412.01618
With the development of smart cities, the demand for continuous pedestrian navigation in large-scale urban environments has significantly increased. While global navigation satellite systems (GNSS) provide low-cost and reliable positioning services,
Externí odkaz:
http://arxiv.org/abs/2411.19845
Autor:
Zhou, Boyao, Zheng, Shunyuan, Tu, Hanzhang, Shao, Ruizhi, Liu, Boning, Zhang, Shengping, Nie, Liqiang, Liu, Yebin
Differentiable rendering techniques have recently shown promising results for free-viewpoint video synthesis of characters. However, such methods, either Gaussian Splatting or neural implicit rendering, typically necessitate per-subject optimization
Externí odkaz:
http://arxiv.org/abs/2411.11363