Zobrazeno 1 - 10
of 556 633
pro vyhledávání: '"Lan BY"'
Text-driven Image to Video Generation (TI2V) aims to generate controllable video given the first frame and corresponding textual description. The primary challenges of this task lie in two parts: (i) how to identify the target objects and ensure the
Externí odkaz:
http://arxiv.org/abs/2412.10275
Autor:
Kootte, B., Reiter, M. P., Andreoiu, C., Beck, S., Bergmann, J., Brunner, T., Dickel, T., Dietrich, K. A., Dilling, J., Dunling, E., Flowerdew, J., Graham, L., Gwinner, G., Hockenbery, Z., Izzo, C., Jacobs, A., Javaji, A., Klawitter, R., Lan, Y., Leistenschneider, E., Lykiardopoulou, E. M., Miskun, I., Mukul, I., Murböck, T., Paul, S. F., Plaß, W. R., Ringuette, J., Scheidenberger, C., Silwal, R., Simpson, R., Teigelhöfer, A., Thompson, R. I., Tracy, Jr., J. L., Vansteenkiste, M., Weil, R., Wieser, M. E., Will, C., Kwiatkowski, A. A.
Direct observation of proton emission with very small emission energy is often unfeasible due to the long partial half-lives associated with tunneling through the Coulomb barrier. Therefore proton emitters with very small Q-values may require masses
Externí odkaz:
http://arxiv.org/abs/2412.10259
Mixed service mode docks enhance efficiency by flexibly handling both loading and unloading trucks in warehouses. However, existing research often predetermines the number and location of these docks prior to planning truck assignment and sequencing.
Externí odkaz:
http://arxiv.org/abs/2412.09090
Autor:
Zhang, Zheyuan, Wang, Zehong, Ma, Tianyi, Taneja, Varun Sameer, Nelson, Sofia, Le, Nhi Ha Lan, Murugesan, Keerthiram, Ju, Mingxuan, Chawla, Nitesh V, Zhang, Chuxu, Ye, Yanfang
The prevalence of unhealthy eating habits has become an increasingly concerning issue in the United States. However, major food recommendation platforms (e.g., Yelp) continue to prioritize users' dietary preferences over the healthiness of their choi
Externí odkaz:
http://arxiv.org/abs/2412.08847
Autor:
Liu, Jihao, Yu, Zhiding, Lan, Shiyi, Wang, Shihao, Fang, Rongyao, Kautz, Jan, Li, Hongsheng, Alvare, Jose M.
This paper presents StreamChat, a novel approach that enhances the interaction capabilities of Large Multimodal Models (LMMs) with streaming video content. In streaming interaction scenarios, existing methods rely solely on visual information availab
Externí odkaz:
http://arxiv.org/abs/2412.08646
The objective of multimodal intent recognition (MIR) is to leverage various modalities-such as text, video, and audio-to detect user intentions, which is crucial for understanding human language and context in dialogue systems. Despite advances in th
Externí odkaz:
http://arxiv.org/abs/2412.08529
This study aims to achieve more precise and versatile object control in image-to-video (I2V) generation. Current methods typically represent the spatial movement of target objects with 2D trajectories, which often fail to capture user intention and f
Externí odkaz:
http://arxiv.org/abs/2412.07721
Autor:
Mu, Jiazuo, Yang, Fuyi, Zhang, Yanshun, Zhang, Junxiong, Luo, Yongjian, Xu, Lan, Shi, Yujiao, Yu, Jingyi, Zhang, Yingliang
We introduce CADSpotting, an efficient method for panoptic symbol spotting in large-scale architectural CAD drawings. Existing approaches struggle with the diversity of symbols, scale variations, and overlapping elements in CAD designs. CADSpotting o
Externí odkaz:
http://arxiv.org/abs/2412.07377
Autor:
Lan, Tian
Ocneanu's tube algebra provides a finite algorithm to compute the Drinfeld center of a fusion category. In this work we reveal the universal property underlying the tube algebra. Take a base category $\mathcal V$ which is concrete, bicomplete, and sy
Externí odkaz:
http://arxiv.org/abs/2412.07198
We investigate the reasoning capabilities of large language models (LLMs) for automatically generating data-cleaning workflows. To evaluate LLMs' ability to complete data-cleaning tasks, we implemented a pipeline for LLM-based Auto Data Cleaning Work
Externí odkaz:
http://arxiv.org/abs/2412.06724