Zobrazeno 1 - 10
of 1 140
pro vyhledávání: '"Zhang Jiwen"'
Autor:
Zhang Jiwen
Publikováno v:
Applied Mathematics and Nonlinear Sciences, Vol 8, Iss 2, Pp 2053-2060 (2023)
This paper used a new interior graphic modeling research based on CAD and depth enhancement teaching models. A massive database for graphic design has been established. An optimization method is proposed based on intelligent decision making, intellig
Externí odkaz:
https://doaj.org/article/b8db5f69fcc14191b1f7a23f44458cca
In this paper, we deal with the following mixed local/nonlocal Schr\"{o}dinger equation \begin{equation*} \left\{ \begin{array}{ll} - \Delta u + (-\Delta)^s u+u = u^p \quad \hbox{in $\mathbb{R}^n$,} u>0 \quad \hbox{in $\mathbb{R}^n$,} \lim\limits_{|x
Externí odkaz:
http://arxiv.org/abs/2411.09941
This article is concerned with ``up to $C^{2, \alpha}$-regularity results'' about a mixed local-nonlocal nonlinear elliptic equation which is driven by the superposition of Laplacian and fractional Laplacian operators. First of all, an estimate on th
Externí odkaz:
http://arxiv.org/abs/2411.09930
We are concerned with the mixed local/nonlocal Schr\"{o}dinger equation \begin{equation} - \Delta u + (-\Delta)^s u+u = u^{p+1} \quad \hbox{in $\mathbb{R}^n$,} \end{equation} for arbitrary space dimension $n\geqslant1$, $s\in(0,1)$, and $p\in(0,2^*-2
Externí odkaz:
http://arxiv.org/abs/2410.19616
TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens
Reading dense text and locating objects within images are fundamental abilities for Large Vision-Language Models (LVLMs) tasked with advanced jobs. Previous LVLMs, including superior proprietary models like GPT-4o, have struggled to excel in both tas
Externí odkaz:
http://arxiv.org/abs/2410.05261
While large multi-modal models (LMMs) have exhibited impressive capabilities across diverse tasks, their effectiveness in handling complex tasks has been limited by the prevailing single-step reasoning paradigm. To this end, this paper proposes VoCoT
Externí odkaz:
http://arxiv.org/abs/2405.16919
Autor:
Du, Mengfei, Wu, Binhao, Zhang, Jiwen, Fan, Zhihao, Li, Zejun, Luo, Ruipu, Huang, Xuanjing, Wei, Zhongyu
Vision-and-Language navigation (VLN) requires an agent to navigate in unseen environment by following natural language instruction. For task completion, the agent needs to align and integrate various navigation modalities, including instruction, obse
Externí odkaz:
http://arxiv.org/abs/2404.01994
Autor:
Zhang, Jiwen, Wu, Jihao, Teng, Yihua, Liao, Minghui, Xu, Nuo, Xiao, Xiao, Wei, Zhongyu, Tang, Duyu
Large language model (LLM) leads to a surge of autonomous GUI agents for smartphone, which completes a task triggered by natural language through predicting a sequence of actions of API. Even though the task highly relies on past actions and visual o
Externí odkaz:
http://arxiv.org/abs/2403.02713
Autor:
Li, Zejun, Wang, Ye, Du, Mengfei, Liu, Qingwen, Wu, Binhao, Zhang, Jiwen, Zhou, Chengxing, Fan, Zhihao, Fu, Jie, Chen, Jingjing, Huang, Xuanjing, Wei, Zhongyu
Recent years have witnessed remarkable progress in the development of large vision-language models (LVLMs). Benefiting from the strong language backbones and efficient cross-modal alignment strategies, LVLMs exhibit surprising capabilities to perceiv
Externí odkaz:
http://arxiv.org/abs/2310.02569
Vision language decision making (VLDM) is a challenging multimodal task. The agent have to understand complex human instructions and complete compositional tasks involving environment navigation and object manipulation. However, the long action seque
Externí odkaz:
http://arxiv.org/abs/2307.08016