Zobrazeno 1 - 10
of 171
pro vyhledávání: '"Wang, Mingze"'
The widespread use of large language models (LLMs) has sparked concerns about the potential misuse of AI-generated text, as these models can produce content that closely resembles human-generated text. Current detectors for AI-generated text (AIGT) l
Externí odkaz:
http://arxiv.org/abs/2406.01179
Autor:
Wang, Mingze, He, Haotian, Wang, Jinbo, Wang, Zilin, Huang, Guanhua, Xiong, Feiyu, Li, Zhiyu, E, Weinan, Wu, Lei
In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence. Specifically, IRE decouples the dynamics of flat and sha
Externí odkaz:
http://arxiv.org/abs/2405.20763
Autor:
Wang, Mingze, Su, Lili, Yan, Cilin, Xu, Sheng, Yuan, Pengcheng, Jiang, Xiaolong, Zhang, Baochang
The intelligent interpretation of buildings plays a significant role in urban planning and management, macroeconomic analysis, population dynamics, etc. Remote sensing image building interpretation primarily encompasses building extraction and change
Externí odkaz:
http://arxiv.org/abs/2403.07564
Symmetries exist abundantly in the loss function of neural networks. We characterize the learning dynamics of stochastic gradient descent (SGD) when exponential symmetries, a broad subclass of continuous symmetries, exist in the loss function. We est
Externí odkaz:
http://arxiv.org/abs/2402.07193
Autor:
Wang, Mingze, E, Weinan
We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self
Externí odkaz:
http://arxiv.org/abs/2402.00522
In this work, we investigate the margin-maximization bias exhibited by gradient-based algorithms in classifying linearly separable data. We present an in-depth analysis of the specific properties of the velocity field associated with (normalized) gra
Externí odkaz:
http://arxiv.org/abs/2311.14387
Autor:
Wang, Mingze, Wu, Lei
In this paper, we provide a theoretical study of noise geometry for minibatch stochastic gradient descent (SGD), a phenomenon where noise aligns favorably with the geometry of local landscape. We propose two metrics, derived from analyzing how noise
Externí odkaz:
http://arxiv.org/abs/2310.00692
Publikováno v:
Aircraft Engineering and Aerospace Technology, 2024, Vol. 96, Issue 4, pp. 501-513.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/AEAT-08-2022-0204
Real-time object detection plays a vital role in various computer vision applications. However, deploying real-time object detectors on resource-constrained platforms poses challenges due to high computational and memory requirements. This paper desc
Externí odkaz:
http://arxiv.org/abs/2307.04816
Autor:
Wang, Mingze, Ma, Chao
The training process of ReLU neural networks often exhibits complicated nonlinear phenomena. The nonlinearity of models and non-convexity of loss pose significant challenges for theoretical analysis. Therefore, most previous theoretical works on the
Externí odkaz:
http://arxiv.org/abs/2305.12467