Zobrazeno 1 - 10
of 302
pro vyhledávání: '"WU Jingfeng"'
Publikováno v:
陆军军医大学学报, Vol 45, Iss 21, Pp 2195-2205 (2023)
Objective To investigate the regulatory effect of nobiletin (NOB) at typical effective doses on intestinal stem cells in vivo and in vitro. Methods After a 3D culture model of mouse colorectal tumor cell line MC38 was constructed, the death and survi
Externí odkaz:
https://doaj.org/article/27559199d3dc444087d8a262f0ab5d78
Publikováno v:
陆军军医大学学报, Vol 45, Iss 19, Pp 1995-2006 (2023)
Objective To investigate the effects of interleukin22 (IL22) in acute colitis induced by dextran sulfate sodium salt (DSS) in mice and its underlying mechanism. Methods Three IL22 knockout mice (IL22-/-) and 3 control mice (IL22+/+) of 8-week-old mal
Externí odkaz:
https://doaj.org/article/56f043d10b83446e9682e31077f6967b
In the context of Machine Learning as a Service (MLaaS) clouds, the extensive use of Large Language Models (LLMs) often requires efficient management of significant query loads. When providing real-time inference services, several challenges arise. F
Externí odkaz:
http://arxiv.org/abs/2409.14961
Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system availability and flexibility. However, cloud-nativ
Externí odkaz:
http://arxiv.org/abs/2409.05093
The typical training of neural networks using large stepsize gradient descent (GD) under the logistic loss often involves two distinct phases, where the empirical risk oscillates in the first phase but decreases monotonically in the second phase. We
Externí odkaz:
http://arxiv.org/abs/2406.08654
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approxi
Externí odkaz:
http://arxiv.org/abs/2406.08466
We consider gradient descent (GD) with a constant stepsize applied to logistic regression with linearly separable data, where the constant stepsize $\eta$ is so large that the loss initially oscillates. We show that GD exits this initial oscillatory
Externí odkaz:
http://arxiv.org/abs/2402.15926
We study the \emph{in-context learning} (ICL) ability of a \emph{Linear Transformer Block} (LTB) that combines a linear attention component and a linear multi-layer perceptron (MLP) component. For ICL of linear regression with a Gaussian prior and a
Externí odkaz:
http://arxiv.org/abs/2402.14951
Accelerated stochastic gradient descent (ASGD) is a workhorse in deep learning and often achieves better generalization performance than SGD. However, existing optimization theory can only explain the faster convergence of ASGD, but cannot explain it
Externí odkaz:
http://arxiv.org/abs/2311.14222
Autor:
Wu, Jingfeng, Zou, Difan, Chen, Zixiang, Braverman, Vladimir, Gu, Quanquan, Bartlett, Peter L.
Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities, enabling them to solve unseen tasks solely based on input contexts without adjusting model parameters. In this paper, we study ICL in one of its simpl
Externí odkaz:
http://arxiv.org/abs/2310.08391