Zobrazeno 1 - 10
of 991
pro vyhledávání: '"CHEN Jianfei"'
Fully quantized training (FQT) accelerates the training of deep neural networks by quantizing the activations, weights, and gradients into lower precision. To explore the ultimate limit of FQT (the lowest achievable precision), we make a first attemp
Externí odkaz:
http://arxiv.org/abs/2408.14267
The tremendous success of Large Language Models (LLMs) across various complex tasks relies heavily on their substantial scale, which raises challenges during model deployment due to their large memory consumption. Recently, numerous studies have atte
Externí odkaz:
http://arxiv.org/abs/2407.20584
Denoising diffusion bridge models (DDBMs) are a powerful variant of diffusion models for interpolating between two arbitrary paired distributions given as endpoints. Despite their promising performance in tasks like image translation, DDBMs require a
Externí odkaz:
http://arxiv.org/abs/2405.15885
Diffusion models have been extensively used in data generation tasks and are recognized as one of the best generative models. However, their time-consuming deployment, long inference time, and requirements on large memory limit their application on m
Externí odkaz:
http://arxiv.org/abs/2404.10445
Training large transformers is slow, but recent innovations on GPU architecture give us an advantage. NVIDIA Ampere GPUs can execute a fine-grained 2:4 sparse matrix multiplication twice as fast as its dense equivalent. In the light of this property,
Externí odkaz:
http://arxiv.org/abs/2404.01847
Publikováno v:
Di-san junyi daxue xuebao, Vol 41, Iss 5, Pp 490-496 (2019)
Objective To compare the safety and efficacy of 3 kinds of permanent, biodegradable and polymer free drug eluting stents in the treatment of unprotected left main coronary artery disease (UPLM). Methods A total of 259 UPLM patients diagnosed by coron
Externí odkaz:
https://doaj.org/article/4ad6f2fb1b854ba58b8665660239bff0
Pretraining transformers are generally time-consuming. Fully quantized training (FQT) is a promising approach to speed up pretraining. However, most FQT methods adopt a quantize-compute-dequantize procedure, which often leads to suboptimal speedup an
Externí odkaz:
http://arxiv.org/abs/2403.12422
Sampling-based algorithms, which eliminate ''unimportant'' computations during forward and/or back propagation (BP), offer potential solutions to accelerate neural network training. However, since sampling introduces approximations to training, such
Externí odkaz:
http://arxiv.org/abs/2402.17227
Generative Adversarial Imitation Learning (GAIL) trains a generative policy to mimic a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from a GAN-like discriminator. A major drawback of GAIL is its trai
Externí odkaz:
http://arxiv.org/abs/2402.16349
Diffusion probabilistic models (DPMs) have exhibited excellent performance for high-fidelity image generation while suffering from inefficient sampling. Recent works accelerate the sampling procedure by proposing fast ODE solvers that leverage the sp
Externí odkaz:
http://arxiv.org/abs/2310.13268