Zobrazeno 1 - 10
of 222
pro vyhledávání: '"Li, Daliang"'
Autor:
Aksitov, Renat, Miryoosefi, Sobhan, Li, Zonglin, Li, Daliang, Babayan, Sheila, Kopparapu, Kavya, Fisher, Zachary, Guo, Ruiqi, Prakash, Sushant, Srinivasan, Pranesh, Zaheer, Manzil, Yu, Felix, Kumar, Sanjiv
Answering complex natural language questions often necessitates multi-step reasoning and integrating external information. Several systems have combined knowledge retrieval with a large language model (LLM) to answer such questions. These systems, ho
Externí odkaz:
http://arxiv.org/abs/2312.10003
Autor:
Li, Daliang, Rawat, Ankit Singh, Zaheer, Manzil, Wang, Xin, Lukasik, Michal, Veit, Andreas, Yu, Felix, Kumar, Sanjiv
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), owing to their excellent understanding and generation abilities. Remarkably, what further sets these models apart is the massive amounts of world
Externí odkaz:
http://arxiv.org/abs/2211.05110
Autor:
Wang, Yihan, Si, Si, Li, Daliang, Lukasik, Michal, Yu, Felix, Hsieh, Cho-Jui, Dhillon, Inderjit S, Kumar, Sanjiv
Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts. They can be further improved towards a specific task by fine-tuning on a specialized dataset. However, fine-tuning usually
Externí odkaz:
http://arxiv.org/abs/2211.00635
Autor:
Li, Zonglin, You, Chong, Bhojanapalli, Srinadh, Li, Daliang, Rawat, Ankit Singh, Reddi, Sashank J., Ye, Ke, Chern, Felix, Yu, Felix, Guo, Ruiqi, Kumar, Sanjiv
This paper studies the curious phenomenon for machine learning models with Transformer architectures that their activation maps are sparse. By activation map we refer to the intermediate output of the multi-layer perceptrons (MLPs) after a ReLU activ
Externí odkaz:
http://arxiv.org/abs/2210.06313
Publikováno v:
In Bioorganic Chemistry September 2024 150
Publikováno v:
In Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 15 December 2024 323
Publikováno v:
In Talanta 1 February 2024 268 Part 1
Autor:
Bhojanapalli, Srinadh, Chakrabarti, Ayan, Glasner, Daniel, Li, Daliang, Unterthiner, Thomas, Veit, Andreas
Deep Convolutional Neural Networks (CNNs) have long been the architecture of choice for computer vision tasks. Recently, Transformer-based architectures like Vision Transformer (ViT) have matched or even surpassed ResNets for image classification. Ho
Externí odkaz:
http://arxiv.org/abs/2103.14586
Autor:
Zhu, Chen, Rawat, Ankit Singh, Zaheer, Manzil, Bhojanapalli, Srinadh, Li, Daliang, Yu, Felix, Kumar, Sanjiv
Large Transformer models have achieved impressive performance in many natural language tasks. In particular, Transformer based language models have been shown to have great capabilities in encoding factual knowledge in their vast amount of parameters
Externí odkaz:
http://arxiv.org/abs/2012.00363
Publikováno v:
In LWT 1 November 2023 189