Zobrazeno 1 - 10
of 613
pro vyhledávání: '"Bhatele, A."'
Autor:
Singla, Vasu, Yue, Kaiyu, Paul, Sukriti, Shirkavand, Reza, Jayawardhana, Mayuka, Ganjdanesh, Alireza, Huang, Heng, Bhatele, Abhinav, Somepalli, Gowthami, Goldstein, Tom
Training large vision-language models requires extensive, high-quality image-text pairs. Existing web-scraped datasets, however, are noisy and lack detailed image descriptions. To bridge this gap, we introduce PixelProse, a comprehensive dataset of o
Externí odkaz:
http://arxiv.org/abs/2406.10328
Autor:
Hans, Abhimanyu, Wen, Yuxin, Jain, Neel, Kirchenbauer, John, Kazemi, Hamid, Singhania, Prajwal, Singh, Siddharth, Somepalli, Gowthami, Geiping, Jonas, Bhatele, Abhinav, Goldstein, Tom
Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training,
Externí odkaz:
http://arxiv.org/abs/2406.10209
Inference on large language models (LLMs) can be expensive in terms of the compute and memory costs involved, especially when long sequence lengths are used. In particular, the self-attention mechanism used in LLM inference contributes significantly
Externí odkaz:
http://arxiv.org/abs/2406.02542
Autor:
McLeish, Sean, Bansal, Arpit, Stein, Alex, Jain, Neel, Kirchenbauer, John, Bartoldson, Brian R., Kailkhura, Bhavya, Bhatele, Abhinav, Geiping, Jonas, Schwarzschild, Avi, Goldstein, Tom
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit th
Externí odkaz:
http://arxiv.org/abs/2405.17399
Autor:
Nichols, Daniel, Polasam, Pranav, Menon, Harshitha, Marathe, Aniruddha, Gamblin, Todd, Bhatele, Abhinav
Optimizing scientific software is a difficult task because codebases are often large and complex, and performance can depend upon several factors including the algorithm, its implementation, and hardware among others. Causes of poor performance can o
Externí odkaz:
http://arxiv.org/abs/2404.18864
Autor:
Davis, Joshua H., Sivaraman, Pranav, Kitson, Joy, Parasyris, Konstantinos, Menon, Harshitha, Minn, Isaac, Georgakoudis, Giorgis, Bhatele, Abhinav
Portability is critical to ensuring high productivity in developing and maintaining scientific software as the diversity in on-node hardware architectures increases. While several programming models provide portability for diverse GPU platforms, they
Externí odkaz:
http://arxiv.org/abs/2402.08950
Autor:
Cankur, Onur, Tomar, Aditya, Nichols, Daniel, Scully-Allison, Connor, Isaacs, Katherine E., Bhatele, Abhinav
Developing efficient parallel applications is critical to advancing scientific development but requires significant performance analysis and optimization. Performance analysis tools help developers manage the increasing complexity and scale of perfor
Externí odkaz:
http://arxiv.org/abs/2401.13150
Publikováno v:
The 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC '24), June 3-7, 2024, Pisa, Italy. ACM, New York, NY, USA, 14 pages
Large language models are increasingly becoming a popular tool for software development. Their ability to model and generate source code has been demonstrated in a variety of contexts, including code completion, summarization, translation, and lookup
Externí odkaz:
http://arxiv.org/abs/2401.12554
Parallel applications can spend a significant amount of time performing I/O on large-scale supercomputers. Fast near-compute storage accelerators called burst buffers can reduce the time a processor spends performing I/O and mitigate I/O bottlenecks.
Externí odkaz:
http://arxiv.org/abs/2312.06131
Despite their better convergence properties compared to first-order optimizers, second-order optimizers for deep learning have been less popular due to their significant computational costs. The primary efficiency bottleneck in such optimizers is mat
Externí odkaz:
http://arxiv.org/abs/2310.12298