Zobrazeno 1 - 10
of 4 645
pro vyhledávání: '"P. Satheesh"'
Autor:
Akter, Syeda Nahida, Prabhumoye, Shrimai, Kamalu, John, Satheesh, Sanjeev, Nyberg, Eric, Patwary, Mostofa, Shoeybi, Mohammad, Catanzaro, Bryan
The utility of synthetic data to enhance pretraining data quality and hence to improve downstream task accuracy has been widely explored in recent large language models (LLMs). Yet, these approaches fall inadequate in complex, multi-hop and mathemati
Externí odkaz:
http://arxiv.org/abs/2410.12881
Autor:
Agrawal, Aakriti, Ding, Mucong, Che, Zora, Deng, Chenghao, Satheesh, Anirudh, Langford, John, Huang, Furong
How can we harness the collective capabilities of multiple Large Language Models (LLMs) to create an even more powerful model? This question forms the foundation of our research, where we propose an innovative approach to weak-to-strong (w2s) general
Externí odkaz:
http://arxiv.org/abs/2410.04571
Data augmentation, a cornerstone technique in deep learning, is crucial in enhancing model performance, especially with scarce labeled data. While traditional techniques are effective, their reliance on hand-crafted methods limits their applicability
Externí odkaz:
http://arxiv.org/abs/2410.02512
Autor:
Paul, Kaushik, Maurya, Akash, Henry, Quentin, Sharma, Kartikey, Satheesh, Pranav, Divyajyoti, Kumar, Prayush, Mishra, Chandra Kant
We present a time-domain inspiral-merger-ringdowm (IMR) waveform model ESIGMAHM constructed within a framework we named ESIGMA for coalescing binaries of spinning black holes on moderately eccentric orbits (Huerta et al. (2018) [Phys. Rev. D 97, 0240
Externí odkaz:
http://arxiv.org/abs/2409.13866
Publikováno v:
5th Asia Conference on Machine Learning and Computing (ACMLC), Bangkok, Thailand, 2022, pp. 32-40
Recently, Machine Learning (ML) methods are built-in as an important component in many smart agriculture platforms. In this paper, we explore the new combination of advanced ML methods for creating a smart agriculture platform where farmers could rea
Externí odkaz:
http://arxiv.org/abs/2409.05174
Numerical weather prediction (NWP) models often underperform compared to simpler climatology-based precipitation forecasts in northern tropical Africa, even after statistical postprocessing. AI-based forecasting models show promise but have avoided p
Externí odkaz:
http://arxiv.org/abs/2408.16349
Autor:
Gosal, Gurpreet, Xu, Yishi, Ramakrishnan, Gokul, Joshi, Rituraj, Sheinin, Avraham, Zhiming, Chen, Mishra, Biswajit, Vassilieva, Natalia, Hestness, Joel, Sengupta, Neha, Sahu, Sunil Kumar, Jia, Bokang, Pandit, Onkar, Katipomu, Satheesh, Kamboj, Samta, Ghosh, Samujjwal, Pal, Rahul, Mullah, Parvez, Doraiswamy, Soundar, Chami, Mohamed El Karim, Nakov, Preslav
We present an efficient method for adapting a monolingual Large Language Model (LLM) to another language, addressing challenges of catastrophic forgetting and tokenizer limitations. We focus this study on adapting Llama 2 to Arabic. Our two-stage app
Externí odkaz:
http://arxiv.org/abs/2407.12869
As language models have scaled both their number of parameters and pretraining dataset sizes, the computational cost for pretraining has become intractable except for the most well-resourced teams. This increasing cost makes it ever more important to
Externí odkaz:
http://arxiv.org/abs/2407.07263
Autor:
Nvidia, Adler, Bo, Agarwal, Niket, Aithal, Ashwath, Anh, Dong H., Bhattacharya, Pallab, Brundyn, Annika, Casper, Jared, Catanzaro, Bryan, Clay, Sharon, Cohen, Jonathan, Das, Sirshak, Dattagupta, Ayush, Delalleau, Olivier, Derczynski, Leon, Dong, Yi, Egert, Daniel, Evans, Ellie, Ficek, Aleksander, Fridman, Denys, Ghosh, Shaona, Ginsburg, Boris, Gitman, Igor, Grzegorzek, Tomasz, Hero, Robert, Huang, Jining, Jawa, Vibhu, Jennings, Joseph, Jhunjhunwala, Aastha, Kamalu, John, Khan, Sadaf, Kuchaiev, Oleksii, LeGresley, Patrick, Li, Hui, Liu, Jiwei, Liu, Zihan, Long, Eileen, Mahabaleshwarkar, Ameya Sunil, Majumdar, Somshubra, Maki, James, Martinez, Miguel, de Melo, Maer Rodrigues, Moshkov, Ivan, Narayanan, Deepak, Narenthiran, Sean, Navarro, Jesus, Nguyen, Phong, Nitski, Osvald, Noroozi, Vahid, Nutheti, Guruprasad, Parisien, Christopher, Parmar, Jupinder, Patwary, Mostofa, Pawelec, Krzysztof, Ping, Wei, Prabhumoye, Shrimai, Roy, Rajarshi, Saar, Trisha, Sabavat, Vasanth Rao Naik, Satheesh, Sanjeev, Scowcroft, Jane Polak, Sewall, Jason, Shamis, Pavel, Shen, Gerald, Shoeybi, Mohammad, Sizer, Dave, Smelyanskiy, Misha, Soares, Felipe, Sreedhar, Makesh Narsimhan, Su, Dan, Subramanian, Sandeep, Sun, Shengyang, Toshniwal, Shubham, Wang, Hao, Wang, Zhilin, You, Jiaxuan, Zeng, Jiaqi, Zhang, Jimmy, Zhang, Jing, Zhang, Vivienne, Zhang, Yian, Zhu, Chen
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distri
Externí odkaz:
http://arxiv.org/abs/2406.11704
Intent Management Function (IMF) is an integral part of future-generation networks. In recent years, there has been some work on AI-based IMFs that can handle conflicting intents and prioritize the global objective based on apriori definition of the
Externí odkaz:
http://arxiv.org/abs/2405.07621