Zobrazeno 1 - 10
of 10 317
pro vyhledávání: '"A A, Olatunji"'
Autor:
Hanke, Vincent, Blanchard, Tom, Boenisch, Franziska, Olatunji, Iyiola Emmanuel, Backes, Michael, Dziedzic, Adam
While open Large Language Models (LLMs) have made significant progress, they still fall short of matching the performance of their closed, proprietary counterparts, making the latter attractive even for the use on highly private data. Recently, vario
Externí odkaz:
http://arxiv.org/abs/2411.05818
Given the popularity of generative AI, Large Language Models (LLMs) often consume hundreds or thousands of GPUs for parallelizing and accelerating the training process. Communication overhead becomes more pronounced when training LLMs at scale. To el
Externí odkaz:
http://arxiv.org/abs/2409.15241
Autor:
Yao, Jinghan, Jacobs, Sam Ade, Tanaka, Masahiro, Ruwase, Olatunji, Shafi, Aamir, Subramoni, Hari, Panda, Dhabaleswar K.
Large Language Models (LLMs) with long context capabilities are integral to complex tasks in natural language processing and computational biology, such as text generation and protein sequence analysis. However, training LLMs directly on extremely lo
Externí odkaz:
http://arxiv.org/abs/2408.16978
Autor:
Lian, Xinyu, Jacobs, Sam Ade, Kurilenko, Lev, Tanaka, Masahiro, Bekman, Stas, Ruwase, Olatunji, Zhang, Minjia
Existing checkpointing approaches seem ill-suited for distributed training even though hardware limitations make model parallelism, i.e., sharding model state across multiple accelerators, a requirement for model scaling. Consolidating distributed mo
Externí odkaz:
http://arxiv.org/abs/2406.18820
Model checkpoints are critical Deep Learning (DL) artifacts that enable fault tolerance for training and downstream applications, such as inference. However, writing checkpoints to persistent storage, and other I/O aspects of DL training, are mostly
Externí odkaz:
http://arxiv.org/abs/2406.13768
Autor:
Afonja, Tejumade, Olatunji, Tobi, Ogun, Sewade, Etori, Naome A., Owodunni, Abraham, Yekini, Moshood
Recent strides in automatic speech recognition (ASR) have accelerated their application in the medical domain where their performance on accented medical named entities (NE) such as drug names, diagnoses, and lab results, is largely unknown. We rigor
Externí odkaz:
http://arxiv.org/abs/2406.12387
Autor:
Ogun, Sewade, Owodunni, Abraham T., Olatunji, Tobi, Alese, Eniola, Oladimeji, Babatunde, Afonja, Tejumade, Olaleye, Kayode, Etori, Naome A., Adewumi, Tosin
Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced f
Externí odkaz:
http://arxiv.org/abs/2406.11727
Autor:
Abdin, Marah, Aneja, Jyoti, Awadalla, Hany, Awadallah, Ahmed, Awan, Ammar Ahmad, Bach, Nguyen, Bahree, Amit, Bakhtiari, Arash, Bao, Jianmin, Behl, Harkirat, Benhaim, Alon, Bilenko, Misha, Bjorck, Johan, Bubeck, Sébastien, Cai, Martin, Cai, Qin, Chaudhary, Vishrav, Chen, Dong, Chen, Dongdong, Chen, Weizhu, Chen, Yen-Chun, Chen, Yi-Ling, Cheng, Hao, Chopra, Parul, Dai, Xiyang, Dixon, Matthew, Eldan, Ronen, Fragoso, Victor, Gao, Jianfeng, Gao, Mei, Gao, Min, Garg, Amit, Del Giorno, Allie, Goswami, Abhishek, Gunasekar, Suriya, Haider, Emman, Hao, Junheng, Hewett, Russell J., Hu, Wenxiang, Huynh, Jamie, Iter, Dan, Jacobs, Sam Ade, Javaheripi, Mojan, Jin, Xin, Karampatziakis, Nikos, Kauffmann, Piero, Khademi, Mahoud, Kim, Dongwoo, Kim, Young Jin, Kurilenko, Lev, Lee, James R., Lee, Yin Tat, Li, Yuanzhi, Li, Yunsheng, Liang, Chen, Liden, Lars, Lin, Xihui, Lin, Zeqi, Liu, Ce, Liu, Liyuan, Liu, Mengchen, Liu, Weishung, Liu, Xiaodong, Luo, Chong, Madan, Piyush, Mahmoudzadeh, Ali, Majercak, David, Mazzola, Matt, Mendes, Caio César Teodoro, Mitra, Arindam, Modi, Hardik, Nguyen, Anh, Norick, Brandon, Patra, Barun, Perez-Becker, Daniel, Portet, Thomas, Pryzant, Reid, Qin, Heyang, Radmilac, Marko, Ren, Liliang, de Rosa, Gustavo, Rosset, Corby, Roy, Sambudha, Ruwase, Olatunji, Saarikivi, Olli, Saied, Amin, Salim, Adil, Santacroce, Michael, Shah, Shital, Shang, Ning, Sharma, Hiteshi, Shen, Yelong, Shukla, Swadheen, Song, Xia, Tanaka, Masahiro, Tupini, Andrea, Vaddamanu, Praneetha, Wang, Chunyu, Wang, Guanhua, Wang, Lijuan, Wang, Shuohang, Wang, Xin, Wang, Yu, Ward, Rachel, Wen, Wen, Witte, Philipp, Wu, Haiping, Wu, Xiaoxia, Wyatt, Michael, Xiao, Bin, Xu, Can, Xu, Jiahang, Xu, Weijian, Xue, Jilong, Yadav, Sonali, Yang, Fan, Yang, Jianwei, Yang, Yifan, Yang, Ziyi, Yu, Donghan, Yuan, Lu, Zhang, Chenruidong, Zhang, Cyril, Zhang, Jianwen, Zhang, Li Lyna, Zhang, Yi, Zhang, Yue, Zhang, Yunan, Zhou, Xiren
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi
Externí odkaz:
http://arxiv.org/abs/2404.14219
Although robot-assisted cardiovascular catheterization is commonly performed for intervention of cardiovascular diseases, more studies are needed to support the procedure with automated tool segmentation. This can aid surgeons on tool tracking and vi
Externí odkaz:
http://arxiv.org/abs/2404.07594
Autor:
Zhang, Zhenyu, Chen, Runjin, Liu, Shiwei, Yao, Zhewei, Ruwase, Olatunji, Chen, Beidi, Wu, Xiaoxia, Wang, Zhangyang
This paper aims to overcome the "lost-in-the-middle" challenge of large language models (LLMs). While recent advancements have successfully enabled LLMs to perform stable language modeling with up to 4 million tokens, the persistent difficulty faced
Externí odkaz:
http://arxiv.org/abs/2403.04797