Zobrazeno 1 - 10
of 15
pro vyhledávání: '"Kuchnik, Michael"'
Autor:
Kokolis, Apostolos, Kuchnik, Michael, Hoffman, John, Kumar, Adithya, Malani, Parth, Ma, Faye, DeVito, Zachary, Sengupta, Shubho, Saladi, Kalyan, Wu, Carole-Jean
Reliability is a fundamental challenge in operating large-scale machine learning (ML) infrastructures, particularly as the scale of ML models and training clusters continues to grow. Despite decades of research on infrastructure failures, the impact
Externí odkaz:
http://arxiv.org/abs/2410.21680
Autor:
Jain, Nitisha, Akhtar, Mubashara, Giner-Miguelez, Joan, Shinde, Rajat, Vanschoren, Joaquin, Vogler, Steffen, Goswami, Sujata, Rao, Yuhan, Santos, Tim, Oala, Luis, Karamousadakis, Michalis, Maskey, Manil, Marcenac, Pierre, Conforti, Costanza, Kuchnik, Michael, Aroyo, Lora, Benjelloun, Omar, Simperl, Elena
Data is critical to advancing AI technologies, yet its quality and documentation remain significant challenges, leading to adverse downstream effects (e.g., potential biases) in AI applications. This paper addresses these issues by introducing Croiss
Externí odkaz:
http://arxiv.org/abs/2407.16883
Autor:
Vidgen, Bertie, Agrawal, Adarsh, Ahmed, Ahmed M., Akinwande, Victor, Al-Nuaimi, Namir, Alfaraj, Najla, Alhajjar, Elie, Aroyo, Lora, Bavalatti, Trupti, Bartolo, Max, Blili-Hamelin, Borhane, Bollacker, Kurt, Bomassani, Rishi, Boston, Marisa Ferrara, Campos, Siméon, Chakra, Kal, Chen, Canyu, Coleman, Cody, Coudert, Zacharie Delpierre, Derczynski, Leon, Dutta, Debojyoti, Eisenberg, Ian, Ezick, James, Frase, Heather, Fuller, Brian, Gandikota, Ram, Gangavarapu, Agasthya, Gangavarapu, Ananya, Gealy, James, Ghosh, Rajat, Goel, James, Gohar, Usman, Goswami, Sujata, Hale, Scott A., Hutiri, Wiebke, Imperial, Joseph Marvin, Jandial, Surgan, Judd, Nick, Juefei-Xu, Felix, Khomh, Foutse, Kailkhura, Bhavya, Kirk, Hannah Rose, Klyman, Kevin, Knotz, Chris, Kuchnik, Michael, Kumar, Shachi H., Kumar, Srijan, Lengerich, Chris, Li, Bo, Liao, Zeyi, Long, Eileen Peters, Lu, Victor, Luger, Sarah, Mai, Yifan, Mammen, Priyanka Mary, Manyeki, Kelvin, McGregor, Sean, Mehta, Virendra, Mohammed, Shafee, Moss, Emanuel, Nachman, Lama, Naganna, Dinesh Jinenhally, Nikanjam, Amin, Nushi, Besmira, Oala, Luis, Orr, Iftach, Parrish, Alicia, Patlak, Cigdem, Pietri, William, Poursabzi-Sangdeh, Forough, Presani, Eleonora, Puletti, Fabrizio, Röttger, Paul, Sahay, Saurav, Santos, Tim, Scherrer, Nino, Sebag, Alice Schoenauer, Schramowski, Patrick, Shahbazi, Abolfazl, Sharma, Vin, Shen, Xudong, Sistla, Vamsi, Tang, Leonard, Testuggine, Davide, Thangarasa, Vithursan, Watkins, Elizabeth Anne, Weiss, Rebecca, Welty, Chris, Wilbers, Tyler, Williams, Adina, Wu, Carole-Jean, Yadav, Poonam, Yang, Xianjun, Zeng, Yi, Zhang, Wenhui, Zhdanov, Fedor, Zhu, Jiacheng, Liang, Percy, Mattson, Peter, Vanschoren, Joaquin
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introdu
Externí odkaz:
http://arxiv.org/abs/2404.12241
Autor:
Akhtar, Mubashara, Benjelloun, Omar, Conforti, Costanza, Gijsbers, Pieter, Giner-Miguelez, Joan, Jain, Nitisha, Kuchnik, Michael, Lhoest, Quentin, Marcenac, Pierre, Maskey, Manil, Mattson, Peter, Oala, Luis, Ruyssen, Pierre, Shinde, Rajat, Simperl, Elena, Thomas, Goeffry, Tykhonov, Slava, Vanschoren, Joaquin, van der Velde, Jos, Vogler, Steffen, Wu, Carole-Jean
Data is a critical resource for Machine Learning (ML), yet working with data remains a key friction point. This paper introduces Croissant, a metadata format for datasets that simplifies how data is used by ML tools and frameworks. Croissant makes da
Externí odkaz:
http://arxiv.org/abs/2403.19546
Although large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are growing concerns around possible negative effects of LLMs such as data memorization, bias, and inappropriate language. Unfortunately
Externí odkaz:
http://arxiv.org/abs/2211.15458
Input pipelines, which ingest and transform input data, are an essential part of training Machine Learning (ML) models. However, it is challenging to implement efficient input pipelines, as it requires reasoning about parallelism, asynchrony, and var
Externí odkaz:
http://arxiv.org/abs/2111.04131
Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks and storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior to training.
Externí odkaz:
http://arxiv.org/abs/1911.00472
Autor:
Kuchnik, Michael, Smith, Virginia
Data augmentation is commonly used to encode invariances in learning methods. However, this process is often performed in an inefficient manner, as artificial examples are created by applying a number of transformations to all points in the training
Externí odkaz:
http://arxiv.org/abs/1810.05222
Autor:
Kuchnik, Michael
The field of machine learning, particularly deep learning, has witnessed tremendous recent advances due to improvements in algorithms, compute, and datasets. Systems built to support deep learning have primarily targeted computations used to produce
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::318ae7b4bedc8e1a5c1e6691d272fa0c
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.