Zobrazeno 1 - 10
of 60
pro vyhledávání: '"McDermott, Matthew B. A."'
Autor:
Oufattole, Nassim, Bergamaschi, Teya, Kolo, Aleksia, Jeong, Hyewon, Gaggin, Hanna, Stultz, Collin M., McDermott, Matthew B. A.
Effective, reliable, and scalable development of machine learning (ML) solutions for structured electronic health record (EHR) data requires the ability to reliably generate high-quality baseline models for diverse supervised learning tasks in an eff
Externí odkaz:
http://arxiv.org/abs/2411.00200
Autor:
Steinberg, Ethan, Wornow, Michael, Bedi, Suhana, Fries, Jason Alan, McDermott, Matthew B. A., Shah, Nigam H.
The growing demand for machine learning in healthcare requires processing increasingly large electronic health record (EHR) datasets, but existing pipelines are not computationally efficient or scalable. In this paper, we introduce meds_reader, an op
Externí odkaz:
http://arxiv.org/abs/2409.09095
Reproducibility remains a significant challenge in machine learning (ML) for healthcare. Datasets, model pipelines, and even task/cohort definitions are often private in this field, leading to a significant barrier in sharing, iterating, and understa
Externí odkaz:
http://arxiv.org/abs/2406.19653
Autor:
McDermott, Matthew B. A., Hansen, Lasse Hyldig, Zhang, Haoran, Angelotti, Giovanni, Gallifant, Jack
In machine learning (ML), a widespread adage is that the area under the precision-recall curve (AUPRC) is a superior metric for model comparison to the area under the receiver operating characteristic (AUROC) for binary classification tasks with clas
Externí odkaz:
http://arxiv.org/abs/2401.06091
Generative, pre-trained transformers (GPTs, a.k.a. "Foundation Models") have reshaped natural language processing (NLP) through their versatility in diverse downstream tasks. However, their potential extends far beyond NLP. This paper provides a soft
Externí odkaz:
http://arxiv.org/abs/2306.11547
Autor:
McDermott, Matthew B. A.
Datasets in the machine learning for health and biomedicine domain are often noisy, irregularly sampled, only sparsely labeled, and small relative to the dimensionality of the both the data and the tasks. These problems motivate the use of representa
Externí odkaz:
https://hdl.handle.net/1721.1/144655
Autor:
Falck, Fabian, Zhou, Yuyin, Rocheteau, Emma, Shen, Liyue, Oala, Luis, Abebe, Girmaw, Roy, Subhrajit, Pfohl, Stephen, Alsentzer, Emily, McDermott, Matthew B. A.
A collection of the accepted abstracts for the Machine Learning for Health (ML4H) symposium 2021. This index is not complete, as some accepted abstracts chose to opt-out of inclusion.
Externí odkaz:
http://arxiv.org/abs/2112.00179
Language model pre-training and derived methods are incredibly impactful in machine learning. However, there remains considerable uncertainty on exactly why pre-training helps improve performance for fine-tuning tasks. This is especially true when at
Externí odkaz:
http://arxiv.org/abs/2103.10334
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 57-62).
Gene expression data holds th
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 57-62).
Gene expression data holds th
Externí odkaz:
https://hdl.handle.net/1721.1/121738
Recent developments in Natural Language Processing (NLP) demonstrate that large-scale, self-supervised pre-training can be extremely beneficial for downstream tasks. These ideas have been adapted to other domains, including the analysis of the amino
Externí odkaz:
http://arxiv.org/abs/2102.00466