Zobrazeno 1 - 10
of 260
pro vyhledávání: '"Rangwala, Huzefa"'
Autor:
Liang, Jiaming, Lei, Chuan, Qin, Xiao, Zhang, Jiani, Katsifodimos, Asterios, Faloutsos, Christos, Rangwala, Huzefa
Data-centric AI focuses on understanding and utilizing high-quality, relevant data in training machine learning (ML) models, thereby increasing the likelihood of producing accurate and useful results. Automatic feature augmentation, aiming to augment
Externí odkaz:
http://arxiv.org/abs/2406.09534
Autor:
Zheng, Da, Song, Xiang, Zhu, Qi, Zhang, Jian, Vasiloudis, Theodore, Ma, Runjie, Zhang, Houyu, Wang, Zichen, Adeshina, Soji, Nisa, Israt, Mottini, Alejandro, Cui, Qingjun, Rangwala, Huzefa, Zeng, Belinda, Faloutsos, Christos, Karypis, George
Publikováno v:
KDD 2024
Graph machine learning (GML) is effective in many business applications. However, making GML easy to use and applicable to industry applications with massive datasets remain challenging. We developed GraphStorm, which provides an end-to-end solution
Externí odkaz:
http://arxiv.org/abs/2406.06022
Machine Learning algorithms (ML) impact virtually every aspect of human lives and have found use across diverse sectors including healthcare, finance, and education. Often, ML algorithms have been found to exacerbate societal biases present in datase
Externí odkaz:
http://arxiv.org/abs/2405.12372
Autor:
Kong, Kezhi, Zhang, Jiani, Shen, Zhengyuan, Srinivasan, Balasubramaniam, Lei, Chuan, Faloutsos, Christos, Rangwala, Huzefa, Karypis, George
Large Language Models (LLMs) trained on large volumes of data excel at various natural language tasks, but they cannot handle tasks requiring knowledge that has not been trained on previously. One solution is to use a retriever that fetches relevant
Externí odkaz:
http://arxiv.org/abs/2402.14361
Autor:
Mavromatis, Costas, Srinivasan, Balasubramaniam, Shen, Zhengyuan, Zhang, Jiani, Rangwala, Huzefa, Faloutsos, Christos, Karypis, George
Large Language Models (LLMs) can adapt to new tasks via in-context learning (ICL). ICL is efficient as it does not require any parameter updates to the trained LLM, but only few annotated examples as input for the LLM. In this work, we investigate an
Externí odkaz:
http://arxiv.org/abs/2310.20046
Autor:
Zhang, Jiani, Shen, Zhengyuan, Srinivasan, Balasubramaniam, Wang, Shen, Rangwala, Huzefa, Karypis, George
Recent advances in large language models have revolutionized many sectors, including the database industry. One common challenge when dealing with large volumes of tabular data is the pervasive use of abbreviated column names, which can negatively im
Externí odkaz:
http://arxiv.org/abs/2310.13196
Autor:
Zhang, Hengrui, Zhang, Jiani, Srinivasan, Balasubramaniam, Shen, Zhengyuan, Qin, Xiao, Faloutsos, Christos, Rangwala, Huzefa, Karypis, George
Recent advances in tabular data generation have greatly enhanced synthetic data quality. However, extending diffusion models to tabular data is challenging due to the intricately varied distributions and a blend of data types of tabular data. This pa
Externí odkaz:
http://arxiv.org/abs/2310.09656
Autor:
Wang, Zifeng, Wang, Zichen, Srinivasan, Balasubramaniam, Ioannidis, Vassilis N., Rangwala, Huzefa, Anubhai, Rishita
Foundation models (FMs) are able to leverage large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained a
Externí odkaz:
http://arxiv.org/abs/2310.03320
Individual-level data (microdata) that characterizes a population, is essential for studying many real-world problems. However, acquiring such data is not straightforward due to cost and privacy constraints, and access is often limited to aggregated
Externí odkaz:
http://arxiv.org/abs/2212.05975
Representation learning for proteins has primarily focused on the global understanding of protein sequences regardless of their length. However, shorter proteins (known as peptides) take on distinct structures and functions compared to their longer c
Externí odkaz:
http://arxiv.org/abs/2211.06428