Zobrazeno 1 - 10
of 8 557
pro vyhledávání: '"Constantinides A"'
he growing number of parameters and computational demands of large language models (LLMs) present significant challenges for their efficient deployment. Recently, there is an increasing interest in quantizing weights to extremely low precision while
Externí odkaz:
http://arxiv.org/abs/2410.06040
Co-designing an AI Impact Assessment Report Template with AI Practitioners and AI Compliance Experts
In the evolving landscape of AI regulation, it is crucial for companies to conduct impact assessments and document their compliance through comprehensive reports. However, current reports lack grounding in regulations and often focus on specific aspe
Externí odkaz:
http://arxiv.org/abs/2407.17374
Autor:
Bogucka, Edyta, Constantinides, Marios, Velazquez, Julia De Miguel, Šćepanović, Sanja, Quercia, Daniele, Gvirtz, Andrés
Today's visualization tools for conveying the risks and benefits of AI technologies are largely tailored for those with technical expertise. To bridge this gap, we have developed a visualization that employs narrative patterns and interactive element
Externí odkaz:
http://arxiv.org/abs/2407.15685
Translational research, especially in the fast-evolving field of Artificial Intelligence (AI), is key to converting scientific findings into practical innovations. In Responsible AI (RAI) research, translational impact is often viewed through various
Externí odkaz:
http://arxiv.org/abs/2407.15647
Integrating Artificial Intelligence (AI) into mobile and wearables offers numerous benefits at individual, societal, and environmental levels. Yet, it also spotlights concerns over emerging risks. Traditional assessments of risks and benefits have be
Externí odkaz:
http://arxiv.org/abs/2407.09322
A number of companies recently worked together to release the new Open Compute Project MX standard for low-precision computation, aimed at efficient neural network implementation. In this paper, we describe and evaluate the first open-source FPGA imp
Externí odkaz:
http://arxiv.org/abs/2407.01475
Autor:
Zhang, Zixi, Zhang, Cheng, Gao, Xitong, Mullins, Robert D., Constantinides, George A., Zhao, Yiren
Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable paramet
Externí odkaz:
http://arxiv.org/abs/2406.14956
Autor:
Chen, Yuang, Zhang, Cheng, Gao, Xitong, Mullins, Robert D., Constantinides, George A., Zhao, Yiren
Grouped-query attention (GQA) has been widely adopted in LLMs to mitigate the complexity of multi-head attention (MHA). To transform an MHA to a GQA, neighbour queries in MHA are evenly split into groups where each group shares the value and key laye
Externí odkaz:
http://arxiv.org/abs/2406.14963
Manual RTL design and optimization remains prevalent across the semiconductor industry because commercial logic and high-level synthesis tools are unable to match human designs. Our experience in industrial datapath design demonstrates that manual op
Externí odkaz:
http://arxiv.org/abs/2406.12421
eGPU, a recently-reported soft GPGPU for FPGAs, has demonstrated very high clock frequencies (more than 750 MHz) and small footprint. This means that for the first time, commercial soft processors may be competitive for the kind of heavy numerical co
Externí odkaz:
http://arxiv.org/abs/2406.03227