Výsledky vyhledávání

Report

Model Attribution in Machine-Generated Disinformation: A Domain Generalization Approach with Supervised Contrastive Learning

Autor: Beigi, Alimohammad, Tan, Zhen, Mudiam, Nivedh, Chen, Canyu, Shu, Kai, Liu, Huan

Model attribution for machine-generated disinformation poses a significant challenge in understanding its origins and mitigating its spread. This task is especially challenging because modern large language models (LLMs) produce disinformation with h

Externí odkaz: http://arxiv.org/abs/2407.21264

Zobrazit plný text záznamu

Report

Can Editing LLMs Inject Harm?

Autor: Chen, Canyu, Huang, Baixiang, Li, Zekun, Chen, Zhaorun, Lai, Shiyang, Xu, Xiongxiao, Gu, Jia-Chen, Gu, Jindong, Yao, Huaxiu, Xiao, Chaowei, Yan, Xifeng, Wang, William Yang, Torr, Philip, Song, Dawn, Shu, Kai

Knowledge editing techniques have been increasingly adopted to efficiently correct the false or outdated knowledge in Large Language Models (LLMs), due to the high cost of retraining from scratch. Meanwhile, one critical but under-explored question i

Externí odkaz: http://arxiv.org/abs/2407.20224

Zobrazit plný text záznamu

Report

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Autor: Chen, Zhaorun, Du, Yichao, Wen, Zichen, Zhou, Yiyang, Cui, Chenhang, Weng, Zhenzhen, Tu, Haoqin, Wang, Chaoqi, Tong, Zhengwei, Huang, Qinglan, Chen, Canyu, Ye, Qinghao, Zhu, Zhihong, Zhang, Yuqing, Zhou, Jiawei, Zhao, Zhuokai, Rafailov, Rafael, Finn, Chelsea, Yao, Huaxiu

While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial

Externí odkaz: http://arxiv.org/abs/2407.04842

Zobrazit plný text záznamu

Report

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Autor: Vidgen, Bertie, Agrawal, Adarsh, Ahmed, Ahmed M., Akinwande, Victor, Al-Nuaimi, Namir, Alfaraj, Najla, Alhajjar, Elie, Aroyo, Lora, Bavalatti, Trupti, Bartolo, Max, Blili-Hamelin, Borhane, Bollacker, Kurt, Bomassani, Rishi, Boston, Marisa Ferrara, Campos, Siméon, Chakra, Kal, Chen, Canyu, Coleman, Cody, Coudert, Zacharie Delpierre, Derczynski, Leon, Dutta, Debojyoti, Eisenberg, Ian, Ezick, James, Frase, Heather, Fuller, Brian, Gandikota, Ram, Gangavarapu, Agasthya, Gangavarapu, Ananya, Gealy, James, Ghosh, Rajat, Goel, James, Gohar, Usman, Goswami, Sujata, Hale, Scott A., Hutiri, Wiebke, Imperial, Joseph Marvin, Jandial, Surgan, Judd, Nick, Juefei-Xu, Felix, Khomh, Foutse, Kailkhura, Bhavya, Kirk, Hannah Rose, Klyman, Kevin, Knotz, Chris, Kuchnik, Michael, Kumar, Shachi H., Kumar, Srijan, Lengerich, Chris, Li, Bo, Liao, Zeyi, Long, Eileen Peters, Lu, Victor, Luger, Sarah, Mai, Yifan, Mammen, Priyanka Mary, Manyeki, Kelvin, McGregor, Sean, Mehta, Virendra, Mohammed, Shafee, Moss, Emanuel, Nachman, Lama, Naganna, Dinesh Jinenhally, Nikanjam, Amin, Nushi, Besmira, Oala, Luis, Orr, Iftach, Parrish, Alicia, Patlak, Cigdem, Pietri, William, Poursabzi-Sangdeh, Forough, Presani, Eleonora, Puletti, Fabrizio, Röttger, Paul, Sahay, Saurav, Santos, Tim, Scherrer, Nino, Sebag, Alice Schoenauer, Schramowski, Patrick, Shahbazi, Abolfazl, Sharma, Vin, Shen, Xudong, Sistla, Vamsi, Tang, Leonard, Testuggine, Davide, Thangarasa, Vithursan, Watkins, Elizabeth Anne, Weiss, Rebecca, Welty, Chris, Wilbers, Tyler, Williams, Adina, Wu, Carole-Jean, Yadav, Poonam, Yang, Xianjun, Zeng, Yi, Zhang, Wenhui, Zhdanov, Fedor, Zhu, Jiacheng, Liang, Percy, Mattson, Peter, Vanschoren, Joaquin

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introdu

Externí odkaz: http://arxiv.org/abs/2404.12241

Zobrazit plný text záznamu

Report

Can Large Language Models Identify Authorship?

Autor: Huang, Baixiang, Chen, Canyu, Shu, Kai

The ability to accurately identify authorship is crucial for verifying content authenticity and mitigating misinformation. Large Language Models (LLMs) have demonstrated exceptional capacity for reasoning and problem-solving. However, their potential

Externí odkaz: http://arxiv.org/abs/2403.08213

Zobrazit plný text záznamu

Report

Can Large Language Model Agents Simulate Human Trust Behaviors?

Autor: Xie, Chengxing, Chen, Canyu, Jia, Feiran, Ye, Ziyu, Shu, Kai, Bibi, Adel, Hu, Ziniu, Torr, Philip, Ghanem, Bernard, Li, Guohao

Large Language Model (LLM) agents have been increasingly adopted as simulation tools to model humans in applications such as social science. However, one fundamental question remains: can LLM agents really simulate human behaviors? In this paper, we

Externí odkaz: http://arxiv.org/abs/2402.04559

Zobrazit plný text záznamu

Report

Combating Misinformation in the Age of LLMs: Opportunities and Challenges

Autor: Chen, Canyu, Shu, Kai

Misinformation such as fake news and rumors is a serious threat on information ecosystems and public trust. The emergence of Large Language Models (LLMs) has great potential to reshape the landscape of combating misinformation. Generally, LLMs can be

Externí odkaz: http://arxiv.org/abs/2311.05656

Zobrazit plný text záznamu

Report

Can LLM-Generated Misinformation Be Detected?

Autor: Chen, Canyu, Shu, Kai

The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental res

Externí odkaz: http://arxiv.org/abs/2309.13788

Zobrazit plný text záznamu

Report

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts, but there is no official standard for means of evaluating those impacts or for which impacts should be evaluated. In this

Externí odkaz: http://arxiv.org/abs/2306.05949

Zobrazit plný text záznamu

Report

MetaGAD: Learning to Meta Transfer for Few-shot Graph Anomaly Detection

Autor: Xu, Xiongxiao, Ding, Kaize, Chen, Canyu, Shu, Kai

Graph anomaly detection has long been an important problem in various domains pertaining to information security such as financial fraud, social spam, network intrusion, etc. The majority of existing methods are performed in an unsupervised manner, a

Externí odkaz: http://arxiv.org/abs/2305.10668

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání