Zobrazeno 1 - 10
of 38
pro vyhledávání: '"Francis, Rhys"'
Autor:
Ward, Francis Rhys, Yang, Zejia, Jackson, Alex, Brown, Randy, Smith, Chandler, Colverd, Grace, Thomson, Louis, Douglas, Raymond, Bartak, Patrik, Rowan, Andrew
Language models (LMs) can exhibit human-like behaviour, but it is unclear how to describe this behaviour without undue anthropomorphism. We formalise a behaviourist view of LM character traits: qualities such as truthfulness, sycophancy, or coherent
Externí odkaz:
http://arxiv.org/abs/2410.04272
Trustworthy capability evaluations are crucial for ensuring the safety of AI systems, and are becoming a key component of AI regulation. However, the developers of an AI system, or the AI system itself, may have incentives for evaluations to understa
Externí odkaz:
http://arxiv.org/abs/2406.07358
Intention is an important and challenging concept in AI. It is important because it underlies many other concepts we care about, such as agency, manipulation, legal responsibility, and blame. However, ascribing intent to AI systems is contentious, an
Externí odkaz:
http://arxiv.org/abs/2402.07221
Deceptive agents are a challenge for the safety, trustworthiness, and cooperation of AI systems. We focus on the problem that agents might deceive in order to achieve their goals (for instance, in our experiments with language models, the goal of bei
Externí odkaz:
http://arxiv.org/abs/2312.01350
How to detect and mitigate deceptive AI systems is an open problem for the field of safe and trustworthy AI. We analyse two algorithms for mitigating deception: The first is based on the path-specific objectives framework where paths in the game that
Externí odkaz:
http://arxiv.org/abs/2306.14816
We define a novel neuro-symbolic framework, argumentative reward learning, which combines preference-based argumentation with existing approaches to reinforcement learning from human feedback. Our method improves prior work by generalising human pref
Externí odkaz:
http://arxiv.org/abs/2209.14010
Autor:
Soo, Ai-Lin, FRANCIS, RHYS
A small group of research intensive universities have been working over the last five years to better understand the characteristics of a more effective research data culture - https://www.researchdataculture.org. A research data culture is sought th
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bc8062ff878709ce6cd2edadf215aeac
As part of the eResearch New Zealand 2023 conference, the Research Data Culture Conversation and NeSI (New Zealand eScience Infrastructure) hosted a workshop seeking to translate the approach and findings from an Australian sector initiative on the c
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::75e4ff168e93f1ec1ae2acd1181d05d2
Presentation by Ai-Lin Soo, Rhys Francis and Luc Betbeder-Matibet giving an overview of Characterising Australia's Experience with Research Data at Scale.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::50690c7e846b80c2bd8a23083ef245e9
The intention of ABLeS (the Australia BioCommons Leadership Share) is to grow and simultaneously accelerate community capacity to construct, maintain and gain insights from community-defined and developed data assets (e.g. reference genome assemblies
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1a2161b86c5368c3e01f6be93a542470