Thoughts You Can Trust? Evaluating the Faithfulness of Model-Generated Explanations and Their Effects on Human Performance
๐ฉโ๐ซ Speaker: Dr. Oana-Maria Camburu
๐
Time: 2025/11/12
๐ฅ Recording:
๐ฌ
Recording will be available after the talk
(2025/11/12)
๐ Abstract:
Large Language Models (LLMs) can readily generate natural language explanationsโor chain-of-thoughts (CoTs)โto justify their outputs. In this talk, I will first introduce methods for evaluating whether such explanations faithfully reflect the decision-making processes of the models that produce them. Second, I present the results of a user study involving 85 clinicians and medical students diagnosing chest X-rays. The study compares the effectiveness of natural language explanations, saliency maps, and their combination in supporting clinical decision-making.
๐ฉโ๐ Biography:
Oana-Maria Camburu is an Assistant Professor in the Department of Computing at Imperial College London. She was a Principal Research Fellow in the Department of Computer Science at the University College London, where she held an Early Career Leverhulme Fellowship. She also did a postdoc at the University of Oxford, from which she obtained her PhD on Explaining Deep Neural Networks. Her main research interests lie in explainability, AI safety, and alignment.