Tuesday, 24 June 2025, 08:30 - 12:30 CEST (Central European Summer Time - Sweden)
Wojciech Samek (short bio)
Fraunhofer Heinrich Hertz Institute and Technical University Berlin, Germany
Modality
on-line
Target Audience
This course targets core as well as applied ML researchers. The target audience also includes industry professionals and developers who regularly train and work with deep neural networks.
Abstract
Explainable Artificial Intelligence (XAI) has made significant progress, offering various techniques to explain AI models, each designed for distinct purposes. Early XAI methods provided valuable tools for examining models by explaining individual predictions and visualizing internal model processes, such as the concepts encoded by neurons. These methods have been instrumental in detecting flawed prediction strategies, often referred to as “Clever Hans” behaviors. However, as AI models become more complex, especially with the rise of generative models like large language models (LLMs), next-generation explanation techniques are needed to provide more human- understandable and actionable insights.
This course offers a structured overview of both classical and next-generation XAI methods. It begins with a discussion of classical XAI approaches, their theoretical foundations, and the common challenges and misconceptions that characterized early research in this field. The course then transitions to cutting-edge developments, focusing on methods that provide more comprehensive, human-readable, and practical explanations. These methods help users better understand, debug, and improve AI models in real-world applications. Special attention will be given to concept-level explanations and interactive XAI techniques, which allow users to interact with machine learning models more effectively. Practical use cases will be presented to illustrate how these methods can enhance model transparency and performance.
The final part of the course addresses the growing importance of XAI in the context of Foundation Models, such as LLMs, and introduces recent methodological breakthroughs that offer deeper insights into these models. The course will also provide an outlook on future applications of XAI. Key topics include black-box model challenges, human-understandable explanations and interactive XAI, training data attribution and scalable methods for explaining global model behaviour, and actionable explanations for understanding, debugging, improving and validating LLMs and Foundation Models.
Benefits for attendees
This course targets core as well as applied ML researchers. Core machine learning researchers may be interested to learn about the connections between the different explanation methods, and the broad set of open questions, in particular, how to extend XAI to new ML algorithms. Applied ML researchers may find it interesting to understand the strong assumptions behind standard validation procedures, and why interpretability can be useful to further validate their model. They may also discover new tools to analyze their data and extract insight from it.
Upon completion attendees will:
- learn about a variety of next-generation explanation methods (attribution-based methods, concept-level XAI, recent training data attribution methods, global XAI approaches)
- understand the theoretical underpinnings, relations and limitations of these methods
- learn how these XAI approaches enable the human user to interact with the ML model in a targeted manner to gain insights in the prediction strategies, debug, validate and improve the model.
- learn about practical use cases, in particular how to apply these methods to state-of-the-art LLMs and Foundation Models
Course Content
The course begins by exploring "classical" XAI (Explainable AI) techniques, including their applications, theoretical foundations, and the common challenges and misconceptions that arose during the initial phase of XAI research. The second section will shift to newer advancements in the field, with a focus on next-generation XAI methods that offer more comprehensive and actionable explanations. Especially, we will present an approach that delivers more human-understandable explanations (e.g., in terms of concepts), discuss its applications, and introduce a recent toolbox implementing this approach. Furthermore, we will focus on interactive XAI approaches, which enable the human user to interact with the ML model in a targeted manner. Here we will present different use cases, where XAI is used to better understand, debug, and refine AI models. The final part will cover the latest developments in XAI for Foundation Models and give an outlook on future applications of XAI.
The topics covered are:
- Motivations: Black-box models and the “Clever Hans” effect
- Classical Explainable AI: Concepts, methods & applications
- Challenges and Common Misconceptions in XAI
- Concept-level and human-understandable explanation of the model's inference process
- Training data attribution and scalable methods for explaining global model behaviour
- Actionable explanations for LLMs (understanding, debugging, improving, validating)
Bio Sketch of Course instructor
Wojciech Samek is a Professor in the EECS Department at Technical University Berlin and is jointly heading the AI Department at Fraunhofer Heinrich Hertz Institute. He studied computer science at Humboldt University of Berlin, Heriot-Watt University and University of Edinburgh and received the Dr. rer. nat. degree with distinction from TU Berlin in 2014. He is a Fellow at BIFOLD – Berlin Institute for the Foundation of Learning and Data, the ELLIS Unit Berlin, and the DFG Research Unit DeSBi. Furthermore, he is an elected member of the IEEE MLSP Technical Committee and the Germany’s Platform for AI. He has co-authored more than 200 papers and was leading editor of the Springer book “Explainable AI: Interpreting, Explaining and Visualizing Deep Learning” (2019), and co-editor of the open access Springer book “xxAI – Beyond explainable AI” (2022). He has served as Program Co-Chair for IEEE MLSP’23, and as Area Chair for NAACL’21 and NeurIPS’23 and 24, and is a recipient of multiple best paper awards, including the 2020 Pattern Recognition Best Paper Award and the 2022 Digital Signal Processing Best Paper Prize.