Webinar
Datasets through the Lđź‘€king-Glass
Datasets through the Lđź‘€king-Glass is a webinar series focusing on the data aspects on learning-based methods. Our aim is to build a community of scientists interested in understanding how the data we use affects the algorithms and society as a whole, instead of only optimizing for a performance metric. We draw inspiration from a variety of topics, such as data curation to build datasets, meta-data, shortcuts, fairness, ethics and philosophy in AI.
All previous talks where the authors have agreed to share the talk, can be found in our YouTube playlist.
Next webinar: Evaluation metrics​
Date: 02 March 2026 at 3:00pm CET
Where: Zoom: Register here
Speakers:
- Evangelia Christodoulou - German Cancer Research Center (DKFZ), Germany
Title: From Metrics to Meaning: Making Evaluation Matter in Medical Imaging AI
Abstract: Recent work from the Metrics Reloaded initiative has exposed fundamental weaknesses in how performance is evaluated and reported in medical imaging AI. An analysis of MICCAI 2023 publications shows that many claimed improvements are not statistically supported and often fall within expected uncertainty given the available dataset sizes, raising the risk of false progress claims. These results highlight the need for larger benchmarks, explicit reporting of performance uncertainty, and more stringent validation practices to ensure that reported advances are robust, interpretable, and clinically meaningful.
Short bio: Evangelia Christodoulou holds a PhD in Clinical Prediction Modelling from KU Leuven, with a background in Mathematics and Biostatistics. She is a postdoctoral researcher at the German Cancer Research Center (DKFZ) in Heidelberg, where her work focuses on the validation of AI methods in biomedical imaging, with particular emphasis on performance uncertainty and dataset size effects.
- Abhishek Singh Sambyal - University of Oulu, Finland
Title: Beyond Accuracy: Understanding Calibration in Medical Image Classification
Abstract: Deep neural networks have achieved impressive performance in medical image classification, yet their confidence estimates are often poorly calibrated, limiting their reliability in clinical practice. In high-risk medical settings, accurate predictions must be accompanied by trustworthy uncertainty estimates. This work investigates how different training strategies, including fully supervised learning and rotation-based self-supervised pretraining with and without transfer learning, influence the calibration behavior of deep neural networks. A comprehensive empirical analysis across multiple medical imaging datasets reveals that self-supervised pretraining can significantly improve confidence reliability while maintaining competitive predictive performance. The findings provide practical insights into the relationship between representation learning, training dynamics, and calibration, highlighting pathways toward building more trustworthy medical AI systems.
Short bio: Abhishek Singh Sambyal is a Postdoctoral Researcher in the Intelligent Medical Systems (IMEDS) Group at the University of Oulu. He completed his PhD in Computer Science and Engineering at the IIT Ropar, India. His research focuses on uncertainty quantification, confidence calibration, and improving the reliability of deep neural networks for medical imaging applications.
- Raghavendra Selvan - University of Copenhagen, Denmark
Title: Carbon footprint of Medical Image Analysis and Mitigation Strategies
Abstract: The increasing energy consumption and carbon footprint of deep learning (DL) due to growing compute requirements has become a cause of concern. With our work we hope to inform on the increasing energy costs incurred in medical image analysis. We discuss simple strategies to cut-down the environmental impact that can make model selection and training processes more efficient. We also probe into the trade-off between resource consumption and performance, specifically, when dealing with models that are used in critical settings such as in clinics.
Short bio: Raghavendra Selvan (Raghav) is currently an Assistant Professor (Tenure-track) at the Machine Learning Section, Department of Computer Science, University of Copenhagen. His research spans sustainable machine learning, machine learning for sciences, medical image analysis, and graph neural networks. He holds a PhD from the University of Copenhagen and is affiliated with Pioneer Center for AI (Denmark) and the pan-European AI network ELLIS. Raghav was born in Bangalore, India. RS is the author of the new book “Sustainable AI”.
Previous talks:
All previous abstracts can be found here.
- S01E01 - Dr. Roxana Daneshjou (Stanford University School of Medicine, Stanford, CA, USA). 27th Feb 2023. Challenges with equipoise and fairness in AI/ML datasets in dermatology
- S01E02 - Dr. David Wen (Oxford University Clinical Academic Graduate School, University of Oxford, Oxford, UK). 27th Feb 2023. Characteristics of open access skin cancer image datasets: implications for equitable digital health
- S01E03 - Prof. Colin Fleming (Ninewells Hospital, Dundee, UK). 27th Feb 2023. Characteristics of skin lesions datasets
- S02E01 - Prof. Amber Simpson (Queen’s University, Canada). 5th June 2023. The medical segmentation decathlon
- S02E02 - Dr. Esther E. Bron (Erasmus MC - University Medical Center Rotterdam, the Netherlands). 5th June 2023. Image analysis and machine learning competitions in dementia
- S02E03 - Dr. Ujjwal Baid (University of Pennsylvania, USA). 5th June 2023. Brain tumor segmentation challenge 2023
- S03E01 - Dr. Thijs Kooi (Lunit, South Korea). 18th September 2023. Optimizing annotation cost for AI based medical image analysis
- S03E02 - Dr. Andre Pacheco (Federal University of EspĂrito Santo, Brazil). 18th September 2023. PAD-UFES-20: the challenges and opportunities in creating a skin lesion dataset
- S04E01 - Dr. Jessica Schrouff (Google DeepMind, UK). 4th December 2023. Detecting shortcut learning for fair medical AI
- S04E02 - Rhys Compton and Lily Zhang (New York University, USA). 4th December 2023. When more is less: Incorporating additional datasets can hurt performance by introducing spurious correlations
- S04E03 - Dr. Enzo Ferrante (CONICET, Argentina). 4th December 2023. Building and auditing a large-scale x-ray segmentation dataset with automatic annotations: Navigating fairness without ground-truth
- S05E01 - Hubert Dariusz ZajÄ…c and Natalia-Rozalia Avlona (University of Copenhagen, Denmark). 25th March 2024. Ground Truth Or Dare: Factors Affecting The Creation Of Medical Datasets For Training AI
- S05E02 - Dr. Annika Reinke (DKFZ, Germany). 25th March 2024. Why your Dataset Matters: Choosing the Right Metrics for Biomedical Image Analysis
- S05E03 - Alceu Bissoto and Dr. Sandra Avila (UNICAMP, Brazil). 25th March 2024. The Performance of Transferability Metrics does not Translate to Medical Tasks
- S06E01 - Hava Chaptoukaev and Maria Zuluaga (EURECOM, France). 24th February 2025. Acquiring, curating and releasing a multi-modal dataset for stress detection: ambitions, achievements, mistakes and lessons learned
- S06E02 - Alice Jin (Massachusetts Institute of Technology, USA). 24th February 2025. Fair Multimodal Checklists for Interpretable Clinical Time Series Prediction
- S06E03 - Malih Alikhani and Resmi Ramachandranpillai (Northeastern University, USA). 24th February 2025. Towards Equity: Overcoming Fairness Challenges in Multimodal Learning
- S07E01 - Amelia Jiménez-Sánchez (IT University of Copenhagen, Denmark). 12th May 2025. In the Picture: Medical Imaging Datasets, Artifacts, and their Living Review
- S07E02 - Tiarna Lee (King’s College London, UK). 12th May 2025. Racial bias in cardiac imaging
- S07E03 - Dewinda J. Rumala (UNIVERSA AI, Switzerland). 12th May 2025. Seeing the Same Brain Twice: Data Leakage and Identity Bias in Brain MRI Analysis
- S08E01 - Mamunur Rahaman (University of New South Wales, Australia). 20th October 2025. Advancing Computational Pathology: Multimodal Datasets and Deep Learning Insights
- S08E02 - David Restrepo (CentraleSupélec, Université Paris-Saclay, France). 20th October 2025. Opening Eyes: Advancing Equitable AI Through Open Ophthalmology Data
- S08E03 - Yuki Arase (School of Computing, Institute of Science Tokyo, Japan). 20th October 2025. Japanese Medical Text Simplification Using Patient Blogs
All previous abstracts can be found here.
Organizers
Amelia Jiménez-Sánchez at the Universitat de Barcelona (Spain), Théo Sourget & Veronika Cheplygina at the IT University of Copenhagen (Denmark), and Steff Groefsema at the University of Groningen (the Netherlands). This project has received funding from the Independent Research Fund Denmark - Inge Lehmann number 1134-00017B.
Newsletter
If you want to receive information about upcoming seminars, please sign up to our mailing list. We pick the GDPR-compliant Brevo (formerly Sendinblue) as our mail provider. If you have any concerns relating to our data handling, please read our privacy notice.
Please be aware that many mail providers are tagged as junk, and the confirmation email might end up in your spam folder. Double check if your confirmation email is there. The sender will be PURRlab @ IT University of Copenhagen (amji @ itu.dk). Please add this sender to your contacts. If you have any problems subscribing to our mailing list, please contact Amelia.