LAUGEL Thibault
Supervision : Christophe MARSALA, Marie-Jeanne LESOT
Co-supervision : DETYNIECKI Marcin
Local Post-hoc Interpretability for Black-box Classifiers
This thesis focuses on the field of XAI (eXplainable AI), and more particularly local post-hoc interpretability paradigm, that is to say the generation of explanations for a single prediction of a trained classifier. In particular, we study a fully agnostic context, meaning that the explanation is generated without using any knowledge about the classifier (treated as a black-box) nor the data used to train it. In this thesis, we identify several issues that can arise in this context and that may be harmful for interpretability. We propose to study each of these issues and propose novel criteria and approaches to detect and characterize them. The three issues we focus on are: the risk of generating explanations that are out of distribution; the risk of generating explanations that cannot be associated to any ground-truth instance; and the risk of generating explanations that are not local enough. These risks are studied through two specific categories of interpretability approaches: counterfactual explanations and local surrogate models.
Defence : 07/03/2020 - 14h00 - Visioconférence
Jury members :
M. Jamal Atif, Dauphine LAMSADE [rapporteur]
M. Marcin Detyniecki, AXA
Mme Fosca Giannotti, Université de Pise KDDLab / ISTI-CNR [rapporteur]
Mme Marie-Jeanne Lesot, Sorbonne Université LIP6
M. Christophe Marsala, Sorbonne Université LIP6
M. Nicolas Maudet, Sorbonne Université LIP6
M. Chris Russell, Alan Turing Institute / University of Surrey
1 PhD student (Supervision / Co-supervision)
- JEYASOTHY Adulam : interprétabilité des modèles en apprentissage automatique
2018-2020 Publications
-
2020
- Th. Laugel : “Interprétabilité Locale Post-hoc des modèles de classification "boîtes noires"”, thesis, defence 07/03/2020, supervision Marsala, marie-jeanne lesot, Christophe, rapporteurs : DETYNIECKI Marcin (2020)
-
2019
- V. Ballet, †. Xavier, J. Aigrain, Th. Laugel, P. Frossard, M. Detyniecki : “Imperceptible Adversarial Attacks on Tabular Data”, NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness and Privacy (Robust AI in FS 2019), Vancouver, Canada (2019)
- Th. Laugel, M.‑J. Lesot, Ch. Marsala, X. Renard, M. Detyniecki : “Unjustified Classification Regions and Counterfactual Explanations In Machine Learning”, Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science, vol 11907, vol. 11907 (II), Lecture Notes in Computer Science, Würzburg, Germany, pp. 37-54 (2019)
- Th. Laugel, M.‑J. Lesot, Ch. Marsala, X. Renard, M. Detyniecki : “The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations”, Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, {IJCAI-19}, Macao, Macao, pp. 2801-2807, (International Joint Conferences on Artificial Intelligence Organization) (2019)
- Th. Laugel, M.‑J. Lesot, Ch. Marsala, M. Detyniecki : “Issues with post-hoc counterfactual explanations: a discussion”, ICML Workshop on Human in the Loop Learning (HILL 2019), Long Beach, United States (2019)
-
2018
- X. Renard, Th. Laugel, M.‑J. Lesot, Ch. Marsala, M. Detyniecki : “Detecting Potential Local Adversarial Examples for Human-Interpretable Defense”, Workshop on Recent Advances in Adversarial Learning (Nemesis) of the European Conference on Machine Learning and Principles of Practice of Knowledge Discovery in Databases (ECML-PKDD), Dublin, Ireland (2018)
- Th. Laugel, X. Renard, M.‑J. Lesot, Ch. Marsala, M. Detyniecki : “Defining Locality for Surrogates in Post-hoc Interpretablity”, Workshop on Human Interpretability for Machine Learning (WHI) - International Conference on Machine Learning (ICML), Stockholm, Sweden (2018)
- Th. Laugel, M.‑J. Lesot, Ch. Marsala, X. Renard, M. Detyniecki : “Comparison-based Inverse Classification for Interpretability in Machine Learning”, 17th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2018), Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations, Cadix, Spain, pp. 100-111, (Springer Verlag) (2018)