This book explains limitations of current methods in interpretable machine learning. The methods include partial dependence plots (PDP), Accumulated Local Effects (ALE), permutation feature importance, leave-one-covariate out (LOCO) and local interpretable model-agnostic explanations (LIME). All of those methods can be used to explain the behavior and predictions of trained machine learning models. But the interpretation methods might not work well in the following cases:

  • if a model models interactions (e.g. when a random forest is used)
  • if features strongly correlate with each other
  • if the model does not correctly model causal relationships
  • if parameters of the interpretation method are not set correctly

This book is the outcome of the seminar “Limitations of Interpretable Machine Learning” which took place in summer 2019 at the Department of Statistics, LMU Munich.

Cover by @YvonneDoinel

Creative Commons License

Creative Commons License

This book is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.