It can be difficult to ascertain which variables machine learning and artificial intelligence (AI) place emphasis on and how automated decisions are made. Because of this, we are currently developing methods for Explainable Artificial Intelligence (XAI) which will allow us to gain insight into the black box. This will contribute to quality-assured calculations and accurate explanations.

Selecting appropriate methods for XAI

One of the central issues is the ability to provide understandable explanations regarding how systems rooted in machine learning and AI calculate predictions or make decisions. A myriad of methods have been developed in recent years in order to determine these processes, but not all are useful or correct.

NR has developed eXplego, a decision tree toolkit, to ease navigation in this environment. eXplego provides interactive guidance to developers in the process of selecting an XAI-method.

The figure shows Shapley values for three different predictions. The graph grey and values are highlighted in red, blue and green columns.
Figure caption: Shapley values for three different predictions, calculated with three separate methods. The numbers shown on the vertical axis denote how the variables contribute, positively or negatively, to predictions. The presumption of independence (the red columns) often result in different and often misleading explanations. The two methods that consider dependence (Ctree and VAEAC) display more consensus and are presumably more accurate. Figure: Olsen, Lars Henry Berge, Ingrid Kristine Glad, Martin Jullum, and Kjersti Aas. “Using Shapley Values and Variational Autoencoders to Explain Predictive Models with Dependent Mixed Features.” JLMR 23, no. 213 (2022): 1-51.

Counterfactual explanations and Shapley values

We have specifically worked with two classes of explanation:

  1. Counterfactual explanations
  2. Shapley values

Counterfactual explanations assess what is required to achieve an alternative outcome with various input. For instance, if your income is adjusted higher or lower.

Shapley values derive from game theory and strive to distribute the importance of each component implemented into the model in an equitable way.

Regardless of methodology, our biggest concern lies in the accuracy of explanations.

Conditional variables provide more accurate explanations

A significant obstacle is that variables in a machine learning model are usually not independent, yet a lot of reputable explanation methods conveniently presume independence. However, the size of your income will, for example, often correlate with your age. By considering this conditionality in a realistic way, we can provide accurate explanations regarding the behaviour of machine learning models. In this context, our statistical competence is of significant advantage.

Current projects