The Case for Interpretable Machine Learning

There’s never been a better time to embrace interpretable machine learning methods than today. Analytica’s experts apply interpretable methods across a wide variety of programming languages for our clients to help answer various questions related to the machine learning models we produce. Transparency and accountability have always been at the forefront of concern about machine learning. With questions about oversight coming from new sources including mainstream media and political leaders, now more than ever machine learning practitioners must be able to justify their predictive models given the intense scrutiny focused on the results, outcomes, and implications of their work. Pointed questions about machine learning products make for a more challenging environment for practitioners, but the heart of the matter resides in the tradeoffs and choices that exist for every machine learning project. This increased attention necessitates a bigger toolbox to address our customers’ concerns.

Briefly consider the most common trade-offs in every machine learning project. Some trade-offs are more practical than technical. For example, the choice between algorithms could force cost or time considerations, and cloud computing costs could skyrocket under less-than-ideal configurations or more development time might not be worth the increase of accuracy delivered. These types of trade-offs are most associated with hitting deadlines and protecting development costs.

Another type of practical trade-off is ethical in nature. The type of data used and the way it is gathered are often more important than the underlying prediction. Current laws forbid adverse actions by our institutions against protected classes, but for all the excitement of generative AI, each new release brings new lawsuits. A recent example is artists and writers fighting to protect their images, likeness, and livelihoods both in the courtroom and in organized collective bargaining. Ignoring terms of service during the data-gathering step is likely to hurt companies in the long run, no matter how much publicity is generated on a short-term basis by cutting corners.

A third type of trade-off well known to practitioners is the variance/bias trade-off. This trade-off applies to the choice of algorithm applied to a given problem. More clearly stated: simple models generalize well on data an algorithm hasn’t seen yet, while more sophisticated algorithms tend to better find nuances in the data at the expense of worse generality on new data. A practitioner may find themselves struggling with this trade-off, as complex models are necessary to accurately represent our complicated world.

What should practitioners do when tough questions arrive? How are model features related to the prediction overall on average? Do certain features contribute to a prediction independently or do they jointly describe complexity in an unexpected way? If an unexpected explanation rises to the top, does it indicate the model is wrong, or is the algorithm pointing to a truth that requires a closer look? Why do specific data points receive a certain prediction? A review of interpretable machine-learning techniques can help answer these questions.

As statistician and machine learning expert Christoph Molnar wrote “A model is better interpretable than another model if its decisions are easier for a human to comprehend than decisions from the other model.” This generally means a practitioner must choose between one of two options: choose an algorithm that is inherently interpretable or choose a less interpretable algorithm where applying interpretable methods are available and straightforward to apply. For many of today’s most challenging problems, the simple interpretable model will not suffice. What then is a responsible practitioner to do?

The answer is generally to focus on model-agnostic methods, such as methods that work on any type of algorithm. Furthermore, a practitioner should be proficient with two types of interpretable methods: global and local. Global interpretive methods answer large scale questions about a model, such as “What are the most important features of an algorithm?” or “How does a feature describe the target prediction on average?” These questions address the entire sample of data used to fit an algorithm and speak to the broad questions about the algorithm itself. Consider the example of a FICO score. The FICO score is itself a machine learning product and we have a general idea of how factors, such as number of delinquent accounts, affect the overall score on average. The machine learning practitioner should be familiar with topics such as functional decomposition and partial dependence as well as data visualizations such as accumulated local effects plots and partial dependence plots to tell a compelling story.

Conversely, local interpretive methods focus on hyper-specific concerns. Consider the example of a FICO score once again. Each and every person with a credit history has the right to inquire which factors contribute to their specific FICO score.

These hyper-specific questions require special techniques. The responsible practitioner should be comfortable with Local Interpretable Model-Agnostic Explanations (LIME) and Shapley Additive exPlanations (SHAP) methods to understand the specifics of a given prediction.

The IRS made headlines in September 2023 when the New York Times reported their intention to leverage Machine Learning to fight tax evasion more efficiently. The sheer volume of returns makes the shift to Machine Learning wise and dovetails nicely with their focus on modernized infrastructure. Their pivot to leveraging data analytics brings familiar comments, as well as an appreciation for leveraging state-of-the-art methods coupled with concern for proper use. An October 2023 Bloomberg Tax article by Chris Cioffi highlighted the types of questions the IRS can expect to receive. “The IRS must be careful to have checks and balances in place that ensure there’s an understanding of how the AI works, but also there’s an understanding of tax law. The extends from what kinds of data the computers are trained on, to being able to track down why a computer selected a certain filing for audit.” Analytica’s policy is to apply these techniques wherever and whenever possible. Thanks to the proliferation of interpretable methods across a wide variety of programming languages that’s never been easier. There’s never been a better time to embrace interpretable machine learning methods.

Back Next

The Case for Interpretable Machine Learning

download