Complete SHAP Guide: Master Local and Global Model Interpretability in Python with Practical Examples

machine_learning

Complete SHAP Guide: Master Local and Global Model Interpretability in Python with Practical Examples

Master SHAP for ML model explainability. Learn local & global interpretations, visualizations, and implementation in Python with practical examples.

Jan 16, 2026

Complete SHAP Guide: Master Local and Global Model Interpretability in Python with Practical Examples

I keep thinking about the trust we place in machine learning. We use these models to approve loans, assist in medical diagnoses, and inform legal decisions. But if we can’t understand why a model makes a specific call, should we use it at all? This question led me down the path of explainable AI, and I found a powerful answer in SHAP. It’s changed how I build and present models. I want to show you how it can do the same for you.

Let’s start with a simple thought. Imagine you’re a director watching a movie. You know the final rating, but you want to know how much each actor contributed to the score. SHAP does exactly that for your model’s predictions. It fairly distributes the “credit” for an outcome among all the input features.

Here’s a basic example using a straightforward model.

import pandas as pd
import numpy as np

# Sample data: a person's loan application
data = pd.DataFrame({
    'income': [65000, 42000, 80000],
    'credit_score': [720, 650, 780],
    'loan_amount': [20000, 15000, 50000]
})

# A very simple, interpretable model (for illustration)
def simple_model(row):
    return (row['income'] * 0.001) + (row['credit_score'] * 0.1) - (row['loan_amount'] * 0.0005)

print("Predictions:", data.apply(simple_model, axis=1))

In this linear case, we can see each feature’s weight. But what happens when the model is a complex, million-tree forest? That’s where SHAP truly shines.

The core idea comes from game theory. It asks: what is the average contribution of a feature, considering every possible combination of other features? This ensures a mathematically fair distribution. The result is a number, a SHAP value, for each feature. A positive value pushes the prediction higher; a negative one pulls it lower.

Why is this better than just checking which feature is most important? Traditional “feature importance” might tell you that ‘income’ matters most on average. But does that help you explain why a specific application was denied? Not really. SHAP gives you that specific, local story.

Let’s move to code with a real dataset. We’ll use a public dataset on heart disease.

import shap
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Load data (example using shap's built-in dataset)
X, y = shap.datasets.adult()
X_display, y_display = shap.datasets.adult(display=True)

# Split and train a model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Create the SHAP explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

print("Model trained. SHAP values calculated.")

With just those few lines, we have a powerful explanation engine ready. What do these values actually look like for one person?

The most intuitive way to see a local explanation is the force plot. It visually shows how each feature moved the model’s output from the average prediction (the baseline) to the final prediction.

# Explain the first person in the test set
shap.initjs() # Enables interactive visualizations in notebooks
person_index = 0
shap.force_plot(explainer.expected_value[1], shap_values[1][person_index], X_test.iloc[person_index])

The plot shows a battle of forces. Features like ‘Capital Gain’ might push the prediction strongly in one direction, while ‘Age’ might pull it back. The sum of all these pushes and pulls equals the model’s final score. Can you see how this instantly builds trust? You can point to the graph and say, “Here’s why.”

But we also need the global view. What’s driving the model’s behavior overall? The summary plot is perfect for this.

# Global feature importance based on SHAP magnitudes
shap.summary_plot(shap_values[1], X_test)

This plot shows every SHAP value for every feature and every person in your dataset. You see the distribution. For ‘Age’, do higher values always increase the prediction? The color gradient shows the feature’s actual value. You might discover that high ‘Age’ only increases risk after a certain point, which is a critical insight.

What about interactions? Sometimes, two features combine in surprising ways. SHAP can reveal this through dependence plots.

# Check how 'Age' interacts with another feature
shap.dependence_plot('Age', shap_values[1], X_test, interaction_index='Hours per week')

This plot might show that the effect of ‘Age’ on income prediction changes dramatically depending on how many hours a person works. These are the insights that turn a good model into a useful, understood tool.

A common question is: isn’t this slow for big models? It can be, but there are tricks. For tree-based models, TreeExplainer is remarkably fast. For very large datasets, you can estimate SHAP values on a sample. The key is that you don’t always need to explain every single prediction; a well-chosen sample often tells the same story.

So, when should you use this? I use SHAP in three main scenarios. First, during model development, to debug strange behavior. If a feature you think is important has near-zero SHAP values, it’s a red flag. Second, for stakeholder reporting. A manager understands a force plot much faster than a feature importance table. Third, for compliance. You often need to provide a reason for an automated decision.

I encourage you to start simple. Pick a model you’ve already built. Calculate the SHAP values. Look at the explanation for a few correct and incorrect predictions. What do you learn? You’ll likely find a subtle bias or an unexpected pattern that improves your next iteration.

This journey from a black box to a clear, glass box is one of the most rewarding in data science. It bridges the gap between technical performance and real-world utility. I hope you’ll try it.

Was this walk-through helpful? Do you have a specific model you’re trying to explain? Share your thoughts or questions in the comments below. If this guide clarified SHAP for you, please consider liking and sharing it with your network. Let’s build more understandable AI together.

Share: Facebook Twitter Reddit LinkedIn WhatsApp Telegram Pinterest Email Instagram

machine_learning

Complete SHAP Guide: Master Local and Global Model Interpretability in Python with Practical Examples

Our Creations

We are on Medium

Similar Posts

Master SHAP for Machine Learning: Complete Guide to Local and Global Model Interpretability

Build Production-Ready ML Model Monitoring and Drift Detection with Evidently AI and MLflow

Complete Guide to SHAP Model Interpretability and Explainable Machine Learning in Python 2024

SHAP Model Explainability Guide: Complete Tutorial for Machine Learning Interpretability in Python

Production Model Interpretation Pipelines: SHAP and LIME Implementation Guide for Python Developers

Production-Ready Feature Engineering Pipelines: Scikit-learn and Pandas Guide for ML Engineers