Last Updated on
Plotting individual decision trees can provide insight into the gradient boosting process for a given dataset.
In this tutorial you will discover how you can plot individual decision trees from a trained gradient boosting model using XGBoost in Python.
Discover how to configure, fit, tune and evaluation gradient boosting models with XGBoost in my new book, with 15 step-by-step tutorial lessons, and full python code.
Let’s get started.
- Update Mar/2018: Added alternate link to download the dataset as the original appears to have been taken down.
How to Visualize Gradient Boosting Decision Trees With XGBoost in Python
Photo by Kaarina Dillabough, some rights reserved.
Need help with XGBoost in Python?
Take my free 7-day email course and discover configuration, tuning and more (with sample code).
Click to sign-up now and also get a free PDF Ebook version of the course.
Start Your FREE Mini-Course Now!
Plot a Single XGBoost Decision Tree
The XGBoost Python API provides a function for plotting decision trees within a trained XGBoost model.
This capability is provided in the plot_tree() function that takes a trained model as the first argument, for example:
This plots the first tree in the model (the tree at index 0).
This plot can be saved to file or shown on the screen using matplotlib and pyplot.show().
This plotting capability requires that you have the graphviz library installed.
We can create an XGBoost model on the Pima Indians onset of diabetes dataset and plot the first tree in the model .
Download the dataset and place it in your current working directory.
The full code listing is provided below:
Running the code creates a plot of the first decision tree in the model (index 0), showing the features and feature values for each split as well as the output leaf nodes.
XGBoost Plot of Single Decision Tree
You can see that variables are automatically named like f1 and f5 corresponding with the feature indices in the input array.
You can see the split decisions within each node and the different colors for left and right splits (blue and red).
The plot_tree() function takes some parameters.
You can plot specific graphs by specifying their index to the num_trees argument. For example, you can plot the 5th boosted tree in the sequence as follows:
You can also change the layout of the graph to be left to right (easier to read) by changing the rankdir argument as ‘LR’ (left-to-right) rather than the default top to bottom (UT).
Peter Prettenhofer - Gradient Boosted Regression Trees in scikit-learn
The result of plotting the tree in the left-to-right layout is shown below.
XGBoost Plot of Single Decision Tree Left-To-Right
In this post you learned how to plot individual decision trees from a trained XGBoost gradient boosted model in Python.
Do you have any questions about plotting decision trees in XGBoost or about this post?
Ask your questions in the comments and I will do my best to answer.
# plot decision tree
from numpy import loadtxt
from xgboost import XGBClassifier
from xgboost import plot_tree
import matplotlib.pyplot asplt
# load data
# split data into X and y
# fit model no training data
# plot single tree
Discover The Algorithm Winning Competitions!
Develop Your Own XGBoost Models in Minutes
...with just a few lines of Python
Discover how in my new Ebook:
XGBoost With Python
It covers self-study tutorials like:
Algorithm Fundamentals, Scaling, Hyperparameters, and much more...
Bring The Power of XGBoost To Your Own Projects
Skip the Academics.
Just Results.See What's Inside