In Super Bowl LIV this Sunday, the Kansas City Chiefs will face the San Francisco 49ers in Miami, Florida. This will be the first Super Bowl appearance for the Chiefs in 50 years. After a loss in 2013, the 49ers are looking to secure their sixth Super Bowl ring, which would tie them with the New England Patriots and Pittsburgh Steelers for the league record. Beyond the history, this game will also represent a classic clash in styles. Throughout this past season, the Chiefs have favored an aggressive offensive strategy, whereas the 49ers have relied on a staunch defense. The old adage, true or not, goes “defense wins championships.” So, what will we see play out on February 2nd?
Here at DataRobot, Andrew Engel (GM of Sports & Gaming) ran the models to see which team is more likely to win Super Bowl LIV. The results? We predict the Kansas City Chiefs to win Super Bowl LIV with a probability of 63.5% and by a margin of 5.5 points.
Andrew’s approach builds on the predictions made by FiveThirtyEight from Elo ratings of the teams and their quarterbacks. Starting with FiveThirtyEight’s data, Andrew sought to also take into account the respective impacts of the offense and defense of each team. He created features representing how each team’s offense or defense stacks up against the other and against the league average.
Andrew brought this data into the DataRobot platform, training the model on games going back to 1950 in order to predict the margin of victory. From the historical predictions of this model, Andrew then calculated its error and trained another DataRobot model to predict the error for a given match-up’s margin of victory.
Using the expected margin of victory and error, Andrew simulated 10,000 games between this season’s Kansas City Chiefs and San Francisco 49ers, including features such as their team and quarterback Elo ratings, and offensive and defensive performance. In these 10,000 simulated Super Bowls, the Kansas City Chiefs won approximately 6,350 games. That is to say, if the Super Bowl were to be played 100 times, we would predict the Chiefs to win about 64 of the games.
Below, we’ll explore how the model is making these predictions.
Yellow rows indicate features that are possibly redundant with others, and share a lot of information.
The chart above ranks features by predictive power for this particular model, via a method called permutation importance. In the top four features, we see all the team and quarterback Elo ratings, which clearly succeed in capturing a lot of the signal. But the contributions of the respective offensive and defensive statistics also matter and represent an arena that could be further explored to improve the model. Of them, the most significant feature to this model is the offensive strength of team_1.
We can explore just how offensive and defensive performance informs the outcome via partial dependence plots, like the ones below:
The margin of victory, in the model termed “score_diff”, is calculated as the difference between the score of “team_1” and “team_2” (which are arbitrarily assigned for a given game). So for games in which team_1 has won, the margin of victory will be positive; while in games in which team_2 is the winner, it will be negative. For this particular prediction, we assigned the Chiefs to team_1.
In the plot above, the margin of victory for team_1 increases for stronger offensive performance in the games in the season prior. The reverse is true in the plot below regarding the defense of team_1, as represented by points allowed.
Finally, in this partial dependence plot for team_2’s defensive performance relative to the league, it should be noted the lower the “margin of victory” the better team_2 did. So for team_2 to have been likelier to win, the lower the better for the ratio “team_2_def_league”.
The picture formed from these partial dependence plots and feature importance definitely complicates the notion that “defense wins championships”. If anything, this analysis seems to slightly favor the opposite, which would be good news for Kansas City. Further analysis would provide deeper insights into what matters more, offense or defense.
A probability of 63.5% is not an assured victory, but in the world of sports championship predictions, that makes the Kansas City Chiefs, and their fifth-highest-scoring offense in the league this year, pretty strong favorites.
About the Authors:
Andrew Engel is General Manager for Sports and Gaming at DataRobot. He works with DataRobot customers across sports and casinos, including several Major League Baseball, National Basketball League and National Hockey League teams. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, and HP before joining DataRobot in February of 2016.
Sarah Khatry is a data scientist at DataRobot. Prior to joining DataRobot, Sarah has worked in longform journalism, experimental physics and the entertainment industry. Sarah has her B.A. in English and Physics from Dartmouth College.