A few weeks ago, one of my favorite webcomics, Math with Bad Drawings, posted a comic in response to his other post, “Why Baseball Statistics Matter”. I really enjoyed Ben Orlin’s explanation because it combines my passion for both baseball and data science. Further, it really resonates with the challenges I’ve seen many companies face as they work to incorporate data science throughout their businesses, as well as the experiences I have had working with various sports teams as they begin using DataRobot. But I don’t think that only baseball statistics matter. I think that there is a lot to learn from all sports analytics for all types of businesses. It’s all about using data to complement intuition so that we can avoid having our heuristics and human biases lead to poor decision making. As Daniel Kahnman once said, “Every company is a decision factory.”
To begin, saying that it took 20 years to assimilate statistics (since Moneyball) into baseball is not really true. Descriptive analytics goes way back to the start of baseball where statistics were being accumulated on the sport. In the late 19th century, Henry Chadwick started recording common baseball stats. In the early 20th century, F.C. Lane proposed an early version of linear weights for batting. And in the late 1940s and early 1950s, Branch Rickey and Allan Roth proposed many of our advanced stats. More recently, in 1977 Bill James started publishing The Bill James Baseball Abstract, seeking more predictive or stable statistics to understand the performance of baseball players and to better forecast their future performance. With the rise of predictive analytics, baseball adopted these tools relatively easily with the framework that was already in place. Additionally, the discrete nature of the action in baseball has made this effort easier to perform and easier to explain.
All Teams and Businesses Should Embrace Sports Analytics
Analytics are also being embraced in free-flowing sports like basketball and soccer. It starts with descriptive analytics, which is an understanding of what has already happened. This answers the questions about how valuable players were and what activities influenced the outcome of past events. From here, analytics rapidly moves on to predictive analytics. This often starts with projecting future player performance (either for evaluating and drafting amateurs), signing free agents, or engaging in trades or transfers. This inevitably leads to collections of massive amounts of data that can be understood and used to build predictive models for predicting the outcome of plays or possessions. From here, the next step is to use the output of these predictive models to develop prescriptive analytics for determining the optimal strategies and tactics.
The advantage we have when we consider sports analytics as a model for how to adopt analytics into any business is that sports are all about competition. All teams are looking for an advantage and will exploit that advantage once they find it. We’ve seen this in all sports throughout the past few decades. It started in baseball, where a few teams embraced analytics and succeeded despite having significantly lower payrolls than other teams. This success forced all of the other teams to also embrace analytics in order to keep up and remain competitive. It is happening in basketball, soccer, football, and other sports. Some teams lead the way in embracing analytics, changing the way they acquire talent, play the game, and find success. Because teams can clearly see the results and determine if what they are doing is working, they can rapidly optimize their process.
What can other businesses learn from this? Why does this work for sports teams? This is because sports teams start small. They get instant feedback and find the quick wins. In baseball, analysts start by providing hitting heat maps of the strike zone, spray charts of batted balls, defensive positionings, baserunning strategies, and pitcher game-plans. This provides immediate value to the organization and also wins some respect from potential critics.
Speaking the Languages of Sports and Business
Next, results are couched in the language of the sport. Businesses understand that the users of their analyses are not analytics professionals, but rather sports executives, scouts, coaches, and players. They can’t be convinced with numbers and graphs alone. The results need to translate insights into a language that can be used every day. To do this successfully, data scientists need to understand the sports domain and that their work needs to directly impact the team itself. They are not predicting future player performance as an end in of itself, rather they are predicting future performance to enable the team to better allocate their dollars and buy wins. In sports, knowing how to coach a player, such as altering their mechanics or teaching them how to throw a new type of pitch, can mean the difference in having the entire team make the playoffs.
Understanding the importance of the components of a player's background and biomechanical indicators, along with analyzing the text in a scouting report, can better predict which players to scout and draft. Breaking down the vast amounts of newly available sensor data, pitching mechanics, training information and biometrics, data can reveal hidden complexities that lead to identifying injury risk, keeping athletes performing in top form while staying off the injured list.
Outside of sports, I’ve seen too many organizations forget about this. These organizations present the results in the language of data science (not in the language that a business can understand) and in a way that is not actionable. Too often, these data science organizations spend months (and sometimes years) building analytical models that recommend solutions that are not even possible for the business.
Sports teams recognize that analytics is only one voice and one perspective in the organization. Everyone has a say and not everything can be captured in an analysis. The key is in synthesizing this information into something greater than any of the individual portions. Too many data science organizations forget that while not all expert wisdom is right, not all of it is wrong. You need to work with those experts and bring them onto your side.
Sports teams move at the speed of their sport. The draft happens on a fixed date, games happen almost every day during the season, and players are dangled for trades hourly. Analysis needs to be done on time to build this synthesis and make decisions. They also know that any advantage they find is likely to be fleeting, so they need to find them, determine how to exploit them, disseminate the information, and move on to the next opportunity as rapidly as they can.
Too many organizations forget they need to also move at the speed of business. I’ve watched organizations painstakingly build analytic solutions to the right business problem only to have that organization’s business change out from under them before they can even implement the solution.
In my years of working with businesses as they build data science capabilities, I’ve seen my share of successes and failures. The successes tended to follow the above process and the failures tended to violate one or more of those practices.
About the Author:
Andrew Engel is General Manager for Sports and Gaming at DataRobot. He works with DataRobot customers across sports and casinos, including several Major League Baseball, National Basketball League and National Hockey League teams. He has been working as a data scientist and leading teams of data scientists for over 10 years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, and HP before joining DataRobot in February of 2016.