2019 US Open Predictions: Doubling Down on the Data

A few months ago, DataRobot simulated the Championships at Wimbledon to predict who would win. After following the fortnight of tennis, we anxiously watched the women’s and men’s finals.  In the women’s finals, we watched our DataRobot model’s favorite, Serena Williams (odds of winning 22%) handily fall to our model’s fifth favorite, Simona Halep (6%). The next day, in the men’s final, we watched the match between our model’s top two favorites, Novak Djokovic (39%) and Roger Federer (32%) compete in an epic final that saw Novak Djokovic win his fifth Wimbledon title.

With the 2019 US Open starting, we wanted to see if we could use DataRobot to predict how this tournament will play out. Will Serena Williams bounce back? Will Simona Halep win again? Will Naomi Osaka repeat in New York? Will Novak Djokovic continue his run of dominance or will we finally see the next generation break out?

Continuing the approach we used for the Wimbledon predictions (and following the methodology of our March Madness and Stanley Cup Finals predictions), we simulated both the men’s and women’s draws for the 2019 US Open. We started with the result of every match (and set scores) for ATP and WTA tour matches from 2010 through 2018. Using this data, we built a historical dataset containing past results, current Elo scores (both overall and surface-specific) and tournament information, then used DataRobot to determine the best model and predict the probability that a player would win a set.

Once we had built this prediction model, we could take the draw of any tournament and simulate the results 100,000 times to find out how often each player would win with that particular draw.

With the draw complete, we know the 128 men and women who will compete in the 2019 tournament. Based on our simulations, the top ten women most likely to win the US Open are given in the table below, with Ashleigh Barty as the favorite with a 13% chance of winning. She is followed closely by Serena Williams and Simona Halep at 12% and 11% chances of winning respectively.

Player

Probability of Winning the US Open

Ashleigh Barty 

13%

Serena Williams 

12%

Simona Halep 

11%

Karolina Pliskova 

8%

Petra Kvitová 

7%

Naomi Osaka 

6%

Victoria Azarenka 

5%

Elina Svitolina 

4%

Angelique Kerber 

3%

Maria Sharapova 

3%

 

Similarly, the top 10 men most likely to win the US Open are given in the table below, with Roger Federer being the slight favorite to win the US Open with a 33% chance of winning. Novak Djokovic and Rafael Nadal should be considered co-favorites with 31% and 30% chances of winning respectively. 

Player

Probability of Winning the US Open

Roger Federer 

33%

Novak Djokovic 

31%

Rafael Nadal 

30%

Dominic Thiem 

2%

Kei Nishikori 

1%

Nick Kyrgios 

1%

Roberto Bautista Agut 

1%

Alexander Zverev 

0%

Kevin Anderson 

0%

Daniil Medvedev 

0%

 

Our simulations predict a wide open Women’s US Open, with Ashleigh Barty as the slight favorite to win her second Slam over Serena Williams and Simona Halep. These three women are all predicted to have a similar chance of winning with Karolina Pliskove, Petra Kvitová, Naomi Osaka, and Victoria Azarenka.  

On the Men’s side, our simulation predicts the continued domination of the big three with Roger Federer as the slight favorite, though Novak Djokovic and Rafael Nadal all have at least a 30% chance of winning the US Open. This leaves the rest of the players in the men’s tournament with a very small chance of taking the title.

The US Open has begun, and the world is watching. Fans of tennis are excited to watch the elite Williams, Barty, Halep, Federer, Djokovic, and Nadal square off on the hard court. Fans of betting and data science are excited to see how predictive the 100,000 simulations turn out to be, fed by ATP and WTA matches over nine seasons with Elo scores, and factoring in surface and more. There is a real possibility for upsets on the court and “in the cloud” alike. 

 

Interested in more Sports Analytics? DataRobot works with professional teams across sports globally. Visit our Sports Analytics solutions page for more content and insights.



New call-to-action

 

About the Author:

Andrew Engel is General Manager for Sports and Gaming at DataRobot. He works with DataRobot customers across sports and casinos, including several Major League Baseball, National Basketball League and National Hockey League teams. He has been working as a data scientist and leading teams of data scientists for over ten years in a wide variety of domains from fraud prediction to marketing analytics. Andrew received his Ph.D. in Systems and Industrial Engineering with a focus on optimization and stochastic modeling. He has worked for Towson University, SAS Institute, the US Navy, Websense (now ForcePoint), Stics, and HP before joining DataRobot in February of 2016.