DataRobot’s Oscar Prediction 2019: All Eyes on Best Picture

This blog is meant to be a fun and unique take on predicting Best Picture for the 91st Academy Awards.

On Sunday, February 24th, the biggest film awards night of the year will take place in Los Angeles, California to honor the best films of 2018. The 91st Academy Awards (Oscars) celebrates creators and artists throughout the industry across 24 categories. The most highly-anticipated award is for Best Picture. Who will take the top honor this year? I looked at this year's nominees to predict which one is most likely to win in this category.

 

How I made the Best Picture prediction

For each Best Picture nominee and winner spanning back to the 1960s, I collected information such as title, actors, plot, and genre of the film from sources such as Wikipedia and IMDb. Other factors I gathered included the film release date, runtime, country the film was from, and language of the film. While these features are useful, any sports bettor will tell you that it is also valuable knowing what the public and experts think. To capture that information, we added the overall IMDb rating scores (fan rating), critics ratings, and how many total nominations each film gathered.

DataRobot automatically built about 100 models and I chose the most accurate of those — an ensemble of an ExtraTrees Classifier and an Elastic Net Classifier. The DataRobot platform took care of the feature engineering and building all the models. Now, let’s look at the rankings for Best Picture!

 

And the winner is…

According to our model, the top two films most likely to win Best Picture are Roma and A Star is Born. Predicting Roma as the film to most likely win an Oscar for Best Picture follows the conventional wisdom for nominees in this category, which we will discuss in this blog post. However, A Star is Born, is a unique pick for this category. If you look at the Vegas odds for this scenario, A Star is Born is considered a long shot. Most folks in Vegas have the film at about an 8% chance to win.  

So let’s use some interpretability tools to better understand why these two films are most likely to win Best Picture, as well as glean some additional fun insights about the Oscars.

Screen Shot 2019-02-20 at 4.23.29 PM

 

Bringing life into the film

A crucial feature for Roma and A Star is Born is the importance of plot. Let’s dig in deeper by examining a word cloud for the Plot of these two films. The more red the word, the more likely it is to win the Best Picture. The more blue the word, the less likely that film is to win Best Picture. And the size of the word indicates how often it shows up across all the nominated movies plots (large words appear more often than smaller words).

'Auto-Tuned Word N-Gram Text Modeler using token occurrences - Plot' WordCloud (1) (1)

Based on the word cloud above, Roma hits on a core theme in its plot — “life,” which is large, bright red, and in the center of the word cloud. A Star is Born, doesn’t use the same keywords in its plot such as “life,” “new,” or “young.” Having these words in your film plot increases your chances of winning for the Best Picture category, which is the case for Roma, making it a likely top pick.  

 

It’s all about romance

When applying genre to a word cloud, we found that romance movies do well with the Academy. Don’t bet on the mystery sci-fi comedy genres as Best Picture winners since they appear on the word cloud as small blue words.  

Romance_Cloud

Both Roma and A Star is Born fit into the romance genre. Roma is categorized as a drama, but romance appears throughout the film through family relationships and the love lives of the two maids, Cleo and Adela. A Star is Born is described as a musical romantic drama. Romance is the film’s driving force between the two main characters, Jack and Ally.

 

Prediction Explanations

When the DataRobot platform provides a prediction, it also provides an explanation. The explanation is a list of the features or variables that were important for the prediction.

 

The Roma explanations are:

Screen Shot 2019-02-21 at 8.21.51 AM

First, the critics’ Metascore rating (96) is the biggest factor increasing the prediction (indicated by the +++) for Roma to win Best Picture. This makes sense because critically-acclaimed movies tend to do well at the Academy Awards. The second factor increasing the prediction is Plot, which Roma scored well for in the word cloud above. Finally, the prediction was actually reduced for this film because of the number of imdbVotes, which indicates that this movie was not popular among a general movie audience. But, since critically-acclaimed movies do well at the Oscars, this lower score won’t harm Roma’s chances for winning in this category.

 

Timing is Everything

Another interesting factor to investigate is the release date of a film. The graph below shows the likelihood for a film to win the Best Picture award by day of the month. You can see that films released in the first week of the month have a slightly higher chance. When we looked at the month of the release, we also found that months like January and August were less likely to win. This is no surprise to those working in the film industry. They call these the Dump Months, the time of year when the less valuable movies are released. The fun part of DataRobot is that you can discover these trends by looking at the data and building models.

Oscar_Blog_Timing_A (1)

 

Roll out the Red Carpet (And Predict for Yourself)

As the Academy picks winners in each category, hopefully, it will inspire you to think about what factors could have influenced each outcome. Gather up your ideas and next year, you can build your own model to get predictions and better understand the movie industry. Enjoy the Academy Awards on Sunday!  

 

New call-to-action

About the Authors:

Rajiv Shah is a data scientist at DataRobot, where he works with customers to make and implement predictions. Previously, Rajiv has been part of data science teams at Caterpillar and State Farm. He enjoys data science and spends time mentoring data scientists, speaking at events, and having fun with blog posts. He has a PhD from the University of Illinois at Urbana Champaign.