Grammy Predictions 2020: Tuning in to the Song and Record of the Year

This blog is meant to be a fun and unique take on predicting song of the year for the 62nd Annual Grammy Awards.

With the Grammy Awards set to air on Sunday, January 26th on CBS, many are ready to see if their favorite new songs will be awarded the iconic gramophone. Last year, history was made as Atlanta hip-hop artist Childish Gambino won for “This Is America”, making it the first rap song to win the prestigious Song of the Year award (read last year’s blog post on the 61st Annual Grammy Awards to see how we predicted this historic win). This isn’t the first time the rap genre has experienced a first. Two years ago, Kendrick Lamar won a Pulitzer Prize for his fourth album “DAMN.”, thereby establishing him as the first musician to win not from the classical or jazz domain. Just as the Recording Academy continues to adapt to a rapidly developing modern era of music, analysts and data scientists alike are acclimating to new advances in machine learning such as automated machine learning

In last year’s blog post, we used spotify data to predict Grammy winners as a fun experiment. Naturally, information such as how danceable or positive a song sounds represents only a small pool of the criteria the Recording Academy may consider when choosing the song most deserving of the award; hence, the results should be taken with a grain of salt. Regardless, it’s always fun to demonstrate applications of machine learning to non-traditional use cases (like NHL hockey, March Madness, or even Game of Thrones), whether the prediction turns out to be correct or not. 

I set my sights to repeat some of the same analysis for not only on this year’s nominees for Song of the Year but also for Record of the Year. For those who may not know, the key difference between these categories is that Record of the Year recognizes everyone involved in the production of the track (including the performers(s), producers(s), and engineer(s)), while Song of the Year focuses specifically on the composer(s) and songwriter(s). 

 

Methodology

As done before, this analysis leverages the R package spotifyr (which I highly recommend) to collect characteristics (such as the audio features) and lyrics for both nominees and winners dating back to 1959. I also add features that describe the general sentiment of the song as well as the amount of profanity present. This year, I decided to expand this a bit more to include words counts and percentages of lyrics that contained words associated with emotions such as anger, sadness, joy, etc. After building hundreds of candidate models on more than 40 features, I chose the one that had the best performance on the five most recent award ceremonies.

 

Who Came Out on Top This Year?

Below are rankings for Song of the Year and Record of the Year:  

Song of the Year 2020

Artist

Song

Probability

Lewis Capaldi

“Someone You Loved”

35.25%

Billie Eilish

“bad guy”

33.53%

Lana Del Rey

“Norman Fucking Rockwell”

26.86%

 

Record of the Year 2020

Artist

Song

Probability

Billie Eilish

“bad guy”

31.10%

Lil Nas X

“Old Town Road”

29.53%

Bon Iver

“Hey, Ma”

19.23%

 

The model predicts one of the leading favorites, Billie Eilish, to win the Record of the Year. She is also a top contender  for Song of the Year. Lil Nas X is also in the running for Record of the Year for his record-setting hit “Old Town Road”. Lewis Capaldi’s “Someone You Loved” holds the top spot for Song of the Year, which makes sense as the Recording Academy typically rewards young singers with older sounding voices

Notably, both Lana Del Rey and Bon Iver appear to have a shot at winning according to the model, despite having fairly low odds. To understand this better, let’s explore just one of the more important factors the model relied on to make a prediction: danceability. This is a measure of the “tempo, rhythm stability, beat strength, and overall regularity” from 0 (least danceable) to 1 (most danceable) in a song. Below is the feature effects graph for danceability associated with the DataRobot model. This plot describes what happens to the model predictions as you span across the different values of the feature, leaving everything else the same.

The higher the point on the chart, the more likely a song will win. In this case, songs have a better chance to win on average if they avoid the middle ground in terms of danceability. This lines up as both “Norman Fucking Rockwell” and “Hey, Ma” are on the lower end, 0.218 and 0.411 respectively, compared to other nominees on the higher end such as “bad guy” (0.701) and “Old Town Road” (0.878). Examples of tracks that float in this middle ground like Tanya Tucker’s “Bring My Flowers Now” (0.557) and Lady Gaga’s “Always Remember Us This Way” (0.553) did not make the cut. 

 

Machine Learning and the Music Industry

While danceability may play a key role, it’s important to note that it’s not the only criteria for making a winning song. Many other considerations are involved (some of which are hard to quantify like current political climate -- think of last year’s Song of the Year winner) and pin-pointing one specific contributing factor is nearly impossible. However, with the help of new data sources and machine learning, we can begin to better summarize empirical relationships in the data, especially in the realm of music. As for the upcoming awards ceremony, we can see that both the model and public perception agree that Billie Eilish is poised for a big night.

Tune in on the 26th to see how things turn out!

 

New call-to-action

 

About the Author:

Taylor Larkin is a data scientist at DataRobot. Based out of Atlanta, he's currently responsible for executing data science projects as well as enabling customers to do data science work. He has worked on machine learning projects and research articles in a variety of realms including geomagnetic storm prediction, healthcare, renewable energy, sports analytics, and wine preference. Prior to joining DataRobot, Taylor graduated from The University of Alabama with a PhD in Business Analytics and an MS in Applied Statistics.