This blog is meant to be a fun and unique take on predicting Song of the Year for the 61st Annual Grammy Awards.
The Grammy Awards are coming up on Sunday, February 10th on CBS. Lots of careers will be made that night. Music industry insiders often talk about a “Grammy Bounce” that results from winning the golden gramophone, with one estimate putting the bounce rate at a 55% increase in concert ticket sales and producer fees during the year following a Grammy win. With so much money and fame on the line, I was curious to see if one could empirically predict a winner. I took a stab at that question, with the help of DataRobot, a machine learning model platform.
The Recording Academy has 900 voluntary members and promotes itself as the music industry’s only peer-recognized accolade and highest achievement. There are currently 30 fields of recognition and 84 categories within those fields. For our purposes in this experiment, we are interested in one category--Song of the Year.
How we made the Song of the Year prediction
In order to predict who is most likely to win Song of the Year, I leveraged the Spotify web API and the R package spotifyr. For each nominee and winner spanning back to the first awards ceremony in 1959, I extracted the genre of the song, amount of profanity, general sentiment, total word count in the song, and various audio features derived by Spotify. For a deeper discussion about some of this information, read my previous blog post. Similarly, as done in the previous analysis, I took advantage of the automation of DataRobot and the power of the DataRobot R package to build 140 candidate models in just six minutes.
Below is a screenshot of the best model blueprint from those tested, which performed about 44% better than randomly guessing during my testing period (from 2012-2018):
And the winner is…
According to my analysis, the two songs most likely to win are This is America by Childish Gambino and Shallow by Lady Gaga and Bradley Cooper. This is consistent with what some experts think, with the former representing a more bold choice for a winner and the latter being more of a traditional one.
Singing a different tune
With this experiment, we’re demonstrating that machine learning can not only be fun but can also have applications well beyond the traditional ones we are used to seeing in fields such as banking or insurance. The music industry could tap into its potential, studying what makes a song successful and understanding why people listen to the songs that they do. With the volume of great music being produced, having quick insights into song popularity could be another tool to help musicians and music producers to refine their expertise.
About the author:
Taylor Larkin is a Data Science Evangelist at DataRobot. Based out of Atlanta, he's responsible for teaching data science best practices by leading training sessions, developing course content for DataRobot University, and helping academic institutions integrate DataRobot into the classroom. He has worked on machine learning projects and research articles in a variety of realms including geomagnetic storm prediction, healthcare, renewable energy, sports analytics, and wine preference. Prior to joining DataRobot, Taylor graduated from The University of Alabama with a Ph.D. in Business Analytics and an MS in Applied Statistics.