Yet Another Spotify Recommender: Part 2

6 minute read

Glad You’re Excited for Part 2

Congrats! You’ve made it to Part 2 of my creation of Yet Another Spotify Recommender. If you haven’t read Part 1 yet, please click here. As we saw in the previous section, comparing audio files with one another took way to long, so here we will go over how I optimized this by using Spotify API’s audio features. I will also go over my process of deploying the recommender to the internet using Flask.

Spotify API’s Audio Features

Luckily, Spotify’s API has a handy call where you can get the audio features of a song in numeric form, which is much easier to compare, compared to large audio files. There are 11 audio faetures which include danceability, energy, key, loudness, speechiness, acousticness, instrumentalness, liveness, valence, tempo, mode, and time signature. The features I decided to use were danceability, energy, loudness, speechiness, acousticness, instrumentalness, liveness, valence, tempo, and mode. By only using these features, I was able to reduce training time down to 2 seconds as only a small set of features of each track needed to be compared rather than large matrices comparing the frequencies of each song. I then took the 5 most similar songs and the 5 most unsimilar songs to use as recommendations to the user. Code to use Spotify’s API using the Python wrapper Spotipy is show below.

# setup spotify with app credentials, use environment variables so GitHub scrapers don't get access to my spotify developer keys
cid = os.getenv('SPOTIPY_CLIENT_ID')
secret = os.getenv('SPOTIPY_CLIENT_SECRET')
client_credentials_manager = SpotifyClientCredentials(client_id = cid, 
                                                      client_secret = secret)
sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)
for i in tqdm(range(len(user_df))):
    user_features_list.append(sp.audio_features(user_df.iloc[i]['track_uri'])[0])
    sleep(0.02)

Note that you will have to get your own track URIs to get the audio features of a song, but the basic outline is shown above. You will also have to signup for a Spotify account and create an app to get a client ID and secret. You can do so by going here.

Deployment

Now we can talk about brining the recommender to the internet. To to this, I wrote a custom web app in Flask with a Redis queue to ensure that my system does not get overloaded in the case that many users want to use the app at the same time, such as during my presentation of the project. In order to make the app public, I used NGINX reverse proxy in order to protect my IP address. Gunicorn was used as using the default Flask server is a bad idea for production deployment. Below I will show the steps needed to setup a Redis queue, Gunicorn, and NGINX.

To setup Redis run the following in the command line below:

sudo apt-get install tk8.5 tcl8.5
mkdir redis_setup && cd redis_setup
curl -O http://download.redis.io/redis-stable.tar.gz
tar xzvf redis-stable.tar.gz
cd redis-stable
make
make test
sudo make install   

Then run the following to start up Redis:

redis-server

Note that Redis default port is 6379, so if you have another program running on this port, you will need to change your port using the --port directive. You now have Redis up and running!

Now you’ll want to create a system process file by creating a file using the following command: sudo vim /etc/systemd/system/app.service The following is what you’ll want to add into this file:

[Unit]
Description=Gunicorn instance for capstone
After=network.target

[Service]
User=jesse
Group=www-data
WorkingDirectory=/home/jesse/dsir-1116/projects/capstone/app
Environment="PATH=/home/jesse/dsir-1116/projects/spotifyenv/bin"
Environment="LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64"
Environment="SPOTIPY_CLIENT_ID=yourspotipyclientid"
Environment="SPOTIPY_CLIENT_SECRET=yourspotipyclientsecret"
Environment="SPOTIPY_REDIRECT_URI=yourspotipyredirecturi"
ExecStart=/home/jesse/dsir-1116/projects/spotifyenv/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app

[Install]
WantedBy=multi-user.target

Be sure to change the user, working directory, environment variables, and execstart to fit your own needs. Now you can start gunicorn and have it start after a system reboot by running the following in the command line:

sudo systemctl start app
sudo systemctl enable app

To install nginx run the following in the command line below:

sudo apt install nginx

Now open an nginx conf file to serve your app using the following command sudo vim /etc/nginx/sites-available/spotify.conf:

Add the following to the file:

server {
        server_name domain.com;

        location / {
                include proxy_params;
                proxy_pass http://unix:/home/jesse/dsir-1116/projects/capstone/app/app.sock;
        }
}

Replace domain.com with your own domain. You will also want to add SSL encryption to your site to make it secure. You can do this using Certbot. To install and add SSL ussing certbot run the following:

sudo snap install core; sudo snap refresh core
sudo snap install --classic certbot
sudo ln -s /snap/bin/certbot /usr/bin/certbot
sudo certbot --nginx

This should be all that needs to be done to get the recommender deployed! If you’d like to try it out click here.

Performance

To see the performance of YASR, I created a MySQL database to store scores of recommendations given by users. I ended up with an average score of 5.66 out of 10 for similar songs and an average score of 5.16 out of 10. One caveat however is that our sample size is small for these scores, so while it is likely that there is more work to do for these recommendations, we would need to obtain more scores to be absolutely sure.

Conclusion

As we learned from building this recommender, comparing audio files is a very difficult and time consuming process. We can see it is likely necessary to create raw numbers for features from audio files rather than comparing the audio files after some simple preprocessing, because waiting 2 days for a recommendation is not viable for most people. By using Spotify’s audio features we were able to reach our goal of giving new artists more exposure by recommending their songs to people using a web app.

Further Steps

Below are some of the steps I can think of to improve YASR:

  • Create interaction terms between Spotify API Audio Features to see if our recommender improves
  • Use speech recognition with our Audio File modeling to see if we can use NLP to recommend songs
  • Build our own features from our Audio File analysis so we don’t have to rely on Spotify API’s audio features
  • Classify audio into genres as Spotify’s API does not include the genre of a song
  • Create another page where users can see more recommendations
  • Use user’s ratings on recommendations to recommend songs to similar users
  • Write code that takes better advantage of the GPU using TensorRT or CUDA in C/C++
  • Expand this model to Apple Music, Google Play Music, other music streaming services

Special Thanks

Thanks to the following people and resources for making this project possible: