import pandas as pdSpotify Favorites Analysis
The following data frame includes spotify user data from the 2018 Spotify Million Dataset Challenge.
spotify = pd.read_csv('https://bcdanl.github.io/data/spotify_all.csv')Out of all these brilliant artists, I’d like to highlight and analyze some of my favorites:
- Saint Motel
- Tame Impala
- Castlecomer
- CRX
- Two Door Cinema Club
First, let’s find which songs from each of these artists are in the larger data frame and how many times each song appears:
artist_favorites = ['Saint Motel', 'Tame Impala', 'Castlecomer', 'CRX', 'Two Door Cinema Club']
spotify_favorites = spotify[spotify['artist_name'].isin(artist_favorites)]
# Song names for each artist
song_names_fav = spotify_favorites.drop_duplicates(subset='track_name', keep='first')[['artist_name', 'track_name']].sort_values('artist_name')
song_names_fav| artist_name | track_name | |
|---|---|---|
| 101794 | CRX | Slow Down |
| 129597 | Castlecomer | Fire Alarm |
| 21124 | Saint Motel | Cold Cold Man |
| 132905 | Saint Motel | Ace In The Hole - Live from Spotify San Francisco |
| 1235 | Saint Motel | Born Again |
| ... | ... | ... |
| 23358 | Two Door Cinema Club | You're Not Stubborn |
| 12902 | Two Door Cinema Club | What You Know - Live |
| 16457 | Two Door Cinema Club | Undercover Martyn |
| 68025 | Two Door Cinema Club | Eat That Up, Its Good For You |
| 195075 | Two Door Cinema Club | Spring |
77 rows × 2 columns
# Number of songs each artist has in the larger spotify DataFrame
songs_count_fav = spotify_favorites.drop_duplicates(subset='track_name', keep='first').value_counts('artist_name').sort_values()
songs_count_fav| count | |
|---|---|
| artist_name | |
| CRX | 1 |
| Castlecomer | 1 |
| Saint Motel | 11 |
| Two Door Cinema Club | 24 |
| Tame Impala | 40 |
# Times each individual track was listed
songs_listed_fav = spotify_favorites.value_counts(['artist_name', 'track_name']).sort_index()
songs_listed_fav| count | ||
|---|---|---|
| artist_name | track_name | |
| CRX | Slow Down | 1 |
| Castlecomer | Fire Alarm | 1 |
| Saint Motel | Ace In The Hole - Live from Spotify San Francisco | 1 |
| Born Again | 2 | |
| Cold Cold Man | 14 | |
| ... | ... | ... |
| Two Door Cinema Club | This Is The Life | 1 |
| Undercover Martyn | 10 | |
| What You Know | 39 | |
| What You Know - Live | 1 | |
| You're Not Stubborn | 2 |
77 rows × 1 columns
Based on the data, let’s see what the highest and lowest amounts for listed songs were
songs_listed_fav.nlargest(5, keep='all')| count | ||
|---|---|---|
| artist_name | track_name | |
| Two Door Cinema Club | What You Know | 39 |
| Tame Impala | The Less I Know The Better | 30 |
| Feels Like We Only Go Backwards | 20 | |
| Saint Motel | My Type | 19 |
| Two Door Cinema Club | Something Good Can Work | 19 |
songs_listed_fav.nsmallest(5, keep='all')| count | ||
|---|---|---|
| artist_name | track_name | |
| CRX | Slow Down | 1 |
| Castlecomer | Fire Alarm | 1 |
| Saint Motel | Ace In The Hole - Live from Spotify San Francisco | 1 |
| Daydream / Wetdream / Nightmare | 1 | |
| Local Long Distance Relationship (LA2NY) | 1 | |
| Something About Us - Recorded at Spotify Studios NYC | 1 | |
| Sweet Talk | 1 | |
| You Can Be You | 1 | |
| Tame Impala | 'Cause I'm A Man - HAIM Remix | 1 |
| Desire Be Desire Go | 1 | |
| Expectation | 1 | |
| I Don't Really Mind | 1 | |
| It Is Not Meant To Be | 1 | |
| Jeremy's Storm | 1 | |
| Keep On Lying | 1 | |
| Love/Paranoia | 1 | |
| Mind Mischief - Ducktails Remix | 1 | |
| Reality In Motion | 1 | |
| Remember Me | 1 | |
| Runway Houses City Clouds | 1 | |
| Sun's Coming Up | 1 | |
| Sundown Syndrome | 1 | |
| Wander | 1 | |
| Why Won't You Make Up Your Mind? | 1 | |
| Two Door Cinema Club | Cigarettes In The Theatre | 1 |
| Gameshow | 1 | |
| Lavender | 1 | |
| Pyramid | 1 | |
| Sleep Alone | 1 | |
| Something Good Can Work - RAC Remix | 1 | |
| Something Good Can Work - The Twelves remix | 1 | |
| Spring | 1 | |
| This Is The Life | 1 | |
| What You Know - Live | 1 |
It seems like there are many instances where tracks are listed only once, which is a shame, but there are plenty of more popular tracks across these artists.
Of these, it seems like ‘Two Door Cinema Club’ and ‘Tame Impala’ are the two most popular artists of the bunch.
Going back to the original ‘spotify_favorites’ DataFrame, we can look at the longest song length as well:
spotify_favorites['duration_min'] = ((spotify_favorites['duration_ms'] / 1000) / 60)
spotify_favorites.sort_values('duration_min', ascending=False)| pid | playlist_name | pos | artist_name | track_name | duration_ms | album_name | duration_min | |
|---|---|---|---|---|---|---|---|---|
| 64923 | 969 | Marshall | 33 | Tame Impala | Let It Happen - Soulwax Remix | 556924 | Let It Happen | 9.282067 |
| 142147 | 999121 | .::March::. | 36 | Tame Impala | Let It Happen - Soulwax Remix | 556924 | Let It Happen | 9.282067 |
| 165421 | 999491 | Play this at my funeral | 25 | Tame Impala | Let It Happen | 467585 | Currents | 7.793083 |
| 19178 | 303 | Tame Impala | 25 | Tame Impala | Let It Happen | 467585 | Currents | 7.793083 |
| 85204 | 1276 | FIREFLY 2016 | 27 | Tame Impala | Let It Happen | 467585 | Currents | 7.793083 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 19184 | 303 | Tame Impala | 31 | Tame Impala | Disciples | 108546 | Currents | 1.809100 |
| 171295 | 999586 | Fall 2017 | 1 | Tame Impala | Disciples | 108546 | Currents | 1.809100 |
| 101795 | 1521 | yoga | 17 | Tame Impala | Disciples | 108546 | Currents | 1.809100 |
| 57271 | 849 | chill | 63 | Tame Impala | Nangs | 107533 | Currents | 1.792217 |
| 19179 | 303 | Tame Impala | 26 | Tame Impala | Nangs | 107533 | Currents | 1.792217 |
343 rows × 8 columns
Interestingly, both the top 5 and bottom 5 song durations came from ‘Tame Impala’ (most of which from the same album as well), ranging from 9.28 minutes to 1.79 minutes.