Sentiment Analysis with songs

Sentiment analysis is the interpretation and classification of emotions (positive, negative and neutral) within text data using natural language processing(NLP) techniques.
In this session, we are going to use songs from R package genius with bing lexicon. The bing lexicon categorizes words in a binary fashion into positive and negative categories with weighted sentiment scores.
First, load libraries
library(pacman)
p_load(tidyr,tidyverse,tidytext,forcats, genius)
then download songs using genius functions and perform sentiment analysis. Steps are described below.
- Aerosmith - I don’t want to miss a thing
#get song lyrics
aero_smith_i_don <- genius_lyrics(artist = "Aerosmith", song = " I don't want to miss a thing")
aero_smith_i_don
## # A tibble: 59 x 3
## track_title line lyric
## <chr> <int> <chr>
## 1 I Don’t Want to Miss a T… 1 I could stay awake just to hear you breathin'
## 2 I Don’t Want to Miss a T… 2 Watch you smile while you are sleeping
## 3 I Don’t Want to Miss a T… 3 While you're far away and dreaming
## 4 I Don’t Want to Miss a T… 4 I could spend my life in this sweet surrender
## 5 I Don’t Want to Miss a T… 5 I could stay lost in this moment forever
## 6 I Don’t Want to Miss a T… 6 Where every moment spent with you is a momen…
## 7 I Don’t Want to Miss a T… 7 Don't want to close my eyes
## 8 I Don’t Want to Miss a T… 8 I don't want to fall asleep
## 9 I Don’t Want to Miss a T… 9 'Cause I'd miss you, baby
## 10 I Don’t Want to Miss a T… 10 And I don't wanna miss a thing
## # … with 49 more rows
#tidy up lyrics
i_dont_tidy <- aero_smith_i_don %>% select(lyric, track_title) %>% unnest_tokens(word, lyric)
i_dont_tidy
## # A tibble: 390 x 2
## track_title word
## <chr> <chr>
## 1 I Don’t Want to Miss a Thing i
## 2 I Don’t Want to Miss a Thing could
## 3 I Don’t Want to Miss a Thing stay
## 4 I Don’t Want to Miss a Thing awake
## 5 I Don’t Want to Miss a Thing just
## 6 I Don’t Want to Miss a Thing to
## 7 I Don’t Want to Miss a Thing hear
## 8 I Don’t Want to Miss a Thing you
## 9 I Don’t Want to Miss a Thing breathin
## 10 I Don’t Want to Miss a Thing watch
## # … with 380 more rows
# join with sentiment lexicon
i_dont_sentiments<- i_dont_tidy%>%
inner_join(get_sentiments("bing"), by = c(word = "word"))
i_dont_sentiments
## # A tibble: 35 x 3
## track_title word sentiment
## <chr> <chr> <chr>
## 1 I Don’t Want to Miss a Thing smile positive
## 2 I Don’t Want to Miss a Thing sweet positive
## 3 I Don’t Want to Miss a Thing surrender negative
## 4 I Don’t Want to Miss a Thing lost negative
## 5 I Don’t Want to Miss a Thing treasure positive
## 6 I Don’t Want to Miss a Thing fall negative
## 7 I Don’t Want to Miss a Thing miss negative
## 8 I Don’t Want to Miss a Thing miss negative
## 9 I Don’t Want to Miss a Thing miss negative
## 10 I Don’t Want to Miss a Thing miss negative
## # … with 25 more rows
#bargraph word-sentiment
i_dont_sentiments %>%
count(sentiment, word) %>%
ungroup() %>%
mutate(n = ifelse(sentiment == "negative", -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = "identity") +
ylab("Contribution to sentiment") +
coord_flip()
- The Weeknd - Earned It
the_weeknd_earned_it <- genius_lyrics(artist = "The Weeknd", song = "Earned It")
the_weeknd_earned_it
## # A tibble: 51 x 3
## track_title line lyric
## <chr> <int> <chr>
## 1 Earned It 1 I'ma care for you
## 2 Earned It 2 I'ma care for you, you, you, you
## 3 Earned It 3 You make it look like it's magic (Oh yeah)
## 4 Earned It 4 'Cause I see nobody, nobody but you, you, you
## 5 Earned It 5 I'm never confused
## 6 Earned It 6 Hey, hey
## 7 Earned It 7 I'm so used to bein' used
## 8 Earned It 8 So I love when you call unexpected
## 9 Earned It 9 'Cause I hate when the moment's expected
## 10 Earned It 10 So I'ma care for you, you, you
## # … with 41 more rows
earned_it_tidy <- the_weeknd_earned_it %>% select(lyric, track_title) %>% unnest_tokens(word, lyric)
earned_it_tidy
## # A tibble: 325 x 2
## track_title word
## <chr> <chr>
## 1 Earned It i'ma
## 2 Earned It care
## 3 Earned It for
## 4 Earned It you
## 5 Earned It i'ma
## 6 Earned It care
## 7 Earned It for
## 8 Earned It you
## 9 Earned It you
## 10 Earned It you
## # … with 315 more rows
earned_it_sentiments<- earned_it_tidy%>%
inner_join(get_sentiments("bing"), by = c(word = "word"))
earned_it_sentiments
## # A tibble: 36 x 3
## track_title word sentiment
## <chr> <chr> <chr>
## 1 Earned It like positive
## 2 Earned It magic positive
## 3 Earned It confused negative
## 4 Earned It love positive
## 5 Earned It unexpected negative
## 6 Earned It hate negative
## 7 Earned It perfect positive
## 8 Earned It worth positive
## 9 Earned It work positive
## 10 Earned It love positive
## # … with 26 more rows
earned_it_sentiments %>%
count(sentiment, word) %>%
ungroup() %>%
mutate(n = ifelse(sentiment == "negative", -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = "identity") +
ylab("Contribution to sentiment") +
coord_flip()
- Pharrell Williams - Happy
pharrell_will_happy <- genius_lyrics(artist = "Pharrell Williams", song = "Happy")
pharrell_will_happy
## # A tibble: 67 x 3
## track_title line lyric
## <chr> <int> <chr>
## 1 Happy 1 <NA>
## 2 Happy 2 It might seem crazy what I'm 'bout to say
## 3 Happy 3 Sunshine she's here, you can take a break
## 4 Happy 4 I'm a hot air balloon that could go to space
## 5 Happy 5 With the air, like I don't care, baby, by the way
## 6 Happy 6 (Because I'm happy)
## 7 Happy 7 Clap along if you feel like a room without a roof
## 8 Happy 8 (Because I'm happy)
## 9 Happy 9 Clap along if you feel like happiness is the truth
## 10 Happy 10 (Because I'm happy)
## # … with 57 more rows
happy_tidy <- pharrell_will_happy %>% select(lyric, track_title) %>% unnest_tokens(word, lyric)
happy_tidy
## # A tibble: 473 x 2
## track_title word
## <chr> <chr>
## 1 Happy <NA>
## 2 Happy it
## 3 Happy might
## 4 Happy seem
## 5 Happy crazy
## 6 Happy what
## 7 Happy i'm
## 8 Happy bout
## 9 Happy to
## 10 Happy say
## # … with 463 more rows
happy_sentiments<- happy_tidy%>%
inner_join(get_sentiments("bing"), by = c(word = "word"))
happy_sentiments
## # A tibble: 63 x 3
## track_title word sentiment
## <chr> <chr> <chr>
## 1 Happy crazy negative
## 2 Happy break negative
## 3 Happy hot positive
## 4 Happy like positive
## 5 Happy happy positive
## 6 Happy like positive
## 7 Happy happy positive
## 8 Happy like positive
## 9 Happy happiness positive
## 10 Happy happy positive
## # … with 53 more rows
happy_sentiments %>%
count(sentiment, word) %>%
ungroup() %>%
mutate(n = ifelse(sentiment == "negative", -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = "identity") +
ylab("Contribution to sentiment") +
coord_flip()
- Summertime Sadness – Lana Del Rey
Summertime_Sadness <- genius_lyrics(artist = "Lana Del Rey", song = "Summertime Sadness")
Summertime_Sadness
## # A tibble: 54 x 3
## track_title line lyric
## <chr> <int> <chr>
## 1 Summertime Sadness 1 Kiss me hard before you go
## 2 Summertime Sadness 2 Summertime sadness
## 3 Summertime Sadness 3 I just wanted you to know
## 4 Summertime Sadness 4 That, baby, you're the best
## 5 Summertime Sadness 5 I got my red dress on tonight
## 6 Summertime Sadness 6 Dancin' in the dark in the pale moonlight
## 7 Summertime Sadness 7 Done my hair up real big, beauty queen style
## 8 Summertime Sadness 8 High heels off, I'm feelin' alive
## 9 Summertime Sadness 9 Oh, my God, I feel it in the air
## 10 Summertime Sadness 10 Telephone wires above are sizzlin' like a snare
## # … with 44 more rows
Summertime_Sadness_tidy <- Summertime_Sadness %>% select(lyric, track_title) %>% unnest_tokens(word, lyric)
Summertime_Sadness_tidy
## # A tibble: 312 x 2
## track_title word
## <chr> <chr>
## 1 Summertime Sadness kiss
## 2 Summertime Sadness me
## 3 Summertime Sadness hard
## 4 Summertime Sadness before
## 5 Summertime Sadness you
## 6 Summertime Sadness go
## 7 Summertime Sadness summertime
## 8 Summertime Sadness sadness
## 9 Summertime Sadness i
## 10 Summertime Sadness just
## # … with 302 more rows
Summertime_Sadness_sentiments<- Summertime_Sadness_tidy%>%
inner_join(get_sentiments("bing"), by = c(word = "word"))
Summertime_Sadness_sentiments
## # A tibble: 39 x 3
## track_title word sentiment
## <chr> <chr> <chr>
## 1 Summertime Sadness hard negative
## 2 Summertime Sadness sadness negative
## 3 Summertime Sadness best positive
## 4 Summertime Sadness dark negative
## 5 Summertime Sadness pale negative
## 6 Summertime Sadness beauty positive
## 7 Summertime Sadness like positive
## 8 Summertime Sadness snare negative
## 9 Summertime Sadness hard negative
## 10 Summertime Sadness sadness negative
## # … with 29 more rows
Summertime_Sadness_sentiments %>%
count(sentiment, word) %>%
ungroup() %>%
mutate(n = ifelse(sentiment == "negative", -n, n)) %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(word, n, fill = sentiment)) +
geom_bar(stat = "identity") +
ylab("Contribution to sentiment") +
coord_flip()
This is a glimpse of sentiment analysis and it’s use. With few steps, one can analyse texts with positive and negative emotions in a manner general public can interpret. Imagine significance of sentiment analysis in social media monitoring, brand monitoring, market research and customer service.