How to make word clouds of your data

  1. Download and load your data into R.
  2. Load libraries
    1. library(tm)
    2. library(snippets)
    3. library(Snowball)
  3. You may want to create a subset of your data that is just the chill, stress or neutral
    1. chill <- subset(stresschill,stress_value>0)
    2. How would you subset for stress?
    3. How would you handle “neutral” data (stress_value equals 0)?
  4. comments <- Corpus(VectorSource(chill$comments))
  5. Change all the text to lowercase
    1. comments <- tm_map(comments,tolower)
  6. Remove stopwords (the, this, etc.)
    1. comments <- tm_map(comments,removeWords, stopwords())
  7. Remove Punctuation
    1. comments <- tm_map(comments,removePunctuation)
  8. (Optional)Stem your document (remove endings like ing, ed, etc.)
    1. comments <- tm_map(comments,stemDocument)
  9. Create a document term matrix
    1. dtm = DocumentTermMatrix(comments,control=list(weighting=weightBin))
  10. If you want to see a list terms sorted by frequency
    1. head(sort(apply(dtm,2,sum),decreasing=TRUE),n=50)
  11. Paste in the word cloud method:

make_cloud <- function(dtm,removeLessThan=1, zoom=FALSE){

   words <- apply(dtm,2,sum)

   words <- words[words >= removeLessThan]

   if (zoom)

         words <- log(words)  

   cloud(words, col = col.bbr(words, fit=TRUE))

}

  1. Make a word cloud
    1. With all the terms

                                                    i.     make_cloud(dtm)

    1. Only some terms and zoom in (if you don’t like the regular word cloud)

                                                    i.     make_cloud(dtm, removeLessThan=4, zoom=TRUE)