How to make word clouds of your
data
- Download
and load your data into R.
- Load
libraries
- library(tm)
- library(snippets)
- library(Snowball)
- You
may want to create a subset of your data that is just the chill, stress or
neutral
- chill
<- subset(stresschill,stress_value>0)
- How
would you subset for stress?
- How would you handle “neutral” data
(stress_value equals 0)?
- comments <- Corpus(VectorSource(chill$comments))
- Change
all the text to lowercase
- comments <- tm_map(comments,tolower)
- Remove
stopwords (the, this, etc.)
- comments <- tm_map(comments,removeWords,
stopwords())
- Remove
Punctuation
- comments <- tm_map(comments,removePunctuation)
- (Optional)Stem
your document (remove endings like ing, ed, etc.)
- comments <- tm_map(comments,stemDocument)
- Create
a document term matrix
- dtm = DocumentTermMatrix(comments,control=list(weighting=weightBin))
- If you
want to see a list terms sorted by frequency
- head(sort(apply(dtm,2,sum),decreasing=TRUE),n=50)
- Paste
in the word cloud method:
make_cloud
<- function(dtm,removeLessThan=1, zoom=FALSE){
words <-
apply(dtm,2,sum)
words <-
words[words >= removeLessThan]
if (zoom)
words <-
log(words)
cloud(words, col =
col.bbr(words, fit=TRUE))
}
- Make a
word cloud
- With
all the terms
i. make_cloud(dtm)
- Only
some terms and zoom in (if you don’t like the regular word cloud)
i. make_cloud(dtm, removeLessThan=4, zoom=TRUE)