Why is removeSparseTerms() not doing anything?
Here's the code and results.? The corpus is the text version of a single book.?? (r vs. 3.2)
docs <- tm_map(docs, stemDocument) dtm <- DocumentTermMatrix(docs) freq <- colSums(as.matrix(dtm)) ord <- order(freq) freq[tail(ord)]
one experi will can lucid dream 287 312 363 452 1018 2413
freq[head(ord)]
abbey abdomin abdu abraham absent abus 1 1 1 1 1 1
dim(dtm)
[1] 1 5265
dtms <- removeSparseTerms(dtm, 0.1) dim(dtms)
[1] 1 5265
dtms <- removeSparseTerms(dtm, 0.001) dim(dtms)
[1] 1 5265
dtms <- removeSparseTerms(dtm, 0.9) dim(dtms)
[1] 1 5265