Skip to content
Prev 166474 / 398503 Next

[R} how to build TermDocMatrix in tm text mining package of R

Hi there, I think something like the following is what you want:

### R start...
# if you put your plain text files in a folder like this
my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\'

# then you can construct a simple tdm like this
library(tm)
my.corpus <- Corpus(DirSource(my.path), readerControl = list
(reader=readPlain))
my.tdm <- TermDocMatrix(my.corpus)

# this show show how words are distributed in the first text document
my.tdm[1, ]
### R end.

by the way, there are some nice examples of using the tm package in
the last Rnews letter (Volume 8/2, October 2008), under the section
'An Introduction to Text Mining in R':
http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf

Hope that helps a little bit,
Tony Breyal
On 9 Jan, 14:21, "Kum-Hoe Hwang" <phdhw... at gmail.com> wrote: