Back to formatted view
Raw Message

Message-ID: <a2119083-d7e6-406f-a464-6ba07d5568b6@z28g2000prd.googlegroups.com>
Date: 2009-01-09T15:39:21Z
From: Tony
Subject: [R} how to build TermDocMatrix in tm text mining package of R
In-Reply-To: <b040cbb00901090621n652657a0nc111ab06e7e7bafa@mail.gmail.com>

Hi there, I think something like the following is what you want:

### R start...
# if you put your plain text files in a folder like this
my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\'

# then you can construct a simple tdm like this
library(tm)
my.corpus <- Corpus(DirSource(my.path), readerControl = list
(reader=readPlain))
my.tdm <- TermDocMatrix(my.corpus)

# this show show how words are distributed in the first text document
my.tdm[1, ]
### R end.

by the way, there are some nice examples of using the tm package in
the last Rnews letter (Volume 8/2, October 2008), under the section
'An Introduction to Text Mining in R':
http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf

Hope that helps a little bit,
Tony Breyal

On 9 Jan, 14:21, "Kum-Hoe Hwang" <phdhw... at gmail.com> wrote:
> Howdy Gurus
>
> I 'd like to ask a question about how to build TermDocMatrix in tm text
> mining package.
>
> It is not clear about importing a plain text file, and them converting that
> text file into TermDocMatrix file, etc to me.
> How can I build a TermDocMatrix of " a plain text document file for text
> association?
> Or are there any good manuals?
>
> Thank you in advance,
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : phdhw... at gmail.com
>
> ? ? ? ? [[alternative HTML version deleted]]
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.