creating a vector from a file
On Tue, 2011-05-31 at 16:19 +0200, heimat los wrote:
On Tue, May 31, 2011 at 4:12 PM, Matt Shotwell <matt at biostatmatt.com>
wrote:
On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote:
> Hello all,
> I am new to R and my question should be trivial. I need to
create a word
> cloud from a txt file containing the words and their
occurrence number. For
> that purposes I am using the snippets package [1].
> As it can be seen at the bottom of the link, first I have to
create a vector
> (is that right that words is a vector?) like bellow.
>
> > words <- c(apple=10, pie=14, orange=5, fruit=4)
>
> My problem is to do the same thing but create the vector
from a file which
> would contain words and their occurence number. I would be
very happy if you
> could give me some hints.
How is the file formatted? Can you provide a small example?
The file format is
"video tape"=8
"object recognition"=45
"object detection"=23
"vhs tape"=2
But I can change it if needed with bash scripting.
A CSV might be more universal, but this will do.
Regards
OK. Save the above as 'words.txt', then from the R prompt:
words.df <- read.table("words.txt", sep="=")
words.vec <- words.df$V2
names(words.vec) <- words.df$V1
Then use words.vec with the snippets::cloud function. I wasn't able to
install the snippets package and test the cloud function, because I am
still using R 2.13.0-alpha.
read.table returns what R calls a 'data frame'; basically a collection
of records over some number of fields. It's like a matrix but different,
since fields may take values of different types. In the example above,
the data frame returned by read.table has two fields named 'V1' and
'V2', respectively. The R expression 'words.df$V2' references the 'V2'
field of words.df, which is a vector. The last expression sets names for
words.vec, by referencing the 'V1' field of words.df.
> Moreover, to understand the format of the file to be
inserted I write the
> vector words to a file.
>
> > write(words, file="words.txt")
>
> However, the file words.txt contains only the values but not
the
> names(apple, pie etc.).
>
> $ cat words.txt
> 10 14 5 4
>
> It seems that I have to understand more about the data types
in R.
>
> Thanks.
> PH
>
> http://www.rforge.net/doc/packages/snippets/cloud.html
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.