Skip to content

Creating data frame from existing data frame

3 messages · 2k3autococker, Peter Alspach, Dimitris Rizopoulos

#
I'm new to R, using it for an engineering stats class, and the first project
is focused on creating data frames and plotting graphs. So far I have
imported a set of data from a text file and saved it as a variable (using
the read.table() function). One of the columns of the data consists of
years, and I'm supposed to create a data frame that only consists the the
date from one given year (ie. I need to scan the Year column, pick out the
specific year, and include those rows). I'm pretty sure I have to use
data.frame() to do this, but I don't know what arguments would pick out rows
from a particular year (and keep the headers of the columns intact in the
new data frame). Beyond this, I need to plot it and export it to an image
file, which I can do myself.

Can someone please tell me the arguments to data.frame() that would achieve
this (or better yet, point me to a good list of functions and arguments in
addition to this).

Thanks for your help,

Josh
#
Josh

You are wanting to subset the data so try help(subset) or ?subset. And
follow the See also which refers to '['.  You might also find the
introductory manual helpful.  Most of the contributed documentation on
CRAN (the comprehensive R archive network) will cover this with
examples.  I found Patrick Burns' 'S-poetry' particularly helpful when I
started - although I don't think he has been updating it, it'll still be
relevant in this context.

I hope this is more helpful than simply supplying the answer.

Peter Alspach
rame-tp19449005p19449005.html
The contents of this e-mail are privileged and/or confidential to the named
 recipient and are not to be used by any other person and/or organisation.
 If you have received this e-mail in error, please notify the sender and delete
 all material pertaining to this e-mail.
#
well, first you could have a look at

?"[.data.frame"
?subset

and then check the following:

dat <- data.frame(year = sample(seq(2000, 2008, 2), 100, TRUE), y =
rnorm(100))

subset(dat, year == 2002)
dat[dat$year == 2002, ]

# or

subset(dat, year > 2002)
dat[dat$year > 2002, ]


I hope it helps.

Best,
Dimitris