An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111128/14a6a8dd/attachment.pl>
how to transform a data file
4 messages · pat j, Jorge I Velez, Jeff Newmiller +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111129/71f0a145/attachment.pl>
library(reshape2)
# sample data because you didn't provide any
dta <- as.data.frame( matrix( sample( 0:1, 100, replace=TRUE ), ncol=10 ) )
dta <- cbind( IDN=1:10, dta )
# The command you couldn't figure out
meltdta <- melt( dta, "IDN" )
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
pat j <pjsleuth at gmail.com> wrote:
Hello R people, I have a data file with 101 numeric variables: one variable called IDN (the individual's unique id number, which I need to retain, and which ranges from 1000 to 1320; some numbers are obviously skipped), and V1 to V100 (each has a value of 0 or 1; these 100 variables represent sequentially ordered days and whether a characteristic was present or absent--e.g., v1 is day 1 and a "1" means the characteristic is present; v10 is day 10 and "0" means the characteristic is absent). This may be child's play for many on this list, but how do I transform this data file to two columns, one called "id" and another column named "c" with 100 rows? I think it will end up being a 1000 row file. I've read some and think that I'm trying to "melt" my existing data. I can transpose the v1 to v100 with t(v1 to v100) but then I'm unclear on how to automatically generate 100 identical IDN's for each case and variable and then put them together. This may be redundant, but for the sake of clarity, what I'm trying to do is get from this: IDN V1 V2 V3 ? V100 1 0 1 0 . . . 1 2 1 1 1 . . . 0 4 0 1 0 . . . 1 . . 100 0 1 0 . . . 1 To this: id c 1 0 1 1 1 0 . . . [continue 96 more times for c4 - c100] 1 1 2 1 2 1 2 1 . . . [continue 96 more times for c4 - c100] 2 0 . . . [then repeat this for the next 98 cases] 100 0 1 0 1 Thank you very much. PJ [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi option 3 library(reshape) melt(d, id.vars="id") Regards Petr
Hi PJ,
Try
# some data
id <- 1:20
m <- matrix(sample(0:1, 200, TRUE), ncol = 10)
colnames(m) <- paste('V', 1:10, sep = "")
d <- data.frame(id, m)
d
# option 1
cbind(rep(d$id, each = ncol(d)-1), matrix(unlist(t(d[,-1])), ncol = 1))
# option 2
cbind(rep(d$id, each = ncol(d) - 1), stack(d[,-1])[,-2])
HTH,
Jorge.-
On Mon, Nov 28, 2011 at 10:19 PM, pat j <> wrote:
Hello R people, I have a data file with 101 numeric variables: one variable called IDN
(the
individual's unique id number, which I need to retain, and which
ranges
from 1000 to 1320; some numbers are obviously skipped), and V1 to V100 (each has a value of 0 or 1; these 100 variables represent
sequentially
ordered days and whether a characteristic was present or absent--e.g.,
v1
is day 1 and a "1" means the characteristic is present; v10 is day 10
and
"0" means the characteristic is absent). This may be child's play for many on this list, but how do I transform
this
data file to two columns, one called "id" and another column named
"c"
with 100 rows? I think it will end up being a 1000 row file. I've read
some
and think that I'm trying to "melt" my existing data. I can transpose
the
v1 to v100 with t(v1 to v100) but then I'm unclear on how to
automatically
generate 100 identical IDN's for each case and variable and then put
them
together. This may be redundant, but for the sake of clarity, what I'm trying
to do
is get from this:
IDN V1 V2 V3 ? V100
1 0 1 0 . . . 1
2 1 1 1 . . . 0
4 0 1 0 . . . 1
.
.
100 0 1 0 . . . 1
To this:
id c
1 0
1 1
1 0
. .
. [continue 96 more times for c4 - c100]
1 1
2 1
2 1
2 1
. .
. [continue 96 more times for c4 - c100]
2 0
.
.
.
[then repeat this for the next 98 cases]
100 0 1 0 1
Thank you very much.
PJ
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.