beginner programming question
On Wed, 17 Dec 2003, Tony Plate wrote:
Another way to approach this is to first massage the data into a more regular format. This may or may not be simpler or faster than other solutions suggested.
You could also use the reshape() command to do the massaging -thomas
> x <- read.table("clipboard", header=T)
> x
rel1 rel2 rel3 age0 age1 age2 age3 sex0 sex1 sex2 sex3 1 1 3 NA 25 23 2 NA 1 2 1 NA 2 4 1 3 35 67 34 10 2 2 1 2 3 1 4 4 39 40 59 60 1 2 2 1 4 4 NA NA 45 70 NA NA 2 2 NA NA
> nn <- c("rel","age0","age","sex0","sex")
> xx <- rbind("colnames<-"(x[,c("rel1","age0","age1","sex0","sex1")], nn),
+ "colnames<-"(x[,c("rel2","age0","age2","sex0","sex2")], nn),
+ "colnames<-"(x[,c("rel3","age0","age3","sex0","sex3")], nn))
> xx
rel age0 age sex0 sex 1 1 25 23 1 2 2 4 35 67 2 2 3 1 39 40 1 2 4 4 45 70 2 2 11 3 25 2 1 1 21 1 35 34 2 1 31 4 39 59 1 2 41 NA 45 NA 2 NA 12 NA 25 NA 1 NA 22 3 35 10 2 2 32 4 39 60 1 1 42 NA 45 NA 2 NA
> > rbind(subset(xx, xx$rel==1 & (xx$sex0==1 |
xx$sex0==xx$sex))[,c("age0","age")], subset(xx, xx$rel==1 & xx$sex==1 &
xx$sex0!=xx$sex)[,c("age","age0")])
age0 age
1 25 23
3 39 40
21 35 34
>
hope this helps, Tony Plate PS. To advanced R users: Is the above usage of the "colnames<-" function within an expression regarded as acceptable or as undesirable programming style? -- I've rarely seen it used, but it can be quite useful. At Wednesday 09:28 PM 12/17/2003 +0200, Adrian Dusa wrote:
Hi all, The last e-mails about beginners gave me the courage to post a question; from a beginner's perspective, there are a lot of questions that I'm tempted to ask. But I'm trying to find the answers either in the documentation, either in the about 15 free books I have, either in the help archives (I often found many similar questions posted in the past). Being an (still actual) user of SPSS, I'd like to be able to do everything in R. I've learned that the best way of doing it is to struggle and find a solution no matter what, refraining from doing it with SPSS. I've became more and more aware of the almost unlimited possibilities that R offers and I'd like to completely switch to R whenever I think I'm ready. I have a (rather theoretical) programming problem for which I have found a solution, but I feel it is a rather poor one. I wonder if there's some other (more clever) solution, using (maybe?) vectorization or subscripting. A toy example would be: rel1 rel2 rel3 age0 age1 age2 age3 sex0 sex1 sex2 sex3 1 3 NA 25 23 2 NA 1 2 1 NA 4 1 3 35 67 34 10 2 2 1 2 1 4 4 39 40 59 60 1 2 2 1 4 NA NA 45 70 NA NA 2 2 NA NA where rel1...3 states the kinship with the respondent (person 0) code 1 meaning husband/wife, code 4 meaning parent and code 3 for children. I would like to get the age for husbands (code 1) in a first column and wife's age in the second: ageh agew 25 23 34 35 39 40 My solution uses *for* loops and *if*s checking for code 1 in each element in the first 3 columns, then checking in the last three columns for husband's code, then taking the corresponding age in a new matrix. I've learned that *for* loops are very slow (and indeed with my dataset of some 2000 rows and 13 columns for kinship it takes quite a lot). I found the "Looping" chapter in "S poetry" very useful (it did saved me from *for* loops a couple of times, thanks!). Any hints would be appreciated, Adrian ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Adrian Dusa (adi at roda.ro) Romanian Social Data Archive (www.roda.ro <http://www.roda.ro/> ) 1, Schitu Magureanu Bd. 76625 Bucharest sector 5 Romania Tel./Fax: +40 (21) 312.66.18\ +40 (21) 312.02.10/ int.101 [[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle