HI,
Try:
?split()
source("http://www.openintro.org/stat/data/cdc.R")
?str(cdc)
#'data.frame':?? ?20000 obs. of? 9 variables:
# $ genhlth : Factor w/ 5 levels "excellent","very good",..: 3 3 3 3 2 2 2 2 3 3 ...
# $ exerany : num? 0 0 1 1 0 1 1 0 0 1 ...
# $ hlthplan: num? 1 1 1 1 1 1 1 1 1 1 ...
# $ smoke100: num? 0 1 1 0 0 0 0 0 1 0 ...
# $ height? : num? 70 64 60 66 61 64 71 67 65 70 ...
# $ weight? : int? 175 125 105 132 150 114 194 170 150 180 ...
# $ wtdesire: int? 175 115 105 124 130 114 185 160 130 170 ...
# $ age???? : int? 77 33 49 42 55 55 31 45 27 44 ...
# $ gender? : Factor w/ 2 levels "m","f": 1 2 2 2 2 2 1 1 2 1 ...
cdc$genhlth<- as.character(cdc$genhlth)
cdclst1<- split(cdc,cdc$genhlth)
lapply(cdclst1,head,2)
#$excellent
#???? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#11 excellent?????? 1??????? 1??????? 1???? 69??? 186????? 175? 46????? m
#13 excellent?????? 1??????? 0??????? 1???? 66??? 185????? 220? 21????? m
#
#$fair
#?? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#12??? fair?????? 1??????? 1??????? 1???? 69??? 168????? 148? 62????? m
#15??? fair?????? 1??????? 0??????? 0???? 69??? 170????? 170? 23????? m
#
#$good
#? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#1??? good?????? 0??????? 1??????? 0???? 70??? 175????? 175? 77????? m
#2??? good?????? 0??????? 1??????? 1???? 64??? 125????? 115? 33????? f
#
#$poor
#?? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#53??? poor?????? 1??????? 1??????? 1???? 62??? 140????? 130? 64????? f
#79??? poor?????? 1??????? 1??????? 0???? 63??? 142????? 120? 52????? f
#$`very good`
#??? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#5 very good?????? 0??????? 1??????? 0???? 61??? 150????? 130? 55????? f
#6 very good?????? 1??????? 1??????? 0???? 64??? 114????? 114? 55????? f
sapply(cdclst1,nrow)
#excellent????? fair????? good????? poor very good
#???? 4657????? 2019????? 5675?????? 677????? 6972
cdcGood<-cdclst1[["good"]]
? str(cdcGood)
#'data.frame':?? ?5675 obs. of? 9 variables:
# $ genhlth : chr? "good" "good" "good" "good" ...
# $ exerany : num? 0 0 1 1 0 1 1 0 1 1 ...
# $ hlthplan: num? 1 1 1 1 1 1 1 0 1 1 ...
# $ smoke100: num? 0 1 1 0 1 0 1 1 1 1 ...
# $ height? : num? 70 64 60 66 65 70 73 67 75 65 ...
# $ weight? : int? 175 125 105 132 150 180 185 156 200 160 ...
# $ wtdesire: int? 175 115 105 124 130 170 175 150 190 140 ...
# $ age???? : int? 77 33 49 42 27 44 79 47 43 54 ...
# $ gender? : Factor w/ 2 levels "m","f": 1 2 2 2 2 1 1 1 1 2 ...
?
A.K.
Hi I am trying to figure out how to subset a bunch of data. As an example I am using the cdc data from openintro.org. In the first column with the name "genhlth" there are various
options that the persons could respond. For exmaple "good" "very good" and "poor". Now >what i would like to do is to seperate the data so that everyone who answered good are stored in one variable and everyone who answered poor are in >another variable.
Now I know i could just do subset(cdc, cdc$genhlth == "poor") to
get the poor, but would really like for a code that would seperate data into each >group, regardless of what the text or the number of groups are.
Can anyone give me a hint?