Skip to content

subset with non logical rules

2 messages · arun, sisseck.net

#
HI,
Try:
?split()

source("http://www.openintro.org/stat/data/cdc.R")
?str(cdc)
#'data.frame':?? ?20000 obs. of? 9 variables:
# $ genhlth : Factor w/ 5 levels "excellent","very good",..: 3 3 3 3 2 2 2 2 3 3 ...
# $ exerany : num? 0 0 1 1 0 1 1 0 0 1 ...
# $ hlthplan: num? 1 1 1 1 1 1 1 1 1 1 ...
# $ smoke100: num? 0 1 1 0 0 0 0 0 1 0 ...
# $ height? : num? 70 64 60 66 61 64 71 67 65 70 ...
# $ weight? : int? 175 125 105 132 150 114 194 170 150 180 ...
# $ wtdesire: int? 175 115 105 124 130 114 185 160 130 170 ...
# $ age???? : int? 77 33 49 42 55 55 31 45 27 44 ...
# $ gender? : Factor w/ 2 levels "m","f": 1 2 2 2 2 2 1 1 2 1 ...
cdc$genhlth<- as.character(cdc$genhlth)
cdclst1<- split(cdc,cdc$genhlth)
lapply(cdclst1,head,2)
#$excellent
#???? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#11 excellent?????? 1??????? 1??????? 1???? 69??? 186????? 175? 46????? m
#13 excellent?????? 1??????? 0??????? 1???? 66??? 185????? 220? 21????? m
#
#$fair
#?? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#12??? fair?????? 1??????? 1??????? 1???? 69??? 168????? 148? 62????? m
#15??? fair?????? 1??????? 0??????? 0???? 69??? 170????? 170? 23????? m
#
#$good
#? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#1??? good?????? 0??????? 1??????? 0???? 70??? 175????? 175? 77????? m
#2??? good?????? 0??????? 1??????? 1???? 64??? 125????? 115? 33????? f
#
#$poor
#?? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#53??? poor?????? 1??????? 1??????? 1???? 62??? 140????? 130? 64????? f
#79??? poor?????? 1??????? 1??????? 0???? 63??? 142????? 120? 52????? f

#$`very good`
#??? genhlth exerany hlthplan smoke100 height weight wtdesire age gender
#5 very good?????? 0??????? 1??????? 0???? 61??? 150????? 130? 55????? f
#6 very good?????? 1??????? 1??????? 0???? 64??? 114????? 114? 55????? f


sapply(cdclst1,nrow)
#excellent????? fair????? good????? poor very good 
#???? 4657????? 2019????? 5675?????? 677????? 6972 

cdcGood<-cdclst1[["good"]]
? str(cdcGood)
#'data.frame':?? ?5675 obs. of? 9 variables:
# $ genhlth : chr? "good" "good" "good" "good" ...
# $ exerany : num? 0 0 1 1 0 1 1 0 1 1 ...
# $ hlthplan: num? 1 1 1 1 1 1 1 0 1 1 ...
# $ smoke100: num? 0 1 1 0 1 0 1 1 1 1 ...
# $ height? : num? 70 64 60 66 65 70 73 67 75 65 ...
# $ weight? : int? 175 125 105 132 150 180 185 156 200 160 ...
# $ wtdesire: int? 175 115 105 124 130 170 175 150 190 140 ...
# $ age???? : int? 77 33 49 42 27 44 79 47 43 54 ...
# $ gender? : Factor w/ 2 levels "m","f": 1 2 2 2 2 1 1 1 1 2 ...
?

A.K.
options that the persons could respond. For exmaple "good" "very good" 
and "poor". Now >what i would like to do is to seperate the data so that 
everyone who answered good are stored in one variable and everyone who 
answered poor are in >another variable.
get the poor, but would really like for a code that would seperate data
 into each >group, regardless of what the text or the number of groups 
are.