An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110519/66340132/attachment.pl>
svytable and na's
4 messages · Thomas Lumley, Manderscheid Katharina
On Thu, May 19, 2011 at 11:41 PM, Manderscheid Katharina
<Katharina.Manderscheid at unilu.ch> wrote:
hi, i am trying to work with the survey package in order to apply survey design weights. the data set i am using - ess - contains missing values. my question: when using svytable(~variable1+variable2, design=my.svydesign.object, na.rm=T) the resulting crosstable contains all missings although i defined the na's as such.
Could you give more details? There isn't an na.rm= argument to
svytable, so it's not surprising it has no effect, but I don't know
what you mean when you say the table "contains all missings." Perhaps
you could show us the output and say how it differs from what you
expected.
-thomas
Thomas Lumley Professor of Biostatistics University of Auckland
hi thomas thanks for your reply. in the documentation of svytable, the argument na.rm=T is mentioned. however, last night i figured out what went wrong in my tabulation: i had a dataset which i attached and then defined the missing values - of course they were not stored in the data set. attach(data) vote[vote=="Don't know"|vote=="Refusal"]<-NA detach(data) so they disappeared when i created a svydesign-object with weighted data and my table looked like this:
svytable(~gndr+vote, design=data.weight, na.rm=T)
vote
gndr Yes No Not eligible to vote Refusal Don't know
Male 453.8726 226.7600 154.1651 0.0000 10.8572
Female 507.6368 302.3634 145.4426 0.0000 17.9157
No answer 0.0000 0.0000 0.0000 0.0000 0.0000
vote
gndr No answer
Male 0.0000
Female 0.0000
No answer 0.0000
thus the solution is to define the missings in the original data set by using data$variable before creating the svydesign-object.
data$vote[data$vote=="Don't know"|data$vote=="Refusal"]<-NA
then the svytable looks correctly:
svytable(~gndr]+vote, design=data.weight, na.rm=T)
vote[drop = T]
gndr[drop = T] Yes No Not eligible to vote
Male 453.8726 226.7600 154.1651
Female 507.6368 302.3634 145.4426
or is there a way to define factor levels as na directly in an svydesign-object?
best and thanks
katharina
dr. katharina manderscheid
soziologisches seminar
universit?t luzern
kasernenplatz 3
6000 luzern 7
tel. ++41 41 228 4657
Von: Thomas Lumley [tlumley at uw.edu]
Gesendet: Donnerstag, 19. Mai 2011 23:48
An: Manderscheid Katharina
Cc: r-help at r-project.org
Betreff: Re: [R] svytable and na's
On Thu, May 19, 2011 at 11:41 PM, Manderscheid Katharina
<Katharina.Manderscheid at unilu.ch> wrote:
> hi,
>
> i am trying to work with the survey package in order to apply survey design weights. the data set i am using - ess - contains missing values.
> my question: when using svytable(~variable1+variable2, design=my.svydesign.object, na.rm=T) the resulting crosstable contains all missings although i defined the na's as such.
>
Could you give more details? There isn't an na.rm= argument to
svytable, so it's not surprising it has no effect, but I don't know
what you mean when you say the table "contains all missings." Perhaps
you could show us the output and say how it differs from what you
expected.
-thomas
--
Thomas Lumley
Professor of Biostatistics
University of Auckland
On Fri, May 20, 2011 at 7:54 PM, Manderscheid Katharina
<Katharina.Manderscheid at unilu.ch> wrote:
hi thomas thanks for your reply. in the documentation of svytable, the argument na.rm=T is mentioned.
No, it isn't. The page says Usage ## S3 method for class 'survey.design': svytable(formula, design, Ntotal = NULL, round = FALSE,...) ## S3 method for class 'svyrep.design': svytable(formula, design, Ntotal = sum(weights(design, "sampling")), round = FALSE,...) There *is* an na.rm= argument to svychisq(), which is documented on the same help page.
however, last night i figured out what went wrong in my tabulation: i had a dataset which i attached and then defined the missing values - of course they were not stored in the data set. attach(data) vote[vote=="Don't know"|vote=="Refusal"]<-NA detach(data)
Yes, that's why attach() is usually a bad idea -- it's very easy to get confused that way.
or is there a way to define factor levels as na directly in an svydesign-object?
You can use update()
design <- update(design, vote=ifelse( vote %in% c("Don't
Know","Refusal"), NA, vote))
-thomas
Thomas Lumley Professor of Biostatistics University of Auckland