Skip to content

Survey Package with Binary Data (no Standard Errors reported)

2 messages · Paul Jones, Thomas Lumley

#
Hi,

I'm trying to get standard errors for some of the variables in my data 
frame. One of the questions on my survey is whether faculty coordinate 
across curriculum to include Arts Education as subject matter. All the 
responses are coded in zeros and ones obviously. For some of the other 
variables I have a 2 for those that responded with "Don't Know".

I'm getting NA for mean and standard deviations from svymean. Am I doing 
something wrong of can the survey package not handle this type of data?

Here's my code.

 > survey <- svydesign(id=~1, data=General, strata=~Grade.Level)
Warning message:
In svydesign.default(id = ~1, data = General, strata = ~Grade.Level) :
  No weights or probabilities supplied, assuming equal probability

 > summary(survey)
Stratified Independent Sampling design (with replacement)
svydesign(id = ~1, data = General, strata = ~Grade.Level)
Probabilities:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      1       1       1       1       1       1
Stratum Sizes:
           Elementary High Middle
obs               312  236    156
design.PSU        312  236    156
actual.PSU        312  236    156
Data variables:
 [1] "Grade.Level"                  "Curriculum"                 
 [3] "Field.Trips"                  "Residencies"                
 [5] "PTA.Support"                  "Community.Open.Performances"
 [7] "Visual.Arts.Attendance"       "Literary.Arts.Attendance"   
 [9] "Arts.Organization.Membership" "Arts.Essential"
          
 > svymean(~Curriculum, survey)
           mean SE
Curriculum   NA NA

???

PJ
3 days later
#
On Fri, 3 Apr 2009, Paul Jones wrote:

            
Are you sure you don't have any NA values in the data? If you do, the 
na.rm=TRUE option to svymean() will fix your problem. If you don't, 
something mysterious is happening. You could try svytable(~Curriculum, 
survey), which will give a tabulation and might show up what is strange 
about your data.

As a separate issue, you might want to look at svyciprop() if some of your 
proportions are close to 0 or 1, to get better confidence intervals.
Nothing obviously wrong with it.

 	-thomas
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle