Skip to content
Prev 200408 / 398503 Next

Weighted descriptives by levels of another variables

Thanks!  Using the plyr package and the approach you outlined seems to  
work well for relatively simple functions (like wtd.mean), but so far  
I haven't had much success in using it with more complex descriptive  
functions like describe {Hmisc}.  I'll take a look later, though, and  
see if I can figure out why.

At any rate, ddply() looks like it will simplify writing a function  
that will allow for weighting data and subdividing it, but still give  
comprehensive summary statistics (i.e. not just the mean or quantiles,  
but all in one).  I'll post it to the list once I have the time to  
write it up.

I also took a stab at using the svyby funtion in the survey package,  
but received the following error message when I input :

 > svyby(cbind(educ, age), female, svynlsy, svymean)
Error in `[.survey.design2`(design, byfactor %in% byfactor[i], ) :
   (subscript) logical subscript too long
__________________________________________________________
In addition to using the survey package (and the svyby function), I've  
found
that many of the 'weighted' functions, such as wtd.mean, work well  
with the
plyr package.  For example,

wtdmean=function(df)wtd.mean(df$obese,df$sampwt);
ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean')

hth, david freedman
Andrew Miles-2 wrote: