Skip to content

Weighted descriptives by levels of another variables

2 messages · Andrew Miles, Karl Ove Hufthammer

#
Thanks!  Using the plyr package and the approach you outlined seems to  
work well for relatively simple functions (like wtd.mean), but so far  
I haven't had much success in using it with more complex descriptive  
functions like describe {Hmisc}.  I'll take a look later, though, and  
see if I can figure out why.

At any rate, ddply() looks like it will simplify writing a function  
that will allow for weighting data and subdividing it, but still give  
comprehensive summary statistics (i.e. not just the mean or quantiles,  
but all in one).  I'll post it to the list once I have the time to  
write it up.

I also took a stab at using the svyby funtion in the survey package,  
but received the following error message when I input :

 > svyby(cbind(educ, age), female, svynlsy, svymean)
Error in `[.survey.design2`(design, byfactor %in% byfactor[i], ) :
   (subscript) logical subscript too long
__________________________________________________________
In addition to using the survey package (and the svyby function), I've  
found
that many of the 'weighted' functions, such as wtd.mean, work well  
with the
plyr package.  For example,

wtdmean=function(df)wtd.mean(df$obese,df$sampwt);
ddply(mydata, ~cut2(age,c(2,6,12,16)),'wtdmean')

hth, david freedman
Andrew Miles-2 wrote:
#
On Mon, 16 Nov 2009 10:43:38 -0500 Andrew Miles <rstuff.miles at gmail.com> 
wrote:
'describe' outputs a list, not just a vector. To get the actual values 
as vectors, you have to extract them, e.g.:

describe(x)$counts
describe(x)$values