Skip to content
Back to formatted view

Raw Message

Message-ID: <50D2FD81.1030208@gmail.com>
Date: 2012-12-20T11:58:57Z
From: Duncan Murdoch
Subject: how to make a table of summary statistics
In-Reply-To: <CAG7aifKt9Pa3reQkcP3RJnkM_bnV=fBiQV4+VM4wATF-43KyqQ@mail.gmail.com>

On 12-12-20 6:45 AM, Francesco Sarracino wrote:
> Dear R-listers,
>
> I am a newbie with R and I am struggling with something I consider very
> basic. I wish to produce a table (to import in a latex file) of summary
> statistics, but for as much as I've been looking around and trying various
> alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I
> am looking for. Probably I am doing something wrong, but I can't figure out
> what.
> Let's make up three simple variables:
>
> var1 <- runif(1000)
> var2 <- runif(1000)
> var3 <- factor(rep(1:2, 500), labels = c("m", "f"))
>
> and let's create a dataset out of them:
> data <- data.frame(var1, var2, var3)
>
> what I'd like to get is a table such as the following one:
>
> variable mean sd min max obs missing
> var1
> var2
> var3
>
> where for each variable, I can read in line the mean, the standard
> deviation, the min and the max values, the number of observations and the
> percentage of missing data.
> Can you advice any way to achieve it?
> Thanks a lot in advance for your kind help,

I'm not sure what you want for var3:  it doesn't make sense to calculate 
the mean or sd for a factor.  But for the other variables, using package 
tables, you  do

latex( tabular( Heading("variable")*(var1 + var2) ~ (mean + sd + min + 
max + (obs=length) + (missing=function(x) sum(is.na(x)))), data=data) )

You might want a breakdown of the summaries by var3; you'd get that this 
way:

latex( tabular( Heading("variable")*(var1 + var2)*var3 ~ (mean + sd + 
min + max + (obs=length) + (missing=function(x) sum(is.na(x)))), 
data=data) )

Duncan Murdoch