how to make a table of summary statistics
On 12-12-20 6:45 AM, Francesco Sarracino wrote:
Dear R-listers,
I am a newbie with R and I am struggling with something I consider very
basic. I wish to produce a table (to import in a latex file) of summary
statistics, but for as much as I've been looking around and trying various
alternatives (plyr, reporttools, pastecs and Hmisc) I haven't found what I
am looking for. Probably I am doing something wrong, but I can't figure out
what.
Let's make up three simple variables:
var1 <- runif(1000)
var2 <- runif(1000)
var3 <- factor(rep(1:2, 500), labels = c("m", "f"))
and let's create a dataset out of them:
data <- data.frame(var1, var2, var3)
what I'd like to get is a table such as the following one:
variable mean sd min max obs missing
var1
var2
var3
where for each variable, I can read in line the mean, the standard
deviation, the min and the max values, the number of observations and the
percentage of missing data.
Can you advice any way to achieve it?
Thanks a lot in advance for your kind help,
I'm not sure what you want for var3: it doesn't make sense to calculate
the mean or sd for a factor. But for the other variables, using package
tables, you do
latex( tabular( Heading("variable")*(var1 + var2) ~ (mean + sd + min +
max + (obs=length) + (missing=function(x) sum(is.na(x)))), data=data) )
You might want a breakdown of the summaries by var3; you'd get that this
way:
latex( tabular( Heading("variable")*(var1 + var2)*var3 ~ (mean + sd +
min + max + (obs=length) + (missing=function(x) sum(is.na(x)))),
data=data) )
Duncan Murdoch