Skip to content

transforming column of a dataframe by var- and median-functions

3 messages · Agrarimmobilien, Daniel Malter, Henrique Dallazuanna

#
Hello,

I try to transform a data frame like

A        B        C

1        1        2,5
2        2        NA
3        1        1,0
4        1        56
5        2        23
6        1        NA
7        2        46

to the following dataframe, calculating the variance and median of the 
C-column, group by B, so the result will be:

B        C(median)         D (var)
1        19,83                9
2        34,5                21


Doing this, I got problems with the NAs in column C.
I tried to combine the aggregate - function
aggregate(C, list(B), FUN=(mean, var))

with the following functions

var(C, use="complete.obs")
median(C, rm.na=TRUE)

but it doesn't work as I want. Has anybody an idea how to do this work?

thank you
Iksmax
#
I think tapply does the job you want:

tapply(C,B,mean,na.rm=TRUE)
tapply(C,B,var,na.rm=TRUE)

Cheers,
Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Agrarimmobilien
Gesendet: Sunday, March 09, 2008 4:38 PM
An: R-help at r-project.org
Betreff: [R] transforming column of a dataframe by var- and median-functions

Hello,

I try to transform a data frame like

A        B        C

1        1        2,5
2        2        NA
3        1        1,0
4        1        56
5        2        23
6        1        NA
7        2        46

to the following dataframe, calculating the variance and median of the
C-column, group by B, so the result will be:

B        C(median)         D (var)
1        19,83                9
2        34,5                21


Doing this, I got problems with the NAs in column C.
I tried to combine the aggregate - function aggregate(C, list(B), FUN=(mean,
var))

with the following functions

var(C, use="complete.obs")
median(C, rm.na=TRUE)

but it doesn't work as I want. Has anybody an idea how to do this work?

thank you
Iksmax

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
I think that your 'C' column is a factor(or character), because the
character used(comma).

One option:

dat$C <- as.numeric(gsub(",", ".", as.character(dat$C)))


library(doBy)
summaryBy(C ~ B, data=dat, FUN=c(median, var), na.rm = T)
On 09/03/2008, Agrarimmobilien <ralf.pfeiffer at agrarimmobilien.info> wrote:
--
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O