reshape/aggregate

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110831/b91cfa6e/attachment.pl>
Hi
Hi all,
I apologize for this probably stupid question, but I really can't figure 
it
out.
I have a dataframe like this:

group <- c(rep('A', 8), rep('B', 15), rep('C', 6))
time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2))
value <- runif (29, 1, 10)
dfx <- data.frame (group, time, value)

I want to calculate mean and standard deviation for all values that 
belong
to the same group and the same time and end up with a dataframe with the
columns time, group, mean and sd that contains the calculated values for
every group at every time point only once (12).
What is the most elegant way to do this? Oh, and I would like to avoid
renaming columns (like the _X1/_X2 created by casting with multiple
functions), if possible.
I am sure that this is pretty basic, but I have already wasted a 
ridiculous
amount of time on this.
see
?aggregate

aggregate(dfx$value, list(group=dfx$group, time=dfx$time), function(x) 
c(mean(x), sd(x)))

and maybe also plyr package could help you

Regards
Petr
Thanks,

Kai

   [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
The plyr solution is:

library(plyr)
ddply(dfx,  .(group, time), summarize, mean = mean(value), sd = sd(value))

Best,
Ista
Hi

Hi all,
I apologize for this probably stupid question, but I really can't figure
it
out.
I have a dataframe like this:

group <- c(rep('A', 8), rep('B', 15), rep('C', 6))
time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2))
value <- runif (29, 1, 10)
dfx <- data.frame (group, time, value)

I want to calculate mean and standard deviation for all values that
belong
to the same group and the same time and end up with a dataframe with the
columns time, group, mean and sd that contains the calculated values for
every group at every time point only once (12).
What is the most elegant way to do this? Oh, and I would like to avoid
renaming columns (like the _X1/_X2 created by casting with multiple
functions), if possible.
I am sure that this is pretty basic, but I have already wasted a
ridiculous
amount of time on this.
see
?aggregate

aggregate(dfx$value, list(group=dfx$group, time=dfx$time), function(x)
c(mean(x), sd(x)))

and maybe also plyr package could help you

Regards
Petr

Thanks,

Kai

? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org
You can use data.table:
group <- c(rep('A', 8), rep('B', 15), rep('C', 6))
time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2))
value <- runif (29, 1, 10)
dfx <- data.frame (group, time, value)
require(data.table)
dfx <- data.table(dfx)
dfx[,
+     list(mean = mean(value), sd = sd(value))
+     , by = list(group, time)
+     ]
      group time     mean        sd
 [1,]     A    1 7.902432 0.8484807
 [2,]     A    2 5.583566 1.1996167
 [3,]     A    3 3.412691 1.1138794
 [4,]     A    4 7.786522 2.2367483
 [5,]     B    1 6.669257 2.1476769
 [6,]     B    2 2.902291 1.6630821
 [7,]     B    3 6.913593 0.9110182
 [8,]     B    4 4.713124 0.9521689
 [9,]     B    5 7.285824 1.5884689
[10,]     C    1 3.799665 3.7728015
[11,]     C    2 9.218785 0.9415034
[12,]     C    3 5.098077 3.5256497
Hi all,
I apologize for this probably stupid question, but I really can't figure it
out.
I have a dataframe like this:

group <- c(rep('A', 8), rep('B', 15), rep('C', 6))
time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2))
value <- runif (29, 1, 10)
dfx <- data.frame (group, time, value)

I want to calculate mean and standard deviation for all values that belong
to the same group and the same time and end up with a dataframe with the
columns time, group, mean and sd that contains the calculated values for
every group at every time point only once (12).
What is the most elegant way to do this? Oh, and I would like to avoid
renaming columns (like the _X1/_X2 created by casting with multiple
functions), if possible.
I am sure that this is pretty basic, but I have already wasted a ridiculous
amount of time on this.

Thanks,

Kai

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Hi
The plyr solution is:

library(plyr)
ddply(dfx,  .(group, time), summarize, mean = mean(value), sd = 
sd(value))

I tried to do the task by ddply but I had difficulties to understand the 
correct syntax. Maybe in next issue of plyr summarise could be referenced 
in ddply help page.

Or add something like:

When performing summary values for a data frame according to levels of a 
factor you shall use syntax
ddply(.data, .variables, summarise, .fun, ...)

Regards
Petr
Best,
Ista

On Wed, Aug 31, 2011 at 7:13 AM, Petr PIKAL <petr.pikal at precheza.cz> 
wrote:
Hi

Hi all,
I apologize for this probably stupid question, but I really can't 
figure
it
out.
I have a dataframe like this:

group <- c(rep('A', 8), rep('B', 15), rep('C', 6))
time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2))
value <- runif (29, 1, 10)
dfx <- data.frame (group, time, value)

I want to calculate mean and standard deviation for all values that
belong
to the same group and the same time and end up with a dataframe with 
the
columns time, group, mean and sd that contains the calculated values 
for
every group at every time point only once (12).
What is the most elegant way to do this? Oh, and I would like to 
avoid
renaming columns (like the _X1/_X2 created by casting with multiple
functions), if possible.
I am sure that this is pretty basic, but I have already wasted a
ridiculous
amount of time on this.
see
?aggregate

aggregate(dfx$value, list(group=dfx$group, time=dfx$time), function(x)
c(mean(x), sd(x)))

and maybe also plyr package could help you

Regards
Petr

Thanks,

Kai

   [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110831/56064538/attachment.pl>