Background:
OS: Linux Mandrake 10.1
release: R 2.0.0
editor: GNU Emacs 21.3.2
front-end: ESS 5.2.3
---------------------------------
Colleagues
I am having some trouble extracting results from the function by, used to
average variables in a data.frame first by one factor (depth) and then by a
second factor (station). The real data.frame is quite large
dim(data.2001)
[1] 32049 11
Here is a snippet of code:
## bin density data for each station into 1 m depth bins, containing means
data.2001.test$integer.Depth <- as.factor(round(data.2001.test$Depth,
digits=0))
attach(data.2001.test)
binned.data.2001 <- by(data.2001.test[,5:11], list(depth=integer.Depth,
station=Station), mean)
and here is a snippet of the data.frame
Background:
OS: Linux Mandrake 10.1
release: R 2.0.0
editor: GNU Emacs 21.3.2
front-end: ESS 5.2.3
---------------------------------
Colleagues
I am having some trouble extracting results from the function by, used to
average variables in a data.frame first by one factor (depth) and then by a
second factor (station). The real data.frame is quite large
dim(data.2001)
[1] 32049 11
Here is a snippet of code:
## bin density data for each station into 1 m depth bins, containing means
data.2001.test$integer.Depth <- as.factor(round(data.2001.test$Depth,
digits=0))
attach(data.2001.test)
binned.data.2001 <- by(data.2001.test[,5:11], list(depth=integer.Depth,
station=Station), mean)
and here is a snippet of the data.frame
Try the following. To keep this short lets just take a subset
of rows called dd. Also, we drop the Station levels
that are not being used since this test only uses 2 levels
and there are 288 Station levels in total. The function that we apply using
by returns a vector consisting of the integer.Depth, Station
and the column means of columns 5 to 10. (Asking for just the
mean of those, as in your example, would take all the numbers
in all the columns passed to mean and give back a grand mean
rather than a mean per column.) Finally we rbind it all back together.
# data.2001.test is your data frame including the integer.Depth column
dd <- data.2001.test[50:60,]
dd$Station <- dd$Station[drop = TRUE]
dd.bin <- by(dd, list(dd$integer.Depth, dd$Station), function(x)
+ c(integer.Depth = x$integer.Depth[1], Station = x$Station[1],
+ colMeans(x[,5:10])))
On 5/29/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
On 5/29/05, McClatchie, Sam (PIRSA-SARDI)
<mcclatchie.sam at saugov.sa.gov.au> wrote:
Background:
OS: Linux Mandrake 10.1
release: R 2.0.0
editor: GNU Emacs 21.3.2
front-end: ESS 5.2.3
---------------------------------
Colleagues
I am having some trouble extracting results from the function by, used to
average variables in a data.frame first by one factor (depth) and then by a
second factor (station). The real data.frame is quite large
dim(data.2001)
[1] 32049 11
Here is a snippet of code:
## bin density data for each station into 1 m depth bins, containing means
data.2001.test$integer.Depth <- as.factor(round(data.2001.test$Depth,
digits=0))
attach(data.2001.test)
binned.data.2001 <- by(data.2001.test[,5:11], list(depth=integer.Depth,
station=Station), mean)
and here is a snippet of the data.frame
Try the following. To keep this short lets just take a subset
of rows called dd. Also, we drop the Station levels
that are not being used since this test only uses 2 levels
and there are 288 Station levels in total. The function that we apply using
by returns a vector consisting of the integer.Depth, Station
and the column means of columns 5 to 10. (Asking for just the
mean of those, as in your example, would take all the numbers
in all the columns passed to mean and give back a grand mean
rather than a mean per column.) Finally we rbind it all back together.
# data.2001.test is your data frame including the integer.Depth column
dd <- data.2001.test[50:60,]
dd$Station <- dd$Station[drop = TRUE]
dd.bin <- by(dd, list(dd$integer.Depth, dd$Station), function(x)
+ c(integer.Depth = x$integer.Depth[1], Station = x$Station[1],
+ colMeans(x[,5:10])))
Here is a correction for the fact that the first two columns are
factors. This time, instead of creating a vector in the function we create a
one row data frame.
Background:
OS: Linux Mandrake 10.1
release: R 2.0.0
editor: GNU Emacs 21.3.2
front-end: ESS 5.2.3
---------------------------------
Colleagues
I am having some trouble extracting results from the function by, used to
average variables in a data.frame first by one factor (depth) and then by a
second factor (station). The real data.frame is quite large
dim(data.2001)
[1] 32049 11
Here is a snippet of code:
## bin density data for each station into 1 m depth bins, containing means
data.2001.test$integer.Depth <- as.factor(round(data.2001.test$Depth,
digits=0))
attach(data.2001.test)
binned.data.2001 <- by(data.2001.test[,5:11], list(depth=integer.Depth,
station=Station), mean)
and here is a snippet of the data.frame