Hi,
I have the following problem:
I have a data.frame with 36 sample sites (colums) for which I have covariates in 3 categories: Area, Month and River. Each Area consists of 3 rivers, which were sampled over 3 month. Now I want to fuse River 1-3 for one area in one month. To get a data.frame with 12 colums.
I am trying to do a "for loop" (which may be a complicated solution, but I don't see an easier way), which is not working, apparently because a[,ij] or a[,c(i,j)] is not working as a definition of the matrix with a double condition in the colums.
How can I make it work or what would be an easier solution?
Thank you for your help,
Anne
data=data.frame(matrix(1:99,nrow=5,ncol=36))
colnames(data)=c(paste("plot",1:36))
cov=data.frame(rep(1:3,12),c(rep("Jan",12),rep("Feb",12),rep("Mar",12)),rep(c(1,1,1,2,2,2,3,3,3,4,4,4),3))
dimnames(cov)=list(colnames(data),c("River","Month","Area"))
###loop###
a=matrix(nrow=dim(data)[1],ncol=length(levels(factor(cov$Month)))*length(levels(factor(cov$Area))))
for(i in 1:length(levels(factor(cov$Month))))
{
for(j in 1:length(levels(factor(cov$Area))))
{
a[,ij]=as.numeric(rowSums(data[,factor(cov$Month)==levels(factor(cov$Month))[i]&factor(cov$Area)==levels(factor(cov$Area))[j]]))
}
}
For-loop
3 messages · Anne-Christine Mupepele, PIKAL Petr, jim holtman
Hi r-help-bounces at r-project.org napsal dne 20.12.2010 11:48:51:
Hi, I have the following problem: I have a data.frame with 36 sample sites (colums) for which I have
covariates
in 3 categories: Area, Month and River. Each Area consists of 3 rivers,
which
were sampled over 3 month. Now I want to fuse River 1-3 for one area in
one
month. To get a data.frame with 12 colums. I am trying to do a "for loop" (which may be a complicated solution, but
I
don't see an easier way), which is not working, apparently because
a[,ij] or a
[,c(i,j)] is not working as a definition of the matrix with a double
condition
in the colums.
How can I make it work or what would be an easier solution?
Thank you for your help,
Anne
data=data.frame(matrix(1:99,nrow=5,ncol=36))
colnames(data)=c(paste("plot",1:36))
cov=data.frame(rep(1:3,12),c(rep("Jan",12),rep("Feb",12),rep("Mar",12)),rep(c
(1,1,1,2,2,2,3,3,3,4,4,4),3))
dimnames(cov)=list(colnames(data),c("River","Month","Area"))
###loop###
a=matrix(nrow=dim(data)[1],ncol=length(levels(factor(cov$Month)))*length
(levels(factor(cov$Area))))
for(i in 1:length(levels(factor(cov$Month))))
{
for(j in 1:length(levels(factor(cov$Area))))
{
a[,ij]=as.numeric(rowSums(data[,factor(cov$Month)==levels(factor(cov$Month))
[i]&factor(cov$Area)==levels(factor(cov$Area))[j]])) } }
I am not exactly sure what you want to do. What operation is fuse? If it is sum so having you data you can do area<-rep(1:12, each=3) data.t<-t(data)
aggregate(data.t, list(area), sum)
Group.1 V1 V2 V3 V4 V5 1 1 18 21 24 27 30 2 2 63 66 69 72 75 3 3 108 111 114 117 120 4 4 153 156 159 162 165 5 5 198 201 204 207 210 6 6 243 246 249 252 255 7 7 189 192 195 198 102 8 8 36 39 42 45 48 9 9 81 84 87 90 93 10 10 126 129 132 135 138 11 11 171 174 177 180 183 12 12 216 219 222 225 228
t(aggregate(data.t, list(area), sum))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] Group.1 1 2 3 4 5 6 7 8 9 10 11 12 V1 18 63 108 153 198 243 189 36 81 126 171 216 V2 21 66 111 156 201 246 192 39 84 129 174 219 V3 24 69 114 159 204 249 195 42 87 132 177 222 V4 27 72 117 162 207 252 198 45 90 135 180 225 V5 30 75 120 165 210 255 102 48 93 138 183 228 but then there is Month value, which is not apparent from your example. Maybe t(aggregate(data.t, list(area, data.t$Month), sum)) Could do the trick but you probably need to show us maybe str and/or head of your real data. Regards Petr
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
try this:
# create indexing vector to select 3 adjacent columns
indx <- sapply(seq(1, 36, 3), seq, length = 3)
# process each row of the table
ans <- t(apply(data, 1, function(.row){
+ # use indx to sum up the columns
+ apply(indx, 2, function(.indx){
+ sum(.row[.indx])
+ })
+ }))
ans
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [1,] 18 63 108 153 198 243 189 36 81 126 171 216 [2,] 21 66 111 156 201 246 192 39 84 129 174 219 [3,] 24 69 114 159 204 249 195 42 87 132 177 222 [4,] 27 72 117 162 207 252 198 45 90 135 180 225 [5,] 30 75 120 165 210 255 102 48 93 138 183 228 On Mon, Dec 20, 2010 at 5:48 AM, Anne-Christine Mupepele
<anne-chr.afs at web.de> wrote:
Hi,
I have the following problem:
I have a data.frame with 36 sample sites (colums) for which I have covariates in 3 categories: Area, Month and River. Each Area consists of 3 rivers, which were sampled over 3 month. Now I want to fuse River 1-3 for one area in one month. To get a data.frame with 12 colums.
I am trying to do a "for loop" (which may be a complicated solution, but I don't see an easier way), which is not working, apparently because a[,ij] or a[,c(i,j)] is not working as a definition of the matrix with a double condition in the colums.
How can ?I make it work or what would be an easier solution?
Thank you for your help,
Anne
data=data.frame(matrix(1:99,nrow=5,ncol=36))
colnames(data)=c(paste("plot",1:36))
cov=data.frame(rep(1:3,12),c(rep("Jan",12),rep("Feb",12),rep("Mar",12)),rep(c(1,1,1,2,2,2,3,3,3,4,4,4),3))
dimnames(cov)=list(colnames(data),c("River","Month","Area"))
###loop###
a=matrix(nrow=dim(data)[1],ncol=length(levels(factor(cov$Month)))*length(levels(factor(cov$Area))))
?for(i in 1:length(levels(factor(cov$Month))))
?{
?for(j in 1:length(levels(factor(cov$Area))))
?{
a[,ij]=as.numeric(rowSums(data[,factor(cov$Month)==levels(factor(cov$Month))[i]&factor(cov$Area)==levels(factor(cov$Area))[j]]))
}
}
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?