Avoiding for loops
Dimitris Rizopoulos wrote:
you could try something along these lines: data <- data.frame(y = rnorm(100), group = rep(1:10, each = 10)) data$sum <- ave(data$y, data$group, FUN = sum) data$norm.y <- data$y / data$sum data
.. or even transform(data, norm=ave(y, group, FUN = function(x) x/sum(x)))
I hope it helps. Best, Dimitris Noah Silverman wrote:
Hi,
I'm trying to normalize some data.
My data is organized by groups. I want to normalize PER GROUP as
opposed to over the entire data set.
The current double loop that I'm using takes almost an hour to run on
about 30,000 rows of data in 2,500 groups.
I'm currently doing this:
-------------------------------------
for(group in unique(data$group)){
sum_V1 <- sum(data$V1[data$group == group])
for(subject in data$subject[data$group == group]){
data$V1_norm[(data$group == group & data$subject == subject)]
<- data$V1[(data$group == group & data$subject == subject)] / sum_V1
}
}
-------------------------------------
Can anyone point me to a faster way to do this in R.
Thanks!
-N
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907