How to let R repeat computations over a number of variables
Uli Kleinwechter <ulikleinwechter at yahoo.com.mx> wrote in news:47DADC55.2070803 at yahoo.com.mx:
Hello, I have written a small script to read a dataset, compute some basic descriptives and write them to a file (see below). The variable "maizeseedcash" of which the statistics are calculated is contained in the data frame agr_inputs. My question is whether there is a way to make R compute the statistics not only for maizeseedcash but also for other variables in the dataset. I thought about a thing like a loop which repeats the computations according to a set of variables which I would like to be able to specify before. Is something like that possible and if so, how would it look like? All hints are appreciated.
My hint would be to first look at ?summary or the describe function in Hmisc package. My second hint would be to start referring to your R objects by their correct names, in this case use "dataframe" instead of dataset. If summary and describe do not satisfy, then you could wrap your work into a function, say func.summ and feed column arguments to it with: apply(agr_inputs, 2, func.summ) There are several areas where the code could be more compact. If you let "probs" be a vector, you can get all of your quantiles at once:
quantile(runif(100), probs=c(0.25, 0.5, 0.75))
25% 50% 75% 0.2240003 0.4919313 0.7359661 The names get carried forward when appended in a vector. See:
test <- c(1,2, quantile(runif(100), probs=c(0.25, 0.5, 0.75)), 4,5) test
25% 50% 75% 1.0000000 2.0000000 0.2228890 0.4978050 0.8440893 4.0000000 5.0000000 And you can reference named elements by name with named indexing:
test["25%"]
25% 0.2228890 Or use summary:
summary(runif(100))
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.003962 0.215400 0.441800 0.474600 0.735100 0.997600
summary(runif(100))["Mean"]
Mean 0.4973 Best of luck; David Winsemius
********
sink("agr_inputs.txt", append=FALSE, type="output")
agr_inputs<-read.csv2("agric_inputs.csv")
attach(agr_inputs)
min<-min(maizeseedcash)
q25<-quantile(maizeseedcash, probs=.25)
median<-quantile(maizeseedcash, probs=.50)
mean<-mean(maizeseedcash)
q75<-quantile(maizeseedcash, probs=.75)
max<-max(maizeseedcash)
var<-var(maizeseedcash)
sd<-sd(maizeseedcash)
varcoeff<-sd/mean*100
Measure<-c("Min","25%", "Median", "Mean", "75%", "Max", "Var", "SD",
"VarCoeff")
maizeseedcas<-c( min, q25, median, mean, q75, max, var, sd,
varcoeff)
solution<-data.frame(Measure, maizeseedcas)
print (solution)
detach(agr_inputs)
sink()
******
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.