Skip to content

How to let R repeat computations over a number of variables

2 messages · Uli Kleinwechter, David Winsemius

#
Hello,

I have written a small script to read a dataset, compute some basic 
descriptives and write them to a file (see below). The variable 
"maizeseedcash" of which the statistics are calculated is contained in 
the data frame agr_inputs. My question is whether there is a way to make 
R compute the statistics not only for maizeseedcash but also for other 
variables in the dataset. I thought about a thing like a loop which 
repeats the computations according to a set of variables which I would 
like to be able to specify before. Is something like that possible and 
if so, how would it look like?

All hints are appreciated.

Best regards,

Uli




********
sink("agr_inputs.txt", append=FALSE, type="output")

agr_inputs<-read.csv2("agric_inputs.csv")

attach(agr_inputs)

min<-min(maizeseedcash)

q25<-quantile(maizeseedcash, probs=.25)

median<-quantile(maizeseedcash, probs=.50)

mean<-mean(maizeseedcash)

q75<-quantile(maizeseedcash, probs=.75)

max<-max(maizeseedcash)

var<-var(maizeseedcash)
             
sd<-sd(maizeseedcash)

varcoeff<-sd/mean*100

Measure<-c("Min","25%", "Median", "Mean", "75%", "Max", "Var", "SD", 
"VarCoeff")

maizeseedcas<-c( min, q25, median, mean, q75, max, var, sd, varcoeff)

solution<-data.frame(Measure, maizeseedcas)

print (solution)

detach(agr_inputs)

sink()

******
#
Uli Kleinwechter <ulikleinwechter at yahoo.com.mx> wrote in
news:47DADC55.2070803 at yahoo.com.mx:
My hint would be to first look at ?summary or the describe function in 
Hmisc package. My second hint would be to start referring to your R 
objects by their correct names, in this case  use "dataframe" instead 
of dataset.

If summary and describe do not satisfy, then you could wrap your work 
into a function, say func.summ and feed column arguments to it with:

 apply(agr_inputs, 2, func.summ)

There are several areas where the code could be more compact. If you 
let "probs" be a vector, you can get all of your quantiles at once:
25%       50%       75% 
0.2240003 0.4919313 0.7359661 

The names get carried forward when appended in a vector. See:
25%       50%       75%                     
1.0000000 2.0000000 0.2228890 0.4978050 0.8440893 4.0000000 5.0000000 

And you can reference named elements by name with named indexing:
25% 
0.2228890 

Or use summary:
Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.003962 0.215400 0.441800 0.474600 0.735100 0.997600
Mean 
0.4973 

Best of luck;
David Winsemius