Skip to content

Automating R for Hypothesis Testing

7 messages · meredith, Rui Barradas

#
R Users-
  I have been trying to automate a manual code that I have developed for
calling in a .csv file, isolating certain rows and columns that correspond
to specified months:
something to the effect
i=name.csv
N=length(i$month)
iphos1=0
iphos2=0
isphos3=0
for i=1,N
 if month=1
    iphos1=iphos+1
    iphos1(iphos1)=i

an so on to call out the months into there own arrays (unless there is a way
I can wrap it into the next automation)

Next: I would like to run a simple linear regression combining each of the
months 1 by 1:
for instance I want to run a regression on a combined model from months 1
and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
distribution, if Chi-sq is less than the Critical value, we accept and go on
to test another set of models with both 1 and 2. If it rejects, then we
proceed to months 2 and 3.  If we move on to the second set on months 1 and
2, and the critical value is accepted, I want to print an accept or reject
and move on to months 2 and 3, until finally comparing months 12-1 at the
end.
Is there a way to loop or automate this in R?

Thanks
Meredith


--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

I'm not at all sure if I understand your problem. Does this describe it?


test first model for months 1 and 2
if test statistic less than critical value{
	test second model for months 1 and 2
	print results of the first and second tests? just one of them?
}
move on to months 2 and 3
etc, until months 12 and 1


Please post example data using dput(dataset).
Just copy it's output and paste it in your post.

And example code, what you're already doing.
(Possibly simplified)

Rui Barradas


meredith wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4619195.html
Sent from the R help mailing list archive at Nabble.com.
#
dput:  http://r.789695.n4.nabble.com/file/n4620188/milwaukeephos.csv
milwaukeephos.csv 

# Feb-march
(Intercept) lffeb_march 
  -2.429890    1.172821
(Intercept)   X1feb_mar lffeb_march 
 -2.8957776  -0.5272793   1.3016303
# Test Stat Equation for Chisq
if less than CV move onto to slope modification
(Intercept) X3feb_march 
   5.342130    1.172821
(Intercept) X3feb_march X4feb_march 
  5.2936263   1.0353752   0.2407557 
# Test Stat
# Record Chisq value

Does this help?
Here lffeb_march is the combination of Feb and March log flows
and llfeb_march is the combination of Feb and March log loads
X3: lffeb_march-mean(feb_march)
X4: X1*X3

Thanks

Rui Barradas wrote
Rui Barradas wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4620188.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

Yes, it does help. Now we can see your data and what you're doing.
What follows is a suggestion on what you could do, not full solution.
(You forgot to say what X1 is, but I don't think it's important to
understand the suggestion.)
(If I'm wrong, say something.)


milwaukeephos <- read.csv("milwaukeephos.csv", header=TRUE,
stringsAsFactors=FALSE)
# list of data.frames, one per month
ls1 <- split(milwaukeephos, milwaukeephos$month)

#--------- if you want to keep the models, not needed if you don't.
#          (yoy probably don't)
modelH <- vector("list", 12)
modelHa <- vector("list", 12)
modelH2 <- vector("list", 12)
modelH2a <- vector("list", 12)
#--------- values to record, these are needed, create them beforehand.
chi_fm <- numeric(12)
chi_fms <- numeric(12)
#
seq_months <- c(1:12, 1) # wrap months around.
for(i in 1:12){
	month_this <- seq_months[i]
	month_next <- seq_months[i + 1]

	lload <- c(ls1[[month_this]]$load_kg, ls1[[month_next]]$load_kg)
	lflow <- c(ls1[[month_this]]$flow, ls1[[month_next]]$flow)
	modelH[[i]] <- lm(lload ~ lflow)
	# If you don't want to keep the models, use modelH only
	# ( without [[i]] )
	# and do the same with X1

	# rest of your code for first test goes here
	chi_fm[i] <- bfm %*% var_fm %*% (bunres_fm - bres_fm)

	# and the same for the second test
	chi_fms[i] <- ...etc...
}


Hope this helps,

Rui Barradas


meredith wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4620696.html
Sent from the R help mailing list archive at Nabble.com.
#
Rui-
  Thanks this definitely helps, just one quick question. How would you code
the values of chi-fm and chi-fms to change based on the degrees of freedom
of each model H(i)?

Meredith


Rui Barradas wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4623689.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

I'm glad it helped.

As for your second question, I don't know, but I'm not very comfortable with
the way you're doing things.
Why subtract the coefficients of model 1 from model 2?
And why the dummy? Why set model 1 to zero?

Isn't it better to use anova's F? After all, it's designed for it, for the
linear model...
And if you really want/need the dummy, wouldn't a nested anova do it? (F
statistic, once again.)

anova(model1, model2)

is simple and statistically speaking seems to me much better. (I specially
don't like the subtraction bit.)

Rui Barradas

meredith wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4623790.html
Sent from the R help mailing list archive at Nabble.com.
5 days later
#
Rui-
  Just a quick question. I understand your comment on using ANOVA, but
doesn't this only test for similarities of the mean. We are trying to see if
a the same model can fit for two or three months, therefore have the similar
slope and intercept. The ANOVA would only do one part of this correct, with
the F-Test?

Thanks!
Meredith

Rui Barradas wrote
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4630243.html
Sent from the R help mailing list archive at Nabble.com.