Is there any function can be used to compare two probit models made from same data?

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090122/1f20cf0e/attachment-0001.pl>
jingjiang yan <jingjiangyan <at> gmail.com> writes:
hi, people
    How can we compare two probit models brought out from the same data?
    Let me use the example used in "An Introduction to R".
    "Consider a small, artificial example, from Silvey (1970).

On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:

Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"

now, we can use the age and the blind percentage to produce a probit model
and get their coefficients by using glm function as was did in "An
Introduction to R"

My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.

If I want to determine which is significantly better, which function can I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this model
is effective?

You can use a likelihood ratio test (i.e.
anova(model1,model0) to compare either model
to the null model (blindness is independent of
both age and height).  The age model and height
model are non-nested, and of equal complexity.
You can tell which one is *better* by comparing
log-likelihoods/deviances, but cannot test
a null hypothesis of significance. Most (but
not all) statisticians would say you can compare 
non-nested models by using AIC, but you don't
get a hypothesis-test/p-value in this way.

  Ben Bolker
Hi - wouldn't it be possible to bootstrap the difference between the fit of
the 2 models?  For example, if one had a *linear* regression problem, the
following script could be used (although I'm sure that it could be
improved):

library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4,  .5,1,.8,  .4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)

#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
     dsub<-dframe[indices,]
     m1se<-summary(lm(disease~age,data=dsub))$sigma; 
     m2se<-summary(lm(disease~ht,da=dsub))$sigma; 
     diff<-m1se-m2se;  #diff is the difference in the SEs of the 2 models
     }
eye <- boot(dframe,model.fun, R=200);  class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))

jingjiang yan <jingjiangyan <at> gmail.com> writes:

hi, people
    How can we compare two probit models brought out from the same data?
    Let me use the example used in "An Introduction to R".
    "Consider a small, artificial example, from Silvey (1970).

On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:

Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"

now, we can use the age and the blind percentage to produce a probit
model
and get their coefficients by using glm function as was did in "An
Introduction to R"

My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.

If I want to determine which is significantly better, which function can
I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this
model
is effective?

  You can use a likelihood ratio test (i.e.
anova(model1,model0) to compare either model
to the null model (blindness is independent of
both age and height).  The age model and height
model are non-nested, and of equal complexity.
You can tell which one is *better* by comparing
log-likelihoods/deviances, but cannot test
a null hypothesis of significance. Most (but
not all) statisticians would say you can compare 
non-nested models by using AIC, but you don't
get a hypothesis-test/p-value in this way.

  Ben Bolker

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

View this message in context: http://www.nabble.com/Is-there-any-function-can-be-used-to-compare-two-probit-models-made-from-same-data--tp21614487p21625839.html
Sent from the R help mailing list archive at Nabble.com.
library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4, ?.5,1,.8, ?.4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)

#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
? ? ?dsub<-dframe[indices,]
? ? ?m1se<-summary(lm(disease~age,data=dsub))$sigma;
? ? ?m2se<-summary(lm(disease~ht,da=dsub))$sigma;
? ? ?diff<-m1se-m2se; ?#diff is the difference in the SEs of the 2 models
? ? ?}
eye <- boot(dframe,model.fun, R=200); ?class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))
?
This may be a naive question, but could this be used to test two models based 
on difference transformations of the dependent variable?

[...]
m1se<-summary(lm(disease ~ age, data=dsub))$sigma
m2se<-summary(lm(log(disease) ~ age, da=dsub))$sigma
[...]

or would the differences in scales render meaningless results?

Cheers,

Dylan
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341

Hi - wouldn't it be possible to bootstrap the difference between the fit of
the 2 models?  For example, if one had a *linear* regression problem, the
following script could be used (although I'm sure that it could be
improved):
There are a number of methods for comparing non-nested models in the 
lmtest package.
library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4,  .5,1,.8,  .4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)

#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
     dsub<-dframe[indices,]
     m1se<-summary(lm(disease~age,data=dsub))$sigma;
     m2se<-summary(lm(disease~ht,da=dsub))$sigma;
     diff<-m1se-m2se;  #diff is the difference in the SEs of the 2 models
     }
eye <- boot(dframe,model.fun, R=200);  class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))

Ben Bolker wrote:

jingjiang yan <jingjiangyan <at> gmail.com> writes:

hi, people
    How can we compare two probit models brought out from the same data?
    Let me use the example used in "An Introduction to R".
    "Consider a small, artificial example, from Silvey (1970).

On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:

Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"

now, we can use the age and the blind percentage to produce a probit
model
and get their coefficients by using glm function as was did in "An
Introduction to R"

My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.

If I want to determine which is significantly better, which function can
I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this
model
is effective?

  You can use a likelihood ratio test (i.e.
anova(model1,model0) to compare either model
to the null model (blindness is independent of
both age and height).  The age model and height
model are non-nested, and of equal complexity.
You can tell which one is *better* by comparing
log-likelihoods/deviances, but cannot test
a null hypothesis of significance. Most (but
not all) statisticians would say you can compare
non-nested models by using AIC, but you don't
get a hypothesis-test/p-value in this way.

  Ben Bolker

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context: 
http://www.nabble.com/Is-there-any-function-can-be-used-to-compare-two-probit-models-made-from-same-data--tp21614487p21625839.html
Sent from the R help mailing list archive at Nabble.com.
Michael Dewey
http://www.aghmed.fsnet.co.uk