An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090122/1f20cf0e/attachment-0001.pl>
Is there any function can be used to compare two probit models made from same data?
5 messages · jingjiang yan, Ben Bolker, David Freedman +2 more
jingjiang yan <jingjiangyan <at> gmail.com> writes:
hi, people
How can we compare two probit models brought out from the same data?
Let me use the example used in "An Introduction to R".
"Consider a small, artificial example, from Silvey (1970).
On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:
Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"
now, we can use the age and the blind percentage to produce a probit model
and get their coefficients by using glm function as was did in "An
Introduction to R"
My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.
If I want to determine which is significantly better, which function can I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this model
is effective?
You can use a likelihood ratio test (i.e. anova(model1,model0) to compare either model to the null model (blindness is independent of both age and height). The age model and height model are non-nested, and of equal complexity. You can tell which one is *better* by comparing log-likelihoods/deviances, but cannot test a null hypothesis of significance. Most (but not all) statisticians would say you can compare non-nested models by using AIC, but you don't get a hypothesis-test/p-value in this way. Ben Bolker
Hi - wouldn't it be possible to bootstrap the difference between the fit of
the 2 models? For example, if one had a *linear* regression problem, the
following script could be used (although I'm sure that it could be
improved):
library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4, .5,1,.8, .4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)
#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
dsub<-dframe[indices,]
m1se<-summary(lm(disease~age,data=dsub))$sigma;
m2se<-summary(lm(disease~ht,da=dsub))$sigma;
diff<-m1se-m2se; #diff is the difference in the SEs of the 2 models
}
eye <- boot(dframe,model.fun, R=200); class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))
Ben Bolker wrote:
jingjiang yan <jingjiangyan <at> gmail.com> writes:
hi, people
How can we compare two probit models brought out from the same data?
Let me use the example used in "An Introduction to R".
"Consider a small, artificial example, from Silvey (1970).
On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:
Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"
now, we can use the age and the blind percentage to produce a probit
model
and get their coefficients by using glm function as was did in "An
Introduction to R"
My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.
If I want to determine which is significantly better, which function can
I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this
model
is effective?
You can use a likelihood ratio test (i.e. anova(model1,model0) to compare either model to the null model (blindness is independent of both age and height). The age model and height model are non-nested, and of equal complexity. You can tell which one is *better* by comparing log-likelihoods/deviances, but cannot test a null hypothesis of significance. Most (but not all) statisticians would say you can compare non-nested models by using AIC, but you don't get a hypothesis-test/p-value in this way. Ben Bolker
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://www.nabble.com/Is-there-any-function-can-be-used-to-compare-two-probit-models-made-from-same-data--tp21614487p21625839.html Sent from the R help mailing list archive at Nabble.com.
On Friday 23 January 2009, David Freedman wrote:
library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4, ?.5,1,.8, ?.4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)
#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
? ? ?dsub<-dframe[indices,]
? ? ?m1se<-summary(lm(disease~age,data=dsub))$sigma;
? ? ?m2se<-summary(lm(disease~ht,da=dsub))$sigma;
? ? ?diff<-m1se-m2se; ?#diff is the difference in the SEs of the 2 models
? ? ?}
eye <- boot(dframe,model.fun, R=200); ?class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))
?
This may be a naive question, but could this be used to test two models based on difference transformations of the dependent variable? [...] m1se<-summary(lm(disease ~ age, data=dsub))$sigma m2se<-summary(lm(log(disease) ~ age, da=dsub))$sigma [...] or would the differences in scales render meaningless results? Cheers, Dylan
Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341
1 day later
At 14:55 23/01/2009, David Freedman wrote:
Hi - wouldn't it be possible to bootstrap the difference between the fit of the 2 models? For example, if one had a *linear* regression problem, the following script could be used (although I'm sure that it could be improved):
There are a number of methods for comparing non-nested models in the lmtest package.
library(MASS); library(boot)
#create intercorrelated data
Sigma <- matrix(c(1,.5,.4, .5,1,.8, .4,.8,1),3,3)
Sigma
dframe<-as.data.frame(mvrnorm(n<-200, rep(0, 3), Sigma))
names(dframe)<-c('disease','age','ht') #age and ht are predictors of
'disease'
head(dframe); cor(dframe)
#bootstrap the difference between models containing the 2 predictors
model.fun <- function(data, indices) {
dsub<-dframe[indices,]
m1se<-summary(lm(disease~age,data=dsub))$sigma;
m2se<-summary(lm(disease~ht,da=dsub))$sigma;
diff<-m1se-m2se; #diff is the difference in the SEs of the 2 models
}
eye <- boot(dframe,model.fun, R=200); class(eye); names(eye);
des(an(eye$t))
boot.ci(eye,conf=c(.95,.99),type=c('norm'))
Ben Bolker wrote:
jingjiang yan <jingjiangyan <at> gmail.com> writes:
hi, people
How can we compare two probit models brought out from the same data?
Let me use the example used in "An Introduction to R".
"Consider a small, artificial example, from Silvey (1970).
On the Aegean island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown below:
Age: 20 35 45 55 70
No. tested: 50 50 50 50 50
No. blind: 6 17 26 37 44
"
now, we can use the age and the blind percentage to produce a probit
model
and get their coefficients by using glm function as was did in "An
Introduction to R"
My question is, let say there is another potential factor instead of age
affected the blindness percentage.
for example, the height of these males. Using their height, and their
relevant blindness we can introduce another probit model.
If I want to determine which is significantly better, which function can
I
use to compare both models? and, in addition, compared with the Null
hypothesis(i.e. the same blindness for all age/height) to prove this
model
is effective?
You can use a likelihood ratio test (i.e. anova(model1,model0) to compare either model to the null model (blindness is independent of both age and height). The age model and height model are non-nested, and of equal complexity. You can tell which one is *better* by comparing log-likelihoods/deviances, but cannot test a null hypothesis of significance. Most (but not all) statisticians would say you can compare non-nested models by using AIC, but you don't get a hypothesis-test/p-value in this way. Ben Bolker
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://www.nabble.com/Is-there-any-function-can-be-used-to-compare-two-probit-models-made-from-same-data--tp21614487p21625839.html Sent from the R help mailing list archive at Nabble.com.
Michael Dewey http://www.aghmed.fsnet.co.uk