Question on approximations of full logistic regression model

The choice is not clear, and requires some simulations to estimate the
average absolute error of the covariance matrix estimators.
Frank
Thank you for your reply, Prof. Harrell.

I agree with you. Dropping only one variable does not actually help a lot.

I have one more question.
During analysis of this model I found that the confidence
intervals (CIs) of some coefficients provided by bootstrapping (bootcov 
function in rms package) was narrower than CIs provided by usual 
variance-covariance matrix and CIs of other coefficients wider.  My data 
has no cluster structure. I am wondering which CIs are better.
I guess bootstrapping one, but is it right?

I would appreciate your help in advance.
--
KH

(11/05/16 12:25), Frank Harrell wrote:
I think you are doing this correctly except for one thing.  The
validation
and other inferential calculations should be done on the full model.  Use
the approximate model to get a simpler nomogram but not to get standard
errors.  With only dropping one variable you might consider just running
the
nomogram on the entire model.
Frank

KH wrote:
Hi,
I am trying to construct a logistic regression model from my data (104
patients and 25 events). I build a full model consisting of five
predictors with the use of penalization by rms package (lrm, pentrace
etc) because of events per variable issue. Then, I tried to approximate
the full model by step-down technique predicting L from all of the
componet variables using ordinary least squares (ols in rms package) as
the followings. I would like to know whether I am doing right or not.

library(rms)
plogit<- predict(full.model)
full.ols<- ols(plogit ~ stenosis+x1+x2+ClinicalScore+procedure,
sigma=1)
fastbw(full.ols, aics=1e10)
  Deleted       Chi-Sq d.f. P      Residual d.f. P      AIC    R2
  stenosis       1.41  1    0.2354   1.41   1    0.2354  -0.59 0.991
  x2            16.78  1    0.0000  18.19   2    0.0001  14.19 0.882
  procedure     26.12  1    0.0000  44.31   3    0.0000  38.31 0.711
  ClinicalScore 25.75  1    0.0000  70.06   4    0.0000  62.06 0.544
  x1            83.42  1    0.0000 153.49   5    0.0000 143.49 0.000

Then, fitted an approximation to the full model using most imprtant
variable (R^2 for predictions from the reduced model against the
original Y drops below 0.95), that is, dropping "stenosis".

full.ols.approx<- ols(plogit ~ x1+x2+ClinicalScore+procedure)
full.ols.approx$stats
           n  Model L.R.        d.f.          R2           g       Sigma
104.0000000 487.9006640   4.0000000   0.9908257   1.3341718   0.1192622

This approximate model had R^2 against the full model of 0.99.
Therefore, I updated the original full logistic model dropping
"stenosis" as predictor.

full.approx.lrm<- update(full.model, ~ . -stenosis)

validate(full.model, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6425   0.7017  0.6131   0.0887          0.5539 1000
R2            0.3270   0.3716  0.3335   0.0382          0.2888 1000
Intercept     0.0000   0.0000  0.0821  -0.0821          0.0821 1000
Slope         1.0000   1.0000  1.0548  -0.0548          1.0548 1000
Emax          0.0000   0.0000  0.0263   0.0263          0.0263 1000

validate(full.approx.lrm, bw=F, B=1000)
           index.orig training    test optimism index.corrected    n
Dxy           0.6446   0.6891  0.6265   0.0626          0.5820 1000
R2            0.3245   0.3592  0.3428   0.0164          0.3081 1000
Intercept     0.0000   0.0000  0.1281  -0.1281          0.1281 1000
Slope         1.0000   1.0000  1.1104  -0.1104          1.1104 1000
Emax          0.0000   0.0000  0.0444   0.0444          0.0444 1000

Validatin revealed this approximation was not bad.
Then, I made a nomogram.

full.approx.lrm.nom<- nomogram(full.approx.lrm,
fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
plot(full.approx.lrm.nom)
Another nomogram using ols model,

full.ols.approx.nom<- nomogram(full.ols.approx,
fun.at=c(0.05,0.1,0.2,0.4,0.6,0.8,0.9,0.95), fun=plogis)
plot(full.ols.approx.nom)
These two nomograms are very similar but a little bit different.

My questions are;

1. Am I doing right?

2. Which nomogram is correct

I would appreciate your help in advance.

--
KH

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context:
http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3525372.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

     E-mail address
         Office: khosoda at med.kobe-u.ac.jp
	Home  : khosoda at venus.dti.ne.jp

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Question-on-approximations-of-full-logistic-regression-model-tp3524294p3526155.html
Sent from the R help mailing list archive at Nabble.com.

Question on approximations of full logistic regression model

Thread (8 messages)