Clogit R and Stata
The "n = 1404" vs. "Number of obs = 468" looks like the giveaway. You are passing the subset selection logic as the 3rd positional argument, but according to the documentation, that is the weights argument. So, clogit(..., data = dframe, subset = sample==1 & glb_ind=="Y")
On Jun 7, 2013, at 18:51 , Richard Beckett wrote:
From: peter dalgaard <pdalgd at gmail.com>
To: Richard Beckett <rbeckett81 at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Friday, June 7, 2013 11:12 AM
Subject: Re: [R] Clogit R and Stata
Here is the R output:
Call:
coxph(formula = Surv(rep(1, 1404L), sftpcons) ~ sftptv2a3 + sftptv2a4 +
sftptv2a5 + sftptv2a2 + sftptv2a6 + logim + maccat + disp4cat +
strata(stratida), data = dframe, method = "exact")
n= 1404, number of events= 351
coef exp(coef) se(coef) z Pr(>|z|)
sftptv2a3 1.4552 4.2852 0.2273 6.401 1.54e-10 ***
sftptv2a4 3.1118 22.4609 0.2265 13.739 < 2e-16 ***
sftptv2a5 1.0717 2.9204 0.2522 4.249 2.15e-05 ***
sftptv2a2 0.7185 2.0514 0.3300 2.177 0.0295 *
sftptv2a6 2.7341 15.3965 0.5050 5.414 6.17e-08 ***
logim 0.7579 2.1338 0.1347 5.625 1.85e-08 ***
maccat 3.0809 21.7771 0.4005 7.693 1.43e-14 ***
disp4cat 0.7061 2.0261 0.1524 4.634 3.59e-06 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
exp(coef) exp(-coef) lower .95 upper .95
sftptv2a3 4.285 0.23336 2.745 6.691
sftptv2a4 22.461 0.04452 14.409 35.013
sftptv2a5 2.920 0.34241 1.781 4.788
sftptv2a2 2.051 0.48747 1.074 3.917
sftptv2a6 15.397 0.06495 5.722 41.429
logim 2.134 0.46866 1.639 2.779
maccat 21.777 0.04592 9.934 47.739
disp4cat 2.026 0.49355 1.503 2.731
Rsquare= 0.239 (max possible= 0.623 )
Likelihood ratio test= 383.2 on 8 df, p=0
Wald test = 264.7 on 8 df, p=0
Score (logrank) test = 396.2 on 8 df, p=0
And the STATA output:
Iteration 0: log likelihood = -95.537697
Iteration 1: log likelihood = -91.465581
Iteration 2: log likelihood = -91.402366
Iteration 3: log likelihood = -91.402264
Iteration 4: log likelihood = -91.402264
Conditional (fixed-effects) logistic regression Number of obs = 468
LR chi2(8) = 141.59
Prob > chi2 = 0.0000
Log likelihood = -91.402264 Pseudo R2 = 0.4365
------------------------------------------------------------------------------
sftpcons | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sftptv2a3 | 2.042827 .4741327 4.31 0.000 1.113544 2.97211
sftptv2a4 | 4.10828 .5593723 7.34 0.000 3.01193 5.204629
sftptv2a5 | 1.766492 .5585173 3.16 0.002 .6718177 2.861165
sftptv2a2 | 1.366568 .6540307 2.09 0.037 .084691 2.648444
sftptv2a6 | 2.307152 .8225835 2.80 0.005 .6949178 3.919386
logim | 1.404135 .3480976 4.03 0.000 .7218764 2.086394
maccat | 2.8423 .7008588 4.06 0.000 1.468642 4.215958
disp4cat | .6347805 .2872258 2.21 0.027 .0718283 1.197733
------------------------------------------------------------------------------
Also tried changing method=approximate with no noticeable change
On Jun 7, 2013, at 15:34 , Richard Beckett wrote:
Sorry to once again write a message but I'm once again stumped and am having no luck finding a solution anywhere else. This question requires some finesse in both R and STATA so hopefully I will be able to get an answer here. I am much more adept in R and am trying to replicate the results of a STATA file in R. Hopefully this is a proper forum for such questions. This is the code for the clogit in STATA clogit sftpcons sftptv2a3 sftptv2a4 sftptv2a5 sftptv2a2 sftptv2a6 logim maccat disp4cat if sample==1 & glb_ind=="Y", group(stratida) and I tried to replicate it using clogit1<-clogit(sftpcons~sftptv2a3+sftptv2a4+sftptv2a5+sftptv2a2+sftptv2a6+logim+maccat+disp4cat+strata(stratida), dframe, sample==1 | glb_ind=="Y") but got different results What did I do wrong here? I interpreted the STATA clogit as run this logit as long as the sample is 1 and glb_ind="Y" What should I be doing instead?
An "&" rather than "|" in the R version might help. Other than that, we're a bit short on clues unless you provide some output. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com