An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101228/36c334e5/attachment.pl>
Problem applying McNemar's - Different values in SPSS and R
10 messages · Manoj Aravind, Marc Schwartz, taby gathoni +1 more
On Dec 28, 2010, at 11:05 AM, Manoj Aravind wrote:
Hi friends,
I get different values for McNemar's test in R and SPSS. Which one should i
rely on when the p values differ.
I came across this problem when i started learning R and seriously give up
on SPSS or any other proprietary software.
Thank u in advance
Output in SPSS follows
*Crosstab*
hsc
Total
ABN
NE
ABN
tvs
ABN
Count
40
3
43
Row %
93.0%
7.0%
100.0%
COL%
78.4%
30.0%
70.5%
NE
Count
11
7
18
Row %
61.1%
38.9%
100.0%
COL%
21.6%
70.0%
29.5%
Total
Count
51
10
61
Row %
83.6%
16.4%
100.0%
COL%
100.0%
100.0%
100.0%
* Chi-Square Tests*
Value
Exact Sig. (2-sided)
McNemar Test
.057(a)
N of Valid Cases
61
a Binomial distribution used.
Output from R is as follows....
tvshsc<-
+ matrix(c(40,11,3,7),
+ nrow=2,
+ dimnames=list("TVS"=c("ABN","NE"),
+ "HSC"=c("ABN","NE")))
tvshsc
HSC
TVS ABN NE
ABN 40 3
NE 11 7
mcnemar.test(tvshsc)
McNemar's Chi-squared test with continuity correction data: tvshsc McNemar's chi-squared = 3.5, df = 1, p-value = 0.06137 Regards Dr. B Manoj Aravind
The SPSS test appears to be an exact test, whereas the default R function does not perform an exact test, so you are not comparing Apples to Apples... Try this using the 'exact2x2' CRAN package:
require(exact2x2)
Loading required package: exact2x2 Loading required package: exactci
mcnemar.exact(matrix(c(40, 11, 3, 7), 2, 2))
Exact McNemar test (with central confidence intervals) data: matrix(c(40, 11, 3, 7), 2, 2) b = 3, c = 11, p-value = 0.05737 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.04885492 1.03241985 sample estimates: odds ratio 0.2727273 HTH, Marc Schwartz
Marc Schwartz <marc_schwartz at me.com> [Tue, Dec 28, 2010 at 06:30:59PM CET]:
On Dec 28, 2010, at 11:05 AM, Manoj Aravind wrote:
Hi friends, I get different values for McNemar's test in R and SPSS. Which one should i rely on when the p values differ.
[...]
The SPSS test appears to be an exact test, whereas the default R function does not perform an exact test, so you are not comparing Apples to Apples...
Indeed, binom.test(11, 14) renders the same p-value as SPSS, whereas mcnemar.test() uses the approximation (|a_12 - a_21| - 1)?/(a_21 + a_12) with the "-1" removed if correct=FALSE. An old question of mine: Is there any reason not to use binom.test() other than historical reasons?
Johannes H?sing There is something fascinating about science.
One gets such wholesale returns of conjecture
mailto:johannes at huesing.name from such a trifling investment of fact.
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")
On Dec 28, 2010, at 11:47 AM, Johannes Huesing wrote:
Marc Schwartz <marc_schwartz at me.com> [Tue, Dec 28, 2010 at 06:30:59PM CET]:
On Dec 28, 2010, at 11:05 AM, Manoj Aravind wrote:
Hi friends, I get different values for McNemar's test in R and SPSS. Which one should i rely on when the p values differ.
[...]
The SPSS test appears to be an exact test, whereas the default R function does not perform an exact test, so you are not comparing Apples to Apples...
Indeed, binom.test(11, 14) renders the same p-value as SPSS, whereas mcnemar.test() uses the approximation (|a_12 - a_21| - 1)?/(a_21 + a_12) with the "-1" removed if correct=FALSE. An old question of mine: Is there any reason not to use binom.test() other than historical reasons?
I may be missing the context of your question, but I frequently see exact binomial tests being used when one is comparing the presumptively known probability of some dichotomous characteristic versus that which is observed in an independent sample. For example, in single arm studies where one is comparing an observed event rate against a point estimate for a presumptive historical control. I also see the use of exact binomial (Clopper-Pearson) confidence intervals being used when one wants to have conservative CI's, given that the nominal coverage of these are at least as large as requested. That is, 95% exact CI's will be at least that large, but in reality can tend to be well above that, depending upon various factors. This is well documented in various papers. I generally tend to use Wilson CI's for binomial proportions when reporting analyses. I have my own code but these are implemented in various R functions, including Frank's binconf() in Hmisc. HTH, Marc
Marc Schwartz <marc_schwartz at me.com> [Tue, Dec 28, 2010 at 07:14:49PM CET]: [...]
An old question of mine: Is there any reason not to use binom.test() other than historical reasons?
(I meant "in lieu of the McNemar approximation", sorry if some misunderstanding ensued).
I may be missing the context of your question, but I frequently see exact binomial tests being used when one is comparing the presumptively known probability of some dichotomous characteristic versus that which is observed in an independent sample. For example, in single arm studies where one is comparing an observed event rate against a point estimate for a presumptive historical control.
In the McNemar context (as used by SPSS) the null hypothesis is p=0.5.
I also see the use of exact binomial (Clopper-Pearson) confidence intervals being used when one wants to have conservative CI's, given that the nominal coverage of these are at least as large as requested. That is, 95% exact CI's will be at least that large, but in reality can tend to be well above that, depending upon various factors. This is well documented in various papers.
Confidence intervals are not that regularly used in the McNemar context, as the conditional probability "a > b given they are unequal" is not that much an interpretable quantity as is the event probability in a single arm study.
I generally tend to use Wilson CI's for binomial proportions when
reporting analyses. I have my own code but these are implemented in various R functions, including Frank's binconf() in Hmisc. Thanks for the hint.
Johannes H?sing There is something fascinating about science.
One gets such wholesale returns of conjecture
mailto:johannes at huesing.name from such a trifling investment of fact.
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101229/4def53f1/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101229/dc554a0d/attachment.pl>
On Dec 28, 2010, at 4:13 PM, Johannes Huesing wrote:
Marc Schwartz <marc_schwartz at me.com> [Tue, Dec 28, 2010 at 07:14:49PM CET]: [...]
An old question of mine: Is there any reason not to use binom.test() other than historical reasons?
(I meant "in lieu of the McNemar approximation", sorry if some misunderstanding ensued).
After I posted, I had a thought that this might be the case. Apologies for the digression then.
I may be missing the context of your question, but I frequently see exact binomial tests being used when one is comparing the presumptively known probability of some dichotomous characteristic versus that which is observed in an independent sample. For example, in single arm studies where one is comparing an observed event rate against a point estimate for a presumptive historical control.
In the McNemar context (as used by SPSS) the null hypothesis is p=0.5.
Yes, from what I can tell from a brief Google search, it appears that there are some software packages offering an exact variant of McNemar's, that will automatically shift to performing an exact binomial test if the sample size is say, <25. I rarely use exact tests in general practice (I am not typically involved with "smallish" data sets), so do not come across this situation frequently. That being said, back to your original query, if one is using these techniques, one might find that the exact binomial test is actually being used as noted and therefore should be aware of the documentation for the package, especially if the results that are output are not clear on the effective shift in methodology. So historical issues nothwithstanding, the functional equivalent of binom.test() is used elsewhere in current practice under certain conditions.
I also see the use of exact binomial (Clopper-Pearson) confidence intervals being used when one wants to have conservative CI's, given that the nominal coverage of these are at least as large as requested. That is, 95% exact CI's will be at least that large, but in reality can tend to be well above that, depending upon various factors. This is well documented in various papers.
Confidence intervals are not that regularly used in the McNemar context, as the conditional probability "a > b given they are unequal" is not that much an interpretable quantity as is the event probability in a single arm study.
I generally tend to use Wilson CI's for binomial proportions when
reporting analyses. I have my own code but these are implemented in various R functions, including Frank's binconf() in Hmisc. Thanks for the hint.
Happy to help. Regards, Marc
On Dec 29, 2010, at 6:48 AM, Manoj Aravind wrote:
Thank you Marc :) It Certainly helped me to get the exact value of P. How to understand when to apply mcnemar.exact or just mcnemar.test? I'm a beginner to biostatistics. Manoj Aravind
Generally speaking, exact tests are used for "small-ish" sample sizes. Frequently when n <100 and in many cases, much lower (eg. <50 or <30). The methods tend to become computationally impractical on "larger" data sets. Since you are coming from SPSS, you might find this document helpful in providing a general framework: http://support.spss.com/productsext/spss/documentation/spssforwindows/otherdocs/SPSS%20Exact%20Tests%207.0.pdf The document is written by Mehta and Patel of Cytel/StatXact, who are historical advocates of the techniques. That being said and as I noted in my reply to Johannes, I am not typically involved in situations where exact tests make sense, thus am probably not the best resource. I would steer you towards using various reference texts on analyzing categorical data (eg. Agresti) for more information. One exception to the above comment, is the use of Fisher's Exact Test (FET), which is typically advocated by folks as an alternative to a chi-square test when **expected** cell counts are <5. However, much has been written in recent times relative to just how conservative the FET is. One resource is: http://www.iancampbell.co.uk/twobytwo/twobytwo.htm Another reference is: How conservative is Fisher's exact test? A quantitative evaluation of the two-sample comparative binomial trial Gerald G. Crans, Jonathan J. Shuster Stat Med. 2008 Aug 15;27(18):3598-611. http://onlinelibrary.wiley.com/doi/10.1002/sim.3221/abstract So you might want to consider those resources as arguments against using the FET under situations that are likely more commonly observed in day to day practice. HTH, Marc
Marc Schwartz <marc_schwartz at me.com> [Wed, Dec 29, 2010 at 03:28:56PM CET]:
On Dec 29, 2010, at 6:48 AM, Manoj Aravind wrote:
Thank you Marc :) It Certainly helped me to get the exact value of P. How to understand when to apply mcnemar.exact or just mcnemar.test?
[...]
Generally speaking, exact tests are used for "small-ish" sample
sizes. Frequently when n <100 and in many cases, much lower (eg. <50 or <30). The methods tend to become computationally impractical on "larger" data sets. Sorry for chiming in again here, but binomial tests are computationally cheap:
system.time(binom.test(48000, 100000))
User System verstrichen
0.072 0.000 0.077
You are certainly correct on Fisher's Exact Test with larger tables
or Wilcoxon's Signed Rank test.
[...]
One exception to the above comment, is the use of Fisher's Exact Test (FET), which is typically advocated by folks as an alternative to a chi-square test when **expected** cell counts are <5. However, much has been written in recent times relative to just how conservative the FET is. One resource is: http://www.iancampbell.co.uk/twobytwo/twobytwo.htm
That's only because people shy away from the randomized version :-)
Johannes H?sing There is something fascinating about science.
One gets such wholesale returns of conjecture
mailto:johannes at huesing.name from such a trifling investment of fact.
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")