Dear R meta-analysis community, I have a question with regards to selection models based on p-values. Is it possible to do the selection model based on reported p-values directly rather than the p-values calculated from the effect size and SE? In many cases, meta-analyses require transformations, or sometimes corrections. However, if we assume that there is a selection process in publishing papers that is based on the p-values, it would make more sense to consider the p-values that are reported in the papers, would it not?. How would one proceed to do this? I believe the selmodel() function in metafor works with objects fitted with the rma() function, therefore, the p-values are re-calculated only from the effect size and SE. Assuming I have the reported p-values (detailed up to three decimals) of all the studies included in my meta-analysis, is it possible to test for the selection of studies based on reported p-values and then correct the effect size? I hope my question makes sense, Thank you for your help, Yashvin Seetahul
[R-meta] Selection models from *reported p-values*
3 messages · Seetahul, Yashvin, Wolfgang Viechtbauer, James Pustejovsky
Dear Yashvin, I haven't thought this all the way through, but the problem with this is that p-values enter the model in two different ways. There are indeed the actually observed p-values of the studies, but in the integration step (which is needed to compute the log likelihood), we also need to compute p-values. Those are not fixed, but arise from integrating over the density (assumed to be normal) of the effect size estimates. These p-values (which then enter the weight function) are computed as a function of y/sqrt(vi). If we use one way of computing the observed p-values and a different way of computing the p-values in this integration step, then there is a bit of a mismatch and I am not sure about the consequences of that. So for consistency, one should then also compute the p-values in the integration step in a corresponding manner, but this would be very case/measure/test specific and trying to fine-tune this for every specific measure and way of testing it becomes extremely difficult implementation-wise. We can see a bit of this in Iyengar and Greenhouse (1988) where the weight function is based on a t instead of a normal distribution (analogous to a z- versus a t-test). But this leads to the extra headache inducing complexities in their appendix. I (and others) decided to avoid all of this by making the simplifying assumption that the p-values are always computed based on Wald-type tests of the form 'estimate / SE'. This should not be too far off in many cases, especially if the sample sizes within studies are not small. For example, the difference between pnorm(2, lower.tail=FALSE) and pt(2, df=100, lower.tail=FALSE) makes very little practical difference. Also, selection models are really rough approximations to a much more complex data generating mechanism anyway, so trying to fine-tune this part of the model is like taking a ruler to align something to millimeter accuracy before taking a sledge hammer to smash it. A bit like the bias correction for d-values. Whether you put d=0.53 or g=0.52 into your model makes so little difference compared to all the other inaccuracies and infidelities we accept in putting together our meta-analytic datasets in the first place. But those are just my two cents. Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis <mailman-bounces at stat.ethz.ch> On Behalf Of Seetahul, Yashvin Sent: Tuesday, March 5, 2024 13:09 To: r-sig-meta-analysis at r-project.org Cc: r-sig-meta-analysis-owner at r-project.org Subject: Selection models from *reported p-values* Dear R meta-analysis community, I have a question with regards to selection models based on p-values. Is it possible to do the selection model based on reported p-values directly rather than the p-values calculated from the effect size and SE? In many cases, meta-analyses require transformations, or sometimes corrections. However, if we assume that there is a selection process in publishing papers that is based on the p-values, it would make more sense to consider the p-values that are reported in the papers, would it not?. How would one proceed to do this? I believe the selmodel() function in metafor works with objects fitted with the rma() function, therefore, the p-values are re-calculated only from the effect size and SE. Assuming I have the reported p- values (detailed up to three decimals) of all the studies included in my meta- analysis, is it possible to test for the selection of studies based on reported p-values and then correct the effect size? I hope my question makes sense, Thank you for your help, Yashvin Seetahul
Yashvin, This is an interesting question, which highlights a potential limitation of existing meta-analytic selection models (at least those that I'm aware of). Just to add a thought to Wolfgang's response: the reason that it would be difficult to modify existing selection models to work with observed p-values is that current models assume that the p-value is a direct function of the effect size estimate and its standard error, and the effect size estimates are the _outcomes_ in the model. So the model implies a _distribution_ of p-values based on the data-generating process, and we need to know what that distribution is. In particular, to work with an observed p-value, we would need to know how the observed p-value is functionally related to the effect size estimate, and this will depend on lots of details about the effect size metric, study design, and analytic methods (your method of calculating the effect size estimate and the authors' method of calculating p-values). For some types of transformations, I think the discrepancies will be quite small. * For example, say that the author reports a p-value for an untransformed correlation coefficient, but you meta-analyze the results based on Fisher z-transformation. For r near zero, the SE of the untransformed coefficient will be quite close to the SE of the z-transformed coefficient, so using one or the other will not make much difference at all. * For another example, say that you do a multiplicative reliability correction to a correlation coefficient. In this case, the SE of the corrected coefficient should also be multiplied by the reliability correction (that is, if we're treating the correction as a fixed constant), and so the ratio of the corrected correlation to the corrected SE will be the same as the ratio of the uncorrected correlation to the uncorrected SE, and the p-value should be the same in both cases. Finally, here's a potentially more problematic/controversial counter-example. Say that you are meta-analyzing standardized mean differences from randomized experiments with pre-test and post-test data, and for sake of uniformity you are using a difference-in-differences estimate for the numerator. But some of the primary studies use ANCOVA for their analysis, so your ES estimate and SE and p-value will differ from those based on the analysis reported in the primary study. Your analysis is less precise than the primary study analysis, so your p-value will tend to be larger than the primary study p-value. Further, maybe you are making an assumption about the pre/post correlation rather than using the primary study data to infer it, and this will introduce a further discrepancy. Personally, I don't have a sense of how big a discrepancy in p-values you can get in this situation. I think it's an interesting question that would be worth looking into (and maybe carrying it through to investigating the implications for the performance of meta-analytic selection models). But pragmatically, the discrepancy could be resolved by using the information from the primary analytic approach (ANCOVA) to calculate the effect size estimate and its standard error, at least to the extent that this is possible given the statistics reported in the primary study. Best, James On Tue, Mar 5, 2024 at 7:17?AM Viechtbauer, Wolfgang (NP) via
R-sig-meta-analysis <r-sig-meta-analysis at r-project.org> wrote:
Dear Yashvin, I haven't thought this all the way through, but the problem with this is that p-values enter the model in two different ways. There are indeed the actually observed p-values of the studies, but in the integration step (which is needed to compute the log likelihood), we also need to compute p-values. Those are not fixed, but arise from integrating over the density (assumed to be normal) of the effect size estimates. These p-values (which then enter the weight function) are computed as a function of y/sqrt(vi). If we use one way of computing the observed p-values and a different way of computing the p-values in this integration step, then there is a bit of a mismatch and I am not sure about the consequences of that. So for consistency, one should then also compute the p-values in the integration step in a corresponding manner, but this would be very case/measure/test specific and trying to fine-tune this for every specific measure and way of testing it becomes extremely difficult imple mentation-wise. We can see a bit of this in Iyengar and Greenhouse (1988) where the weight function is based on a t instead of a normal distribution (analogous to a z- versus a t-test). But this leads to the extra headache inducing complexities in their appendix. I (and others) decided to avoid all of this by making the simplifying assumption that the p-values are always computed based on Wald-type tests of the form 'estimate / SE'. This should not be too far off in many cases, especially if the sample sizes within studies are not small. For example, the difference between pnorm(2, lower.tail=FALSE) and pt(2, df=100, lower.tail=FALSE) makes very little practical difference. Also, selection models are really rough approximations to a much more complex data generating mechanism anyway, so trying to fine-tune this part of the model is like taking a ruler to align something to millimeter accuracy before taking a sledge hammer to smash it. A bit like the bias correction for d-values. Whether you put d=0.53 or g=0.52 into your model makes so little difference compared to all the other inaccuracies and infidelities we accept in putting together our meta-analytic datasets in the first place. But those are just my two cents. Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis <mailman-bounces at stat.ethz.ch> On Behalf Of
Seetahul,
Yashvin Sent: Tuesday, March 5, 2024 13:09 To: r-sig-meta-analysis at r-project.org Cc: r-sig-meta-analysis-owner at r-project.org Subject: Selection models from *reported p-values* Dear R meta-analysis community, I have a question with regards to selection models based on p-values. Is it possible to do the selection model based on reported p-values
directly
rather than the p-values calculated from the effect size and SE? In many cases, meta-analyses require transformations, or sometimes
corrections.
However, if we assume that there is a selection process in publishing
papers
that is based on the p-values, it would make more sense to consider the
p-values
that are reported in the papers, would it not?. How would one proceed to do this? I believe the selmodel() function in
metafor
works with objects fitted with the rma() function, therefore, the
p-values are
re-calculated only from the effect size and SE. Assuming I have the
reported p-
values (detailed up to three decimals) of all the studies included in my
meta-
analysis, is it possible to test for the selection of studies based on
reported
p-values and then correct the effect size? I hope my question makes sense, Thank you for your help, Yashvin Seetahul
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis