[R-meta] Calculation of p values in selmodel
This is an issue with maximum likelihood estimation of the step function selection model generally (rather than a problem with the software implementation). The step function model assumes that there are different selection probabilities for effect size estimates with p-values that fall into different intervals. For a 3-parameter model, the intervals are [0, .025] and (.025, 1], with the first interval fixed to have selection probability 1 and the second interval having selection probability lambda > 0 (an unknown parameter of the model). If there are no observed ES estimates in the first interval, then the ML estimate of lambda is infinite. If there are no observed ES estimates in the second interval, then the ML estimate of lambda is zero, outside of the parameter space. In some of my work, I've implemented an ad hoc fix for the issue by moving the p-value threshold around so that there are at least three ES estimates in each interval. This isn't based on any principle in particular, although Jack Vevea once suggested to me that this might be the sort of thing an analyst might do just to get the model to converge. A more principled way to fix the issue would be to use penalized likelihood or Bayesian methods with an informative prior on lambda. See the publipha package (https://cran.r-project.org/package=publipha) for one implementation. James On Sat, Mar 16, 2024 at 10:23?PM Will Hopkins via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:
No-one has responded to this issue. It's now causing a problem in my
simulations when I am analyzing for publication bias arising from deletion
of 90% of nonsignificant study estimates and ending up with small numbers
(10-30) of included studies. See below (and attached as an easier-to-read
text file) for an example. Two of the 14 study estimates (Row 8 and 9) were
non-significant, but the original t value (tOrig) would have made them
significant in selmodel(?, type = "step", steps = (0.025)). So I processed
any observations with non-significant p values and t>1.96 by replacing the
standard error (YdelSE) with Ydelta/1.95. The resulting new t vslues (tNew)
are 1.95 for both those observations, whereas all the other t values are
unchanged. So they should be non-significant in selmodel, right? But I
still get this error message:
Error in selmodel.rma.uni(x, type = "step", steps = (0.025)) :
One or more intervals do not contain any observed p-values (use
'verbose=TRUE' to see which).
I must be doing something idiotic, but what? Help, please!
Oh, and thanks again to Tobias Saueressig for his help with
list-processing of the objects created by rma, selmodel and confint. My
original for-loop approach fell over when the values of the Sim variable
were not consecutive integers (for example, when I had generated the sims
and then deleted any lacking non-significant study estimates), but separate
processing of the lists as suggested by Tobias worked perfectly. It stops
working when it crashes out with the above error, but hopefully someone
will solve that problem.
Will
Sim StudID Sex SSize Ydelta YdelSE
tOrig tNew pValue
<dbl> <dbl> <fct> <dbl> <dbl> <dbl>
<dbl> <dbl> <dbl>
1 448 1 Female 10 3.72
0.684 5.44 5.44 0.000413
2 448 6 Female 10 3.08
0.901 3.42 3.42 0.00766
3 448 11 Female 10 4.49
0.926 4.85 4.85 0.000906
4 448 21 Female 28 4.95
0.777 6.37 6.37 0.000000808
5 448 26 Female 12 3.82
1.25 3.06 3.06 0.0109
6 448 31 Female 22 2.13
0.991 2.15 2.15 0.0433
7 448 36 Female 10 3.27
1.13 2.89 2.89 0.0177
8 448 10 Male 18 4.46
2.29 2.03 1.95 0.0578
9 448 14 Male 10 3.2
1.64 1.98 1.95 0.0795
10 448 17 Male 13 4.32
1.97 2.19 2.19 0.049
11 448 30 Male 10 1.16
0.467 2.48 2.48 0.0348
12 448 38 Male 10 3.61
1.24 2.91 2.91 0.0175
13 448 39 Male 10 2.49
0.828 3.01 3.01 0.0148
14 448 40 Male 28 1.92
0.602 3.19 3.19 0.0036
*From:* Will Hopkins <willthekiwi at gmail.com>
*Sent:* Friday, March 15, 2024 8:39 AM
*To:* 'R Special Interest Group for Meta-Analysis' <
r-sig-meta-analysis at r-project.org>
*Subject:* Calculation of p values in selmodel
According to your documentation, Wolfgang, the selection models in
selmodel are based on the p values of the study estimates, but these are
computed by assuming the study estimate divided by its standard error has a
normal distribution, whereas significance in the original studies of mean
effects of continuous variables would have been based on a t distribution.
It could make a difference when sample sizes in the original studies are
~10 or so, because some originally non-significant effects would be treated
as significant by selmodel. For example, with a sample size of 10, a mean
change has 9 degrees of freedom, so a p value of 0.080 (i.e.,
non-significant, p>0.05) in the original study will be given a p value of
0.049 (i.e., significant, p<0.05) by selmodel. Is this issue likely to make
any real difference to the performance of selmodel with meta-analyses of
realistic small-sample studies? I guess that only a small (negligible?)
proportion of p values will fall between 0.05 and 0.08, in the worst-case
scenario of a true effect close to the critical value and with only 9
degrees of freedom for the SE. If it is an issue, you could include the
SE's degrees of freedom in the rma object that gets passed to selmodel.
Will
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis