[R-meta] Calculation of p values in selmodel

Thanks heaps for your response, Wolfgang. I understand the problems with specifying a p value or degrees of freedom, depending on how authors have calculated the effects. The Wald-type p value is obviously right for logs of count and hazard ratios. Standardized mean differences are a real mess already, because almost all authors use the wrong SD(s) to standardize, but you can use rma(yi=..., vi=...,...) with properly calculated yi and vi, and work out the degrees of freedom (or p value) based on a t approximation to the non-central t, which should be fine for sample sizes of 10 or more. But that assertion needs checking with simulations.

Speaking of which, the simulations I am currently doing will help determine if adding the p values goes "horribly wrong". I don't know enough R yet to figure out how to implement your selmodel(..., pval=dat$pValue) with list processing. I've got functions of functions that work, thanks to Tobias Saueressig, and yesterday ChatGPT worked out how to add code that stops the error messages from corrupting the list processing.  Here's the code I am using. Can you tell me what I need to add where?

# Define a function to apply metafor::rma()
apply_rma <- function(xxx) {
  rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=~Sex -1, data=xxx, control=list(maxiter=400), method=c("REML","DL"), level=90)
}
# Apply the function to each element of dat_list using lapply
meta_results <- lapply(dat_list, apply_rma)

# Define a function to apply metafor::selmodel()
apply_selmodel <- function(x) {
  # Try processing the element
  tryCatch({
    # This code may produce an error
    #result <- metafor::selmodel(x,  type="beta")
    result <- metafor::selmodel(x,  type="step", steps=(0.025))
    return(result)
  }, error = function(e) {
    # Handle the error (e.g., print a message)
    cat("selmodel failed with this:\n", conditionMessage(e), "\n")
    # Return nothing (skip the element) when an error occurs
    return(invisible())
  })
}
# Clear the adjusted_meta list, to avoid any confusion
adjusted_meta <- list()
# Apply the function to each element of meta_results using lapply
adjusted_meta <- lapply(meta_results, apply_selmodel)

# Remove the NULLs
adjusted_meta <- adjusted_meta[sapply(adjusted_meta, function(x) !is.null(x))]

-----Original Message-----
From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org> On Behalf Of Viechtbauer, Wolfgang (NP) via R-sig-meta-analysis
Sent: Tuesday, March 19, 2024 12:37 AM
To: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r-project.org>
Cc: Viechtbauer, Wolfgang (NP) <wolfgang.viechtbauer at maastrichtuniversity.nl>
Subject: Re: [R-meta] Calculation of p values in selmodel

This thread intermingles several issues, so let me add my 2 cents to each one:

This thread started out with Will's criticism that the selection models implemented in selmodel() compute p-values based on a standard Wald-type test of zi = yi / sqrt(vi). This issue was also recently raised in this post:

https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2024-March/005098.html

to which I and James provided the following replies:

https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2024-March/005099.html
https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2024-March/005100.html

Let me reiterate: p-values are not just computed for the observed effect sizes, but also as part of an additional step that involves integrating over the weighted density of the effect size estimates. The weights are a function of the p-values, which in turn are a function of the integrand. So, if one wants to use p-values that are not computed based on a standard Wald-type test, one doesn't just have to change how the p-values of the observed effect sizes are computed, but also how the p-values in this integration step are computed. And this is not automatically solved by supplying degrees of freedom and then using a t-distribution. What if a study author has used a Mann?Whitney U test instead of a t-test? What if they used a permutation test? The p-values are then not coming from a t-test, so those dfs are irrelevant. For other effect size measures (e.g., odds ratios), the authors may have used a Pearson chi^2 test (with or without a continuity correction) or maybe the odds ratio came from a logistic regression model with additional predictors. If one wants to use the p-values as reported, then the p-values in the integration step would have to be computed in the same manner. Implementing this for all these possibilities is essentially impossible.

Also, if we want to be so precise, then one should not assume that yi is normally distributed, but use the exact distribution for the density and the integration step. So for d it would be a scaled and shifted non-central t-distribution, for r it would be the exact distribution of a Pearson correlation coefficient, for raw proportions it would be a binomial distribution, and so on and so on -- this will be different for every single effect size measure out there. Implementing this again is essentially impossible.

Will also asked: "Is this issue likely to make any real difference to the performance of selmodel with meta-analyses of realistic small-sample studies?" An answer to this depends on many factors (what effect size measure? how many studies? what are their sample sizes? what method was used to compute the actual p-values (could be different for the different studies)? which selection model is being used? what is the true selection function?) and as far as I know, there are no simulation studies addressing this question in the first place, so the answer is: Nobody knows.

Based on this, I did a small simulation study for standardized mean differences and found no noteworthy difference between using Wald-type tests instead of the 'proper' t-tests with k=50 studies and sample sizes between 20 to 60. But there are many factors one could manipulate here so this might not generalize.

Also, as I also mentioned before, this concern is probably a relatively minor issue in comparison to the fact that selection models are really rough approximations to a much more complex data generating mechanism anyway. The moment some study authors are deviating in their behavior from the assumed selection model (which is guaranteed to be case), then this is likely to have a much greater impact on the performance of selection models.

However, based on this discussion, I have now implemented the possibility to supply p-values directly to selmodel() via the *undocumented* 'pval' argument. To illustrate with Will's dataset:

library(metafor)

dat <- structure(list(Sim = c(448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L, 448L), StudID = c(1L, 6L, 11L, 21L, 26L, 31L, 36L, 10L, 14L, 17L, 30L, 38L, 39L, 40L), Sex = c("Female", "Female", "Female", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male", "Male"), SSize = c(10L, 10L, 10L, 28L, 12L, 22L, 10L, 18L, 10L, 13L, 10L, 10L, 10L, 28L), Ydelta = c(3.72, 3.08, 4.49, 4.95, 3.82, 2.13, 3.27, 4.46, 3.2, 4.32, 1.16, 3.61, 2.49, 1.92), YdelSE = c(0.684, 0.901, 0.926, 0.777, 1.25, 0.991, 1.13, 2.29, 1.64, 1.97, 0.467, 1.24, 0.828, 0.602), tOrig = c(5.44, 3.42, 4.85, 6.37, 3.06, 2.15, 2.89, 2.03, 1.98, 2.19, 2.48, 2.91, 3.01, 3.19), tNew = c(5.44, 3.42, 4.85, 6.37, 3.06, 2.15, 2.89, 1.95, 1.95, 2.19, 2.48, 2.91, 3.01, 3.19), pValue = c(0.000413, 0.00766, 0.000906, 8.08e-07, 0.0109, 0.0433, 0.0177, 0.0578, 0.0795, 0.049, 0.0348, 0.0175, 0.0148, 0.0036)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14"))

res <- rma(Ydelta, sei=YdelSE, data=dat, method="ML") res

sel <- selmodel(res, type="stepfun", alternative="greater", steps=c(.025, 1)) sel

sel <- selmodel(res, type="stepfun", alternative="greater", steps=c(.025, 1), pval=dat$pValue) sel

To be clear, the p-values in the integration step are still computed based on the Wald-type test and this still assumes normally distributed effect size estimates. The impact of doing this are unclear and this might be horribly wrong. Use at your own discretion.

The other issue that was raised is empty intervals in step-function selection models. This is really a separate issue (even if it arose in one particular example due to different methods being used in the studies themselves and the selection model for computing the p-values). James nicely explained why empty intervals cause problems for ML estimation. In connection with this, I will illustrate this with an example (and some features of the selmodel() function):

library(metafor)

dat <- dat.hackshaw1998

res <- rma(yi, vi, data=dat, method="ML") res

sel <- selmodel(res, type="stepfun", alternative="greater", steps=c(.025, .05, 1), verbose=TRUE)

Another undocumented argument that was already implemented is the possibility to skip the interval check:

sel <- selmodel(res, type="stepfun", alternative="greater",
                steps=c(.025, .05, 1), skipintcheck=TRUE) sel

As one can see, the estimate of delta[2] drifts to 0 (which then causes issues with inverting the Hessian, which is yet another separate issue).

To illustrate the opposite:

sel <- selmodel(res, type="stepfun", alternative="greater", steps=c(.001, 1)) sel

Now delta[2] wants to drift to infinity, which it can't due to 100 being the default limit for the delta values (bounds are imposed for numerical stability issues). This can be adjusted:

sel <- selmodel(res, type="stepfun", alternative="greater", steps=c(.001, 1),
                control=list(delta.max=Inf)) sel

It doesn't quite get to infinity, but close enough I would say.

I had considered the option to automatically collapse empty intervals (or at least, optionally), but decided not to do this. I leave it up to the user to implement such steps if they like.

Best,
Wolfgang

[R-meta] Calculation of p values in selmodel

Thread (14 messages)