I have a question concerning whether to delete some of the data or not. I have a dataset of 56 studies with a pooled effect size of g=1.25. Yet, there are 4 data that reported an incredibly high effect size (8.15, 6.63, 4.14, 4.10 respectively). Statistically, they should be considered as outliers and be removed from the dataset. But since these data went through the inclusion and exclusion criteria, they should be staying in the dataset since they met all the requirements of my selection. So if we excluded the 4 data, wouldn't that be miss-reporting some data in the dataset? What should I do? Should I excluded the 4 seemingly influential cases or keep them for a complete list of research? *Name*: Nick Chen (Ping-Cheng, Chen) *School*:National Taiwan Normal University (NTNU) English Department (Master) *Email*: wow99308008 at gmail.com *Phone number*: +886 909 663 963
[R-meta] About whether to delete the outliers from the dataset
3 messages · Nick Chen, Michael Dewey, Reza Norouzian
Dear Nick Apart from the two options you outline (include all, exclude four) I assume you have already investigated whether these four studies share some common feature which might explain the differences. I would suggest presenting the full analysis as your main one and then presenting the one excluding the four as a sensitivity analysis. If the scientific conclusions are unaltered then your discussion is much simpler but if excluding them leads to a different conclusion then your discussion section needs to provide some suggestion about what is going on. I think presenting the analysis excluding the four as the main analysis is less preferable and, of course, just reporting that analysis and ignoring the four altogether is clearly wrong (I know you did not suggests that). Michael
On 10/12/2023 06:32, Nick Chen via R-sig-meta-analysis wrote:
I have a question concerning whether to delete some of the data or not. I have a dataset of 56 studies with a pooled effect size of g=1.25. Yet, there are 4 data that reported an incredibly high effect size (8.15, 6.63, 4.14, 4.10 respectively). Statistically, they should be considered as outliers and be removed from the dataset. But since these data went through the inclusion and exclusion criteria, they should be staying in the dataset since they met all the requirements of my selection. So if we excluded the 4 data, wouldn't that be miss-reporting some data in the dataset? What should I do? Should I excluded the 4 seemingly influential cases or keep them for a complete list of research? *Name*: Nick Chen (Ping-Cheng, Chen) *School*:National Taiwan Normal University (NTNU) English Department (Master) *Email*: wow99308008 at gmail.com *Phone number*: +886 909 663 963 [[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Michael
Dear Nick, It may be useful to add some additional context to your question for better assistance. For example, do you have a multilevel data structure where each study could have multiple rows or instead you have allowed only one row for each study in your dataset? Also, can you possibly describe your method of outlier detection? For instance, if you are using the metafor package, have you looked at the combination of cooks.distance(), hatvalues(), and rstudent() for those large effects in your meta-regression model? Additionally, I wonder what happens to your pooled effect's standard error (or the width of the pooled effect's confidence interval [CI]) with versus without those large effects? For example, does the width of the CI substantially (ex. by ~30%) decrease after removing those large effects, increase, or remain largely unchanged? Finally, depending on how much this matters to you in terms of your study objectives, does retaining versus removing those large effects in your meta-regression model change the statistical significance of your pooled effect at all (i.e., sig. to not sig., or vice versa)? Reza On Sun, Dec 10, 2023 at 12:33?AM Nick Chen via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:
I have a question concerning whether to delete some of the data or not. I
have a dataset of 56 studies with a pooled effect size of g=1.25. Yet,
there are 4 data that reported an incredibly high effect size (8.15, 6.63,
4.14, 4.10 respectively). Statistically, they should be considered as
outliers and be removed from the dataset. But since these data went through
the inclusion and exclusion criteria, they should be staying in the dataset
since they met all the requirements of my selection. So if we excluded the
4 data, wouldn't that be miss-reporting some data in the dataset? What
should I do? Should I excluded the 4 seemingly influential cases or keep
them for a complete list of research?
*Name*: Nick Chen (Ping-Cheng, Chen)
*School*:National Taiwan Normal University (NTNU) English Department
(Master)
*Email*: wow99308008 at gmail.com
*Phone number*: +886 909 663 963
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org To manage your subscription to this mailing list, go to: https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis