Dear Wolfgang, I hope you are well these days. I had some general questions related to the data structure in mixed-effect models. We are currently working with data extracted from pee-reviewed papers as well as big data extracted from state or agency surveys. The issue we have is although we are including only 2-3 agency studies, each study can generate up to 1000-9000 effect sizes due to the abundance of data they produced. Conversely, the data collected from peer-reviewed articles are much smaller than that perhaps < 800 effect sizes combined. My co-authors want to use those agency data, but I am very concerned that including those data makes sense. So the question is: 1. Is it even reasonable to consider including such few studies that will pretty much dominate the entire data? 2. Can common mixed-effect model approaches with study random factor account for such disproportionate contribution of few studies? It would be extremely helpful to hear your perspectives. Thank you Best, JU
[R-meta] Dear Wolfgang
6 messages · Ju Lee, Michael Dewey, Wolfgang Viechtbauer +1 more
Hi JU, This is a very interesting question. My personal sense of it is that one should be very cautious when working with such highly imbalanced datasets like this (where some studies contribute just a few effect sizes while others contribute orders of magnitude more). The usual methods for handling dependent effects (such as multi-level meta-analysis, robust variance estimation, etc.) can do strange and wacky things in this situation. As far as I know, their performance has not been evaluated with data structures like this, and my understanding is that the usual statistical theory supporting these methods doesn't necessarily apply very well. To make progress, I would recommend first carefully reviewing your inclusion criteria and checking whether the effect sizes from the agency studies are really comparable and aligned with the effect sizes extracted from the peer-reviewed papers. One possible scenario is that the peer-reviewed papers all only report outcomes on well-validated scales, whereas the agency studies report outcomes on a kitchen sink of different outcomes, including well-validated full scales but also never-validated scales, sub-scales, single items, and assorted other oddities. In this situation, I don't seem any benefit to throwing in all the sub-scales and other chaff, just because it's available. Better to use stricter inclusion criteria and just focus on the sound, validated measures. Another possible scenario is that the peer-reviewed studies report one sort of information, whereas the agency studies report a categorically different sort of information (with little or no overlap in terms of the measures used, scale of the samples, etc.). In this case, it would seem sensible to perform separate syntheses of the peer-reviewed literature and the agency studies, then scratch your head over how to understand differences between the two bodies of evidence. Another possible scenario is that the peer-reviewed studies are all poorly reported (and potentially selectively reported), whereas the agency studies are not just completely reported, but also include information on a wider range of relevant outcomes (e.g., effects over longer follow-up times) than the peer-reviewed stuff. In that case, it seems useful and potentially important to include the broader range of effects from the agency studies. However, it needs to be done with care. In particular, it will be critical to understand the sources of variation *within* the agency studies. You might investigate this by conducting preliminary analyses of the effects from each agency study *on its own*, understanding what the important within-study moderators are, and only then thinking about how to line up the evidence from the agency studies with the evidence from the peer reviewed studies. I'd be curious to hear more about the context you're working in, and which (if any) of these scenarios you find yourself in. I'm also very interested to hear others perspectives on this question. Kind Regards, James
On Dec 9, 2020, at 12:07 PM, Ju Lee <juhyung2 at stanford.edu> wrote: ?Dear Wolfgang,
I hope you are well these days. I had some general questions related to the data structure in mixed-effect models. We are currently working with data extracted from pee-reviewed papers as well as big data extracted from state or agency surveys. The issue we have is although we are including only 2-3 agency studies, each study can generate up to 1000-9000 effect sizes due to the abundance of data they produced. Conversely, the data collected from peer-reviewed articles are much smaller than that perhaps < 800 effect sizes combined. My co-authors want to use those agency data, but I am very concerned that including those data makes sense. So the question is: 1. Is it even reasonable to consider including such few studies that will pretty much dominate the entire data? 2. Can common mixed-effect model approaches with study random factor account for such disproportionate contribution of few studies? It would be extremely helpful to hear your perspectives. Thank you Best, JU [[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dear James, Thank you so much for these detailed thoughts and suggestions. My co-authors and I find this input extremely helpful. Currently, we are analyzing fishery data across the US, trying to understand how habitat or environmental contexts influence fish productivity by combining both peer-reviewed and agency data. We think our current issue is most related to your first point. We have >80 peer-reviewed papers and 3-4 agency dataset. One major difference between peer-reviewed articles and agency data is that agency data are collected more frequently, across larger areas and different sites, and over a longer time frame (> 20 yr long data accumulated). Also, they do indiscriminately record all catch, whereas peer-reviewed papers often report ones that are more relevant to their research question (one that is more aligned with our research question as well). So, after reading your comments, I envision us applying a stricter rule for the agency data. We have 1000 or so effect sizes combining many peer-reviewed articles, whereas each dataset generates between a whopping 9000-13000 effect sizes if we apply the same inclusion criteria. Our follow-up questions are: 1. Given that we think there is still some value in including the agency data in our analysis, is it reasonable to apply different criteria or rules (more strict) just for the agency data? For example, in our peer-reviewed data, we treat samplings conducted in different areas and years as independent studies and as these effects are matched with discrete measurements for our moderator of interest (say temperature or depth). I am wondering if it is justifiable if we apply a different protocol just for agency data (ex. merging and pooling across all years and sites and just generate a single effect size for each fish species OR only including randomly chosen year or site from the dataset) for the sake of not taking over the entire data with agency ones. 2. One option we were considering was to run our models with and without agency data and report both. However, you pointed out that that model output (including such an abnormally large study) may not be reliable at all if there is such a huge study size differences to begin with. So, my understanding is that this should not be one of our options unless we can significantly reduce the number of agency data being incorporated? 3. Finally, you mentioned analyzing the agency data separately. I have not considered this option but say we have two agency datasets that we combine to run our models. There would be plenty of effect sizes (coming from different sites, years, observers, and fish species) but only two levels of study (data 1, 2). I am unsure if this approach would make any sense, but do you have any additional thoughts on the validity of such models or approaches? Thank you very much for your time, James. I hope my answers and follow-up questions are clear enough. Best regards, JU
From: James Pustejovsky <jepusto at gmail.com>
Sent: Wednesday, December 9, 2020 6:12 PM
To: Ju Lee <juhyung2 at stanford.edu>
Cc: Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl>; R meta <r-sig-meta-analysis at r-project.org>
Subject: Re: [R-meta] Dear Wolfgang
Sent: Wednesday, December 9, 2020 6:12 PM
To: Ju Lee <juhyung2 at stanford.edu>
Cc: Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl>; R meta <r-sig-meta-analysis at r-project.org>
Subject: Re: [R-meta] Dear Wolfgang
Hi JU, This is a very interesting question. My personal sense of it is that one should be very cautious when working with such highly imbalanced datasets like this (where some studies contribute just a few effect sizes while others contribute orders of magnitude more). The usual methods for handling dependent effects (such as multi-level meta-analysis, robust variance estimation, etc.) can do strange and wacky things in this situation. As far as I know, their performance has not been evaluated with data structures like this, and my understanding is that the usual statistical theory supporting these methods doesn't necessarily apply very well. To make progress, I would recommend first carefully reviewing your inclusion criteria and checking whether the effect sizes from the agency studies are really comparable and aligned with the effect sizes extracted from the peer-reviewed papers. One possible scenario is that the peer-reviewed papers all only report outcomes on well-validated scales, whereas the agency studies report outcomes on a kitchen sink of different outcomes, including well-validated full scales but also never-validated scales, sub-scales, single items, and assorted other oddities. In this situation, I don't seem any benefit to throwing in all the sub-scales and other chaff, just because it's available. Better to use stricter inclusion criteria and just focus on the sound, validated measures. Another possible scenario is that the peer-reviewed studies report one sort of information, whereas the agency studies report a categorically different sort of information (with little or no overlap in terms of the measures used, scale of the samples, etc.). In this case, it would seem sensible to perform separate syntheses of the peer-reviewed literature and the agency studies, then scratch your head over how to understand differences between the two bodies of evidence. Another possible scenario is that the peer-reviewed studies are all poorly reported (and potentially selectively reported), whereas the agency studies are not just completely reported, but also include information on a wider range of relevant outcomes (e.g., effects over longer follow-up times) than the peer-reviewed stuff. In that case, it seems useful and potentially important to include the broader range of effects from the agency studies. However, it needs to be done with care. In particular, it will be critical to understand the sources of variation *within* the agency studies. You might investigate this by conducting preliminary analyses of the effects from each agency study *on its own*, understanding what the important within-study moderators are, and only then thinking about how to line up the evidence from the agency studies with the evidence from the peer reviewed studies. I'd be curious to hear more about the context you're working in, and which (if any) of these scenarios you find yourself in. I'm also very interested to hear others perspectives on this question. Kind Regards, James > On Dec 9, 2020, at 12:07 PM, Ju Lee <juhyung2 at stanford.edu> wrote: > > ?Dear Wolfgang, > > I hope you are well these days. > > I had some general questions related to the data structure in mixed-effect models. > We are currently working with data extracted from pee-reviewed papers as well as big data extracted from state or agency surveys. > The issue we have is although we are including only 2-3 agency studies, each study can generate up to 1000-9000 effect sizes due to the abundance of data they produced. > > Conversely, the data collected from peer-reviewed articles are much smaller than that perhaps < 800 effect sizes combined. My co-authors want to use those agency data, but I am very concerned that including those data makes sense. > > So the question is: > > 1. Is it even reasonable to consider including such few studies that will pretty much dominate the entire data? > 2. Can common mixed-effect model approaches with study random factor account for such disproportionate contribution of few studies? > > It would be extremely helpful to hear your perspectives. > Thank you > > Best, > JU > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-meta-analysis mailing list > R-sig-meta-analysis at r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dear Ju One thing which occurs to me is that if the peer-reviewed studies are on a more restricted set of species and those is the relevant ones for your interests then why not just use those species from the agency data? That would presumably make the sets more equal in size. Michael
On 10/12/2020 22:44, Ju Lee wrote:
Dear James, Thank you so much for these detailed thoughts and suggestions. My co-authors and I find this input extremely helpful. Currently, we are analyzing fishery data across the US, trying to understand how habitat or environmental contexts influence fish productivity by combining both peer-reviewed and agency data. We think our current issue is most related to your first point. We have >80 peer-reviewed papers and 3-4 agency dataset. One major difference between peer-reviewed articles and agency data is that agency data are collected more frequently, across larger areas and different sites, and over a longer time frame (> 20 yr long data accumulated). Also, they do indiscriminately record all catch, whereas peer-reviewed papers often report ones that are more relevant to their research question (one that is more aligned with our research question as well). So, after reading your comments, I envision us applying a stricter rule for the agency data. We have 1000 or so effect sizes combining many peer-reviewed articles, whereas each dataset generates between a whopping 9000-13000 effect sizes if we apply the same inclusion criteria. Our follow-up questions are: 1. Given that we think there is still some value in including the agency data in our analysis, is it reasonable to apply different criteria or rules (more strict) just for the agency data? For example, in our peer-reviewed data, we treat samplings conducted in different areas and years as independent studies and as these effects are matched with discrete measurements for our moderator of interest (say temperature or depth). I am wondering if it is justifiable if we apply a different protocol just for agency data (ex. merging and pooling across all years and sites and just generate a single effect size for each fish species OR only including randomly chosen year or site from the dataset) for the sake of not taking over the entire data with agency ones. 2. One option we were considering was to run our models with and without agency data and report both. However, you pointed out that that model output (including such an abnormally large study) may not be reliable at all if there is such a huge study size differences to begin with. So, my understanding is that this should not be one of our options unless we can significantly reduce the number of agency data being incorporated? 3. Finally, you mentioned analyzing the agency data separately. I have not considered this option but say we have two agency datasets that we combine to run our models. There would be plenty of effect sizes (coming from different sites, years, observers, and fish species) but only two levels of study (data 1, 2). I am unsure if this approach would make any sense, but do you have any additional thoughts on the validity of such models or approaches? Thank you very much for your time, James. I hope my answers and follow-up questions are clear enough. Best regards, JU
________________________________
From: James Pustejovsky <jepusto at gmail.com>
Sent: Wednesday, December 9, 2020 6:12 PM
To: Ju Lee <juhyung2 at stanford.edu>
Cc: Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl>; R meta <r-sig-meta-analysis at r-project.org>
Subject: Re: [R-meta] Dear Wolfgang
Hi JU,
This is a very interesting question. My personal sense of it is that
one should be very cautious when working with such highly imbalanced
datasets like this (where some studies contribute just a few effect
sizes while others contribute orders of magnitude more). The usual
methods for handling dependent effects (such as multi-level
meta-analysis, robust variance estimation, etc.) can do strange and
wacky things in this situation. As far as I know, their performance
has not been evaluated with data structures like this, and my
understanding is that the usual statistical theory supporting these
methods doesn't necessarily apply very well.
To make progress, I would recommend first carefully reviewing your
inclusion criteria and checking whether the effect sizes from the
agency studies are really comparable and aligned with the effect sizes
extracted from the peer-reviewed papers. One possible scenario is that
the peer-reviewed papers all only report outcomes on well-validated
scales, whereas the agency studies report outcomes on a kitchen sink
of different outcomes, including well-validated full scales but also
never-validated scales, sub-scales, single items, and assorted other
oddities. In this situation, I don't seem any benefit to throwing in
all the sub-scales and other chaff, just because it's available.
Better to use stricter inclusion criteria and just focus on the sound,
validated measures.
Another possible scenario is that the peer-reviewed studies report one
sort of information, whereas the agency studies report a categorically
different sort of information (with little or no overlap in terms of
the measures used, scale of the samples, etc.). In this case, it would
seem sensible to perform separate syntheses of the peer-reviewed
literature and the agency studies, then scratch your head over how to
understand differences between the two bodies of evidence.
Another possible scenario is that the peer-reviewed studies are all
poorly reported (and potentially selectively reported), whereas the
agency studies are not just completely reported, but also include
information on a wider range of relevant outcomes (e.g., effects over
longer follow-up times) than the peer-reviewed stuff. In that case, it
seems useful and potentially important to include the broader range of
effects from the agency studies. However, it needs to be done with
care. In particular, it will be critical to understand the sources of
variation *within* the agency studies. You might investigate this by
conducting preliminary analyses of the effects from each agency study
*on its own*, understanding what the important within-study moderators
are, and only then thinking about how to line up the evidence from the
agency studies with the evidence from the peer reviewed studies.
I'd be curious to hear more about the context you're working in, and
which (if any) of these scenarios you find yourself in. I'm also very
interested to hear others perspectives on this question.
Kind Regards,
James
On Dec 9, 2020, at 12:07 PM, Ju Lee <juhyung2 at stanford.edu> wrote:
?Dear Wolfgang,
I hope you are well these days.
I had some general questions related to the data structure in mixed-effect models.
We are currently working with data extracted from pee-reviewed papers as well as big data extracted from state or agency surveys.
The issue we have is although we are including only 2-3 agency studies, each study can generate up to 1000-9000 effect sizes due to the abundance of data they produced.
Conversely, the data collected from peer-reviewed articles are much smaller than that perhaps < 800 effect sizes combined. My co-authors want to use those agency data, but I am very concerned that including those data makes sense.
So the question is:
1. Is it even reasonable to consider including such few studies that will pretty much dominate the entire data?
2. Can common mixed-effect model approaches with study random factor account for such disproportionate contribution of few studies?
It would be extremely helpful to hear your perspectives.
Thank you
Best,
JU
[[alternative HTML version deleted]]
_______________________________________________
R-sig-meta-analysis mailing list
R-sig-meta-analysis at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
[[alternative HTML version deleted]]
_______________________________________________
R-sig-meta-analysis mailing list
R-sig-meta-analysis at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
@Ji: I don't have immediate answers to your questions, but I just want to raise another point, since you mentioned that there are often multiple estimates for the same site over time, for many sites (some of which I suspect may be close to each other spatially), and for many species. In principle, an appropriately formulated model should be able to automatically account for these things. Let me give a simple analogy. Suppose I want to know if group A has higher/lower blood pressure than group B. I measure the blood pressure of the individuals in group A once. In group B, I have the same number of individuals but measure their blood pressure 100 times. I obviously cannot just run a t-test here, treating the repeated measurements from group B as if there are 100 times as many people in that group. An appropriately formulated multilevel / mixed-effects model however will account for the (presumably) very high correlation in the repeated measurements for the same individual and effectively downweight these repeated measurements. The same idea applies to meta-analysis. I can formulate models that allow for dependency/correlation in multiple estimates for the same site over time and that allow for spatial correlation in sites (depending on close together they are). In fact, I was recently involved in a meta-analysis on fish abundance data where we accounted for the spatial correlation in the estimates: Maire, A., Thierry, E., Viechtbauer, W., & Daufresne, M. (2019). Poleward shift in large-river fish communities detected with a novel meta-analysis framework. Freshwater Biology, 64(6), 1143-1156. In this case, the estimates were slopes (for some measure over time), so we did not have to deal with temporal correlation in the estimates (except when computing the slopes and corresponding standard errors themselves). But based on this paper, I actually added spatial correlation structures to the rma.mv() function (this is in the 'devel' version). See: https://wviechtb.github.io/metafor/reference/rma.mv.html and search for "For outcomes that have a known spatial configuration". The rma.mv() function also allows for adding random effects to account for serial/auto correlation in estimates. This is relevant to account for dependency in multiple estimates from the same site over time. Inclusion of different species also raises the possibility of phylogenetic correlations. rma.mv() yet again has you covered here. One can add random effects for species with and without a corresponding correlation matrix derived from a phylogeny. The tricky part is of course coming up with an 'appropriately formulated model'. You are stepping into cutting-edge territory here, so good luck on this one! ;) Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Michael Dewey Sent: Friday, 11 December, 2020 12:17 To: Ju Lee; James Pustejovsky Cc: R meta Subject: Re: [R-meta] Dear Wolfgang Dear Ju One thing which occurs to me is that if the peer-reviewed studies are on a more restricted set of species and those is the relevant ones for your interests then why not just use those species from the agency data? That would presumably make the sets more equal in size. Michael On 10/12/2020 22:44, Ju Lee wrote:
Dear James, Thank you so much for these detailed thoughts and suggestions. My co-
authors and I find this input extremely helpful.
Currently, we are analyzing fishery data across the US, trying to
understand how habitat or environmental contexts influence fish productivity by combining both peer-reviewed and agency data. We think our current issue is most related to your first point. We have >80 peer-reviewed papers and 3- 4 agency dataset.
One major difference between peer-reviewed articles and agency data is
that agency data are collected more frequently, across larger areas and different sites, and over a longer time frame (> 20 yr long data accumulated). Also, they do indiscriminately record all catch, whereas peer- reviewed papers often report ones that are more relevant to their research question (one that is more aligned with our research question as well).
So, after reading your comments, I envision us applying a stricter rule
for the agency data. We have 1000 or so effect sizes combining many peer- reviewed articles, whereas each dataset generates between a whopping 9000- 13000 effect sizes if we apply the same inclusion criteria.
Our follow-up questions are: 1. Given that we think there is still some value in including the agency
data in our analysis, is it reasonable to apply different criteria or rules (more strict) just for the agency data?
For example, in our peer-reviewed data, we treat samplings conducted in
different areas and years as independent studies and as these effects are matched with discrete measurements for our moderator of interest (say temperature or depth). I am wondering if it is justifiable if we apply a different protocol just for agency data (ex. merging and pooling across all years and sites and just generate a single effect size for each fish species OR only including randomly chosen year or site from the dataset) for the sake of not taking over the entire data with agency ones.
2. One option we were considering was to run our models with and without
agency data and report both. However, you pointed out that that model output (including such an abnormally large study) may not be reliable at all if there is such a huge study size differences to begin with. So, my understanding is that this should not be one of our options unless we can significantly reduce the number of agency data being incorporated?
3. Finally, you mentioned analyzing the agency data separately. I have not
considered this option but say we have two agency datasets that we combine to run our models. There would be plenty of effect sizes (coming from different sites, years, observers, and fish species) but only two levels of study (data 1, 2). I am unsure if this approach would make any sense, but do you have any additional thoughts on the validity of such models or approaches?
Thank you very much for your time, James. I hope my answers and follow-up
questions are clear enough.
Best regards, JU
________________________________ From: James Pustejovsky <jepusto at gmail.com> Sent: Wednesday, December 9, 2020 6:12 PM To: Ju Lee <juhyung2 at stanford.edu> Cc: Viechtbauer Wolfgang (SP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>; R meta <r-sig-meta- analysis at r-project.org>
Subject: Re: [R-meta] Dear Wolfgang Hi JU, This is a very interesting question. My personal sense of it is that one should be very cautious when working with such highly imbalanced datasets like this (where some studies contribute just a few effect sizes while others contribute orders of magnitude more). The usual methods for handling dependent effects (such as multi-level meta-analysis, robust variance estimation, etc.) can do strange and wacky things in this situation. As far as I know, their performance has not been evaluated with data structures like this, and my understanding is that the usual statistical theory supporting these methods doesn't necessarily apply very well. To make progress, I would recommend first carefully reviewing your inclusion criteria and checking whether the effect sizes from the agency studies are really comparable and aligned with the effect sizes extracted from the peer-reviewed papers. One possible scenario is that the peer-reviewed papers all only report outcomes on well-validated scales, whereas the agency studies report outcomes on a kitchen sink of different outcomes, including well-validated full scales but also never-validated scales, sub-scales, single items, and assorted other oddities. In this situation, I don't seem any benefit to throwing in all the sub-scales and other chaff, just because it's available. Better to use stricter inclusion criteria and just focus on the sound, validated measures. Another possible scenario is that the peer-reviewed studies report one sort of information, whereas the agency studies report a categorically different sort of information (with little or no overlap in terms of the measures used, scale of the samples, etc.). In this case, it would seem sensible to perform separate syntheses of the peer-reviewed literature and the agency studies, then scratch your head over how to understand differences between the two bodies of evidence. Another possible scenario is that the peer-reviewed studies are all poorly reported (and potentially selectively reported), whereas the agency studies are not just completely reported, but also include information on a wider range of relevant outcomes (e.g., effects over longer follow-up times) than the peer-reviewed stuff. In that case, it seems useful and potentially important to include the broader range of effects from the agency studies. However, it needs to be done with care. In particular, it will be critical to understand the sources of variation *within* the agency studies. You might investigate this by conducting preliminary analyses of the effects from each agency study *on its own*, understanding what the important within-study moderators are, and only then thinking about how to line up the evidence from the agency studies with the evidence from the peer reviewed studies. I'd be curious to hear more about the context you're working in, and which (if any) of these scenarios you find yourself in. I'm also very interested to hear others perspectives on this question. Kind Regards, James
On Dec 9, 2020, at 12:07 PM, Ju Lee <juhyung2 at stanford.edu> wrote: ?Dear Wolfgang, I hope you are well these days. I had some general questions related to the data structure in mixed-
effect models.
We are currently working with data extracted from pee-reviewed papers as
well as big data extracted from state or agency surveys.
The issue we have is although we are including only 2-3 agency studies,
each study can generate up to 1000-9000 effect sizes due to the abundance of data they produced.
Conversely, the data collected from peer-reviewed articles are much
smaller than that perhaps < 800 effect sizes combined. My co-authors want to use those agency data, but I am very concerned that including those data makes sense.
So the question is: 1. Is it even reasonable to consider including such few studies that
will pretty much dominate the entire data?
2. Can common mixed-effect model approaches with study random factor
account for such disproportionate contribution of few studies?
It would be extremely helpful to hear your perspectives. Thank you Best, JU
4 days later
Hi JU, Responses to your follow-up questions below. James
On Thu, Dec 10, 2020 at 4:44 PM Ju Lee <juhyung2 at stanford.edu> wrote:
Dear James, Thank you so much for these detailed thoughts and suggestions. My co-authors and I find this input extremely helpful. Currently, we are analyzing fishery data across the US, trying to understand how habitat or environmental contexts influence fish productivity by combining both peer-reviewed and agency data. We think our current issue is most related to your first point. We have >80 peer-reviewed papers and 3-4 agency dataset. One major difference between peer-reviewed articles and agency data is that agency data are collected more frequently, across larger areas and different sites, and over a longer time frame (> 20 yr long data accumulated). Also, they do indiscriminately record all catch, whereas peer-reviewed papers often report ones that are more relevant to their research question (one that is more aligned with our research question as well). So, after reading your comments, I envision us applying a stricter rule for the agency data. We have 1000 or so effect sizes combining many peer-reviewed articles, whereas each dataset generates between a whopping 9000-13000 effect sizes if we apply the same inclusion criteria. Our follow-up questions are: *1. Given that we think there is still some value in including the agency data in our analysis, is it reasonable to apply different criteria or rules (more strict) just for the agency data?* *For example, in our peer-reviewed data, we treat samplings conducted in different areas and years as independent studies and as these effects are matched with discrete measurements for our moderator of interest (say temperature or depth). I am wondering if it is justifiable if we apply a different protocol just for agency data (ex. merging and pooling across all years and sites and just generate a single effect size for each fish species OR only including randomly chosen year or site from the dataset) for the sake of not taking over the entire data with agency ones.*
As a statistician who doesn't know anything about the subject-matter, I'm afraid I don't really feel qualified to answer this question. This requires making judgements about the relevance for your research questions of the different types of sites, species, measures, time-points, etc. included in the agency data and in the peer-reviewed data. I would not recommend using a different protocol for agency data than for peer reviewed data just for the sake of shrinking down the dataset. I think the thing to do is focus on which data are relevant and suitable for answering the questions you have.
2. One option we were considering was to run our models with and without agency data and report both. However, you pointed out that that model output (including such an abnormally large study) may not be reliable at all if there is such a huge study size differences to begin with. So, my understanding is that this should not be one of our options unless we can significantly reduce the number of agency data being incorporated?
It doesn't seem unreasonable to me to run your analyses with and without the agency data, and reporting both sets of results. The problem with this approach would be figuring out how to interpret everything and draw bottom-line conclusions if the results aren't consistent. That's why I suggested running the analyses separately for the peer-reviewed data and the agency data. It think that would let you get more of a purchase on what's going on.
3. Finally, you mentioned analyzing the agency data separately. I have not considered this option but say we have two agency datasets that we combine to run our models. There would be plenty of effect sizes (coming from different sites, years, observers, and fish species) but only two levels of study (data 1, 2). I am unsure if this approach would make any sense, but do you have any additional thoughts on the validity of such models or approaches?
If you model the agency data on its own, I don't think it would make sense to include study as a random effect. The results would be conditional on the available agency studies, so you'd have to interpret them accordingly. But, with so much available data, there would still be many sources of variation that could be investigated and modeled (as Wolfgang noted in his reply).