[R-meta] Handling meta-analysis dataset with sampling variance equals zero
Dear Tzlil I can understand the desire to make everything consistent across outcomes but that does seem to make life unnecessarily complicated for you. Would it be acceptable for the third analysis to recode the outcome as binary (in this sort of game nobody sprints versus somebody sprints) or would that deviate too much from your scientific question? Michael
On 11/01/2022 04:47, Tzlil Shushan wrote:
Hi james, Thank you for the reply.. To your questions and by the way of trying to make it more understandable. We are interested in the estimates of meterage covered (group mean and SD ? expressed as meter/min) during different sided-game commonly used is football (soccer) training. Then, we further explore moderating factors such as game format (e.g. number of players) or any other configurations (e.g. pitch size) or rules (e.g. scoring options) on these running outcomes. Regarding the speed, YES, different studies usually use different speed zones; for example: Study 1 has 3 buckets (14.4?19.8 km.h, 19.8?24 km.h and everything >24) and reports group's summary statistics of distance covered on each of the three buckets. Study 2 also has 3 buckets (13?18 km.h, 18?22 km.h and everything
22) and reports group's summary statistics of distance covered on each of
the three buckets. First thing we did was to calculate the overall running in each starting zone to infinite (i.e. everything >14.4, >19.8 and >24 in study 1, and everything >13, >18 and >22 in study 2). Of note, for means we basically added the distance in each bucket. For aggregating SDs we either calculated or estimated their covariances (by considering their mutual relationship).. For the final datasets we basically have 3 different speed zones (informed by conceptual decisions related to our field) that we meta-analyse each of them separately (three independent meta-analysis), let's say: Meta 1 includes all the estimates >13 km.h and up to >16 km.h Meta 2 includes all the estimates >18 km.h and up to >22 km.h Meta 3 includes all the estimates >24 km.h Note: estimates are the group mean and SD of the distance covered (i.e. meter per min). example dataset meta 1: mean (m/min) SD (min/min) Speed 4 0.8 >18 3 0.5 >19.8 6 0.6 >22 For the meta-analysis includes the highest speed values (meta 3), there are many studies reporting summary statistics of mean=0 and SD=0 (see below; i.e. none of the distance covered was above 24 km.h).These results make sense. For this model we get warning message for non-definite covariance in the V-matrix and can't have heterogeneity statistics like Q-statistics and I^2 for the model (as we have for the other two lower speeds models). We are keen to know what would be the most reasonable solution for reporting heterogeneity in this model. example dataset meta 3: mean (m/min) SD (min/min) Speed 0 0 >24 0.12 0.02 >24.8 0 0 >25 To your last question, there are many different in games formats within and between samples, resulting in many studies reporting multiple effect sizes for the same participants. Therefore, we use nested approach and RVE for our models while controlling for their covariances. Later on, we conduct meta-regression to to test the effect of these differences on running outcomes. I hope this makes more sense on the project in general and our question on heterogeneity in particular.. Kind regards, Tzlil Shushan | Sport Scientist, Physical Preparation Coach BEd Physical Education and Exercise Science MSc Exercise Science - High Performance Sports: Strength & Conditioning, CSCS PhD Candidate Human Performance Science & Sports Analytics ??????? ??? ??, 11 ????? 2022 ?-14:20 ??? ?James Pustejovsky?? <? jepusto at gmail.com??>:?
Hi Tzlil, I am trying to understand better what your outcomes are and what questions you're trying to answer. From your explanation, it sounds like you are interested in the distribution of speeds at which a player runs during a game. So if you had the raw data for one player, you might represent it as a histogram showing the amount of distance traveled during a game (rescaled as distance traveled per minute of game play) as a function of the speed: Speed Distance traveled (per minute of game play) ----------- ------------------------------------------------------------ 26 km/h x 25 km/h xx 24 km/h x 23 km/h xxxx 22 km/h xxxx 21 km/h xxxxxx 20 km/h xxx 19 km/h xxxxxxxx 18 km/h xxxxxxxxxx 17 km/h xxxxxxxxxxx 16 km/h xxxxxxxxxxxx 15 km/h xxxxxxxxxxxxxxxx 14 km/h xxxxxxxxxxxxxxx [etc.] But from your explanation, it sounds like you only have summary statistics on this distribution, such as histograms with much coarser categories than what I have represented above. Is that correct? And if so, do different studies generally use the same set of coarse speed categories? Or does every study use different categories? Also, I'm not clear about how you end up with a mean and a SD for each of these buckets. Is the SD a summary over multiple individual participants/players? Or over multiple repetitions for an individual? All of the above questions are just about the outcomes you and your colleagues are examining. How would you summarize your research question? Is it about how variation in game format or game rules affect the distribution of running speeds? So the "intervention" or "treatment" of interest is a comparison of different game formats? If that is correct, do the game formats vary within sample or only between sample? James On Mon, Jan 10, 2022 at 7:32 PM Tzlil Shushan <tzlil21092 at gmail.com> wrote:
Dear Wolfgang, James and the team.. Me and colleagues are currently conducting a meta-analysis in the area
of sports. In our analysis, we meta-analyse the exposure of high-speed running and sprinting in variations of games during football (soccer) training. For example, an outcome may be the distance covered (in meters) during a 4 versus 4 players game between 14.4 to 19.8 km.h only or the distance covered above 24.0 km.h. Hence, our main outcome is the mean and sampling variance of SD express as meters per minute game (we use "MN" in escalc function)..
Considering that exposure to high velocity thresholds (e.g. >24 km.h)
uncommonly happens during such games we have many outcomes that have mean and SD of 0 (we get sampling error of 0), also yielding to an overall estimate that is very close to zero. In other words, almost all of the distance covered during such games is in running speeds that are less than what is considered 'sprinting' in football.
My main question is regarding heterogeneity.. Whilst building the 'V-matrix' using the clubSandwich package we get a
warning message of non-positive definite due to 0s in the matrix. We basically ignore this because it makes sense to have 0s as explained. Also, I know that when we run the model we don't get the Q-statistics and cannot calculate I^2 due to the same reason..
I've been reading some of past discussions on these in the group however
wanted to make sure and ask for reporting approaches. Is there any option to get these heterogeneity statistics with our data? Alternatively, can we basically state that we don't report these because of the nature of the dataset including many 0 values ? and report tau only?
The main thing is that we have a dataset including lower intensities
which we obtain all aforementioned heterogeneity and we want to have a consistent report strategy in the paper, unless it is impossible due to the difference in outcomes across datasets..
I appreciate your help here.. Kind regards, Tzlil Shushan | Sport Scientist, Physical Preparation Coach BEd Physical Education and Exercise Science MSc Exercise Science - High Performance Sports: Strength & Conditioning,
CSCS
PhD Candidate Human Performance Science & Sports Analytics
[[alternative HTML version deleted]]
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis