Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
[R-meta] Questions about the use of metaprop for the pooling of proportions
12 messages · Michael Dewey, Gerta Ruecker, Thiago Roza +1 more
1 day later
Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael
On 06/03/2022 16:36, Thiago Roza wrote:
Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dear Michael, Thank you for your reply! Do you think it would be possible to generate pooled proportions for at least the most commonly reported suicide method in this case? (I would organize my dataset in the following format: "suicide by hanging" vs "other method of suicide", only two categories). Thank you, Thiago Em seg., 7 de mar. de 2022 ?s 13:40, Michael Dewey <lists at dewey.myzen.co.uk> escreveu:
Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael On 06/03/2022 16:36, Thiago Roza wrote:
Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
-- Michael http://www.dewey.myzen.co.uk/home.html
Dear Thiago, dear Michael,
I read this thread and I still am not clear about the nature of the
data. Are these really compositional data, or simple proportions? The
difference is:
* Compositional data are characterized by lacking a denominator (no
"n", no sample size). For each study, you have only percentages that
add to 100%. Such data occur in microbioma research (percentages of
species in the microbioma).
* By contrast, proportions are given as r (number of events) and n
(sample size, i.e., number of persons/patients/trials/whatever), or
as percentages and n.
If you have proportions, you may use metaprop. If you have compositional
data, as Michael supposed, you cannot.
Best,
Gerta
Am 08.03.2022 um 12:34 schrieb Thiago Roza:
Dear Michael, Thank you for your reply! Do you think it would be possible to generate pooled proportions for at least the most commonly reported suicide method in this case? (I would organize my dataset in the following format: "suicide by hanging" vs "other method of suicide", only two categories). Thank you, Thiago Em seg., 7 de mar. de 2022 ?s 13:40, Michael Dewey <lists at dewey.myzen.co.uk> escreveu:
Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael On 06/03/2022 16:36, Thiago Roza wrote:
Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
-- Michael http://www.dewey.myzen.co.uk/home.html
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker [[alternative HTML version deleted]]
Dear Gerta, Thank you for your reply! In my systematic review, I have several cross-sectional original studies. In each one of these original studies I have a sample size (n for the total number of suicide cases included in the study), and this number is also classified according to the suicide method (for instance, if n is 100 for the total number of cases, 80% or 80 cases died due to hanging, 10 or 10% died due to firearms, 5 or 5% died due to drug overdose, 3 or 3% died due to pesticides, and so on). The same example applies to other variables such as biological sex, race, suicide site, etc. The idea of my analysis is to pool the proportions of several key characteristics, including suicide methods, across all included studies, so I can report the proportions with 95%CI in the paper. I tried using "metaprop" for the pooling of the proportions of suicide methods, however, when I summed up the pooled proportions, when using the "Inverse" method the sum would give more than 100%, and when using the "GLMM" method it would give less than 100%. That is why I was wondering if it was possible to pool those proportions using "metaprop". If yes, is it OK for the summed pooled proportions to be different than 100%? Thank you, Thiago Em ter., 8 de mar. de 2022 ?s 09:27, Dr. Gerta R?cker <ruecker at imbi.uni-freiburg.de> escreveu:
Dear Thiago, dear Michael, I read this thread and I still am not clear about the nature of the data. Are these really compositional data, or simple proportions? The difference is: Compositional data are characterized by lacking a denominator (no "n", no sample size). For each study, you have only percentages that add to 100%. Such data occur in microbioma research (percentages of species in the microbioma). By contrast, proportions are given as r (number of events) and n (sample size, i.e., number of persons/patients/trials/whatever), or as percentages and n. If you have proportions, you may use metaprop. If you have compositional data, as Michael supposed, you cannot. Best, Gerta Am 08.03.2022 um 12:34 schrieb Thiago Roza: Dear Michael, Thank you for your reply! Do you think it would be possible to generate pooled proportions for at least the most commonly reported suicide method in this case? (I would organize my dataset in the following format: "suicide by hanging" vs "other method of suicide", only two categories). Thank you, Thiago Em seg., 7 de mar. de 2022 ?s 13:40, Michael Dewey <lists at dewey.myzen.co.uk> escreveu: Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael On 06/03/2022 16:36, Thiago Roza wrote: Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Michael http://www.dewey.myzen.co.uk/home.html _______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta Am 08.03.2022 um 13:42 schrieb Thiago Roza:
Dear Gerta, Thank you for your reply! In my systematic review, I have several cross-sectional original studies. In each one of these original studies I have a sample size (n for the total number of suicide cases included in the study), and this number is also classified according to the suicide method (for instance, if n is 100 for the total number of cases, 80% or 80 cases died due to hanging, 10 or 10% died due to firearms, 5 or 5% died due to drug overdose, 3 or 3% died due to pesticides, and so on). The same example applies to other variables such as biological sex, race, suicide site, etc. The idea of my analysis is to pool the proportions of several key characteristics, including suicide methods, across all included studies, so I can report the proportions with 95%CI in the paper. I tried using "metaprop" for the pooling of the proportions of suicide methods, however, when I summed up the pooled proportions, when using the "Inverse" method the sum would give more than 100%, and when using the "GLMM" method it would give less than 100%. That is why I was wondering if it was possible to pool those proportions using "metaprop". If yes, is it OK for the summed pooled proportions to be different than 100%? Thank you, Thiago Em ter., 8 de mar. de 2022 ?s 09:27, Dr. Gerta R?cker <ruecker at imbi.uni-freiburg.de> escreveu:
Dear Thiago, dear Michael, I read this thread and I still am not clear about the nature of the data. Are these really compositional data, or simple proportions? The difference is: Compositional data are characterized by lacking a denominator (no "n", no sample size). For each study, you have only percentages that add to 100%. Such data occur in microbioma research (percentages of species in the microbioma). By contrast, proportions are given as r (number of events) and n (sample size, i.e., number of persons/patients/trials/whatever), or as percentages and n. If you have proportions, you may use metaprop. If you have compositional data, as Michael supposed, you cannot. Best, Gerta Am 08.03.2022 um 12:34 schrieb Thiago Roza: Dear Michael, Thank you for your reply! Do you think it would be possible to generate pooled proportions for at least the most commonly reported suicide method in this case? (I would organize my dataset in the following format: "suicide by hanging" vs "other method of suicide", only two categories). Thank you, Thiago Em seg., 7 de mar. de 2022 ?s 13:40, Michael Dewey <lists at dewey.myzen.co.uk> escreveu: Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael On 06/03/2022 16:36, Thiago Roza wrote: Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Michael http://www.dewey.myzen.co.uk/home.html _______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Dear Thiago, I found that, apparently, the result presented by the common effect model (=fixed effect model) is simply the sum of all entries/events over all studies, divided by the total sample size (summed up over all studies). You see this by typing the following after the code in my last e-mail: all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed)) all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed)) all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed)) This means that the method is equivalent to considering the data as a contingency table where the rows correspond to the studies and the columns to the outcomes. The meta-analytic result corresponds to the percentages in the column sums, and of course these add to 100%. In fact this is the easiest way to deal with this kind of data. @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or the rma.glmm help pages. Do you see any problem with interpreting Thiago's data as a contingency table? I think that, by contrast to pairwise comparison data, confounding/ecological bias is not an issue here. Best, Gerta Am 08.03.2022 um 19:30 schrieb Dr. Gerta R?cker:
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta Am 08.03.2022 um 13:42 schrieb Thiago Roza:
Dear Gerta, Thank you for your reply! In my systematic review, I have several cross-sectional original studies. In each one of these original studies I have a sample size (n for the total number of suicide cases included in the study), and this number is also classified according to the suicide method (for instance, if n is 100 for the total number of cases, 80% or 80 cases died due to hanging, 10 or 10% died due to firearms, 5 or 5% died due to drug overdose, 3 or 3% died due to pesticides, and so on). The same example applies to other variables such as biological sex, race, suicide site, etc. The idea of my analysis is to pool the proportions of several key characteristics, including suicide methods, across all included studies, so I can report the proportions with 95%CI in the paper. I tried using "metaprop" for the pooling of the proportions of suicide methods, however, when I summed up the pooled proportions, when using the "Inverse" method the sum would give more than 100%, and when using the "GLMM" method it would give less than 100%. That is why I was wondering if it was possible to pool those proportions using "metaprop". If yes, is it OK for the summed pooled proportions to be different than 100%? Thank you, Thiago Em ter., 8 de mar. de 2022 ?s 09:27, Dr. Gerta R?cker <ruecker at imbi.uni-freiburg.de> escreveu:
Dear Thiago, dear Michael, I read this thread and I still am not clear about the nature of the data. Are these really compositional data, or simple proportions? The difference is: Compositional data are characterized by lacking a denominator (no "n", no sample size). For each study, you have only percentages that add to 100%. Such data occur in microbioma research (percentages of species in the microbioma). By contrast, proportions are given as r (number of events) and n (sample size, i.e., number of persons/patients/trials/whatever), or as percentages and n. If you have proportions, you may use metaprop. If you have compositional data, as Michael supposed, you cannot. Best, Gerta Am 08.03.2022 um 12:34 schrieb Thiago Roza: Dear Michael, Thank you for your reply! Do you think it would be possible to generate pooled proportions for at least the most commonly reported suicide method in this case? (I would organize my dataset in the following format: "suicide by hanging" vs "other method of suicide", only two categories). Thank you, Thiago Em seg., 7 de mar. de 2022 ?s 13:40, Michael Dewey <lists at dewey.myzen.co.uk> escreveu: Dear Thiago What you have is compositional data which might prove a useful search term. A common way to analyse such data is by taking the ratios of the components to a reference one and then taking logs. However that is about the sum total of my knowledge of compositional data analysis and as far as I know there is no extant R package which deals with it. Others on the list may have better ideas. For future reference if you post on CrossValidated it is best to put a link in each of them so people can check if it has already been answered in the other place. Michael On 06/03/2022 16:36, Thiago Roza wrote: Dear all, I am conducting a meta-analysis about characteristics of suicide deaths in post-mortem studies. My aim is to describe pooled proportions of key characteristics (biological sex, suicide site, race, marital status, suicide method, the proportion of substance use near death, proportion of psychiatric diagnosis prior to death, etc) across the included studies. Initially, I thought that "metaprop" from the package "meta" would be enough to pool all these proportions across included studies. Nevertheless, some of these variables have more than one category (i.e. suicide method has more than 10 categories: such as hanging, firearm, poisoning, etc), and the pooling of the proportion of each suicide method separately produces results which when summed up give more than 100% for the summed proportion of all suicide methods. Therefore, my first question is: is it possible to pool all those proportions using "metaprop"? If yes, could anyone give an example about the coding for the pooling of proportions in the case of suicide methods? If not, is there any other package that would allow me to pool the aggregate proportion of suicide methods? Thank you, Thiago Roza
_______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Michael http://www.dewey.myzen.co.uk/home.html _______________________________________________ R-sig-meta-analysis mailing list R-sig-meta-analysis at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis -- Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail:???? ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Hi Gerta, Under homogeneity, we have X_i ~ Binomial(n_i, pi), in which case sum(X_i) ~ Binomial(sum(n_i), pi) and hence sum(out1)/sum(n) plogis(coef(glm(out1/n ~ 1, weights = n, family = binomial))) or using metaprop() / rma.glmm() plogis(metaprop(out1, n)$TE.fixed) plogis(coef(rma.glmm(measure="PLO", xi=out1, ni=n, method="EE"))) are all identical. It goes to show how the logistic regression approach gives an 'exact' model, based on the exact distributional properties of binomial counts. As for Thiago's data: I think this is fine. But essentially he has multinomial data. I recently described in a post how such data could be addressed if one would want to analyze them all simultaneously: https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2022-February/003878.html Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Dr. Gerta R?cker Sent: Tuesday, 08 March, 2022 20:30 To: Thiago Roza Cc: r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of proportions Dear Thiago, I found that, apparently, the result presented by the common effect model (=fixed effect model) is simply the sum of all entries/events over all studies, divided by the total sample size (summed up over all studies). You see this by typing the following after the code in my last e-mail: all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed)) all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed)) all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed)) This means that the method is equivalent to considering the data as a contingency table where the rows correspond to the studies and the columns to the outcomes. The meta-analytic result corresponds to the percentages in the column sums, and of course these add to 100%. In fact this is the easiest way to deal with this kind of data. @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or the rma.glmm help pages. Do you see any problem with interpreting Thiago's data as a contingency table? I think that, by contrast to pairwise comparison data, confounding/ecological bias is not an issue here. Best, Gerta Am 08.03.2022 um 19:30 schrieb Dr. Gerta R?cker:
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta
Hi Wolfgang, Thank you! Indeed I just saw that the ML estimate under the binomial model and the assumption of homogeneity gives (sum r_i)/(sum n_i). In fact this seems equivalent to logistic regression. Probably it works also under the multinomial model, I didn't write this down. I admit that I never had thought about this :( Best, Gerta Am 08.03.2022 um 22:58 schrieb Viechtbauer, Wolfgang (SP):
Hi Gerta, Under homogeneity, we have X_i ~ Binomial(n_i, pi), in which case sum(X_i) ~ Binomial(sum(n_i), pi) and hence sum(out1)/sum(n) plogis(coef(glm(out1/n ~ 1, weights = n, family = binomial))) or using metaprop() / rma.glmm() plogis(metaprop(out1, n)$TE.fixed) plogis(coef(rma.glmm(measure="PLO", xi=out1, ni=n, method="EE"))) are all identical. It goes to show how the logistic regression approach gives an 'exact' model, based on the exact distributional properties of binomial counts. As for Thiago's data: I think this is fine. But essentially he has multinomial data. I recently described in a post how such data could be addressed if one would want to analyze them all simultaneously: https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2022-February/003878.html Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Dr. Gerta R?cker Sent: Tuesday, 08 March, 2022 20:30 To: Thiago Roza Cc: r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of proportions Dear Thiago, I found that, apparently, the result presented by the common effect model (=fixed effect model) is simply the sum of all entries/events over all studies, divided by the total sample size (summed up over all studies). You see this by typing the following after the code in my last e-mail: all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed)) all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed)) all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed)) This means that the method is equivalent to considering the data as a contingency table where the rows correspond to the studies and the columns to the outcomes. The meta-analytic result corresponds to the percentages in the column sums, and of course these add to 100%. In fact this is the easiest way to deal with this kind of data. @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or the rma.glmm help pages. Do you see any problem with interpreting Thiago's data as a contingency table? I think that, by contrast to pairwise comparison data, confounding/ecological bias is not an issue here. Best, Gerta Am 08.03.2022 um 19:30 schrieb Dr. Gerta R?cker:
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta
Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Dear Gerta and Wolfgang, Thank you for the replies! The fixed model works just fine for my multinomial data (the sum of the proportions of all suicide methods is now 100!). I think that in this case, I will use the random-effects model for the binomial data in metaprop and the fixed effects model for the multinomial data! Thank you for your help! Thiago Em ter., 8 de mar. de 2022 ?s 19:06, Dr. Gerta R?cker <ruecker at imbi.uni-freiburg.de> escreveu:
Hi Wolfgang, Thank you! Indeed I just saw that the ML estimate under the binomial model and the assumption of homogeneity gives (sum r_i)/(sum n_i). In fact this seems equivalent to logistic regression. Probably it works also under the multinomial model, I didn't write this down. I admit that I never had thought about this :( Best, Gerta Am 08.03.2022 um 22:58 schrieb Viechtbauer, Wolfgang (SP):
Hi Gerta, Under homogeneity, we have X_i ~ Binomial(n_i, pi), in which case sum(X_i) ~ Binomial(sum(n_i), pi) and hence sum(out1)/sum(n) plogis(coef(glm(out1/n ~ 1, weights = n, family = binomial))) or using metaprop() / rma.glmm() plogis(metaprop(out1, n)$TE.fixed) plogis(coef(rma.glmm(measure="PLO", xi=out1, ni=n, method="EE"))) are all identical. It goes to show how the logistic regression approach gives an 'exact' model, based on the exact distributional properties of binomial counts. As for Thiago's data: I think this is fine. But essentially he has multinomial data. I recently described in a post how such data could be addressed if one would want to analyze them all simultaneously: https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2022-February/003878.html Best, Wolfgang
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Dr. Gerta R?cker Sent: Tuesday, 08 March, 2022 20:30 To: Thiago Roza Cc: r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of proportions Dear Thiago, I found that, apparently, the result presented by the common effect model (=fixed effect model) is simply the sum of all entries/events over all studies, divided by the total sample size (summed up over all studies). You see this by typing the following after the code in my last e-mail: all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed)) all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed)) all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed)) This means that the method is equivalent to considering the data as a contingency table where the rows correspond to the studies and the columns to the outcomes. The meta-analytic result corresponds to the percentages in the column sums, and of course these add to 100%. In fact this is the easiest way to deal with this kind of data. @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or the rma.glmm help pages. Do you see any problem with interpreting Thiago's data as a contingency table? I think that, by contrast to pairwise comparison data, confounding/ecological bias is not an issue here. Best, Gerta Am 08.03.2022 um 19:30 schrieb Dr. Gerta R?cker:
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta
-- Dr. rer. nat. Gerta R?cker, Dipl.-Math. Guest Scientist Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg Zinkmattenstr. 6a, D-79108 Freiburg, Germany Mail: ruecker at imbi.uni-freiburg.de Homepage: https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
Happy to see other people spending time at 11pm thinking about this kind of stuff :) If we want to be really precise, the MLE of the logit-transformed true proportion is qlogis((sum r_i)/(sum n_i)) for the logistic regression model with a logit link, but since MLEs are invariant under transformations, so plogis(qlogis((sum r_i)/(sum n_i))) = (sum r_i)/(sum n_i)) is the MLE of the true proportion. In fact, this is neatly demonstrated by fitting the logistic regression with an identity link (do we even call this 'logistic' regression?!?): coef(glm(out1/n ~ 1, weights = n, family = binomial(link = "identity"))) That all of this happens 'automagically' is really a neat feature of logistic regression. Best, Wolfgang
-----Original Message----- From: Dr. Gerta R?cker [mailto:ruecker at imbi.uni-freiburg.de] Sent: Tuesday, 08 March, 2022 23:07 To: Viechtbauer, Wolfgang (SP); Thiago Roza Cc: r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of proportions Hi Wolfgang, Thank you! Indeed I just saw that the ML estimate under the binomial model and the assumption of homogeneity gives (sum r_i)/(sum n_i). In fact this seems equivalent to logistic regression. Probably it works also under the multinomial model, I didn't write this down. I admit that I never had thought about this :( Best, Gerta Am 08.03.2022 um 22:58 schrieb Viechtbauer, Wolfgang (SP):
Hi Gerta, Under homogeneity, we have X_i ~ Binomial(n_i, pi), in which case sum(X_i) ~
Binomial(sum(n_i), pi) and hence
sum(out1)/sum(n) plogis(coef(glm(out1/n ~ 1, weights = n, family = binomial))) or using metaprop() / rma.glmm() plogis(metaprop(out1, n)$TE.fixed) plogis(coef(rma.glmm(measure="PLO", xi=out1, ni=n, method="EE"))) are all identical. It goes to show how the logistic regression approach gives
an 'exact' model, based on the exact distributional properties of binomial counts.
As for Thiago's data: I think this is fine. But essentially he has multinomial
data. I recently described in a post how such data could be addressed if one would want to analyze them all simultaneously:
-----Original Message----- From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org]
On
Behalf Of Dr. Gerta R?cker Sent: Tuesday, 08 March, 2022 20:30 To: Thiago Roza Cc: r-sig-meta-analysis at r-project.org Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of proportions Dear Thiago, I found that, apparently, the result presented by the common effect model (=fixed effect model) is simply the sum of all entries/events over all studies, divided by the total sample size (summed up over all studies). You see this by typing the following after the code in my last e-mail: all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed)) all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed)) all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed)) This means that the method is equivalent to considering the data as a contingency table where the rows correspond to the studies and the columns to the outcomes. The meta-analytic result corresponds to the percentages in the column sums, and of course these add to 100%. In fact this is the easiest way to deal with this kind of data. @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or the rma.glmm help pages. Do you see any problem with interpreting Thiago's data as a contingency table? I think that, by contrast to pairwise comparison data, confounding/ecological bias is not an issue here. Best, Gerta Am 08.03.2022 um 19:30 schrieb Dr. Gerta R?cker:
Dear Thiago, So you have proportions of several mutually exclusive outcomes. Of course, these are dependent because the sum is always the total numbers of cases in the study (corresponding to 100% in that study). Nevertheless, I don't see any reason why not pooling each outcome separately using metaprop(). In fact, depending on the transformation, the resulting average proportion will not generally sum up to 100%, particularly not when using no transformation at all. This raises the question which transformation to choose. The default in metaprop() is random intercept logistic regression model with transformation logit. I made an observation that I have to think about, and you may try this. If I use the default, the sum of the pooled percentages over all outcomes is indeed always 1 for the fixed effect estimate. I used code like this (here for 3 outcomes): #### Random data #### out1 <- rbinom(10,100,0.1) out2 <- rbinom(10,100,0.5) out3 <- rbinom(10,100,0.9) n <- out1 + out2 + out3 m1 <- metaprop(out1, n) m2 <- metaprop(out2, n) m3 <- metaprop(out3, n) plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed) (plogis is the inverse of the logit transformation, often called "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1 for the fixed effect estimates, but not in general for the random effects estimates, only in case of small heterogeneity (which is rarely the case with proportions). I am interested to hear whether this works with your data. (And I have to prove that this holds in general ...) Best, Gerta
Hi all Am 08.03.2022 um 23:19 schrieb Viechtbauer, Wolfgang (SP):
Happy to see other people spending time at 11pm thinking about this kind of stuff :)
Yes. That's typical mathematicians' behaviour.
If we want to be really precise, the MLE of the logit-transformed true proportion is qlogis((sum r_i)/(sum n_i)) for the logistic regression model with a logit link, but since MLEs are invariant under transformations, so plogis(qlogis((sum r_i)/(sum n_i))) = (sum r_i)/(sum n_i)) is the MLE of the true proportion. In fact, this is neatly demonstrated by fitting the logistic regression with an identity link (do we even call this 'logistic' regression?!?): coef(glm(out1/n ~ 1, weights = n, family = binomial(link = "identity"))) That all of this happens 'automagically' is really a neat feature of logistic regression.
Nice! Good night then :) Gerta