Skip to content

[R-meta] Meta-analysis of proportion differences (certain cells frequency)

8 messages · Wolfgang Viechtbauer, Jakub Ruszkowski, Dr. Gerta Rücker

#
Dear All, 

I am looking for guidance on performing a meta-analysis of the proportions
difference using R. Specifically, I am interested in analyzing the difference
in certain cell frequencies: for example, the difference in lymphocyte% among
all white blood cells between patients and healthy individuals, without having
information about the total white blood cell count. 

I would appreciate it if anyone could provide insights or point me in the
right direction regarding the appropriate R packages and methods for
conducting such a meta-analysis. On a related note, I came across information
suggesting that the difference between two beta distributions (likely
representing fraction values) does not follow a normal distribution:
https://blogs.sas.com/content/iml/2023/03/01/distribution-difference-beta.html.
Because of that, I am afraid that I cannot use rma function. 

Thank you in advance for your time and support. 

Best regards, 

Jakub Ruszkowski
Department of Nephrology, Transplantology and Internal Medicine
Medical University of Gda?sk
3 days later
#
Dear Jakub,

Proportions like you are describing can be thought of as so-called 'compositional data' (i.e., data that describe to what extent some whole is composed of various subcomponents):

https://en.wikipedia.org/wiki/Compositional_data

For example, one might know that in a given person, 52% of their white blood cells are eutrophils, 36% are lymphocytes, 7% are monocytes, and the remaining 5% are other types. But without an actual count, these cannot be treated as binomial/multinomial counts and are just percentages (or proportions) of the whole.

Compositional data analysis is its own subfield in statistics, but whether the methods described there are relevant in the present context is not clear to me.

Since you mentioned the beta distribution: Yes, one could assume that a percentage/proportion like in the case above (i.e., a proportion of 0.36 of the white blood cells are lymphocytes) is beta distributed. But note that this is a proportion for a single individual. I would assume that there is such a proportion for multiple individuals within a group (e.g., patients). Then what is it that study authors would report? I would assume that they report the mean proportion (with hopefully also the SD of the individual proportions). If so, then one could basically just use methods for meta-analyzing means and mean differences.

Best,
Wolfgang
#
Dear Wolfgang, 

thank you for your answer! Yes, I am aware of the compositional character of
the data (I wish all authors of primary studies were also) and the huge
limitations of any attempts to meta-analyze them. Unfortunately, I do not know
any well-explained method to meta-analyze simultaneously all components of the
composition properly, that is why I thought about simplification the issue to
the analysis of differences of main cell types of interest. 

Yeah, the authors usually report the mean and SD of the proportions. I forgot
that sample means even from beta (/Dirichlet) distributions follow the normal
distribution! Thank you a lot for clarifying that it is ok to use methods for
mean differences. In case some studies would report cell counts, would you
rather analyze them together with studies reporting only mean+SD [%] (using
SMD) or treat them separately? 

Best wishes
Jakub 

W dniu 2024-03-15 14:44, Viechtbauer, Wolfgang (NP) napisa?(a):
difference between two beta distributions (likely representing fraction values) does not follow a normal distribution: https://blogs.sas.com/content/iml/2023/03/01/distribution-difference-beta.html [1]. Because of that, I am afraid that I cannot use rma function. Thank you in advance for your time and support. Best regards, Jakub Ruszkowski Department of Nephrology, Transplantology and Internal Medicine Medical University of Gda?sk
 

Links:
------
[1]
https://blogs.sas.com/content/iml/2023/03/01/distribution-difference-beta.html
[2] https://en.wikipedia.org/wiki/Compositional_data
#
But what exactly does an author of a study that reports cell counts actually report? Are they are reporting the lymphocyte and total white blood cell count for each participant?

Best,
Wolfgang
#
Dear Wolfgang, 

I am currently preparing the protocol, so I cannot share real data now. Nearly
all studies, that I am aware of now, report the mean (+SD) proportions of all
cell subpopulations, whereas some studies do report also individual
participant data (proportion of each of the cell populations separately for
each of the patients). I asked to be prepared for future challenges and to
better understand whether SMD may be a good choice in this scenario. 

Best wishes
Jakub 

W dniu 2024-03-15 18:56, Viechtbauer, Wolfgang (NP) napisa?(a):
distributions follow the normal distribution! Thank you a lot for clarifying that it is ok to use methods for mean differences. In case some studies would report cell counts, would you rather analyze them together with studies reporting only mean+SD [%] (using SMD) or treat them separately? Best wishes Jakub W dniu 2024-03-15 14:44, Viechtbauer, Wolfgang (NP) napisa?(a): Dear Jakub, Proportions like you are describing can be thought of as so-called 'compositional data' (i.e., data that describe to what extent some whole is composed of various subcomponents): https://en.wikipedia.org/wiki/Compositional_data [1] [2]For example, one might know that in a given person, 52% of their white blood cells are eutrophils, 36% are lymphocytes, 7% are monocytes, and the remaining 5% are other types. But without an actual count, these cannot be treated as binomial/multinomial counts and are just percentages (or proportions) of the whole. Compositional data analysis is its own subfield in
statistics, but whether the methods described there are relevant in the present context is not clear to me. Since you mentioned the beta distribution: Yes, one could assume that a percentage/proportion like in the case above (i.e., a proportion of 0.36 of the white blood cells are lymphocytes) is beta distributed. But note that this is a proportion for a single individual. I would assume that there is such a proportion for multiple individuals within a group (e.g., patients). Then what is it that study authors would report? I would assume that they report the mean proportion (with hopefully also the SD of the individual proportions). If so, then one could basically just use methods for meta-analyzing means and mean differences. Best, Wolfgang
 

Links:
------
[1] https://en.wikipedia.org/wiki/Compositional_data
#
Dear Jakub,

(I only superficiously followed this correspondence.)
These mean proportions (of compositional data) are all on the same scale (0-1). Then, why not take MD instead of SMD? I do not really see an indication for SMD.

Best,
Gerta


UNIVERSIT?TSKLINIKUM FREIBURG
Institute for Medical Biometry and Statistics

Dr. Gerta R?cker
Guest Scientist

Stefan-Meier-Stra?e 26 ? 79104 Freiburg
gerta.ruecker at uniklinik-freiburg.de

https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker

-----Urspr?ngliche Nachricht-----
Von: Jakub Ruszkowski via R-sig-meta-analysis <r-sig-meta-analysis at r-project.org> 
Gesendet: Freitag, 15. M?rz 2024 21:59
An: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r-project.org>
Cc: Jakub Ruszkowski <jakub.ruszkowski at gumed.edu.pl>
Betreff: Re: [R-meta] Meta-analysis of proportion differences (certain cells frequency)

 

Dear Wolfgang, 

I am currently preparing the protocol, so I cannot share real data now. Nearly
all studies, that I am aware of now, report the mean (+SD) proportions of all
cell subpopulations, whereas some studies do report also individual
participant data (proportion of each of the cell populations separately for
each of the patients). I asked to be prepared for future challenges and to
better understand whether SMD may be a good choice in this scenario. 

Best wishes
Jakub 

W dniu 2024-03-15 18:56, Viechtbauer, Wolfgang (NP) napisa?(a):
distributions follow the normal distribution! Thank you a lot for clarifying that it is ok to use methods for mean differences. In case some studies would report cell counts, would you rather analyze them together with studies reporting only mean+SD [%] (using SMD) or treat them separately? Best wishes Jakub W dniu 2024-03-15 14:44, Viechtbauer, Wolfgang (NP) napisa?(a): Dear Jakub, Proportions like you are describing can be thought of as so-called 'compositional data' (i.e., data that describe to what extent some whole is composed of various subcomponents): https://en.wikipedia.org/wiki/Compositional_data [1] [2]For example, one might know that in a given person, 52% of their white blood cells are eutrophils, 36% are lymphocytes, 7% are monocytes, and the remaining 5% are other types. But without an actual count, these cannot be treated as binomial/multinomial counts and are just percentages (or proportions) of the whole. Compositional data analysis is its own subfield in
statistics, but whether the methods described there are relevant in the present context is not clear to me. Since you mentioned the beta distribution: Yes, one could assume that a percentage/proportion like in the case above (i.e., a proportion of 0.36 of the white blood cells are lymphocytes) is beta distributed. But note that this is a proportion for a single individual. I would assume that there is such a proportion for multiple individuals within a group (e.g., patients). Then what is it that study authors would report? I would assume that they report the mean proportion (with hopefully also the SD of the individual proportions). If so, then one could basically just use methods for meta-analyzing means and mean differences. Best, Wolfgang
 

Links:
------
[1] https://en.wikipedia.org/wiki/Compositional_data


_______________________________________________
R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org
To manage your subscription to this mailing list, go to:
https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
#
Dear Gerta, 

The question was about combining mean proportions and mean counts (if rarely
available) in one analysis - is it reasonable to analyze them together (using
SMD) or not? 

Best
Jakub 

W dniu 2024-03-16 18:30, Dr. Gerta R?cker napisa?(a):
issue to the analysis of differences of main cell types of interest. Yeah, the authors usually report the mean and SD of the proportions. I forgot that sample means even from beta (/Dirichlet)
Links:
------
[1] https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker
[2] https://en.wikipedia.org/wiki/Compositional_data
[3] https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
#
Dear Jakub,

You wrote about IPD that they report the ?proportion of each of the cell populations separately for each of the patients?. As I understand it (I may still misunderstand it), this is still a proportion, not a count. Thus you could use this IPD information to pool the proportions across all individuals within the study and then put the result into an aggregate data meta-analysis.

Best,
Gerta


Von: Jakub Ruszkowski <jakub.ruszkowski at gumed.edu.pl>
Gesendet: Samstag, 16. M?rz 2024 18:39
An: Dr. Gerta R?cker <gerta.ruecker at uniklinik-freiburg.de>
Cc: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r-project.org>
Betreff: Re: AW: [R-meta] Meta-analysis of proportion differences (certain cells frequency)


Dear Gerta,

The question was about combining mean proportions and mean counts (if rarely available) in one analysis - is it reasonable to analyze them together (using SMD) or not?

Best
Jakub

W dniu 2024-03-16 18:30, Dr. Gerta R?cker napisa?(a):

Dear Jakub,



(I only superficiously followed this correspondence.)

These mean proportions (of compositional data) are all on the same scale (0-1). Then, why not take MD instead of SMD? I do not really see an indication for SMD.



Best,

Gerta





UNIVERSIT?TSKLINIKUM FREIBURG

Institute for Medical Biometry and Statistics



Dr. Gerta R?cker

Guest Scientist



Stefan-Meier-Stra?e 26 ? 79104 Freiburg

gerta.ruecker at uniklinik-freiburg.de<mailto:gerta.ruecker at uniklinik-freiburg.de>



https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=ruecker



-----Urspr?ngliche Nachricht-----

Von: Jakub Ruszkowski via R-sig-meta-analysis <r-sig-meta-analysis at r-project.org<mailto:r-sig-meta-analysis at r-project.org>>

Gesendet: Freitag, 15. M?rz 2024 21:59

An: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r-project.org<mailto:r-sig-meta-analysis at r-project.org>>

Cc: Jakub Ruszkowski <jakub.ruszkowski at gumed.edu.pl<mailto:jakub.ruszkowski at gumed.edu.pl>>

Betreff: Re: [R-meta] Meta-analysis of proportion differences (certain cells frequency)







Dear Wolfgang,



I am currently preparing the protocol, so I cannot share real data now. Nearly

all studies, that I am aware of now, report the mean (+SD) proportions of all

cell subpopulations, whereas some studies do report also individual

participant data (proportion of each of the cell populations separately for

each of the patients). I asked to be prepared for future challenges and to

better understand whether SMD may be a good choice in this scenario.



Best wishes

Jakub



W dniu 2024-03-15 18:56, Viechtbauer, Wolfgang (NP) napisa?(a):
But what exactly does an author of a study that reports cell counts actually report? Are they are reporting the lymphocyte and total white blood cell count for each participant? Best, Wolfgang -----Original Message----- From: R-sig-meta-analysis <r-sig-meta-analysis-bounces at r-project.org<mailto:r-sig-meta-analysis-bounces at r-project.org>> On Behalf Of Jakub Ruszkowski via R-sig-meta-analysis Sent: Friday, March 15, 2024 18:28 To: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis at r- project.org<mailto:r-sig-meta-analysis at r-%20project.org>> Cc: Jakub Ruszkowski <jakub.ruszkowski at gumed.edu.pl<mailto:jakub.ruszkowski at gumed.edu.pl>> Subject: Re: [R-meta] Meta-analysis of proportion differences (certain cells frequency) Dear Wolfgang, thank you for your answer! Yes, I am aware of the compositional character of the data (I wish all authors of primary studies were also) and the huge limitations of any attempts to meta-analyze them. Unfortunately, I do not know any well-explained method to meta-analyze simultaneously all components of the composition properly, that is why I thought about simplification the issue to the analysis of differences of main cell types of interest. Yeah, the authors usually report the mean and SD of the proportions. I forgot that sample means even from beta (/Dirichlet)

distributions follow the normal distribution! Thank you a lot for clarifying that it is ok to use methods for mean differences. In case some studies would report cell counts, would you rather analyze them together with studies reporting only mean+SD [%] (using SMD) or treat them separately? Best wishes Jakub W dniu 2024-03-15 14:44, Viechtbauer, Wolfgang (NP) napisa?(a): Dear Jakub, Proportions like you are describing can be thought of as so-called 'compositional data' (i.e., data that describe to what extent some whole is composed of various subcomponents): https://en.wikipedia.org/wiki/Compositional_data [1] [2]For example, one might know that in a given person, 52% of their white blood cells are eutrophils, 36% are lymphocytes, 7% are monocytes, and the remaining 5% are other types. But without an actual count, these cannot be treated as binomial/multinomial counts and are just percentages (or proportions) of the whole. Compositional data analysis is its own subfield in

statistics, but whether the methods described there are relevant in the present context is not clear to me. Since you mentioned the beta distribution: Yes, one could assume that a percentage/proportion like in the case above (i.e., a proportion of 0.36 of the white blood cells are lymphocytes) is beta distributed. But note that this is a proportion for a single individual. I would assume that there is such a proportion for multiple individuals within a group (e.g., patients). Then what is it that study authors would report? I would assume that they report the mean proportion (with hopefully also the SD of the individual proportions). If so, then one could basically just use methods for meta-analyzing means and mean differences. Best, Wolfgang





Links:

------

[1] https://en.wikipedia.org/wiki/Compositional_data






_______________________________________________

R-sig-meta-analysis mailing list @ R-sig-meta-analysis at r-project.org<mailto:R-sig-meta-analysis at r-project.org>

To manage your subscription to this mailing list, go to:

https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis