Dear all, Could someone point me to a function or algorithm to generate random bivariate binomial data? Some details about what I'm trying to do. I have a dataset of trees who were categorised as not damaged or damaged. Each tree is measured twice (once in two consecutive years). The trees can recover from the damage but the data is clearly correlated. As a (un)damaged tree is more likely to stay (un)damaged. A GEE-model indicates a correlation of about 0.6 between the measures on the same tree. I need to simulate similar dataset to do some power calculations. Best regards, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Generate bivariate binomial data
4 messages · ONKELINX, Thierry, Christian Ritz, Charles C. Berry
Hi Thierry, I think it should be possible to generate such correlated data using a mixed model approach: 1) Generate pairs of correlated linear predictor values using an ordinary linear mixed model setup (for example using rnorm() repeatedly) 2) Back-transform these values using the inverse logit (or similar link function) to obtain probabilities 3) Draw binomial responses using the probabilities from 2) (using rbinom()) You may need to use trial-and-error to get the right amount of correlation. Christian
On Fri, 17 Apr 2009, ONKELINX, Thierry wrote:
Dear all, Could someone point me to a function or algorithm to generate random bivariate binomial data? Some details about what I'm trying to do. I have a dataset of trees who were categorised as not damaged or damaged. Each tree is measured twice (once in two consecutive years). The trees can recover from the damage but the data is clearly correlated. As a (un)damaged tree is more likely to stay (un)damaged. A GEE-model indicates a correlation of about 0.6 between the measures on the same tree.
The commonly used sampling models for such count data (see Bishop, Fienberg, and Holland, Discrete Multivariate Analysis, 1975.) involve four parameters. There are various parameterizations. In your case, the total sample size (N), the proportion of undamaged trees at the first (pr.undam.1), and the proportions at the second time conditional on the first (p.undam.2.undam.1, p.undam.2.dam.1) seems like reasonable way to parameterize the problem to do your simulation. If you have the marginal counts and the correlation, you can transform them to the above parameterization by hacking through the algebra to find the expected 2 by 2 table of counts as a function of the latter parameters. Then y <- rbinom( N, 1, pr.undam.1 ) x <- rbinom( N, 1, ifelse(y==1, pr.undam.2.undam.1, pr.undam.2.dam.1 ) table(x,y) should get you started HTH, Chuck
I need to simulate similar dataset to do some power calculations. Best regards, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
4 days later
Thank you. This was very helpfull. Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: Charles C. Berry [mailto:cberry at tajo.ucsd.edu] Verzonden: vrijdag 17 april 2009 21:31 Aan: ONKELINX, Thierry CC: r-help at r-project.org Onderwerp: Re: [R] Generate bivariate binomial data
On Fri, 17 Apr 2009, ONKELINX, Thierry wrote:
Dear all, Could someone point me to a function or algorithm to generate random bivariate binomial data? Some details about what I'm trying to do. I have a dataset of trees
who
were categorised as not damaged or damaged. Each tree is measured
twice
(once in two consecutive years). The trees can recover from the damage but the data is clearly correlated. As a (un)damaged tree is more
likely
to stay (un)damaged. A GEE-model indicates a correlation of about 0.6 between the measures on the same tree.
The commonly used sampling models for such count data (see Bishop, Fienberg, and Holland, Discrete Multivariate Analysis, 1975.) involve four parameters. There are various parameterizations. In your case, the total sample size (N), the proportion of undamaged trees at the first (pr.undam.1), and the proportions at the second time conditional on the first (p.undam.2.undam.1, p.undam.2.dam.1) seems like reasonable way to parameterize the problem to do your simulation. If you have the marginal counts and the correlation, you can transform them to the above parameterization by hacking through the algebra to find the expected 2 by 2 table of counts as a function of the latter parameters. Then y <- rbinom( N, 1, pr.undam.1 ) x <- rbinom( N, 1, ifelse(y==1, pr.undam.2.undam.1, pr.undam.2.dam.1 ) table(x,y) should get you started HTH, Chuck
I need to simulate similar dataset to do some power calculations. Best regards, Thierry
------------------------------------------------------------------------
---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no
more
than asking him to perform a post-mortem examination: he may be able
to
say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Dit bericht en eventuele bijlagen geven enkel de visie van de
schrijver weer
en binden het INBO onder geen enkel beding, zolang dit bericht niet
bevestigd is
door een geldig ondertekend document. The views expressed in this
message
and any annex are purely those of the writer and may not be regarded
as stating
an official position of INBO, as long as the message is not confirmed
by a duly
signed document.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry (858) 534-2098
Dept of Family/Preventive
Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego
92093-0901
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in this message
and any annex are purely those of the writer and may not be regarded as stating
an official position of INBO, as long as the message is not confirmed by a duly
signed document.