Colleagues,
I need to simulate skewed data so I can run a sample size calculation.
I know the 2.5th, 25th, 50th, and 75th centiles of the data (32, 43, 48, 250).
data <- matrix(c(75,250,50,48,25,43,2.5,32),nrow=4,ncol=2,byrow=TRUE)
dimnames(data) <- list(NULL,c("x","y"))
data
Is there a way I can use these values to generate simulations of the original data? Of course if the data were normally distributed this would be a piece of cake, but given the skewness, I don't know how to go about the generating the values that would be expected from a distribution having the observed values at the four centiles.
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Simulate skewed data if 2.5, 25th 50th and 75 centile are known
3 messages · John Sorkin, Bert Gunter, Marc Schwartz
Hint: See below. On Wednesday, August 5, 2015, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:
Colleagues,
I need to simulate skewed data so I can run a sample size calculation.
I know the 2.5th, 25th, 50th, and 75th centiles of the data (32, 43, 48,
250).
data <- matrix(c(75,250,50,48,25,43,2.5,32),nrow=4,ncol=2,byrow=TRUE)
dimnames(data) <- list(NULL,c("x","y"))
data
Is there a way I can use these values to generate simulations of the
original data? Of course if the data were normally distributed this would
be a piece of cake,
Oh -- how? ( a normal distribution is defined by 2 parameters. You appear to have 4. ) If you can answer this question, you can probably answer the same question for skew data. See also things like Johnson distributions, Pearson distributions, and other flexible distribution families. You should also probably move to stackexchange, as this is definitely a statistical matter. Once you decide what to do, R will have a package to do it. Others may be able to offer better advice, so wait a bit before proceeding, though. -- Bert but given the skewness, I don't know how to go about the generating the
values that would be expected from a distribution having the observed
values at the four centiles.
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:25}}
On Aug 5, 2015, at 4:21 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: Hint: See below. On Wednesday, August 5, 2015, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:
Colleagues,
I need to simulate skewed data so I can run a sample size calculation.
I know the 2.5th, 25th, 50th, and 75th centiles of the data (32, 43, 48,
250).
data <- matrix(c(75,250,50,48,25,43,2.5,32),nrow=4,ncol=2,byrow=TRUE)
dimnames(data) <- list(NULL,c("x","y"))
data
Is there a way I can use these values to generate simulations of the
original data? Of course if the data were normally distributed this would
be a piece of cake,
Oh -- how? ( a normal distribution is defined by 2 parameters. You appear to have 4. ) If you can answer this question, you can probably answer the same question for skew data. See also things like Johnson distributions, Pearson distributions, and other flexible distribution families. You should also probably move to stackexchange, as this is definitely a statistical matter. Once you decide what to do, R will have a package to do it. Others may be able to offer better advice, so wait a bit before proceeding, though. -- Bert but given the skewness, I don't know how to go about the generating the
values that would be expected from a distribution having the observed values at the four centiles. Thank you, John
John, Just to pick up on Bert?s suggestion, there are some threads over on SE that discuss similar subject matter, one of which, due to my own curiosity, led me to: https://cran.r-project.org/web/packages/rriskDistributions/index.html which you may find of value. Regards, Marc Schwartz