Green and Byar (1980) Prostate Cancer Data set from Andrews and Herzberg - Data
Rolf Turner wrote:
On 25/03/2009, at 10:04 AM, Frank E Harrell Jr wrote:
Ravi Varadhan wrote:
Hi, I am looking for a data set containing the information from a randomized trial evaluating the effect of DES (diethylsilbestrol) on multiple time-to-event endpoints, prostate cancer, CVD, and other causes. The original source of this data is Green and Byar (1980). This is a popular competing risks problem that has subsequently been discussed in a number of statistical papers including Kay (1986). Does anyone have a digital version of this data set? This data is also presented in Andrews, D. F. and Herzberg, A. M. (1985). Data. Does a digital version of all the data sets in A & H exist? Thanks very much, Ravi.
An R binary dataset is at http://biostat.mc.vanderbilt.edu/Datasets Note that there is something strange about the AP variable with a lot of ties at some value near 1.0. I have never been able to find any documentation about this problem. If you find any please let me know.
Out of idle curiosity I went to have a look at this data set. I had problems. (1) The given URL didn't work for me; when I clicked on it, I got an error 404. But if I went to http://biostat.mc.vanderbilt.edu I found a link to ``Datasets'', and clicking on that got me to some data sets.
Sorry that should have been DataSets not Datasets.
(2) Scrolling down to ``Byar and Green prostate cancer data'' appeared to get me to the right place. But I couldn't see any signs of any ``R binary files''.
Please look again. It's under the heading "R". Unfortunately I used .sav suffix for save() files in the old days. The .xls fine opened with no problem in OpenOffice; has 506 rows. Frank
The available formats appear to be *.sav (SPSS?), *.sdd (???), and *.xls. (3) I downloaded the prostate.xls file O.K. But when I tried to read it in with the read.xls() function from the gdata package, I got an error to the effect
> X <- read.xls("prostate.xls")
Converting xls file to csv file... Done.
Reading csv file... Error in read.table(file = file, header = header,
sep = sep, quote = quote, :
no lines available in input
I was able to ``open'' the prostate.xls file with the version of Excel
available
on my Mac, save it as a *.csv file, and then read *that* in with read.csv()
What am I missing? *Are* there ``R binary'' files lurking about that I
am somehow
not seeing? Why won't read.xls() work on this data set?
cheers,
Rolf Turner
######################################################################
Attention:This e-mail message is privileged and confidential. If you are
not theintended recipient please delete the message and notify the
sender.Any views or opinions presented are solely those of the author.
This e-mail has been scanned and cleared by
MailMarshalwww.marshalsoftware.com
######################################################################
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University