Skip to content

Coagulation data by Box/Hunter/Hunter

3 messages · T. Murlidharan Nair, (Ted Harding), Kjetil Halvorsen

#
Hi!!
Has anyone used the coagulation data for statistical analysis ?  I 
managed to
get the data from the web but unsure of the way its supposed to read. I am
new to R so trying to gets myself familiarized with the statistical 
tools using
available data.  I am appending the data I got of the web. If anyone is 
aware of
its use I would welcome their input.
Cheers always!!
Murli

=======================================================================
THIS IS DATAPLOT DATA FILE   BOXBLOOD.DAT
DIET EFFECT ON BLOOD COAGULATION
BOX, HUNTER & HUNTER (1978)
STATISTIC FOR EXPERIMENTERS
WILEY, PAGE 165-197 (MAIN EXAMPLE OF CHAPTER 6)
COMPLETELY RANDOMIZED DESIGN
NUMBER OF OBSERVATIONS = 24
TOTAL NUMBER OF VARIABLES PER LINE IMAGE = 3
   RESPONSE VARIABLE = BLOOD COAGULATION TIME
   FACTOR 1 = DIET (4 LEVELS)
   FACTOR 2 = RUN SEQUENCE
TO READ THIS FILE INTO DATAPLOT (AND ANALYZE)--
   SKIP 25
   READ BOXBLOOD.DAT Y X1 RUNSEQ
   CHAR X ALL
   PLOT Y X1 X1
   .
   ER; ANOVA Y X1
   .
   PLOT RES X1 X1
   PLOT RES PRED PRED
   PLOT RES RUNSEQ
   NORMAL PROBABILITY PLOT RES
 Y   X1  RUNSEQ
-----------------
62    1    20
60    1     2
63    1    11
59    1    10
63    2    12
67    2     9
71    2    15
64    2    14
65    2     4
66    2     8
68    3    16
66    3     7
71    3     1
67    3    17
68    3    13
68    3    21
56    4    23
62    4     3
60    4     6
61    4    18
63    4    22
64    4    19
63    4     5
59    4    24
#
On 19-Aug-04 T. Murlidharan Nair wrote:
[Original dataset omitted]

Hi Murli,

There are several approaches to getting started with this sort of
situation in R. The approach I usually adopt is generally as follows.

First, copy the original data to a file, say "blood.csv", and then
edit this file to remove all except the line which lists the variables
and the lines of data themselves; then edit these lines to replace
the spaces separating data items on a line with a single comma.
In this way you obtain

Y,X1,RUNSEQ
62,1,20
60,1,2
63,1,11
59,1,10
63,2,12
67,2,9
71,2,15
64,2,14
65,2,4
66,2,8
68,3,16
66,3,7
71,3,1
67,3,17
68,3,13
68,3,21
56,4,23
62,4,3
60,4,6
61,4,18
63,4,22
64,4,19
63,4,5
59,4,24

Then start R, and enter commands like

D <- read.csv("blood.csv")
Y <- D$Y; X1 <- D$X1; RUNSEQ <- D$RUNSEQ

Now R has all the data, both combined into a dataframe D and
also as separate variables. From this point on, you can do
what you like.

Some examples:

  plot(X1,Y)

  ix<-(X1==1);  plot(RUNSEQ[ix],Y[ix],pch=1,xlim=c(0,25),ylim=c(40,80))
  ix<-(X1==2);points(RUNSEQ[ix],Y[ix],pch=2,col="red")
  ix<-(X1==3);points(RUNSEQ[ix],Y[ix],pch=3,col="green")
  ix<-(X1==4);points(RUNSEQ[ix],Y[ix],pch=4,col="blue")

  X1.f <- factor(X1); D.lm <- lm(Y ~ X1.f)
  summary(D.lm)

  plot(RUNSEQ,D.lm$res)

and so on.

Good hunting!
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 20-Aug-04                                       Time: 11:50:11
------------------------------ XFMail ------------------------------
#
The coagulation dataset is from chapter 6 of Box, Hunter&Hunter, and is 
used to
introduce comparison of more than two treatments (ANOVA). It is supposedly
coagulation times of blood extracted from animals, after  randomizing 
the animals
to four groups and giving four different diets.

Some use in R:

coag <- matrix( scan(), 24, 3, byrow=TRUE)
   # cutting/pasting from your post into the prompt from scan()

colnames(coag) <- c("time","diet","order")
 > coag <- as.data.frame(coag)
 > coag$diet <- as.factor(coag$diet)
 > oneway.test(time ~  diet, data=coag, var.eq=TRUE)

        One-way analysis of means

data:  time and diet
F = 13.5714, num df = 3, denom df = 20, p-value = 4.658e-05

with(coag, stripchart(time ~ diet))

Kjetil Halvorsen
T. Murlidharan Nair wrote: