I have some Bernoulli data something like this: x<-sort(runif(100,1,20)) p<-pnorm(x,10,3) y<-as.numeric(runif(x)<p) plot(x,y) lines(x,p) This plot is not very satisfactory because the ogive does not visually fit the (0,1) points very well, and also because the points tend to fall on top of one another. The second problem can be eliminated by adding vertical jitter. However I was thinking about the following plot. Instead of plotting all the 0,1 points, instead divide the x axis into bins. In each bin, find the average y value. Then plot (x=average of x values in bin, y=average of 0,1 values in bin). So if I use 10 bins I have 10 points in the plot and now the y-values are proportions instead of 0/1. Is this a plot that other people have used (refs appreciated)? If so maybe someone has code to do this. Otherwise, I am not sure of how to do this in R. Could someone help me? Thanks very much. Bill -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
plot of Bernoulli data
4 messages · Bill Simpson, Ben Bolker, Frank E Harrell Jr
x<-sort(runif(100,1,20)) p<-pnorm(x,10,3) y<-as.numeric(runif(x)<p) plot(x,y) lines(x,p)
df<-data.frame(x,y) aggregate(df,list(x=(x<5),(x>5)&(x<10),(x>10) & (x<15),(x>15)), FUN=mean) gives me what I want but if anyone has a better way to collect the observations into bins I'd like to hear it. It would be nice to pass along something like breaks<-c(5,10,15,20) Thanks Bill Simpson -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
cut()?
On Tue, 2 Oct 2001, Bill Simpson wrote:
x<-sort(runif(100,1,20)) p<-pnorm(x,10,3) y<-as.numeric(runif(x)<p) plot(x,y) lines(x,p)
df<-data.frame(x,y) aggregate(df,list(x=(x<5),(x>5)&(x<10),(x>10) & (x<15),(x>15)), FUN=mean) gives me what I want but if anyone has a better way to collect the observations into bins I'd like to hear it. It would be nice to pass along something like breaks<-c(5,10,15,20) Thanks Bill Simpson -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
The loess smoother, with outlier detection turned off,
is an excellent way to estimate the relationship
between a continuous variable and the probability
of an event, based on a binary dependent variable.
Use lowess(x,y,iter=0). I do wonder about the way in
which your data are generated, however. You
might think about
p <- whatever # and check that if you use pnorm the
# first arg to pnorm spans the right metric
y <- 1*(runif(n) <= p) # n=100 in your example
Bill Simpson wrote:
I have some Bernoulli data something like this: x<-sort(runif(100,1,20)) p<-pnorm(x,10,3) y<-as.numeric(runif(x)<p) plot(x,y) lines(x,p) This plot is not very satisfactory because the ogive does not visually fit the (0,1) points very well, and also because the points tend to fall on top of one another. The second problem can be eliminated by adding vertical jitter. However I was thinking about the following plot. Instead of plotting all the 0,1 points, instead divide the x axis into bins. In each bin, find the average y value. Then plot (x=average of x values in bin, y=average of 0,1 values in bin). So if I use 10 bins I have 10 points in the plot and now the y-values are proportions instead of 0/1. Is this a plot that other people have used (refs appreciated)? If so maybe someone has code to do this. Otherwise, I am not sure of how to do this in R. Could someone help me? Thanks very much. Bill -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._