Adding 95% contours around scatterplot points with ggplot2
Hi Nate, You can make it less busy using the bins argument. This is not documented, except in the examples to stat_contour, but try ggplot(data=data, aes(x, y, colour=(factor(level)), fill=level))+ geom_point()+ stat_density2d(bins=2) HTH, Ista
On Mon, Jan 28, 2013 at 2:43 PM, Nathan Miller <natemiller77 at gmail.com> wrote:
Thanks Ista, I have played a bit with stat_density2d as well. It doesn't completely capture what I am looking for and ends up being quite busy at the same time. I'm looking for a way of helping those looking that the figure to see the broad patterns of where in the x/y space the data from different groups are distributed. Using the 95% CI type idea is so that I don't end up arbitrarily drawing circles around each set of points. I appreciate your direction though. Nate On Mon, Jan 28, 2013 at 10:50 AM, Ista Zahn <istazahn at gmail.com> wrote:
Hi Nathan, This only fits some of your criteria, but have you looked at ?stat_density2d? Best, Ista On Mon, Jan 28, 2013 at 12:53 PM, Nathan Miller <natemiller77 at gmail.com> wrote:
Hi all, I have been looking for means of add a contour around some points in a scatterplot as a means of representing the center of density for of the data. I'm imagining something like a 95% confidence estimate drawn around the data. So far I have found some code for drawing polygons around the data. These look nice, but in some cases the polygons are strongly influenced by outlying points. Does anyone have a thought on how to draw a contour which is more along the lines of a 95% confidence space? I have provided a working example below to illustrate the drawing of the polygons. As I said I would rather have three "ovals"/95% contours drawn around the points by "level" to capture the different density distributions without the visualization being heavily influenced by outliers. I have looked into the code provided here from Hadley https://groups.google.com/forum/?fromgroups=#!topic/ggplot2/85q4SQ9q3V8 using the mvtnorm package and the dmvnorm function, but haven't been able to get it work for my data example. The calculated densities are always zero (at this step of Hadley's code: dgrid$dens <- dmvnorm(as.matrix(dgrid), ex_mu, ex_sigma) ) I appreciate any assistance. Thanks, Nate x<-c(seq(0.15,0.4,length.out=30),seq(0.2,0.6,length.out=30), seq(0.4,0.6,length.out=30)) y<-c(0.55,x[1:29]+0.2*rnorm(29,0.4,0.3),x[31:60]*rnorm(30,0.3,0.1),x[61:90]*rnorm(30,0.4,0.25)) data<-data.frame(level=c(rep(1, 30),rep(2,30), rep(3,30)), x=x,y=y) find_hull <- function(data) data[chull(data$x, data$y), ] hulls <- ddply(data, .(level), find_hull) fig1 <- ggplot(data=data, aes(x, y, colour=(factor(level)), fill=level))+geom_point() fig1 <- fig1 + geom_polygon(data=hulls, alpha=.2) fig1 [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.