Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: Kevin Wright [mailto:kw.stat at gmail.com]
> Sent: Thursday, January 27, 2011 10:27 AM
> To: Tal Galili
> Cc: Greg Snow; r-help at r-project.org
> Subject: Re: [R] boxplot - code for labeling outliers - any suggestions
> for improvements?
>
> My colleagues that use one of the .Net languages/libraries can make
> scatter plots that look better than R's because they have better
> spreading of the labels.
>
> If someone could spread this labels on the following graph, I would be
> impressed.
>
> plot(Sepal.Length~Sepal.Width, data=iris)
> with(iris,text(Sepal.Width, Sepal.Length, 1:nrow(iris), cex=.5))
>
> Kevin
>
>
> On Thu, Jan 27, 2011 at 9:52 AM, Tal Galili <tal.galili at gmail.com>
> wrote:
> > Thanks again for the pointer to spread.labs Greg.
> >
> > I implemented it into the function and also extended it to deal with
> > formulas so it could behave just like boxplot.
> > Code and examples are available here:
> > http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-
> a-boxplot/
> >
> > I'd be happy for any suggestions on how to improve it.
> >
> > Best,
> > Tal
> >
> >
> >
> > ----------------Contact
> > Details:-------------------------------------------------------
> > Contact me: Tal.Galili at gmail.com | ?972-52-7275845
> > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il
> (Hebrew) |
> > www.r-statistics.com (English)
> > ---------------------------------------------------------------------
> -------------------------
> >
> >
> >
> >
> > On Thu, Jan 27, 2011 at 1:09 AM, Greg Snow <Greg.Snow at imail.org>
> wrote:
> >
> >> For the last point (cluttered text), look at spread.labels in the
> plotrix
> >> package and spread.labs in the TeachingDemos package (I favor the
> later, but
> >> could be slightly biased as well). ?Doing more than what those 2
> functions
> >> do becomes really complicated really fast.
> >>
> >> --
> >> Gregory (Greg) L. Snow Ph.D.
> >> Statistical Data Center
> >> Intermountain Healthcare
> >> greg.snow at imail.org
> >> 801.408.8111
> >>
> >>
> >> > -----Original Message-----
> >> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> > project.org] On Behalf Of Tal Galili
> >> > Sent: Wednesday, January 26, 2011 4:05 PM
> >> > To: r-help at r-project.org
> >> > Subject: [R] boxplot - code for labeling outliers - any
> suggestions for
> >> > improvements?
> >> >
> >> > Hello all,
> >> > I wrote a small function to add labels for outliers in a boxplot.
> >> > This function will only work on a simple boxplot/formula command
> (e.g:
> >> > something like boxplot(y~x)).
> >> >
> >> > Code + example follows in this e-mail.
> >> >
> >> > I'd be happy for any suggestions on how to improve this code, for
> >> > example:
> >> >
> >> > ? ?- Handle boxplot.matrix (which shouldn't be too hard to do)
> >> > ? ?- Handle cases of complex functions (e.g: boxplot(y~a*b))
> >> > ? ?- Handle cases where there are many outliers leading to a
> clutter of
> >> > text
> >> > ? ?(to this I have no idea how to systematically solve)
> >> >
> >> >
> >> > Best,
> >> > Tal
> >> > ------------------------------
> >> >
> >> >
> >> > # the function
> >> > boxplot.add.outlier.text <- function(DATA, x_name, y_name,
> label_name)
> >> > {
> >> >
> >> >
> >> > boxplot.outlier.data <- function(xx, y_name)
> >> > {
> >> > ?y <- xx[,y_name]
> >> > boxplot_range <- range(boxplot.stats(y)$stats)
> >> > ss <- (y < boxplot_range[1]) | (y > boxplot_range[2])
> >> > ?return(xx[ss,])
> >> > }
> >> >
> >> > require(plyr)
> >> > txt_to_run <- paste("ddply(DATA, .(",x_name,"),
> boxplot.outlier.data,
> >> > y_name
> >> > = y_name)", sep = "")
> >> > ?ourlier_df <- eval(parse(text = txt_to_run))
> >> > # head(ourlier_df)
> >> > ?txt_to_run <- paste("formula(",y_name,"~", x_name,")")
> >> > ?formu <- eval(parse(text = txt_to_run))
> >> > boxdata <- boxplot(formu , data = DATA, plot = F)
> >> > ?boxdata_group_name <- boxdata$names[boxdata$group]
> >> > boxdata_outlier_df <- data.frame(group = boxdata_group_name, y =
> >> > boxdata$out, x = boxdata$group)
> >> > ?for(i in seq_len(dim(boxdata_outlier_df)[1]))
> >> > {
> >> > ?ss <- (ourlier_df[,x_name] ?%in% boxdata_outlier_df[i,]$group) &
> >> > (ourlier_df[,y_name] %in% boxdata_outlier_df[i,]$y)
> >> > current_label <- ourlier_df[ss,label_name]
> >> > ?temp_x <- boxdata_outlier_df[i,"x"]
> >> > temp_y <- boxdata_outlier_df[i,"y"]
> >> > ?text(temp_x, temp_y, current_label,pos=4)
> >> > }
> >> >
> >> > list(boxdata_outlier_df = boxdata_outlier_df,
> ourlier_df=ourlier_df)
> >> > }
> >> >
> >> > # example:
> >> > boxplot(decrease ~ treatment, data = OrchardSprays, log = "y", col
> =
> >> > "bisque")
> >> > boxplot.add.outlier.text(OrchardSprays, "treatment", "decrease",
> >> > "colpos")
> >> >
> >> >
> >> >
> >> >
> >> > ----------------Contact
> >> > Details:-------------------------------------------------------
> >> > Contact me: Tal.Galili at gmail.com | ?972-52-7275845
> >> > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il
> (Hebrew)
> >> > |
> >> > www.r-statistics.com (English)
> >> > ------------------------------------------------------------------
> -----
> >> > -----------------------
> >> >
> >> > ? ? ? [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > R-help at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posting-
> >> > guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > ? ? ? ?[[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Kevin Wright