Skip to content

Fixed! Thanks all:RE: scatterplot to boxplot translation?

2 messages · Vining, Kelly, Bert Gunter

#
Thanks to David and Jorge - both of your helpful suggestions got me to the desired endpoint. In case anyone else has this question: I boxplotted my y variable data, but did the "cut" operation on the x variable in order to conserve the order of the y data. I see another suggestion coming in from another user that basically says this. 

So, my working line of code was:

boxplot(count$RPKM ~ cut(count$C_count, breaks=4)

Much appreciation to everyone who responded...thanks for helping with a na?ve question without making me feel stupid.

This discussion board is very, very good.

--Kelly V.

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Friday, December 09, 2011 11:58 AM
To: Uwe Ligges
Cc: Vining, Kelly; r-help at r-project.org
Subject: Re: [R] scatterplot to boxplot translation?
On Dec 9, 2011, at 2:50 PM, Uwe Ligges wrote:

            
In that context (having defined a cut-variable with single-integer break argument),  would have thought this should work:

  boxplot(count$RPKM ~ cutRPKM)

--
David.
David Winsemius, MD
West Hartford, CT
#
Kelly:

Glad you got what you were looking for, but this whole thread begs the
question; (Why) should you do this? You lose information in binning
the continuous data, of course. Perhaps your answer is that the point
scatter in the data is too noisy to clearly discern what's going on, a
legitimate response. One might  then -- or in general -- consider
overlaying a fitted smooth (nonparameteric) curve to the data to
reveal the "trend." There are a zillion ways to do this in R: both
lattice and ggplot have built-in capabilities to do this easily, as
does base R with ?scatter.smooth. If that's too easy, you can do it by
hand via ?lowess (or it's more flexible cousin, ?loess),
smooth.spline, etc. In actuality, your binning strategy is a crude,
non-smooth version of such smoothing, so it's not that far-fetched. Or
as some of the choicer R-Help pages say, cutting and boxplotting is to
smoothing as histograms are to nonparametric density estimates.

Cheers,
Bert


On Fri, Dec 9, 2011 at 12:05 PM, Vining, Kelly
<Kelly.Vining at oregonstate.edu> wrote: