Skip to content

boxplot

11 messages · carol white, Jim Lemon, capricy gao +5 more

#
Hi,
It must be an easy question but how to boxplot a subset of data:

data = read.table("my_data.txt", header = T)
boxplot(data$var1[data$loc == "nice"]~data$loc_type[data$loc == "nice"])
#in this case, i want to display only the boxplot loc == "nice"
#doesn't display the boxplot of only loc == "nice". It also displays loc == "mice"

Thanks

Carol
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: my_data.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130321/bd22ee3f/attachment.txt>
#
On 03/21/2013 07:40 PM, carol white wrote:
Hi Carol,
It's them old factors sneakin' up on you. Try this:

boxplot(data$var1[data$loc == "nice"]~
  as.character(data$loc_type[data$loc == "nice"]))

Jim
#
Your variable loc_type combines information from two variables (loc and
type). Since you are subsetting on loc, why not just plot by type?

boxplot(var1~type, data[data$loc=="nice",])

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
1 day later
#
Hi Janh,

When you say that you have "multiple data sets of unequal sample sizes" are you speaking of the same kind of data"  For example are you speaking of data from a set of experiments where the variables measured are all the same and where when you graph them you expect the same x and y scales? 

Or are you talking about essentilly independent data sets that it makes sense to graph in a grid ?  


John Kane
Kingston ON Canada
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
#
On the subject of boxplots, I have multiple data sets of unequal sample 
sizes and was wondering what would be the most efficient way to read in 
the data and plot side-by-side boxplots, with options for controlling 
the orientation of the plots (i.e. vertical or horizontal) and the 
spacing? Your assistance is greatly appreciated, but please try to be 
explicit as I am no R expert.

Thanks Janh

Not sure without a reproducible example (please read posting guide) but 
maybe this will help:

a = sample(1:100, 30)
b = sample(1:100, 50)
c = sample(50:100,19)
d = sample(1:50, 15)
e = sample(1:50, 15)
f = sample(1:50, 25)

x11(width = 7, height = 3.5)
op =par(mfrow = c(1,2))  # plot side by side

# horizontal
boxplot(list(a,b,c), horizontal = TRUE, main = 'Horizontal',
     xlim = c(0,9), names = c('a', 'b','c'))
boxplot(list(e, f), horizontal = TRUE, xlim = c(0,9), names = c('e', 'f'),
     at = c(7,8), add = TRUE, col ='red')

# vertical
boxplot(list(a,b,c), horizontal = FALSE, xlim = c(0,9), main = 'Verticalal',
      names = c('a', 'b','c'))
boxplot(list(e, f), horizontal = FALSE, xlim = c(0,9), names = c('e', 'f'),
     at = c(7,8), add = TRUE, col ='red')
par(op)

Cheers,

Rob



___
  Robert W. Baer, Ph.D.
Professor of Physiology
Kirksille College of Osteopathic Medicine
A. T. Still University of Health Sciences
Kirksville, MO 63501 USA
#
Unless you have a really large number of wells I'd just use the brute force
   approach of reading in each data set with a simple read.table or
   read.csv  like

   well1  <-  read.csv("well1.csv) type of statement and repeat for each well.
   Here is a simple example that may give you an idea of how to do the boxplots
   . I have done them two ways, one using base graphics and the other using
   ggplot2.  You will probably have to install the ggplot2 package -- just
   issue the command install.packages("ggplot2)
   The base approach is initially a lot simpler but in the longer term, if you
   expect to do a lot of graphing work in R, the grid packages like ggplot2 or
   lattice seem to offer a lot more control for less actual typing, especially
   if you need publication/report quality graphics.
   ##===============start code=====================
   set.seed(345)  #reproducable sample
     # create three sample data sets,
     well_1  <-  data.frame(arsenic = rnorm(12))
     well_2  <-  data.frame (arsenic = rnorm(10))
     well_3  <-  data.frame (arsenic = rnorm(15))

     wells  <-  rbind(well_1, well_2, well_3)  # create single data.frame

     #create an id value for each well
      well_id   <-  c(rep(1,nrow(well_1)),  rep(2,  nrow(well_2)), rep(3,
   nrow(well_3)))

     #add the well identifier
     wells  <-  cbind(wells , well_id)
     str(wells) # check to see what we have

     boxplot(arsenic ~ well_id, data = wells) # plot vertical boxplot
     boxplot(arsenic ~ well_id, data = wells,
               horizontal = TRUE,col=c("red","green","blue")) #horizontal box
   plot

     # vertical boxplot using ggplot2
     library(ggplot2)

     p  <-  ggplot(wells, aes(as.factor(well_id), arsenic)) + geom_boxplot()
     p
     # horizontal boxplot
     p1   <-  p + coord_flip()
     p1

      p2   <-   ggplot(wells,  aes(as.factor(well_id),  arsenic,  fill  =
   as.factor(well_id) )) +
               geom_boxplot() + coord_flip() +
                        scale_fill_discrete(guide=FALSE)

   ##===============end code======================



   John Kane
   Kingston ON Canada

   -----Original Message-----
   From: annijanh at gmail.com
   Sent: Sat, 23 Mar 2013 10:22:02 -0400
   To: jrkrideau at inbox.com
   Subject: Re: [R] boxplot

   Hello John,

   I apologize for the delayed response.  Yes I am referring to the same type
   of  data in the data sets.  For example, the arsenic concentrations in
   individual groundwater monitoring wells at a groundwater contaminated site,
   where one well may have 12 concentration measurements, another well has 10,
   etc.

   Thanks
   Janh
On Fri, Mar 22, 2013 at 5:31 PM, John Kane <[1]jrkrideau at inbox.com> wrote:
Hi Janh,
     When you say that you have "multiple data sets of unequal sample sizes"
     are you speaking of the same kind of data"  For example are you speaking
     of data from a set of experiments where the variables measured are all the
     same and where when you graph them you expect the same x and y scales?
     Or are you talking about essentilly independent data sets that it makes
     sense to graph in a grid ?
     John Kane
     Kingston ON Canada

   > -----Original Message-----
   > From: [2]annijanh at gmail.com
   > Sent: Fri, 22 Mar 2013 10:46:21 -0400
   > To: [3]dcarlson at tamu.edu
   > Subject: Re: [R] boxplot
   >
   > Hello All,
   >
   > On the subject of boxplots, I have multiple data sets of unequal sample
   > sizes and was wondering what would be the most efficient way to read in
   > the
   > data and plot side-by-side boxplots, with options for controlling the
   > orientation of the plots (i.e. vertical or horizontal) and the spacing?
   > Your
   > assistance is greatly appreciated, but please try to be explicit as I am
   > no
   > R expert.  Thanks
   >
   > Janh
   >
   >
   >
   > On Thu, Mar 21, 2013 at 9:19 AM, David L Carlson <[4]dcarlson at tamu.edu>
> wrote:
>
   >> Your variable loc_type combines information from two variables (loc and
   >> type). Since you are subsetting on loc, why not just plot by type?
   >>
   >> boxplot(var1~type, data[data$loc=="nice",])
   >>
   >> ----------------------------------------------
   >> David L Carlson
   >> Associate Professor of Anthropology
   >> Texas A&M University
   >> College Station, TX 77843-4352
   >>
   >>> -----Original Message-----
   >>> From: [5]r-help-bounces at r-project.org [mailto:[6]r-help-bounces at r-
   >>> [7]project.org] On Behalf Of Jim Lemon
   >>> Sent: Thursday, March 21, 2013 4:05 AM
   >>> To: carol white
   >>> Cc: [8]r-help at stat.math.ethz.ch
   >>> Subject: Re: [R] boxplot
   >>>
>>> On 03/21/2013 07:40 PM, carol white wrote:
>>>> Hi,
   >>>> It must be an easy question but how to boxplot a subset of data:
   >>>>
   >>>> data = read.table("my_data.txt", header = T)
   >>>> boxplot(data$var1[data$loc == "nice"]~data$loc_type[data$loc ==
   >>> "nice"])
   >>>> #in this case, i want to display only the boxplot loc == "nice"
   >>>> #doesn't display the boxplot of only loc == "nice". It also displays
   >>> loc == "mice"
   >>>>
   >>> Hi Carol,
   >>> It's them old factors sneakin' up on you. Try this:
   >>>
   >>> boxplot(data$var1[data$loc == "nice"]~
   >>>   as.character(data$loc_type[data$loc == "nice"]))
   >>>
   >>> Jim
   >>>
   >>> ______________________________________________
   >>> [9]R-help at r-project.org mailing list
   >>> [10]https://stat.ethz.ch/mailman/listinfo/r-help
   >>> PLEASE do read the posting guide [11]http://www.R-project.org/posting-
   >>> guide.html
   >>> and provide commented, minimal, self-contained, reproducible code.
   >>
   >> ______________________________________________
   >> [12]R-help at r-project.org mailing list
   >> [13]https://stat.ethz.ch/mailman/listinfo/r-help
   >> PLEASE do read the posting guide
   >> [14]http://www.R-project.org/posting-guide.html
   >> and provide commented, minimal, self-contained, reproducible code.
   >>
   >

     >       [[alternative HTML version deleted]]

   >
   > ______________________________________________
   > [15]R-help at r-project.org mailing list
   > [16]https://stat.ethz.ch/mailman/listinfo/r-help
   > PLEASE do read the posting guide
   > [17]http://www.R-project.org/posting-guide.html
   > and provide commented, minimal, self-contained, reproducible code.

     ____________________________________________________________
     FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
     Check it out at [18]http://www.inbox.com/earth
     _________________________________________________________________

   Get Free 5GB Email ? Check out spam free email with many cool features!
   Visit [19]http://www.inbox.com/email to find out more!

References

   1. mailto:jrkrideau at inbox.com
   2. mailto:annijanh at gmail.com
   3. mailto:dcarlson at tamu.edu
   4. mailto:dcarlson at tamu.edu
   5. mailto:r-help-bounces at r-project.org
   6. mailto:r-help-bounces at r-
   7. http://project.org/
   8. mailto:r-help at stat.math.ethz.ch
   9. mailto:R-help at r-project.org
  10. https://stat.ethz.ch/mailman/listinfo/r-help
  11. http://www.R-project.org/posting-
  12. mailto:R-help at r-project.org
  13. https://stat.ethz.ch/mailman/listinfo/r-help
  14. http://www.R-project.org/posting-guide.html
  15. mailto:R-help at r-project.org
  16. https://stat.ethz.ch/mailman/listinfo/r-help
  17. http://www.R-project.org/posting-guide.html
  18. http://www.inbox.com/earth
  19. http://www.inbox.com/email
#
> I am going to use clara for? gene expression analysis,
    > so tried to play around with the examples from R document:


    > http://127.0.0.1:10699/library/cluster/html/clara.html

    > Everything looked fine until I tried to plot the results:

    > it says:  waiting to confirm page change...

    > I waited for more than 10 min and NO plot came out...

:-) :-)  

I really had a great chuckle!!

*Who* do you think is waiting when R says
  " waiting to confirm page change..."
??

Of course, R is waiting for *you* to confirm the page change...


    > Should I wait longer? Anything wrong like this?

:-) ;-)  

... I can hardly stop chuckling away... sorry ...

    > Thanks for any input:)

You're welcome.
Martin