Skip to content

boxplot help

6 messages · John Kane, Richard M. Heiberger, andyspeak +1 more

#
Hi, im a newbie with very wobbly coding abilities.
Tearing my hair out over getting the boxplot i want...

I have a dataset called 'eagle' which consists of year (2011 or 2012), month
(jan - dec), roof (TT6, TT13 or BARE) and temp (the continuous variable that
i want to plot).
So i want boxplots of the three roof treatments in every month organised in
chronical order along x axis 2011 - 2012.

my code at the moment is:
which produces the graph inserted

Can anyone point me in right direction?
thanks  http://r.789695.n4.nabble.com/file/n4640360/boxplot_trial.jpg 



--
View this message in context: http://r.789695.n4.nabble.com/boxplot-help-tp4640360.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi Andy,

Nice plot but yes, probably not exactly what you want.  

Thanks for providing the code.  The nextthing you need to do is to send us some data to go with the code.  There is a very handy function called dput() , which converts a dataset into a format that you can just copy from your R terminal and paste into an email that allows readers to work wiith your data or a sample of it.

If the dataset is really large then something like dput(head (mydata, 50))  will supply the first fifty rows of data for us to work with.

One of your problems seems to be that your dates are character values and need to be converted into datesfor you to get the chronological date order.  See ?strptime  or perhaps install the lubridate package for help there.

Otherwise welcome to R 

Note to other readers : I think I actually got the dput syntax this time!!

John Kane
Kingston ON Canada
____________________________________________________________
GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails
#
hi thanks

the dput output is...

structure(list(Year = c(2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 
2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L), Month =
structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L), .Label = c("Apr", "Aug", "Dec", "Feb", "Jan", "Jul", "Jun", 
"Mar", "May", "Nov", "Oct", "Sep"), class = "factor"), Roof =
structure(c(3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), .Label = c("BARE", "TT13", "TT6"), class = "factor"), Temp = c(10.68, 
11.8, 12.16, 12.36, 12.5, 12.58, 12.46, 12.72, 12.31, 12.35, 
12.44, 12.59, 12.86, 13.05, 13.83, 13.92, 14.24, 14.47, 13.84, 
13.44, 12.83, 12.73, 12.47, 12.18, 12.59, 12.36, 12.09, 11.73, 
13.02, 13.16, 13.21, 13.53, 14.15, 14.67, 15.53, 15.48, 17.43, 
18.56, 22.6, 25.06, 20.36, 18.78, 15.29, 11.98, 10.23, 9.42, 
8.38, 7.37, 6.47, 6.3)), .Names = c("Year", "Month", "Roof", 
"Temp"), row.names = c(NA, 50L), class = "data.frame")

But as this is only first 50 rows then it doesnt show that some of the data
is Jan - Jul of 2012 which i want displaying separately from Apr-Jul 2011
data.

I think the next reply i recieved points me to a good solution.



--
View this message in context: http://r.789695.n4.nabble.com/boxplot-help-tp4640360p4640392.html
Sent from the R help mailing list archive at Nabble.com.
#
Hello,

I'm not sure wether this is what you want, but here it goes.

dd <- structure( ...etc... )  # your dataset

# make group identifiers
ym <- paste(dd$Year, as.character(dd$Month), sep="-")

op <- par(las=2) # make labels perpendicular to axis
bp <- boxplot(Temp ~ ym, data=dd)
axis(1, at = seq_along(unique(ym)), labels = bp$names)
par(op)

If it's what you want, add color and legend.

Hope this helps,

Rui Barradas

Em 15-08-2012 17:52, andyspeak escreveu:
#
Okay your first main problem is that you have Year and Month as character variables.

I think you need to convert them to a single date: Try this:
eagle$dates  <-  as.Date(paste(dd, "/", "01", sep=""), "%Y/%b/%d")

I think this helps cure a lot of the problem.

However a different approach to the same results I think is this using the ggplot2 package which I imagine you would have to install ( install.packages("ggplot2") 

I noticed that your dput() example only had data for a part of one year so I manufactored some data to use as an example.  However you still need to create the dates column.  




# Data Set

mydat <- structure(list(roof = structure(c(1L, 3L, 2L, 1L, 1L, 2L, 3L, 
3L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 
1L, 3L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 1L, 2L, 3L, 2L, 1L, 2L, 
3L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 2L, 3L, 2L), .Label = c("bare", 
"tt13", "tt6"), class = "factor"), dates = structure(c(8L, 8L, 
3L, 3L, 8L, 9L, 5L, 7L, 11L, 1L, 11L, 12L, 7L, 12L, 3L, 4L, 2L, 
2L, 9L, 6L, 6L, 6L, 11L, 12L, 4L, 1L, 3L, 12L, 5L, 2L, 9L, 10L, 
11L, 6L, 1L, 11L, 5L, 9L, 4L, 9L, 9L, 2L, 12L, 1L, 11L, 9L, 2L, 
9L, 4L, 6L), .Label = c("2011-06-01", "2011-07-01", "2011-08-01", 
"2011-09-01", "2011-10-01", "2011-11-01", "2011-12-01", "2012-01-01", 
"2012-02-01", "2012-03-01", "2012-04-01", "2012-05-01"), class = "factor"), 
    temp = c(17.33, 16.92, 17.06, 17.79, 17.63, 5.16, 13.85, 
    14.3, 14.44, 11.32, 15.15, 15.04, 17.72, 10.33, 10.46, 9.25, 
    10.7, 13.72, 19.2, 9.03, 8.69, 13.41, 16.08, 19.91, 10.87, 
    14.06, 16.57, 8.66, 17.74, 15.71, 17.91, 7.26, 15.89, 22.14, 
    15.93, 20.01, 18.45, 12.34, 15.67, 13.7, 10.68, 7.2, 16.83, 
    13.99, 14.69, 16.13, 20.35, 16.89, 19.34, 15.05), year = c(2012L, 
    2012L, 2011L, 2011L, 2012L, 2012L, 2011L, 2011L, 2012L, 2011L, 
    2012L, 2012L, 2011L, 2012L, 2011L, 2011L, 2011L, 2011L, 2012L, 
    2011L, 2011L, 2011L, 2012L, 2012L, 2011L, 2011L, 2011L, 2012L, 
    2011L, 2011L, 2012L, 2012L, 2012L, 2011L, 2011L, 2012L, 2011L, 
    2012L, 2011L, 2012L, 2012L, 2011L, 2012L, 2011L, 2012L, 2012L, 
    2011L, 2012L, 2011L, 2011L)), .Names = c("roof", "dates", 
"temp", "year"), class = "data.frame", row.names = c(NA, -50L
))


# Then give this a try. It looks a bit weird as I just generated a bunch of dates.
#=================================================
library(ggplot2)
p  <-  ggplot( mydat , aes(as.factor(dates) , temp, fill = factor(roof))) + geom_boxplot(statistic = "identity") + 
           ylab("Temperature") + xlab("Months")  + scale_fill_hue(name="Roof\nType")  +
            facet_grid( year ~ .)
p

# ===================================================

John Kane
Kingston ON Canada
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!