Skip to content

Generating a Special Histogram

9 messages · Dan Abner, Richard M. Heiberger, Sarah Goslee +4 more

#
Hi all,

Is anyone aware of a package, function, or general R trick that would make
generating histograms like the one in the attachment easy in R (i.e.,
without manually drawing each individual horizontal line and specifying the
coordinates for a textbox for each number)?

I need to make ~12 of these for different samples of n=25, so the manual
approach would be very painful...

Thanks,

Dan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hist1.pdf
Type: application/pdf
Size: 98591 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170105/d76d9440/attachment.pdf>
#
I recommend the stem function.
The decimal point is 1 digit(s) to the right of the |

  12 | 57
  14 | 4902479
  16 | 1233444349
  18 | 002507
The decimal point is 1 digit(s) to the right of the |

  13 | 57
  14 | 49
  15 | 02479
  16 | 1233444
  17 | 349
  18 | 0025
  19 | 07
On Thu, Jan 5, 2017 at 11:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:
#
Hi Dan,

I'd probably start by looking at the various examples for stem and
leaf plot in R.

stem() or aplpack::stem.leaf() might help you get started, or if you
don't need the fancy boxes, be sufficient.

Sarah
On Thu, Jan 5, 2017 at 11:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:

  
    
#
On 05/01/2017 11:01 AM, Dan Abner wrote:
You can write a function to do this pretty easily.  hist(..., 
plot=FALSE) does all the calculations for you; you just need to write 
the loop to draw the boxes.  For example,

myhist <- function(x) {
   histvals <- hist(x, plot = FALSE)
   with(histvals, {
     plot(range(breaks), range(c(0, counts)), type = "n")

     for (i in seq_along(histvals$counts)) {
       keep <- (breaks[i] < x & x <= breaks[i+1]) |
               (i == 1 & x == breaks[1])
       vals <- x[keep]
       for (j in seq_along(vals)) {
         rect(breaks[i], j-1, breaks[i+1], j)
         text(mids[i], j-0.5, vals[j])
       }
     }
   })
}

x <- round(rnorm(20, mean=166, sd=4))
myhist(x)

Duncan Murdoch
#
Here's a different approach using barplot() to draw the boxes. The first line in the function sorts the values so that they are printed from lowest to highest on the histogram. If you want them in the original sequence, comment this line out. It also assumes you want intervals of 10:

set.seed(42)
wgt <- round(rnorm(45, 170, 15))

boxhist <- function(x) {
     x <- sort(x)
     obs <- 1:length(x)
     low <- floor(min(x/10))*10
     high <- ceiling(max(x/10))*10
     grp <- cut(x, breaks=c(seq(low, high, by=10)), include.lowest=TRUE)
     mat <- table(obs, grp)
     cols <- ncol(mat)
     barplot(mat, space=0, col="lightblue", xaxt="n")
     axis(1, 0:cols, seq(low, high, by=10))
     mat <- apply(mat, 2, cumsum) * mat
     xval <- apply(mat, 1, function(x) which(x > 0))
     yval <- apply(mat, 1, max)
     text(xval-.5, yval-.5, x)
}

boxhist(wgt)

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch
Sent: Thursday, January 5, 2017 10:47 AM
To: Dan Abner; r-help at r-project.org
Subject: Re: [R] Generating a Special Histogram
On 05/01/2017 11:01 AM, Dan Abner wrote:
You can write a function to do this pretty easily.  hist(..., 
plot=FALSE) does all the calculations for you; you just need to write 
the loop to draw the boxes.  For example,

myhist <- function(x) {
   histvals <- hist(x, plot = FALSE)
   with(histvals, {
     plot(range(breaks), range(c(0, counts)), type = "n")

     for (i in seq_along(histvals$counts)) {
       keep <- (breaks[i] < x & x <= breaks[i+1]) |
               (i == 1 & x == breaks[1])
       vals <- x[keep]
       for (j in seq_along(vals)) {
         rect(breaks[i], j-1, breaks[i+1], j)
         text(mids[i], j-0.5, vals[j])
       }
     }
   })
}

x <- round(rnorm(20, mean=166, sd=4))
myhist(x)

Duncan Murdoch

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BoxHist.png
Type: image/png
Size: 8407 bytes
Desc: BoxHist.png
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170105/c6213613/attachment.png>
#
Hi Dan,
This may help if your data is in the format below:

waffle.mat<-matrix(c(rep(NA,14),137,135,rep(NA,6),144,149,
 rep(NA,3),150,152,159,157,154,
 NA,163,164,164,161,162,165,164,rep(NA,5),179,173,173,
 rep(NA,4),182,180,185,180,
 rep(NA,6),197,190,rep(NA,8)),ncol=9)
waffle.col<-matrix("lightblue",ncol=9,nrow=8)
waffle.col[is.na(waffle.mat)]<-NA
waffle.border<-matrix("blue",ncol=9,nrow=8)
waffle.border[is.na(waffle.mat)]<-NA
library(plotrix)
# use a waffle plot
color2D.matplot(waffle.mat,cellcolors=waffle.col,border=waffle.border,
 show.values=TRUE,xat=10,yat=10,xlab="",ylab="")
axis(1,at=1:8,labels=seq(130,200,by=10))
axis(2,at=1:8)
axis.break(1)

Jim
On Fri, Jan 6, 2017 at 3:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:
#
On 06/01/17 10:31, Jim Lemon wrote:
Being picky-picky-picky I would like to point out that Duncan's and 
David's functions don't *quite* reproduce the picture in the pdf file
that the OP attached, when called with the data from that picture:

egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
            161,162,165,164,179,173,173,182,180,185,180,197,190)
myhist(egdat)
boxhist(egdat)

It's a matter of including the left or right endpoints in the bins.

Duncan's function needs to swap "<" and "<=" in the definition of "keep"
(and make a corresponding adjustment in the "|" clause, so as to look at
the last rather than the first break value).

David's function needs to set "right=FALSE" in the call to cut().

Jim's waffle plot gets it right, at the expense of needing to have the
data organised in an inconvenient form.

All that being said, all of you blokes came up with solutions that are 
far beyond my capability of producing.  Hat's off to you.

cheers,

Rolf
#
A worthy challenge, Rolf:

egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
           161,162,165,164,179,173,173,182,180,185,180,197,190)
egcut<-cut(egdat,breaks=seq(120,210,by=10),right=FALSE)
eglist<-vector("list",9)
for(egindex in 1:9) eglist[[egindex]]<-rev(egdat[as.numeric(egcut)==egindex])
egdf<-as.data.frame(lapply(eglist,function(x) x[1:8]))
names(egdf)<-paste("V",1:9,sep="")
waffle.mat<-as.matrix(sapply(egdf,rev))

Jim
On Fri, Jan 6, 2017 at 9:21 AM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
#
On 05/01/2017 5:21 PM, Rolf Turner wrote:
Well, that's just because the attached picture was wrong :-).

Duncan