Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful... Thanks, Dan -------------- next part -------------- A non-text attachment was scrubbed... Name: hist1.pdf Type: application/pdf Size: 98591 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170105/d76d9440/attachment.pdf>
Generating a Special Histogram
9 messages · Dan Abner, Richard M. Heiberger, Sarah Goslee +4 more
I recommend the stem function.
stem(wt)
The decimal point is 1 digit(s) to the right of the | 12 | 57 14 | 4902479 16 | 1233444349 18 | 002507
stem(wt, 2)
The decimal point is 1 digit(s) to the right of the | 13 | 57 14 | 49 15 | 02479 16 | 1233444 17 | 349 18 | 0025 19 | 07
On Thu, Jan 5, 2017 at 11:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:
Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful... Thanks, Dan
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Dan, I'd probably start by looking at the various examples for stem and leaf plot in R. stem() or aplpack::stem.leaf() might help you get started, or if you don't need the fancy boxes, be sufficient. Sarah
On Thu, Jan 5, 2017 at 11:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:
Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful... Thanks, Dan
Sarah Goslee http://www.functionaldiversity.org
On 05/01/2017 11:01 AM, Dan Abner wrote:
Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful...
You can write a function to do this pretty easily. hist(...,
plot=FALSE) does all the calculations for you; you just need to write
the loop to draw the boxes. For example,
myhist <- function(x) {
histvals <- hist(x, plot = FALSE)
with(histvals, {
plot(range(breaks), range(c(0, counts)), type = "n")
for (i in seq_along(histvals$counts)) {
keep <- (breaks[i] < x & x <= breaks[i+1]) |
(i == 1 & x == breaks[1])
vals <- x[keep]
for (j in seq_along(vals)) {
rect(breaks[i], j-1, breaks[i+1], j)
text(mids[i], j-0.5, vals[j])
}
}
})
}
x <- round(rnorm(20, mean=166, sd=4))
myhist(x)
Duncan Murdoch
Here's a different approach using barplot() to draw the boxes. The first line in the function sorts the values so that they are printed from lowest to highest on the histogram. If you want them in the original sequence, comment this line out. It also assumes you want intervals of 10:
set.seed(42)
wgt <- round(rnorm(45, 170, 15))
boxhist <- function(x) {
x <- sort(x)
obs <- 1:length(x)
low <- floor(min(x/10))*10
high <- ceiling(max(x/10))*10
grp <- cut(x, breaks=c(seq(low, high, by=10)), include.lowest=TRUE)
mat <- table(obs, grp)
cols <- ncol(mat)
barplot(mat, space=0, col="lightblue", xaxt="n")
axis(1, 0:cols, seq(low, high, by=10))
mat <- apply(mat, 2, cumsum) * mat
xval <- apply(mat, 1, function(x) which(x > 0))
yval <- apply(mat, 1, max)
text(xval-.5, yval-.5, x)
}
boxhist(wgt)
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Duncan Murdoch
Sent: Thursday, January 5, 2017 10:47 AM
To: Dan Abner; r-help at r-project.org
Subject: Re: [R] Generating a Special Histogram
On 05/01/2017 11:01 AM, Dan Abner wrote:
Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful...
You can write a function to do this pretty easily. hist(...,
plot=FALSE) does all the calculations for you; you just need to write
the loop to draw the boxes. For example,
myhist <- function(x) {
histvals <- hist(x, plot = FALSE)
with(histvals, {
plot(range(breaks), range(c(0, counts)), type = "n")
for (i in seq_along(histvals$counts)) {
keep <- (breaks[i] < x & x <= breaks[i+1]) |
(i == 1 & x == breaks[1])
vals <- x[keep]
for (j in seq_along(vals)) {
rect(breaks[i], j-1, breaks[i+1], j)
text(mids[i], j-0.5, vals[j])
}
}
})
}
x <- round(rnorm(20, mean=166, sd=4))
myhist(x)
Duncan Murdoch
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: BoxHist.png
Type: image/png
Size: 8407 bytes
Desc: BoxHist.png
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20170105/c6213613/attachment.png>
Hi Dan,
This may help if your data is in the format below:
waffle.mat<-matrix(c(rep(NA,14),137,135,rep(NA,6),144,149,
rep(NA,3),150,152,159,157,154,
NA,163,164,164,161,162,165,164,rep(NA,5),179,173,173,
rep(NA,4),182,180,185,180,
rep(NA,6),197,190,rep(NA,8)),ncol=9)
waffle.col<-matrix("lightblue",ncol=9,nrow=8)
waffle.col[is.na(waffle.mat)]<-NA
waffle.border<-matrix("blue",ncol=9,nrow=8)
waffle.border[is.na(waffle.mat)]<-NA
library(plotrix)
# use a waffle plot
color2D.matplot(waffle.mat,cellcolors=waffle.col,border=waffle.border,
show.values=TRUE,xat=10,yat=10,xlab="",ylab="")
axis(1,at=1:8,labels=seq(130,200,by=10))
axis(2,at=1:8)
axis.break(1)
Jim
On Fri, Jan 6, 2017 at 3:01 AM, Dan Abner <dan.abner99 at gmail.com> wrote:
Hi all, Is anyone aware of a package, function, or general R trick that would make generating histograms like the one in the attachment easy in R (i.e., without manually drawing each individual horizontal line and specifying the coordinates for a textbox for each number)? I need to make ~12 of these for different samples of n=25, so the manual approach would be very painful... Thanks, Dan
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 06/01/17 10:31, Jim Lemon wrote:
Hi Dan,
This may help if your data is in the format below:
waffle.mat<-matrix(c(rep(NA,14),137,135,rep(NA,6),144,149,
rep(NA,3),150,152,159,157,154,
NA,163,164,164,161,162,165,164,rep(NA,5),179,173,173,
rep(NA,4),182,180,185,180,
rep(NA,6),197,190,rep(NA,8)),ncol=9)
waffle.col<-matrix("lightblue",ncol=9,nrow=8)
waffle.col[is.na(waffle.mat)]<-NA
waffle.border<-matrix("blue",ncol=9,nrow=8)
waffle.border[is.na(waffle.mat)]<-NA
library(plotrix)
# use a waffle plot
color2D.matplot(waffle.mat,cellcolors=waffle.col,border=waffle.border,
show.values=TRUE,xat=10,yat=10,xlab="",ylab="")
axis(1,at=1:8,labels=seq(130,200,by=10))
axis(2,at=1:8)
axis.break(1)
Being picky-picky-picky I would like to point out that Duncan's and
David's functions don't *quite* reproduce the picture in the pdf file
that the OP attached, when called with the data from that picture:
egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
161,162,165,164,179,173,173,182,180,185,180,197,190)
myhist(egdat)
boxhist(egdat)
It's a matter of including the left or right endpoints in the bins.
Duncan's function needs to swap "<" and "<=" in the definition of "keep"
(and make a corresponding adjustment in the "|" clause, so as to look at
the last rather than the first break value).
David's function needs to set "right=FALSE" in the call to cut().
Jim's waffle plot gets it right, at the expense of needing to have the
data organised in an inconvenient form.
All that being said, all of you blokes came up with solutions that are
far beyond my capability of producing. Hat's off to you.
cheers,
Rolf
Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
A worthy challenge, Rolf:
egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
161,162,165,164,179,173,173,182,180,185,180,197,190)
egcut<-cut(egdat,breaks=seq(120,210,by=10),right=FALSE)
eglist<-vector("list",9)
for(egindex in 1:9) eglist[[egindex]]<-rev(egdat[as.numeric(egcut)==egindex])
egdf<-as.data.frame(lapply(eglist,function(x) x[1:8]))
names(egdf)<-paste("V",1:9,sep="")
waffle.mat<-as.matrix(sapply(egdf,rev))
Jim
On Fri, Jan 6, 2017 at 9:21 AM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
On 06/01/17 10:31, Jim Lemon wrote:
Hi Dan,
This may help if your data is in the format below:
waffle.mat<-matrix(c(rep(NA,14),137,135,rep(NA,6),144,149,
rep(NA,3),150,152,159,157,154,
NA,163,164,164,161,162,165,164,rep(NA,5),179,173,173,
rep(NA,4),182,180,185,180,
rep(NA,6),197,190,rep(NA,8)),ncol=9)
waffle.col<-matrix("lightblue",ncol=9,nrow=8)
waffle.col[is.na(waffle.mat)]<-NA
waffle.border<-matrix("blue",ncol=9,nrow=8)
waffle.border[is.na(waffle.mat)]<-NA
library(plotrix)
# use a waffle plot
color2D.matplot(waffle.mat,cellcolors=waffle.col,border=waffle.border,
show.values=TRUE,xat=10,yat=10,xlab="",ylab="")
axis(1,at=1:8,labels=seq(130,200,by=10))
axis(2,at=1:8)
axis.break(1)
Being picky-picky-picky I would like to point out that Duncan's and David's
functions don't *quite* reproduce the picture in the pdf file
that the OP attached, when called with the data from that picture:
egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
161,162,165,164,179,173,173,182,180,185,180,197,190)
myhist(egdat)
boxhist(egdat)
It's a matter of including the left or right endpoints in the bins.
Duncan's function needs to swap "<" and "<=" in the definition of "keep"
(and make a corresponding adjustment in the "|" clause, so as to look at
the last rather than the first break value).
David's function needs to set "right=FALSE" in the call to cut().
Jim's waffle plot gets it right, at the expense of needing to have the
data organised in an inconvenient form.
All that being said, all of you blokes came up with solutions that are far
beyond my capability of producing. Hat's off to you.
cheers,
Rolf
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
On 05/01/2017 5:21 PM, Rolf Turner wrote:
On 06/01/17 10:31, Jim Lemon wrote:
Hi Dan,
This may help if your data is in the format below:
waffle.mat<-matrix(c(rep(NA,14),137,135,rep(NA,6),144,149,
rep(NA,3),150,152,159,157,154,
NA,163,164,164,161,162,165,164,rep(NA,5),179,173,173,
rep(NA,4),182,180,185,180,
rep(NA,6),197,190,rep(NA,8)),ncol=9)
waffle.col<-matrix("lightblue",ncol=9,nrow=8)
waffle.col[is.na(waffle.mat)]<-NA
waffle.border<-matrix("blue",ncol=9,nrow=8)
waffle.border[is.na(waffle.mat)]<-NA
library(plotrix)
# use a waffle plot
color2D.matplot(waffle.mat,cellcolors=waffle.col,border=waffle.border,
show.values=TRUE,xat=10,yat=10,xlab="",ylab="")
axis(1,at=1:8,labels=seq(130,200,by=10))
axis(2,at=1:8)
axis.break(1)
Being picky-picky-picky I would like to point out that Duncan's and
David's functions don't *quite* reproduce the picture in the pdf file
that the OP attached, when called with the data from that picture:
egdat <- c(137,135,144,149,150,152,159,157,154,163,164,164,
161,162,165,164,179,173,173,182,180,185,180,197,190)
myhist(egdat)
boxhist(egdat)
It's a matter of including the left or right endpoints in the bins.
Duncan's function needs to swap "<" and "<=" in the definition of "keep"
(and make a corresponding adjustment in the "|" clause, so as to look at
the last rather than the first break value).
Well, that's just because the attached picture was wrong :-). Duncan
David's function needs to set "right=FALSE" in the call to cut(). Jim's waffle plot gets it right, at the expense of needing to have the data organised in an inconvenient form. All that being said, all of you blokes came up with solutions that are far beyond my capability of producing. Hat's off to you. cheers, Rolf