Labeling a range of bars in barplot?
Marc Schwartz (via MN) wrote:
On Tue, 2005-12-13 at 10:53 +0000, Dan Bolser wrote:
Hi, I am plotting a distribution of (ordered) values as a barplot. I would like to label groups of bars together to highlight aspects of the distribution. The label for the group should be the range of values in those bars. As this is hard to describe, here is an example; x <- rlnorm(50)*2 barplot(sort(x,decreasing=T)) y <- quantile(x, seq(0, 1, 0.2)) y plot(diff(y)) That last plot is to highlight that I want to label lots of the small columns together, and have a few more labels for the bigger columns (more densely labeled). I guess I will have to turn out my own labels using low level plotting functions, but I am stumped as to how to perform the calculation for label placement. I imagine drawing several line segments, one for each group of bars to be labeled together, and putting the range under each line segment as the label. Each line segment will sit under the group of bars that it covers. Thanks for any help with the above! Cheers, Dan.
Dan, Here is a hint. barplot() returns the bar midpoints: mp <- barplot(sort(x, decreasing = TRUE))
head(mp)
[,1]
[1,] 0.7
[2,] 1.9
[3,] 3.1
[4,] 4.3
[5,] 5.5
[6,] 6.7
There will be one value in 'mp' for each bar in your series.
You can then use those values along the x axis to draw your line
segments under the bars as you require, based upon the cut points you
want to highlight.
To get the center of a given group of bars, you can use:
mean(mp[start:end])
where 'start' and 'end' are the extreme bars in each of your groups.
Two other things that might be helpful. See ?cut and ?hist, noting the
output in the latter when 'plot = FALSE'.
HTH,
Thanks all for help on this question, including those who emailed me off
list.
I went with the suggestion of Marc above, because I could follow through
how to implement the code (other more complete solutions were hard for
me to 'reverse engineer').
Here is my solution in full, which I feel gives rather nice output :)
## Approximate my data for you to try
x <- sort((runif(70)*100)^3,decreasing=T)
## Plot the barplot
mp <-
barplot(x,
# Remove default label names
names.arg=rep('',70)
)
## Break data range, and count bars per break
my.hist <-
hist(x,plot=F,
## Pick the (approximate) number of labels
## NB: using quantiles is incorrect here
breaks=4
)
## Check for sanity
## points(mp[length(mp)],x[length(mp)],col=2)
## Counts become new 'breaks'
my.new.breaks <-
my.hist$counts
## Some formating stuff
my.names <-
sprintf("%.1d",my.hist$breaks)
# Prepare to add labels
op<-par(xpd=TRUE)
i <- length(mp) # Note we label from right to left
q <- 1
#
for(j in my.new.breaks){
st <- i #
en <- i-j+1 #
##
segments(mp[st],-50000,
mp[en],-50000,lwd=2,col=2)
##
text(mean(mp[st:en]),-100000,pos=1,
paste(paste(my.names[q],"-",sep=" "),
my.names[q+1],sep="\n"),cex=0.6)
##
i <- i-j #
q <- q+1
}
You should see that the density of labels corresponds to the range of
data (hopefully not too dense), giving more labels to regions of the
plot with bigger ranges.
Marc Schwartz
Cheers, Dan.