Skip to content

stacked plot

3 messages · Dennis Murphy, Henri-Paul Indiogine

#
Hi!

I am trying to use ggplot2 to create a stacked bar plot.  Previously I
tried using barplot() but gave up because of problems with the
positioning of the legend and other appearance problems.   I am now
trying to learn ggplot2 and use it for all the plots that I need to
create for my dissertation.

I am able to create normal bar plots using ggplot2, but I am stomped
with the stacked bar plots.

This works:

barplot(t(file.codes), beside = FALSE)

the data.frame file.codes looks like this .....

        code.1 code.2 code.3 code.4 code.5 ....
file.1      2       0         0         5        4      ....
file.2      3       18       1         0        2      ....
....

I would like each file to be a bar and then each code stacked for each
file.    By transposing the file.codes data.frame barplot() will allow
me to do so.   I am trying to obtain the same result in ggplot2  but i
think that qplot wants the data to be like this:

file.1 code.1  2
file.1 code.2  0
file.1 code.3  0
file.1 code.4  5
file.1 code.5  4
file.2 code.1  3
file.2 code.2  18
....

I think that I need to use the package "reshape", but I am not sure
whether to use cast(), melt(), or recast() and how to set up the
function.

Thanks,
Henri-Paul
#
It appears that your object is currently a matrix. Here's a toy
example to illustrate how to get a stacked bar chart in ggplot2:

library('ggplot2')
m <- matrix(1:9, ncol = 3, dimnames = list(letters[1:3], LETTERS[1:3]))
(d <- as.data.frame(as.table(m)))
  Var1 Var2 Freq
1    a    A    1
2    b    A    2
3    c    A    3
4    a    B    4
5    b    B    5
6    c    B    6
7    a    C    7
8    b    C    8
9    c    C    9

ggplot(d, aes(x = Var1, y = Freq, fill = Var2)) +
   geom_bar(position = 'stack', stat = 'identity') +
   labs(x = 'Variable 1', y = 'Frequency', fill = 'Group') +
   scale_fill_manual(values = c('A' = 'red', 'B' = 'blue', 'C' = 'green'))

This plot uses Var1 as the x-variable, Freq as the response and Var2
as the variable whose frequencies are to be stacked, distinguished by
fill color. position = 'stack' designates the stacking while stat =
'identity' indicates that the y variable Freq should be used to
represent the counts.
labs()  designates the labels for each axis; the fill = label
indicates the legend title for the fill colors. Finally, the
scale_fill_manual() function is used to manually assign specific
colors to levels of the fill variable Var2. The scale_fill_manual()
code could also have been written as

... +
scale_fill_manual(breaks = levels(d$Var2), values = c('red', 'blue', 'green'))

with the same result.

HTH,
Dennis

On Thu, Oct 20, 2011 at 10:08 PM, Henri-Paul Indiogine
<hindiogine at gmail.com> wrote:
#
Hi Dennis!

Fantastic, great, wonderful, beautiful.

I slightly changed your code to adapt it to my situation:

ggplot(DF.2, aes(x=file.name, y=value,
fill=codes))+geom_histogram(position="stack", stat="identity") +
labs(x="document", y="number of codings")

#######################
file.name codes value
---------------------------------
file.1   code.1 ?  2
file.1   code.2 ?  0
file.1   code.3 ?  0
file.1   code.4 ?  5
file.1   code.5 ?  4
file.2   code.1 ?  3
file.2   code.2 ?18
 ....

There are 126 bars (file1 -> file.126), so I should do the following:
(1) convert to a histogram with no gaps between the bars, and (2)
remove the labels at the bottom of each bar and just have
xlab="documents".

However, even with changing geom_bar to geom_histogram there are small
gaps between the bars.

Thanks for your help,

Henri-Paul