Hello, I'm trying to create a line graph with a legend, but have no
success controlling the legend. Since nothing I've tried seems to work,
I must be doing something systematically wrong. Can anyone point this
out to me?
Here's my data:
> weights
# A tibble: 1,246 ? 3
Date J K
<date> <dbl> <dbl>
1 2000-02-13 133 188
2 2000-02-20 134 185
3 2000-02-27 135 187
4 2000-03-05 135 185
5 2000-03-12 NA 184
6 2000-03-19 NA 184.
7 2000-03-26 136 184.
8 2000-04-02 134 185
9 2000-04-09 133 186
10 2000-04-16 NA 186
# ? 1,236 more rows
# ? Use `print(n = ...)` to see more rows
>
Here's my attempts. You can see some of the things I've tried in the
commented out sections:
weights %>%
group_by(year(Date)) %>%
summarize(
m_K = mean(K, na.rm = TRUE),
m_J = mean(J, na.rm = TRUE),
) %>%
ggplot(aes(x = `year(Date)`)) +
geom_point(aes(y = m_K, color = "red")) +
geom_smooth(aes(y = m_K, color = "red")) +
geom_point(aes(y = m_J, color = "blue")) +
geom_smooth(aes(y = m_J, color = "blue")) +
guides(size = "legend",
shape = "legend")
## scale_shape_discrete(name="Person",
## breaks=c("m_K", "m_J"),
## labels=c("K", "J"))
## theme(legend.title=element_blank())
When this runs, the blue line for "K" is above the red line for "J", as
I expect, but in the legend, the red is shown first, and labeled "blue."
I'd like to be able to create a legend where the first entry shows a
blue line and is labeled "K" and the second is red and labeled "J".
On a different but related topic, I'd welcome any advice or suggestions
on my methodology in this example. Is this the correct way to summarize
with a mean? Do I need the two sets of geom_point and geom_line clauses
to create this graph, or is there a better way?
Thanks for all your advice and guidance.
-Kevin
Newbie: Controlling legends in graphs
5 messages · Rui Barradas, Christopher W. Ryan, Kevin Zembower
?s 14:24 de 12/05/2023, Kevin Zembower via R-help escreveu:
Hello, I'm trying to create a line graph with a legend, but have no success controlling the legend. Since nothing I've tried seems to work, I must be doing something systematically wrong. Can anyone point this out to me? Here's my data:
> weights
# A tibble: 1,246 ? 3
Date J K
<date> <dbl> <dbl>
1 2000-02-13 133 188
2 2000-02-20 134 185
3 2000-02-27 135 187
4 2000-03-05 135 185
5 2000-03-12 NA 184
6 2000-03-19 NA 184.
7 2000-03-26 136 184.
8 2000-04-02 134 185
9 2000-04-09 133 186
10 2000-04-16 NA 186
# ? 1,236 more rows
# ? Use `print(n = ...)` to see more rows
>
Here's my attempts. You can see some of the things I've tried in the
commented out sections:
weights %>%
group_by(year(Date)) %>%
summarize(
m_K = mean(K, na.rm = TRUE),
m_J = mean(J, na.rm = TRUE),
) %>%
ggplot(aes(x = `year(Date)`)) +
geom_point(aes(y = m_K, color = "red")) +
geom_smooth(aes(y = m_K, color = "red")) +
geom_point(aes(y = m_J, color = "blue")) +
geom_smooth(aes(y = m_J, color = "blue")) +
guides(size = "legend",
shape = "legend")
## scale_shape_discrete(name="Person",
## breaks=c("m_K", "m_J"),
## labels=c("K", "J"))
## theme(legend.title=element_blank())
When this runs, the blue line for "K" is above the red line for "J", as
I expect, but in the legend, the red is shown first, and labeled "blue."
I'd like to be able to create a legend where the first entry shows a
blue line and is labeled "K" and the second is red and labeled "J".
On a different but related topic, I'd welcome any advice or suggestions
on my methodology in this example. Is this the correct way to summarize
with a mean? Do I need the two sets of geom_point and geom_line clauses
to create this graph, or is there a better way?
Thanks for all your advice and guidance.
-Kevin
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello,
This is mainly a data reshaping problem. Insteadof plotting two
variables, J and K, if the data is in the long format you will map the
column with these variables names to the color aesthetic and call each
geom_* only once. Then, assign the colors you want.
As for placing K above J, note that ggplot places them by alphabetical
order unless you coerce to factor with the levels in the order you want.
Also, if you want to compute aggregate statistics for several columns,
use ?across. See the code below.
Here is a complete example. I have augmented your data set in order to
have more years to plot.
# augment the data set
weights <- " Date J K
1 2000-02-13 133 188
2 2000-02-20 134 185
3 2000-02-27 135 187
4 2000-03-05 135 185
5 2000-03-12 NA 184
6 2000-03-19 NA 184.
7 2000-03-26 136 184.
8 2000-04-02 134 185
9 2000-04-09 133 186
10 2000-04-16 NA 186"
weights <- read.table(text = weights, header = TRUE)
weights$Date <- as.Date(weights$Date)
tmp <- weights
tmp <- lapply(1:10, \(y) {
tmp$Date <- years(y) + tmp$Date
tmp$J <- tmp$J + sample(-10:10, nrow(weights), TRUE)
tmp$K <- tmp$K + sample(-10:10, nrow(weights), TRUE)
tmp
})
weights <- do.call(rbind, tmp)
#---
# plot code
library(ggplot2)
library(dplyr)
library(tidyr)
library(lubridate)
weights %>%
mutate(Year = year(Date)) %>%
group_by(Year) %>%
summarize(across(J:K, mean, na.rm = TRUE)) %>%
# now reshape the data
pivot_longer(-Year) %>%
# uncomment the next line if you want K
# to show up on top in the legend
# mutate(name = factor(name, levels = c("K", "J"))) %>%
ggplot(aes(Year, value, color = name)) +
geom_smooth(
formula = y ~ x,
method = lm,
se = FALSE
) +
geom_point() +
scale_color_manual(values = c(J = "red", K = "blue"))
Hope this helps,
Rui Barradas
3 days later
Rui, thanks so much for your help. Your explanation and example were clear and concise. Thanks for taking the time and effort to help me. -Kevin
On 5/12/23 16:06, Rui Barradas wrote:
?s 14:24 de 12/05/2023, Kevin Zembower via R-help escreveu:
Hello, I'm trying to create a line graph with a legend, but have no
success controlling the legend. Since nothing I've tried seems to work,
I must be doing something systematically wrong. Can anyone point this
out to me?
Here's my data:
? > weights
# A tibble: 1,246 ? 3
???? Date?????????? J???? K
???? <date>???? <dbl> <dbl>
?? 1 2000-02-13?? 133? 188
?? 2 2000-02-20?? 134? 185
?? 3 2000-02-27?? 135? 187
?? 4 2000-03-05?? 135? 185
?? 5 2000-03-12??? NA? 184
?? 6 2000-03-19??? NA? 184.
?? 7 2000-03-26?? 136? 184.
?? 8 2000-04-02?? 134? 185
?? 9 2000-04-09?? 133? 186
10 2000-04-16??? NA? 186
# ? 1,236 more rows
# ? Use `print(n = ...)` to see more rows
? >
Here's my attempts. You can see some of the things I've tried in the
commented out sections:
weights %>%
????? group_by(year(Date)) %>%
????? summarize(
????????? m_K = mean(K, na.rm = TRUE),
????????? m_J = mean(J, na.rm = TRUE),
????????? ) %>%
????? ggplot(aes(x = `year(Date)`)) +
????? geom_point(aes(y = m_K, color = "red")) +
????? geom_smooth(aes(y = m_K, color = "red")) +
????? geom_point(aes(y = m_J, color = "blue")) +
????? geom_smooth(aes(y = m_J, color = "blue")) +
????? guides(size = "legend",
???????????? shape = "legend")
????? ## scale_shape_discrete(name="Person",
????? ##????????????????????? breaks=c("m_K", "m_J"),
????? ##????????????????????? labels=c("K", "J"))
????? ## theme(legend.title=element_blank())
When this runs, the blue line for "K" is above the red line for "J", as
I expect, but in the legend, the red is shown first, and labeled "blue."
I'd like to be able to create a legend where the first entry shows a
blue line and is labeled "K" and the second is red and labeled "J".
On a different but related topic, I'd welcome any advice or suggestions
on my methodology in this example. Is this the correct way to summarize
with a mean? Do I need the two sets of geom_point and geom_line clauses
to create this graph, or is there a better way?
Thanks for all your advice and guidance.
-Kevin
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello,
This is mainly a data reshaping problem. Insteadof plotting two
variables, J and K, if the data is in the long format you will map the
column with these variables names to the color aesthetic and call each
geom_* only once. Then, assign the colors you want.
As for placing K above J, note that ggplot places them by alphabetical
order unless you coerce to factor with the levels in the order you want.
Also, if you want to compute aggregate statistics for several columns,
use ?across. See the code below.
Here is a complete example. I have augmented your data set in order to
have more years to plot.
# augment the data set
weights <- " Date?????????? J???? K
? 1 2000-02-13?? 133? 188
? 2 2000-02-20?? 134? 185
? 3 2000-02-27?? 135? 187
? 4 2000-03-05?? 135? 185
? 5 2000-03-12??? NA? 184
? 6 2000-03-19??? NA? 184.
? 7 2000-03-26?? 136? 184.
? 8 2000-04-02?? 134? 185
? 9 2000-04-09?? 133? 186
10 2000-04-16??? NA? 186"
weights <- read.table(text = weights, header = TRUE)
weights$Date <- as.Date(weights$Date)
tmp <- weights
tmp <- lapply(1:10, \(y) {
? tmp$Date <- years(y) + tmp$Date
? tmp$J <- tmp$J + sample(-10:10, nrow(weights), TRUE)
? tmp$K <- tmp$K + sample(-10:10, nrow(weights), TRUE)
? tmp
})
weights <- do.call(rbind, tmp)
#---
# plot code
library(ggplot2)
library(dplyr)
library(tidyr)
library(lubridate)
weights %>%
??? mutate(Year = year(Date)) %>%
??? group_by(Year) %>%
??? summarize(across(J:K, mean, na.rm = TRUE)) %>%
??? # now reshape the data
??? pivot_longer(-Year) %>%
??? # uncomment the next line if you want K
??? # to show up on top in the legend
??? # mutate(name = factor(name, levels = c("K", "J"))) %>%
??? ggplot(aes(Year, value, color = name)) +
??? geom_smooth(
??????? formula = y ~ x,
??????? method = lm,
??????? se = FALSE
??? ) +
??? geom_point() +
??? scale_color_manual(values = c(J = "red", K = "blue"))
Hope this helps,
Rui Barradas
I"m more of a lattice guy than a ggplot guy, but perhaps this is part of the problem: .....
geom_point(aes(y = m_K, color = "red")) + ##### >> you've
associated "K" with the color red
geom_smooth(aes(y = m_K, color = "red")) +
geom_point(aes(y = m_J, color = "blue")) + ###### >> and "J" with
the color blue
geom_smooth(aes(y = m_J, color = "blue")) +
..... ##### >> but you object about a "blue line for K" and a "red line for J"? When this runs, the blue line for "K" is above the red line for "J", as
I expect, but in the legend, the red is shown first, and labeled "blue." I'd like to be able to create a legend where the first entry shows a blue line and is labeled "K" and the second is red and labeled "J". Thanks for all your advice and guidance.
.......
-Kevin
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
See below.
On 5/16/23 10:52, Christopher Ryan wrote:
I"m more of a lattice guy than a ggplot guy, but perhaps this is part of
the problem:
.....
? ? ?geom_point(aes(y = m_K, color = "red")) +? ##### >> you've
associated "K" with the color red
? ? ?geom_smooth(aes(y = m_K, color = "red")) +
? ? ?geom_point(aes(y = m_J, color = "blue")) +? ?###### >> and "J"
with the color blue
? ? ?geom_smooth(aes(y = m_J, color = "blue")) +
.....
Yes, I was confused that I associated "K" with the color red, yet the line and points for K's data were blue, but in the legend, was labeled with the word "red". But, I think I've got it straightened out now. Thanks for your help. -Kevin