An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111106/e98b2013/attachment.pl>
Correlation between matrices
4 messages · Kaiyin Zhong, Dennis Murphy
Hi:
I don't think you want to keep these objects separate; it's better to
combine everything into a data frame. Here's a variation of your
example - the x variable ends up being a mouse, but you may have
another variable that's more appropriate to plot so take this as a
starting point. One plot uses the ggplot2 package, the other uses the
lattice and latticeExtra packages.
library('ggplot2')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
mice = paste('mouse', 1:5, sep='')
elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
# Generate a data frame from the combinations of
# mice, regions and elem:
d <- data.frame(expand.grid(mice = mice, regions = regions,
elem = elem), y = rnorm(125))
# Create a numeric version of mice
d$mouse <- as.numeric(d$mice)
# A function to return regression coefficients
coefun <- function(df) coef(lm(y ~ mouse), data = df)
# Apply to all regions * elem combinations
coefs <- ddply(d, .(regions, elem), coefun)
names(coefs) <- c('regions', 'elem', 'b0', 'b1')
# Generate the plot using package ggplot2:
ggplot(d, aes(x = mouse, y = y)) +
geom_point(size = 2.5) +
geom_abline(data = coefs, aes(intercept = b0, slope = b1),
size = 1) +
facet_grid(elem ~ regions)
# Same plot in lattice:
library('lattice')
library('latticeExtra')
p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'),
layout = c(5, 5))
HTH,
Dennis
On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
mice = paste('mouse', 1:5, sep='')
for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
+ ? assign(n, as.data.frame(replicate(5, rnorm(5)))) + }
names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) =
row.names(Enzyme) = mice
Cu
? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443
Zn
? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666
Fe
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038
Ca
? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281
Enzyme
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 In each anatomic region, I would like to calculate the correlation between Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and do a scatter plot with a tendency line, organizing those plots into a grid. See the image below for the desired effect: http://postimage.org/image/62brra6jn/ How can I achieve this? Thank you in advance. ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you Dennis, your tips are really helpful. I don't quite understand the lm(y~mouse) part; my intention was -- in pseudo code -- lm(y(Enzyme) ~ y(each elem)). In addition, attach(d) seems necessary before using lm(y~mouse), and since d$mouse has a length 125, while each elem for each region has a length 5, it generates the following error:
coefs = ddply(d, .(regions, elem), coefun)
Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) : variable lengths differ (found for 'mouse')
On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy <djmuser at gmail.com> wrote:
Hi:
I don't think you want to keep these objects separate; it's better to
combine everything into a data frame. Here's a variation of your
example - the x variable ends up being a mouse, but you may have
another variable that's more appropriate to plot so take this as a
starting point. One plot uses the ggplot2 package, the other uses the
lattice and latticeExtra packages.
library('ggplot2')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
? ? ? ? ? ?'cerebellum')
mice = paste('mouse', 1:5, sep='')
elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
# Generate a data frame from the combinations of
# mice, regions and elem:
d <- data.frame(expand.grid(mice = mice, regions = regions,
? ? ? ? ? ? ? ? ? ? ? ? ? ?elem = elem), y = rnorm(125))
# Create a numeric version of mice
d$mouse <- as.numeric(d$mice)
# A function to return regression coefficients
coefun <- function(df) coef(lm(y ~ mouse), data = df)
# Apply to all regions * elem combinations
coefs <- ddply(d, .(regions, elem), coefun)
names(coefs) <- c('regions', 'elem', 'b0', 'b1')
# Generate the plot using package ggplot2:
ggplot(d, aes(x = mouse, y = y)) +
? geom_point(size = 2.5) +
? geom_abline(data = coefs, aes(intercept = b0, slope = b1),
? ? ? ? ? ? ? ? ? ? ? ? ? ? size = 1) +
? facet_grid(elem ~ regions)
# Same plot in lattice:
library('lattice')
library('latticeExtra')
p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'),
? ? ? ? layout = c(5, 5))
HTH,
Dennis
On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
mice = paste('mouse', 1:5, sep='')
for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
+ ? assign(n, as.data.frame(replicate(5, rnorm(5)))) + }
names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) =
row.names(Enzyme) = mice
Cu
? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443
Zn
? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666
Fe
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038
Ca
? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281
Enzyme
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 In each anatomic region, I would like to calculate the correlation between Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and do a scatter plot with a tendency line, organizing those plots into a grid. See the image below for the desired effect: http://postimage.org/image/62brra6jn/ How can I achieve this? Thank you in advance. ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi:
On Sat, Nov 5, 2011 at 11:06 PM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
Thank you Dennis, your tips are really helpful. I don't quite understand the lm(y~mouse) part; my intention was -- in pseudo code -- lm(y(Enzyme) ~ y(each elem)).
As I said in my first response, I didn't quite understand what you
were trying to regress so I used the mouse as a way of showing you how
the code works. I think I understand what you want now, though.
I'll create a data set in two ways: the first assumes you have the
data as constructed in your original post and the second generates
random numbers after erecting a 'scaffold' data frame. The game is to
separate the enzyme data from the element data and put them into the
final data frame as separate columns. Then the regression is easy if
that's what you need to do.
# Method 1: Generate the data as you did into separate data frames
elem0 <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
# Creates five 5 x 5 data frames with names V1-V5:
for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
assign(n, as.data.frame(replicate(5, rnorm(5))))
}
# Stack the chemical element data using melt() from
# the reshape2 package:
library('reshape2')
d1 <- rbind(melt(Cu), melt(Zn), melt(Fe), melt(Ca))
# Relabel V1 - V5 with brain region names, add a factor
# to distinguish individual elements and tack on the melted
# Enzyme data so that it repeats in each element block
d1 <- within(d1, {
variable <- factor(d1$variable, labels = regions)
elem <- factor(rep(elem0[1:4], each = 25))
Enzyme <- melt(Enzyme)[, 2]
} )
# Plot the data using lattice and latticeExtra:
library('lattice')
library('latticeExtra')
p <- xyplot(Enzyme ~ value | variable + elem,
data = d1, type = c('p', 'r'))
useOuterStrips(p)
###########################################
## Method 2: Generate the random data after setting
## up the element/region/mouse combinations
##
# Generate a data frame from the combinations of
# mice, regions and elem:
library('ggplot2')
mice <- paste('mouse', 1:5, sep = '')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
elem <- elem0[1:4]
d0 <- data.frame(expand.grid(mice = mice,
regions = regions, elem = elem))
d0 <- within(d0, {
value <- rnorm(100) # generate element values
Enzyme <- rnorm(25) # generate enzyme values
} )
# the Enzyme values are recycled through all element blocks.
# You can either adapt the lattice code above to plot d0, or you
# can do the following to get an analogous plot in ggplot2.
# It's easier to compute the slopes and intercepts and put
# them into a data frame that ggplot() can import, so that's
# what we'll do first.
# A function to return regression coefficients from a
# generic data frame. Since this function goes into ddply(),
# the argument df is a (generic) data frame and the output
# will be converted to a one-line data frame.
coefun <- function(df) coef(lm(Enzyme ~ value, data = df))
# Apply the function to all regions * elem combinations.
# Output is a data frame of coefficients corresponding to
# each region/element combination
coefs <- ddply(d0, .(regions, elem), coefun)
# Rename the columns
names(coefs) <- c('regions', 'elem', 'b0', 'b1')
# Generate the plot using package ggplot2:
ggplot(d0, aes(x = val, y = Enzyme)) +
geom_point(size = 2.5) +
geom_abline(data = coefs, aes(intercept = b0, slope = b1),
size = 1) +
xlab("") +
facet_grid(elem ~ regions)
In addition, attach(d) seems necessary before using lm(y~mouse), and since d$mouse has a length 125, while each elem for each region has a length 5, it generates the following error:
You should never need to use attach() - use the data = argument in lm() instead, where the value of data is the name of a data frame. It's always easier to use the modeling functions in R having formula interfaces with data frames.
coefs = ddply(d, .(regions, elem), coefun)
Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) : ?variable lengths differ (found for 'mouse')
You're clearly doing something here that's messing up the structure of the data. Study what the code (and its output) above are telling you, particularly if you're not familiar with plyr, lattice and/or ggplot2. Writing functions to insert into a **ply() function in plyr can be tricky. If you continue to have problems, please provide a reproducible example as you did here. HTH, Dennis
On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy <djmuser at gmail.com> wrote:
Hi:
I don't think you want to keep these objects separate; it's better to
combine everything into a data frame. Here's a variation of your
example - the x variable ends up being a mouse, but you may have
another variable that's more appropriate to plot so take this as a
starting point. One plot uses the ggplot2 package, the other uses the
lattice and latticeExtra packages.
library('ggplot2')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
? ? ? ? ? ?'cerebellum')
mice = paste('mouse', 1:5, sep='')
elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
# Generate a data frame from the combinations of
# mice, regions and elem:
d <- data.frame(expand.grid(mice = mice, regions = regions,
? ? ? ? ? ? ? ? ? ? ? ? ? ?elem = elem), y = rnorm(125))
# Create a numeric version of mice
d$mouse <- as.numeric(d$mice)
# A function to return regression coefficients
coefun <- function(df) coef(lm(y ~ mouse), data = df)
# Apply to all regions * elem combinations
coefs <- ddply(d, .(regions, elem), coefun)
names(coefs) <- c('regions', 'elem', 'b0', 'b1')
# Generate the plot using package ggplot2:
ggplot(d, aes(x = mouse, y = y)) +
? geom_point(size = 2.5) +
? geom_abline(data = coefs, aes(intercept = b0, slope = b1),
? ? ? ? ? ? ? ? ? ? ? ? ? ? size = 1) +
? facet_grid(elem ~ regions)
# Same plot in lattice:
library('lattice')
library('latticeExtra')
p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'),
? ? ? ? layout = c(5, 5))
HTH,
Dennis
On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
'cerebellum')
mice = paste('mouse', 1:5, sep='')
for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
+ ? assign(n, as.data.frame(replicate(5, rnorm(5)))) + }
names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) =
row.names(Enzyme) = mice
Cu
? ? ? ? ? cortex hippocampus brain_stem ?mid_brain cerebellum mouse1 -0.5436573 -0.31486713 ?0.1039148 -0.3908665 -1.0849112 mouse2 ?1.4559136 ?1.75731752 -2.1195118 -0.9894767 ?0.3609033 mouse3 -0.6735427 -0.04666507 ?0.9641000 ?0.4683339 ?0.7419944 mouse4 ?0.6926557 -0.47820023 ?1.3560802 ?0.9967562 -1.3727874 mouse5 ?0.2371585 ?0.20031393 -1.4978517 ?0.7535148 ?0.5632443
Zn
? ? ? ? ? ?cortex hippocampus brain_stem ?mid_brain ?cerebellum mouse1 -0.66424043 ? 0.6664478 ?1.1983546 ?0.0319403 ?0.41955740 mouse2 -1.14510448 ? 1.5612235 ?0.3210821 ?0.4094753 ?1.01637466 mouse3 -0.85954416 ? 2.8275458 -0.6922565 -0.8182307 -0.06961242 mouse4 ?0.03606034 ?-0.7177256 ?0.7067217 ?0.2036655 -0.25542524 mouse5 ?0.67427572 ? 0.6171704 ?0.1044267 -1.8636174 -0.07654666
Fe
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.8337008 ? 2.0884261 ?0.29730413 -1.6884804 ?0.8336137 mouse2 -0.2734139 ?-0.5728439 ?0.63791556 -0.6232828 -1.1352224 mouse3 -0.4795082 ? 0.1627235 ?0.21775206 ?1.0751584 -0.5581422 mouse4 ?1.7125147 ?-0.5830600 ?1.40597896 -0.2815305 ?0.3776360 mouse5 -0.3469067 ?-0.4813120 -0.09606797 ?1.0970077 -1.1234038
Ca
? ? ? ? ? cortex hippocampus ?brain_stem ? mid_brain cerebellum mouse1 -0.7663354 ? 0.8595091 ?1.33803798 -1.17651576 ?0.8299963 mouse2 -0.7132260 ?-0.2626811 ?0.08025079 -2.40924271 ?0.7883005 mouse3 -0.7988904 ?-0.1144639 -0.65901136 ?0.42462227 ?0.7068755 mouse4 ?0.3880393 ? 0.5570068 -0.49969135 ?0.06633009 -1.3497228 mouse5 ?1.0077684 ? 0.6023264 -0.57387762 ?0.25919461 -0.9337281
Enzyme
? ? ? ? ? cortex hippocampus ?brain_stem ?mid_brain cerebellum mouse1 ?1.3430936 ? 0.5335819 -0.56992947 ?1.3565803 -0.8323391 mouse2 ?1.0520850 ?-1.0201124 ?0.89600005 ?1.4719880 ?1.0854768 mouse3 -0.2802482 ? 0.6863323 -1.37483570 -0.7790174 ?0.2446761 mouse4 -0.1916415 ?-0.4566571 ?1.93365932 ?1.3493848 ?0.2130424 mouse5 -1.0349593 ?-0.1940268 -0.07216321 -0.2968288 ?1.7406905 In each anatomic region, I would like to calculate the correlation between Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and do a scatter plot with a tendency line, organizing those plots into a grid. See the image below for the desired effect: http://postimage.org/image/62brra6jn/ How can I achieve this? Thank you in advance. ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.