Skip to content
Prev 276494 / 398506 Next

Correlation between matrices

Hi:
On Sat, Nov 5, 2011 at 11:06 PM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
As I said in my first response, I didn't quite understand what you
were trying to regress so I used the mouse as a way of showing you how
the code works. I think I understand what you want now, though.

I'll create a data set in two ways: the first assumes you have the
data as constructed in your original post and the second generates
random numbers after erecting a 'scaffold' data frame. The game is to
separate the enzyme data from the element data and put them into the
final data frame as separate columns. Then the regression is easy if
that's what you need to do.

# Method 1: Generate the data as you did into separate data frames

elem0 <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
            'cerebellum')

# Creates five 5 x 5 data frames with names V1-V5:
for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
   assign(n, as.data.frame(replicate(5, rnorm(5))))
 }

# Stack the chemical element data using melt() from
# the reshape2 package:
library('reshape2')
d1 <- rbind(melt(Cu), melt(Zn), melt(Fe), melt(Ca))
# Relabel V1 - V5 with brain region names, add a factor
# to distinguish individual elements and tack on the melted
# Enzyme data so that it repeats in each element block
d1 <- within(d1, {
        variable <- factor(d1$variable, labels = regions)
        elem <- factor(rep(elem0[1:4], each = 25))
        Enzyme <- melt(Enzyme)[, 2]
      }     )
# Plot the data using lattice and latticeExtra:
library('lattice')
library('latticeExtra')
p <- xyplot(Enzyme ~ value | variable + elem,
                   data = d1, type = c('p', 'r'))
useOuterStrips(p)

###########################################
## Method 2: Generate the random data after setting
##                 up the element/region/mouse combinations
##

# Generate a data frame from the combinations of
# mice, regions and elem:

library('ggplot2')

mice <- paste('mouse', 1:5, sep = '')
regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
            'cerebellum')
elem <- elem0[1:4]
d0 <- data.frame(expand.grid(mice = mice,
                             regions = regions, elem = elem))
d0 <- within(d0, {
        value <- rnorm(100)   # generate element values
        Enzyme <- rnorm(25)  # generate enzyme values
      }         )

# the Enzyme values are recycled through all element blocks.

# You can either adapt the lattice code above to plot d0, or you
# can do the following to get an analogous plot in ggplot2.
# It's easier to compute the slopes and intercepts and put
# them into a data frame that ggplot() can import, so that's
# what we'll do first.

# A function to return regression coefficients from a
# generic data frame. Since this function goes into ddply(),
# the argument df is a (generic) data frame and the output
# will be converted to a one-line data frame.

coefun <- function(df) coef(lm(Enzyme ~ value, data = df))

# Apply the function to all regions * elem combinations.
# Output is a data frame of coefficients corresponding to
# each region/element combination

coefs <- ddply(d0, .(regions, elem), coefun)
# Rename the columns
names(coefs) <- c('regions', 'elem', 'b0', 'b1')

# Generate the plot using package ggplot2:
ggplot(d0, aes(x = val, y = Enzyme)) +
   geom_point(size = 2.5) +
   geom_abline(data = coefs, aes(intercept = b0, slope = b1),
                             size = 1) +
   xlab("") +
   facet_grid(elem ~ regions)
You should never need to use attach() - use the data = argument in
lm() instead, where the value of data is the name of a data frame.
It's always easier to use the modeling functions in R having formula
interfaces with data frames.
You're clearly doing something here that's messing up the structure of
the data. Study what the code (and its output) above are telling you,
particularly if you're not familiar with plyr, lattice and/or ggplot2.
Writing functions to insert into a **ply() function in plyr can be
tricky. If you continue to have problems, please provide a
reproducible example as you did here.

HTH,
Dennis