Skip to content

Complex multiple t tests in a data frame with several id factors

3 messages · Kaiyin Zhong, Bert Gunter

#
I have assayed the concentrations of various metal elements in
different anatomic regions of two strains of mice. Now, for each
element, in each region, I want to do a t test to find whether there
is any difference between the two strains.

Here is what I did (using simulated data as an example):

# create the data frame
id geno region elem      conc
1   1  exp  brain   fe  8.497498
2   2  exp  brain   fe  9.280944
3   3  exp  brain   fe  9.726271
4   1   wt  brain   fe 11.556397
5   2   wt  brain   fe 10.992550
6   3   wt  brain   fe  9.711200
7   1  exp  spine   fe 11.168603
8   2  exp  spine   fe  9.331127
9   3  exp  spine   fe 11.048226
10  1   wt  spine   fe  8.480867
11  2   wt  spine   fe  8.887062
12  3   wt  spine   fe  8.329797
13  1  exp  brain   cu 10.242652
14  2  exp  brain   cu  9.865984
15  3  exp  brain   cu  9.163728
16  1   wt  brain   cu 11.099385
17  2   wt  brain   cu  9.364261
18  3   wt  brain   cu  9.718322
19  1  exp  spine   cu 10.720157
20  2  exp  spine   cu 11.505430
21  3  exp  spine   cu  9.499359
22  1   wt  spine   cu  9.855950
23  2   wt  spine   cu 10.120489
24  3   wt  spine   cu  9.526252
25  1  exp  brain   zn  9.736196
26  2  exp  brain   zn 11.938710
27  3  exp  brain   zn  9.668625
28  1   wt  brain   zn  9.961574
29  2   wt  brain   zn 10.461621
30  3   wt  brain   zn  9.873667
31  1  exp  spine   zn  9.708067
32  2  exp  spine   zn 10.109309
33  3  exp  spine   zn 10.973387
34  1   wt  spine   zn  8.406536
35  2   wt  spine   zn  7.797746
36  3   wt  spine   zn 11.127984

# use tapply to aggregate
region
elem brain     spine
  fe Numeric,6 Numeric,6
  cu Numeric,6 Numeric,6
  zn Numeric,6 Numeric,6

# check whether the order of data has been preserved after aggregation
[[1]]
[1]  8.497498  9.280944  9.726271 11.556397 10.992550  9.711200

# create an external factor for strain grouping
[1] "exp" "exp" "exp" "wt"  "wt"  "wt"

# do the t test using the grouping factor
region
elem brain  spine
  fe List,9 List,9
  cu List,9 List,9
  zn List,9 List,9

I believe I have made no mistakes so far, but I wonder is there a
better way of doing this?
#
The concentrations of the different metals within an animal are
correlated, so that doing as you suggest will almost certainly result
in nonsense P values. So I suggest you seek local statistical help or,
failing that, post on a statistical forum like stats.stackexchange.com
 .

There are  various multivariate packages -- check e.g. the ChemPhys
and Multivariate task views -- that may be pertinent, but your post
suggests that you probably need some help to use them. Ergo my
suggestion above.

Cheers,
Bert
On Sun, Dec 4, 2011 at 7:36 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:

  
    
#
Okay, thank you.
On Mon, Dec 5, 2011 at 12:01 AM, Bert Gunter <gunter.berton at gene.com> wrote: