Skip to content

about model.matrix

5 messages · Fix Ace, Bert Gunter, Michael Dewey

#
I tried to run the sample code from R:
dd <- data.frame(a = gl(3,4), b = gl(4,1,12))?  a b
1  1 1
2  1 2
3  1 3
4  1 4
5  2 1
6  2 2
7  2 3
8  2 4
9  3 1
10 3 2
11 3 3
12 3 4
options("contrasts")
model.matrix(~ a + b, dd)(Intercept) a2 a3 b2 b3 b4
1            1  0  0  0  0  0
2            1  0  0  1  0  0
3            1  0  0  0  1  0
4            1  0  0  0  0  1
5            1  1  0  0  0  0
6            1  1  0  1  0  0
7            1  1  0  0  1  0
8            1  1  0  0  0  1
9            1  0  1  0  0  0
10           1  0  1  1  0  0
11           1  0  1  0  1  0
12           1  0  1  0  0  1
when I tried to remove the intercept from the matrix, I used the following codemodel.matrix(~ 0+a + b, dd)
 a1 a2 a3 b2 b3 b41 1 0 0 0 0 02 1 0 0 1 0 03 1 0 0 0 1 04 1 0 0 0 0 15 0 1 0 0 0 06 0 1 0 1 0 07 0 1 0 0 1 08 0 1 0  0 0 19 0 0 1 0 0 010 0 0 1 1 0 011 0 0 1 0 1 012 0 0 1 0 0 1?when I tried to remove the intercept

Here I noticed that, all levels of a, a1, a2, and a3, were included. I wonder how  I can include the "b1" in the matrix as well?   a1 a2 a3 b1 b2 b3 b4
1   1  0  0  1  0  0  0
2   1  0  0  0  1  0  0
3   1  0  0  0  0  1  0
4   1  0  0  0  0  0  1
5   0  1  0  1  0  0  0
6   0  1  0  0  1  0  0
7   0  1  0  0  0  1  0
8   0  1  0  0  0  0  1
9   0  0  1  1  0  0  0
10  0  0  1  0  1  0  0
11  0  0  1  0  0  1  0
12  0  0  1  0  0  0  1?
#
This is really a question about statistics rather than R but see below
On 01/04/2015 06:28, Fix Ace wrote:
That got mangled but

In your matrix below try forming the sum of a1+a2+a3 and the sum of 
b1+b2+b3+b4. I think you will find they are linearly related.

  
    
  
#
Thank you very much for the response. Then what does it mean? I am not a stat person, but have to use it for my project. :(
Could you please recommend some readings about it? Thanks a lot!
On Wednesday, April 1, 2015 10:58 AM, Michael Dewey <lists at dewey.myzen.co.uk> wrote:
This is really a question about statistics rather than R but see below
On 01/04/2015 06:28, Fix Ace wrote:
That got mangled but

In your matrix below try forming the sum of a1+a2+a3 and the sum of 
b1+b2+b3+b4. I think you will find they are linearly related.

  
    
#
This is a big topic. You might try looking for tutorials on "linear
models", with "rank" or "rank deficiency" as subtopics. One possible
book is:

http://www.amazon.com/Linear-Models-Chapman-Statistical-Science/dp/1439887330/ref=sr_1_5?s=books&ie=UTF8&qid=1427987551&sr=1-5&keywords=linear+models+in+statistics

... but there are dozens.

Better yet, consult a local statistical expert for help. Trying to
educate yourself is laudable, but may be unrealistic.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Thu, Apr 2, 2015 at 7:20 AM, Fix Ace <acefix at rocketmail.com> wrote:
#
You cannot have columns which are linearly dependent.

Starting a project which uses statistics without having some sort of 
local statistical backup seems unprofitable.

install.packages("fortunes") # if not already done
library(fortunes)
fortune(122)
On 02/04/2015 15:20, Fix Ace wrote: