Fitting linear models

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090420/fcc49c52/attachment-0001.pl>
Is this homework? If so, you need to read the text and/or class notes more
carefully.

-- Bert Gunter

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Vemuri, Aparna
Sent: Monday, April 20, 2009 4:26 PM
To: r-help at r-project.org
Subject: [R] Fitting linear models

I am not sure if this is an R-users question, but since most of you here
are statisticians, I decided to give it a shot. 

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept)          SO4          NO3      NH4
    0.01323      0.01968      0.01856           NA  

and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept)          SO4         	 NO3      NH4
Na       Cl  
 -0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
NA

In both cases, the last independent variable has a coefficient of NA in
the result. I say last variable because, when I change the order of the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way?  Is there anyway for me to force R to use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables. 

Thanks
Aparna 

model1<-lm(formula = PBW ~ SO4 + NH4 +NO3)
model1
Call:
lm(formula = PBW ~ SO4 + NH4 + NO3)

Coefficients:
(Intercept)          SO4      NH4          NO3  
    0.01323     -0.00430      0.06394           NA
model2<-lm(formula = PBW ~ SO4 + NO3 + Na +Cl  +NH4)
model2
Call:
lm(formula = PBW ~ SO4 + NO3 + Na + Cl + NH4)

Coefficients:
(Intercept)          SO4             NO3                 	Na
Cl                  NH4	  
 -0.0006987    0.0196371   -0.0050303    0.0685020    0.0427431
NA  

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

I am not sure if this is an R-users question, but since most of you  
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept)          SO4          NO3      NH4
   0.01323      0.01968      0.01856           NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept)          SO4         	 NO3      NH4
Na       Cl
-0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
NA

In both cases, the last independent variable has a coefficient of NA  
in
the result. I say last variable because, when I change the order of  
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way?  Is there anyway for me to force R to  
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?  
Please stop that.

Just a guess, ... since you have not provided enough information to do  
otherwise, ... Are all of those variables 1/0 dummy variables? If so  
and if you want to have an output that satisfies your need for  
labeling the coefficients as you naively anticipate, then put "0+" at  
the beginning of the formula or "-1" at the end, so that the intercept  
will disappear and then all variables will get labeled as you expect.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
Try:
model1<-lm(PBW~SO4+NO3+NH4)
Does it work?
Dimitri
I am not sure if this is an R-users question, but since most of you here
are statisticians, I decided to give it a shot.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? NO3 ? ? ?NH4
Na ? ? ? Cl
?-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA in
the result. I say last variable because, when I change the order of the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.

Thanks
Aparna

model1<-lm(formula = PBW ~ SO4 + NH4 +NO3)
model1
Call:
lm(formula = PBW ~ SO4 + NH4 + NO3)

Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ?NH4 ? ? ? ? ?NO3
? ?0.01323 ? ? -0.00430 ? ? ?0.06394 ? ? ? ? ? NA

model2<-lm(formula = PBW ~ SO4 + NO3 + Na +Cl ?+NH4)
model2
Call:
lm(formula = PBW ~ SO4 + NO3 + Na + Cl + NH4)

Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ? ? NO3 ? ? ? ? ? ? ? ? ? ? ? ?Na
Cl ? ? ? ? ? ? ? ? ?NH4
?-0.0006987 ? ?0.0196371 ? -0.0050303 ? ?0.0685020 ? ?0.0427431
NA

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient.  Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri 
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
 This is not homework. But I will remember to do my research before posting here.

Aparna 

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

I am not sure if this is an R-users question, but since most of you  
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept)          SO4          NO3      NH4
   0.01323      0.01968      0.01856           NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept)      	    SO4         	 NO3      NH4
Na       Cl
-0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
NA

In both cases, the last independent variable has a coefficient of NA  
in
the result. I say last variable because, when I change the order of  
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way?  Is there anyway for me to force R to  
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?  
Please stop that.

Just a guess, ... since you have not provided enough information to do  
otherwise, ... Are all of those variables 1/0 dummy variables? If so  
and if you want to have an output that satisfies your need for  
labeling the coefficients as you naively anticipate, then put "0+" at  
the beginning of the formula or "-1" at the end, so that the intercept  
will disappear and then all variables will get labeled as you expect.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
Aparna,

I should have been more explicit. Run ?lm . You'll see this:

"lm(formula, data, subset, weights, na.action,
   method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
   singular.ok = TRUE, contrasts = NULL, offset, ...)"

So, in addition to specifying the formula, you have to specify the
data frame in which you keep your variables. I assume they are in a
data frame? (unless for some reasons you keep all variables as
separate vectors).
So, after you wrote the formula, you have to indicate the name of the
data frame, for example "MyData":

model1<-lm(PBW~SO4+NO3+NH4, MyData)

Dimitri
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
?This is not homework. But I will remember to do my research before posting here.

Aparna

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:

I am not sure if this is an R-users question, but since most of you
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
The variables are all in separate vectors. 

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] 
Sent: Tuesday, April 21, 2009 8:26 AM
To: Vemuri, Aparna
Cc: David Winsemius; r-help at r-project.org
Subject: Re: [R] Fitting linear models

Aparna,

I should have been more explicit. Run ?lm . You'll see this:

"lm(formula, data, subset, weights, na.action,
   method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
   singular.ok = TRUE, contrasts = NULL, offset, ...)"

So, in addition to specifying the formula, you have to specify the
data frame in which you keep your variables. I assume they are in a
data frame? (unless for some reasons you keep all variables as
separate vectors).
So, after you wrote the formula, you have to indicate the name of the
data frame, for example "MyData":

model1<-lm(PBW~SO4+NO3+NH4, MyData)

Dimitri
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
?This is not homework. But I will remember to do my research before posting here.

Aparna

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:

I am not sure if this is an R-users question, but since most of you
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
Are they of the same length?
The variables are all in separate vectors.

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 8:26 AM
To: Vemuri, Aparna
Cc: David Winsemius; r-help at r-project.org
Subject: Re: [R] Fitting linear models

Aparna,

I should have been more explicit. Run ?lm . You'll see this:

"lm(formula, data, subset, weights, na.action,
? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
? singular.ok = TRUE, contrasts = NULL, offset, ...)"

So, in addition to specifying the formula, you have to specify the
data frame in which you keep your variables. I assume they are in a
data frame? (unless for some reasons you keep all variables as
separate vectors).
So, after you wrote the formula, you have to indicate the name of the
data frame, for example "MyData":

model1<-lm(PBW~SO4+NO3+NH4, MyData)

Dimitri

On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
?This is not homework. But I will remember to do my research before posting here.

Aparna

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:

I am not sure if this is an R-users question, but since most of you
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
Yes, they are all of the same length.

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] 
Sent: Tuesday, April 21, 2009 8:32 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

Are they of the same length?
The variables are all in separate vectors.

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 8:26 AM
To: Vemuri, Aparna
Cc: David Winsemius; r-help at r-project.org
Subject: Re: [R] Fitting linear models

Aparna,

I should have been more explicit. Run ?lm . You'll see this:

"lm(formula, data, subset, weights, na.action,
? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
? singular.ok = TRUE, contrasts = NULL, offset, ...)"

So, in addition to specifying the formula, you have to specify the
data frame in which you keep your variables. I assume they are in a
data frame? (unless for some reasons you keep all variables as
separate vectors).
So, after you wrote the formula, you have to indicate the name of the
data frame, for example "MyData":

model1<-lm(PBW~SO4+NO3+NH4, MyData)

Dimitri

On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
?This is not homework. But I will remember to do my research before posting here.

Aparna

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:

I am not sure if this is an R-users question, but since most of you
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
Can we see your data to be able to replicate the error? Or maybe a
subset of data with some fake variable names?
Yes, they are all of the same length.

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 8:32 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

Are they of the same length?

On Tue, Apr 21, 2009 at 11:31 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
The variables are all in separate vectors.

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 8:26 AM
To: Vemuri, Aparna
Cc: David Winsemius; r-help at r-project.org
Subject: Re: [R] Fitting linear models

Aparna,

I should have been more explicit. Run ?lm . You'll see this:

"lm(formula, data, subset, weights, na.action,
? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
? singular.ok = TRUE, contrasts = NULL, offset, ...)"

So, in addition to specifying the formula, you have to specify the
data frame in which you keep your variables. I assume they are in a
data frame? (unless for some reasons you keep all variables as
separate vectors).
So, after you wrote the formula, you have to indicate the name of the
data frame, for example "MyData":

model1<-lm(PBW~SO4+NO3+NH4, MyData)

Dimitri

On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David,
Thanks for the suggestions. No, I did not label my dependent variable "function".

My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.

Bert:
?This is not homework. But I will remember to do my research before posting here.

Aparna

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Monday, April 20, 2009 5:35 PM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:

I am not sure if this is an R-users question, but since most of you
here
are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.

I am using the lm() function in R to fit a dependent variable to a set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

David,
Thanks for the suggestions. No, I did not label my dependent  
variable "function".
That was from my error in reading your call to lm. In my defense I am  
reasonably sure the proper assignment to arguments is lm(formula= ...)  
rather than lm(function= ...).

My dependent variable PBW and all the independent variables are  
continuous variables. It is especially troubling since the order in  
which I input independent variables determines whether or not it  
gets a coefficient.  Like I already mentioned, I checked the  
correlation matrix and picked the variables with moderate to high  
correlation with the independent variable. . So I guess it is not so  
na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of str()  
applied to each of the variables and then we could all stop guessing.
Aparna

I am using the lm() function in R to fit a dependent variable to a  
set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept)          SO4          NO3      NH4
  0.01323      0.01968      0.01856           NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept)      	    SO4         	 NO3      NH4
Na       Cl
-0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me  
to
the reason R behaves this way?  Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
Attached are the first hundred rows of my data in comma separated format. 	
Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information.

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Tuesday, April 21, 2009 8:38 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

David,
Thanks for the suggestions. No, I did not label my dependent  
variable "function".
That was from my error in reading your call to lm. In my defense I am  
reasonably sure the proper assignment to arguments is lm(formula= ...)  
rather than lm(function= ...).

My dependent variable PBW and all the independent variables are  
continuous variables. It is especially troubling since the order in  
which I input independent variables determines whether or not it  
gets a coefficient.  Like I already mentioned, I checked the  
correlation matrix and picked the variables with moderate to high  
correlation with the independent variable. . So I guess it is not so  
na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of str()  
applied to each of the variables and then we could all stop guessing.
Aparna

I am using the lm() function in R to fit a dependent variable to a  
set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept)          SO4          NO3      NH4
  0.01323      0.01968      0.01856           NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept)      	    SO4         	 NO3      NH4
Na       Cl
-0.0006987   -0.0119750   -0.0295042    0.0842989    0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me  
to
the reason R behaves this way?  Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Vemuri-Rhelp-sample.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090421/792096ca/attachment-0002.txt>
I am not sure what the problem is.
I found no errors:

data<-read.csv(file.choose())  # I had to change your file extension
to .csv first
dim(data)
names(data)

lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1)  # Getting nice results
print(lm.model.2) # Getting same results

# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)

Dimitri
Attached are the first hundred rows of my data in comma separated format.
Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information.

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Tuesday, April 21, 2009 8:38 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:

David,
Thanks for the suggestions. No, I did not label my dependent
variable "function".
That was from my error in reading your call to lm. In my defense I am
reasonably sure the proper assignment to arguments is lm(formula= ...)
rather than lm(function= ...).

My dependent variable PBW and all the independent variables are
continuous variables. It is especially troubling since the order in
which I input independent variables determines whether or not it
gets a coefficient. ?Like I already mentioned, I checked the
correlation matrix and picked the variables with moderate to high
correlation with the independent variable. . So I guess it is not so
na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of str()
applied to each of the variables and then we could all stop guessing.

Aparna

I am using the lm() function in R to fit a dependent variable to a
set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me
to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:

David,
Thanks for the suggestions. No, I did not label my dependent  
variable "function".
That was from my error in reading your call to lm. In my defense I  
am reasonably sure the proper assignment to arguments is  
lm(formula= ...) rather than lm(function= ...).

My dependent variable PBW and all the independent variables are  
continuous variables. It is especially troubling since the order in  
which I input independent variables determines whether or not it  
gets a coefficient.  Like I already mentioned, I checked the  
correlation matrix and picked the variables with moderate to high  
correlation with the independent variable. . So I guess it is not  
so na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of  
str() applied to each of the variables and then we could all stop  
guessing.
I am going to take a wild stab in the dark here and suggest that 'NH4'  
is exactly correlated to or even identical to one of the other IVs  
used in the formula.

  set.seed(1)
  PBW <- rnorm(100)
  SO4 <- rnorm(100)
  NO3 <- rnorm(100)
  NH4 <- rnorm(100)

 > lm(PBW ~ SO4 + NO3 + NH4)

Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)

Coefficients:
(Intercept)          SO4          NO3          NH4
     0.11065     -0.00273      0.02096     -0.04826

Now watch:

NH4 <- NO3 * 1.5

 > lm(PBW ~ SO4 + NO3 + NH4)

Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)

Coefficients:
(Intercept)          SO4          NO3          NH4
   1.084e-01   -7.871e-05    1.596e-02           NA

 > cor(cbind(SO4, NO3, NH4))
             SO4         NO3         NH4
SO4  1.00000000 -0.04953621 -0.04953621
NO3 -0.04953621  1.00000000  1.00000000
NH4 -0.04953621  1.00000000  1.00000000

I suspect that there is a collinearity problem here. Aparna, post back  
with the correlation matrix of your IV's (full data set) and that  
should either support or refute my theory. If supported and you use:

 > summary(lm(PBW ~ SO4 + NO3 + NH4))

Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)

Residuals:
      Min       1Q   Median       3Q      Max
-2.30129 -0.60350  0.01765  0.58513  2.27806

Coefficients: (1 not defined because of singularities)
               Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.084e-01  9.083e-02   1.194    0.236
SO4         -7.871e-05  9.531e-02  -0.001    0.999
NO3          1.596e-02  8.827e-02   0.181    0.857
NH4                 NA         NA      NA       NA

Residual standard error: 0.9073 on 97 degrees of freedom
Multiple R-squared: 0.0003379,	Adjusted R-squared: -0.02027
F-statistic: 0.01639 on 2 and 97 DF,  p-value: 0.9837

Note the warning message about singularities for NH4.

BTW, as an aside, picking variables for a model based upon their  
correlation with the DV is not a good way to go. You might want to  
pick up a copy of Frank's book "Regression Modeling Strategies":

   http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS

HTH,

Marc Schwartz
Thanks Dimitri! Following exactly what you did, I wrote all my individual variable vectors to a data frame and used lm(formula,data) and this time it works for me too. 

Marc, your theory is correct.NH4 variable shares a strong correlation with one of the IV along with the DV. 
	SO4 	NO3	NH4	PBW
SO4	1           -0.0867	0.999	0.999
NO3	-0.0867   1	-0.0527	-0.0938
NH4	0.999	-0.0527   1	0.999
PBW	0.999	-0.0938	 0.999	1

Aparna 

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] 
Sent: Tuesday, April 21, 2009 9:02 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org; David Winsemius
Subject: Re: [R] Fitting linear models

I am not sure what the problem is.
I found no errors:

data<-read.csv(file.choose())  # I had to change your file extension
to .csv first
dim(data)
names(data)

lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1)  # Getting nice results
print(lm.model.2) # Getting same results

# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)

Dimitri
Attached are the first hundred rows of my data in comma separated format.
Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information.

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Tuesday, April 21, 2009 8:38 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:

David,
Thanks for the suggestions. No, I did not label my dependent
variable "function".
That was from my error in reading your call to lm. In my defense I am
reasonably sure the proper assignment to arguments is lm(formula= ...)
rather than lm(function= ...).

My dependent variable PBW and all the independent variables are
continuous variables. It is especially troubling since the order in
which I input independent variables determines whether or not it
gets a coefficient. ?Like I already mentioned, I checked the
correlation matrix and picked the variables with moderate to high
correlation with the independent variable. . So I guess it is not so
na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of str()
applied to each of the variables and then we could all stop guessing.

Aparna

I am using the lm() function in R to fit a dependent variable to a
set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me
to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com
But if the multicollinearity is so strong, then I am wondering why it
worked in the data frame as opposed to 4 seprate vectors? It should
not make any difference...
Dimitri
Thanks Dimitri! Following exactly what you did, I wrote all my individual variable vectors to a data frame and used lm(formula,data) and this time it works for me too.

Marc, your theory is correct.NH4 variable shares a strong correlation with one of the IV along with the DV.
? ? ? ?SO4 ? ? NO3 ? ? NH4 ? ? PBW
SO4 ? ? 1 ? ? ? ? ? -0.0867 ? ? 0.999 ? 0.999
NO3 ? ? -0.0867 ? 1 ? ? -0.0527 -0.0938
NH4 ? ? 0.999 ? -0.0527 ? 1 ? ? 0.999
PBW ? ? 0.999 ? -0.0938 ?0.999 ?1

Aparna

-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 9:02 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org; David Winsemius
Subject: Re: [R] Fitting linear models

I am not sure what the problem is.
I found no errors:

data<-read.csv(file.choose()) ?# I had to change your file extension
to .csv first
dim(data)
names(data)

lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1) ?# Getting nice results
print(lm.model.2) # Getting same results

# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)

Dimitri
On Tue, Apr 21, 2009 at 11:50 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
Attached are the first hundred rows of my data in comma separated format.
Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information.

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Tuesday, April 21, 2009 8:38 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org
Subject: Re: [R] Fitting linear models

On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:

David,
Thanks for the suggestions. No, I did not label my dependent
variable "function".
That was from my error in reading your call to lm. In my defense I am
reasonably sure the proper assignment to arguments is lm(formula= ...)
rather than lm(function= ...).

My dependent variable PBW and all the independent variables are
continuous variables. It is especially troubling since the order in
which I input independent variables determines whether or not it
gets a coefficient. ?Like I already mentioned, I checked the
correlation matrix and picked the variables with moderate to high
correlation with the independent variable. . So I guess it is not so
na?ve to expect a regression coefficient on all of them.

Dimitri
model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with;
model1<-lm(formula=PBW~SO4+NO3+NH4+0)

You could, of course, provide either the data or the results of str()
applied to each of the variables and then we could all stop guessing.

Aparna

I am using the lm() function in R to fit a dependent variable to a
set
of 3 to 5 independent variables. For this, I used the following
commands:

model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4
? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA

and

model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients:
(Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4
Na ? ? ? Cl
-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751
NA

In both cases, the last independent variable has a coefficient of NA
in
the result. I say last variable because, when I change the order of
the
variables, the coefficient changes (see below). Can anyone point me
to
the reason R behaves this way? ?Is there anyway for me to force R to
use
all the variables? I checked the correlation matrices to makes sure
there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you?
Please stop that.

Just a guess, ... since you have not provided enough information to do
otherwise, ... Are all of those variables 1/0 dummy variables? If so
and if you want to have an output that satisfies your need for
labeling the coefficients as you naively anticipate, then put "0+" at
the beginning of the formula or "-1" at the end, so that the intercept
will disappear and then all variables will get labeled as you expect.
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com

Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com