An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090420/fcc49c52/attachment-0001.pl>
Fitting linear models
16 messages · Vemuri, Aparna, Bert Gunter, David Winsemius +2 more
Is this homework? If so, you need to read the text and/or class notes more carefully. -- Bert Gunter -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Vemuri, Aparna Sent: Monday, April 20, 2009 4:26 PM To: r-help at r-project.org Subject: [R] Fitting linear models I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot. I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients:
(Intercept) SO4 NO3 NH4
0.01323 0.01968 0.01856 NA
and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) SO4 NO3 NH4 Na Cl -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables. Thanks Aparna model1<-lm(formula = PBW ~ SO4 + NH4 +NO3)
model1
Call:
lm(formula = PBW ~ SO4 + NH4 + NO3)
Coefficients:
(Intercept) SO4 NH4 NO3
0.01323 -0.00430 0.06394 NA
model2<-lm(formula = PBW ~ SO4 + NO3 + Na +Cl +NH4) model2
Call: lm(formula = PBW ~ SO4 + NO3 + Na + Cl + NH4) Coefficients: (Intercept) SO4 NO3 Na Cl NH4 -0.0006987 0.0196371 -0.0050303 0.0685020 0.0427431 NA ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) SO4 NO3 NH4 0.01323 0.01968 0.01856 NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) SO4 NO3 NH4 Na Cl -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
David Winsemius, MD Heritage Laboratories West Hartford, CT
Try: model1<-lm(PBW~SO4+NO3+NH4) Does it work? Dimitri
On Mon, Apr 20, 2009 at 7:26 PM, Vemuri, Aparna <avemuri at epri.com> wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot. I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? NO3 ? ? ?NH4 Na ? ? ? Cl ?-0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables. Thanks Aparna model1<-lm(formula = PBW ~ SO4 + NH4 +NO3)
model1
Call: lm(formula = PBW ~ SO4 + NH4 + NO3) Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ?NH4 ? ? ? ? ?NO3 ? ?0.01323 ? ? -0.00430 ? ? ?0.06394 ? ? ? ? ? NA
model2<-lm(formula = PBW ~ SO4 + NO3 + Na +Cl ?+NH4) model2
Call: lm(formula = PBW ~ SO4 + NO3 + Na + Cl + NH4) Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ? ? NO3 ? ? ? ? ? ? ? ? ? ? ? ?Na Cl ? ? ? ? ? ? ? ? ?NH4 ?-0.0006987 ? ?0.0196371 ? -0.0050303 ? ?0.0685020 ? ?0.0427431 NA ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models
On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) SO4 NO3 NH4 0.01323 0.01968 0.01856 NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) SO4 NO3 NH4 Na Cl -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
David Winsemius, MD Heritage Laboratories West Hartford, CT
Aparna, I should have been more explicit. Run ?lm . You'll see this: "lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)" So, in addition to specifying the formula, you have to specify the data frame in which you keep your variables. I assume they are in a data frame? (unless for some reasons you keep all variables as separate vectors). So, after you wrote the formula, you have to indicate the name of the data frame, for example "MyData": model1<-lm(PBW~SO4+NO3+NH4, MyData) Dimitri
On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: ?This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect. -- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
The variables are all in separate vectors. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:26 AM To: Vemuri, Aparna Cc: David Winsemius; r-help at r-project.org Subject: Re: [R] Fitting linear models Aparna, I should have been more explicit. Run ?lm . You'll see this: "lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)" So, in addition to specifying the formula, you have to specify the data frame in which you keep your variables. I assume they are in a data frame? (unless for some reasons you keep all variables as separate vectors). So, after you wrote the formula, you have to indicate the name of the data frame, for example "MyData": model1<-lm(PBW~SO4+NO3+NH4, MyData) Dimitri
On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: ?This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect. -- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Are they of the same length?
On Tue, Apr 21, 2009 at 11:31 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
The variables are all in separate vectors. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:26 AM To: Vemuri, Aparna Cc: David Winsemius; r-help at r-project.org Subject: Re: [R] Fitting linear models Aparna, I should have been more explicit. Run ?lm . You'll see this: "lm(formula, data, subset, weights, na.action, ? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, ? singular.ok = TRUE, contrasts = NULL, offset, ...)" So, in addition to specifying the formula, you have to specify the data frame in which you keep your variables. I assume they are in a data frame? (unless for some reasons you keep all variables as separate vectors). So, after you wrote the formula, you have to indicate the name of the data frame, for example "MyData": model1<-lm(PBW~SO4+NO3+NH4, MyData) Dimitri On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: ?This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect. -- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Yes, they are all of the same length. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:32 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models Are they of the same length?
On Tue, Apr 21, 2009 at 11:31 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
The variables are all in separate vectors. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:26 AM To: Vemuri, Aparna Cc: David Winsemius; r-help at r-project.org Subject: Re: [R] Fitting linear models Aparna, I should have been more explicit. Run ?lm . You'll see this: "lm(formula, data, subset, weights, na.action, ? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, ? singular.ok = TRUE, contrasts = NULL, offset, ...)" So, in addition to specifying the formula, you have to specify the data frame in which you keep your variables. I assume they are in a data frame? (unless for some reasons you keep all variables as separate vectors). So, after you wrote the formula, you have to indicate the name of the data frame, for example "MyData": model1<-lm(PBW~SO4+NO3+NH4, MyData) Dimitri On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: ?This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect. -- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Can we see your data to be able to replicate the error? Or maybe a subset of data with some fake variable names?
On Tue, Apr 21, 2009 at 11:32 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
Yes, they are all of the same length. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:32 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models Are they of the same length? On Tue, Apr 21, 2009 at 11:31 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
The variables are all in separate vectors. -----Original Message----- From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] Sent: Tuesday, April 21, 2009 8:26 AM To: Vemuri, Aparna Cc: David Winsemius; r-help at r-project.org Subject: Re: [R] Fitting linear models Aparna, I should have been more explicit. Run ?lm . You'll see this: "lm(formula, data, subset, weights, na.action, ? method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, ? singular.ok = TRUE, contrasts = NULL, offset, ...)" So, in addition to specifying the formula, you have to specify the data frame in which you keep your variables. I assume they are in a data frame? (unless for some reasons you keep all variables as separate vectors). So, after you wrote the formula, you have to indicate the name of the data frame, for example "MyData": model1<-lm(PBW~SO4+NO3+NH4, MyData) Dimitri On Tue, Apr 21, 2009 at 11:12 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function". My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before. Bert: ?This is not homework. But I will remember to do my research before posting here. Aparna -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Monday, April 20, 2009 5:35 PM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 20, 2009, at 7:26 PM, Vemuri, Aparna wrote:
I am not sure if this is an R-users question, but since most of you here are statisticians, I decided to give it a shot.
You can omit the unnecessary preambles.
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? ?0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ? SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect. -- David Winsemius, MD Heritage Laboratories West Hartford, CT
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
-- Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
Aparna
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) SO4 NO3 NH4 0.01323 0.01968 0.01856 NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) SO4 NO3 NH4 Na Cl -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
David Winsemius, MD Heritage Laboratories West Hartford, CT
Attached are the first hundred rows of my data in comma separated format. Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information. -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Tuesday, April 21, 2009 8:38 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models
On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
Aparna
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) SO4 NO3 NH4 0.01323 0.01968 0.01856 NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) SO4 NO3 NH4 Na Cl -0.0006987 -0.0119750 -0.0295042 0.0842989 0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
David Winsemius, MD Heritage Laboratories West Hartford, CT -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Vemuri-Rhelp-sample.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090421/792096ca/attachment-0002.txt>
I am not sure what the problem is.
I found no errors:
data<-read.csv(file.choose()) # I had to change your file extension
to .csv first
dim(data)
names(data)
lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1) # Getting nice results
print(lm.model.2) # Getting same results
# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)
Dimitri
On Tue, Apr 21, 2009 at 11:50 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
Attached are the first hundred rows of my data in comma separated format. Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information. -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Tuesday, April 21, 2009 8:38 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
Aparna
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
On Apr 21, 2009, at 10:37 AM, David Winsemius wrote:
On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
I am going to take a wild stab in the dark here and suggest that 'NH4'
is exactly correlated to or even identical to one of the other IVs
used in the formula.
set.seed(1)
PBW <- rnorm(100)
SO4 <- rnorm(100)
NO3 <- rnorm(100)
NH4 <- rnorm(100)
> lm(PBW ~ SO4 + NO3 + NH4)
Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)
Coefficients:
(Intercept) SO4 NO3 NH4
0.11065 -0.00273 0.02096 -0.04826
Now watch:
NH4 <- NO3 * 1.5
> lm(PBW ~ SO4 + NO3 + NH4)
Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)
Coefficients:
(Intercept) SO4 NO3 NH4
1.084e-01 -7.871e-05 1.596e-02 NA
> cor(cbind(SO4, NO3, NH4))
SO4 NO3 NH4
SO4 1.00000000 -0.04953621 -0.04953621
NO3 -0.04953621 1.00000000 1.00000000
NH4 -0.04953621 1.00000000 1.00000000
I suspect that there is a collinearity problem here. Aparna, post back
with the correlation matrix of your IV's (full data set) and that
should either support or refute my theory. If supported and you use:
> summary(lm(PBW ~ SO4 + NO3 + NH4))
Call:
lm(formula = PBW ~ SO4 + NO3 + NH4)
Residuals:
Min 1Q Median 3Q Max
-2.30129 -0.60350 0.01765 0.58513 2.27806
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.084e-01 9.083e-02 1.194 0.236
SO4 -7.871e-05 9.531e-02 -0.001 0.999
NO3 1.596e-02 8.827e-02 0.181 0.857
NH4 NA NA NA NA
Residual standard error: 0.9073 on 97 degrees of freedom
Multiple R-squared: 0.0003379, Adjusted R-squared: -0.02027
F-statistic: 0.01639 on 2 and 97 DF, p-value: 0.9837
Note the warning message about singularities for NH4.
BTW, as an aside, picking variables for a model based upon their
correlation with the DV is not a good way to go. You might want to
pick up a copy of Frank's book "Regression Modeling Strategies":
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS
HTH,
Marc Schwartz
Thanks Dimitri! Following exactly what you did, I wrote all my individual variable vectors to a data frame and used lm(formula,data) and this time it works for me too.
Marc, your theory is correct.NH4 variable shares a strong correlation with one of the IV along with the DV.
SO4 NO3 NH4 PBW
SO4 1 -0.0867 0.999 0.999
NO3 -0.0867 1 -0.0527 -0.0938
NH4 0.999 -0.0527 1 0.999
PBW 0.999 -0.0938 0.999 1
Aparna
-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 9:02 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org; David Winsemius
Subject: Re: [R] Fitting linear models
I am not sure what the problem is.
I found no errors:
data<-read.csv(file.choose()) # I had to change your file extension
to .csv first
dim(data)
names(data)
lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1) # Getting nice results
print(lm.model.2) # Getting same results
# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)
Dimitri
On Tue, Apr 21, 2009 at 11:50 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
Attached are the first hundred rows of my data in comma separated format. Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information. -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Tuesday, April 21, 2009 8:38 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
Aparna
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
But if the multicollinearity is so strong, then I am wondering why it worked in the data frame as opposed to 4 seprate vectors? It should not make any difference... Dimitri
On Tue, Apr 21, 2009 at 12:21 PM, Vemuri, Aparna <avemuri at epri.com> wrote:
Thanks Dimitri! Following exactly what you did, I wrote all my individual variable vectors to a data frame and used lm(formula,data) and this time it works for me too.
Marc, your theory is correct.NH4 variable shares a strong correlation with one of the IV along with the DV.
? ? ? ?SO4 ? ? NO3 ? ? NH4 ? ? PBW
SO4 ? ? 1 ? ? ? ? ? -0.0867 ? ? 0.999 ? 0.999
NO3 ? ? -0.0867 ? 1 ? ? -0.0527 -0.0938
NH4 ? ? 0.999 ? -0.0527 ? 1 ? ? 0.999
PBW ? ? 0.999 ? -0.0938 ?0.999 ?1
Aparna
-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Tuesday, April 21, 2009 9:02 AM
To: Vemuri, Aparna
Cc: r-help at r-project.org; David Winsemius
Subject: Re: [R] Fitting linear models
I am not sure what the problem is.
I found no errors:
data<-read.csv(file.choose()) ?# I had to change your file extension
to .csv first
dim(data)
names(data)
lapply(data,function(x){sum(is.na(x))})
lm.model.1<-lm(PBW~SO4+NO3+NH4,data)
lm.model.2<-lm(PBW~SO4+NH4+NO3,data)
print(lm.model.1) ?# Getting nice results
print(lm.model.2) # Getting same results
# Another method (gets exactly the same results):
library(Design)
ols.model.1<-ols(PBW~SO4+NO3+NH4,data)
ols.model.2<-ols(PBW~SO4+NH4+NO3,data)
Dimitri
On Tue, Apr 21, 2009 at 11:50 AM, Vemuri, Aparna <avemuri at epri.com> wrote:
Attached are the first hundred rows of my data in comma separated format. Forcing the regression line through the origin, still does not give a coefficient on the last independent variable. Also, I don't mind if there is a coefficient on the dependent axis. I just want all of the variables to have coefficients in the regression equation or a at least a consistent result, irrespective of the order of input information. -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Tuesday, April 21, 2009 8:38 AM To: Vemuri, Aparna Cc: r-help at r-project.org Subject: Re: [R] Fitting linear models On Apr 21, 2009, at 11:12 AM, Vemuri, Aparna wrote:
David, Thanks for the suggestions. No, I did not label my dependent variable "function".
That was from my error in reading your call to lm. In my defense I am reasonably sure the proper assignment to arguments is lm(formula= ...) rather than lm(function= ...).
My dependent variable PBW and all the independent variables are continuous variables. It is especially troubling since the order in which I input independent variables determines whether or not it gets a coefficient. ?Like I already mentioned, I checked the correlation matrix and picked the variables with moderate to high correlation with the independent variable. . So I guess it is not so na?ve to expect a regression coefficient on all of them. Dimitri model1<-lm(PBW~SO4+NO3+NH4), gives me the same result as before.
Did you get the expected results with; model1<-lm(formula=PBW~SO4+NO3+NH4+0) You could, of course, provide either the data or the results of str() applied to each of the variables and then we could all stop guessing.
Aparna
I am using the lm() function in R to fit a dependent variable to a set of 3 to 5 independent variables. For this, I used the following commands:
model1<-lm(function=PBW~SO4+NO3+NH4)
Coefficients: (Intercept) ? ? ? ? ?SO4 ? ? ? ? ?NO3 ? ? ?NH4 ? 0.01323 ? ? ?0.01968 ? ? ?0.01856 ? ? ? ? ? NA and
model2<-lm(function=PBW~SO4+NO3+NH4+Na+Cl)
Coefficients: (Intercept) ? ? ? ? ? ? ?SO4 ? ? ? ? ? ? ? ? ?NO3 ? ? ?NH4 Na ? ? ? Cl -0.0006987 ? -0.0119750 ? -0.0295042 ? ?0.0842989 ? ?0.1344751 NA In both cases, the last independent variable has a coefficient of NA in the result. I say last variable because, when I change the order of the variables, the coefficient changes (see below). Can anyone point me to the reason R behaves this way? ?Is there anyway for me to force R to use all the variables? I checked the correlation matrices to makes sure there is no orthogonality between the variables.
You really did not name your dependent variable "function" did you? Please stop that. Just a guess, ... since you have not provided enough information to do otherwise, ... Are all of those variables 1/0 dummy variables? If so and if you want to have an output that satisfies your need for labeling the coefficients as you naively anticipate, then put "0+" at the beginning of the formula or "-1" at the end, so that the intercept will disappear and then all variables will get labeled as you expect.
-- David Winsemius, MD Heritage Laboratories West Hartford, CT
-- Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com
Dimitri Liakhovitski MarketTools, Inc. Dimitri.Liakhovitski at markettools.com