What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients? -- View this message in context: http://r.789695.n4.nabble.com/Get-variable-names-from-results-of-lm-tp4631095.html Sent from the R help mailing list archive at Nabble.com.
Get variable names from results of lm()
8 messages · jdub, R. Michael Weylandt, Peter Ehlers +3 more
Hi,
I assume this is what you are looking.
?variable.names()
?x <- 1:20
????? y <-? x + (x/4 - 2)^3 + rnorm(20, sd=3)
????? names(y) <- paste("O",x,sep=".")
????? ww <- rep(1,20); ww[13] <- 0
????? summary(lmxy <- lm(y ~ x + I(x^2)+I(x^3) + I((x-10)^2),
???????????????????????? weights = ww), cor = TRUE)
variable.names(lmxy)
[1] "(Intercept)" "x"?????????? "I(x^2)"????? "I(x^3)"??
A.K.
----- Original Message -----
From: jdub <jack at ramas.com>
To: r-help at r-project.org
Cc:
Sent: Wednesday, May 23, 2012 10:58 AM
Subject: [R] Get variable names from results of lm()
What is the best way to get the variable names used in lm() from its
results?
Stumbling around I found I could get the response variable name from
myMod$terms[[2]]
but using
myMod$terms[[1 ]]
gives a tilda.
I found the names buried in other places in the model object and in the
summary of the model, but is there a more direct way, similar to using
coef(myMod) to get the coefficients?
--
View this message in context: http://r.789695.n4.nabble.com/Get-variable-names-from-results-of-lm-tp4631095.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
I think the easiest that comes to mind is simply names(coef(myMod)) But did you look at myMod$terms[[3]] ? That seems to be the RHS of the formula input (in the few cases I tried) Best, Michael
On Wed, May 23, 2012 at 10:58 AM, jdub <jack at ramas.com> wrote:
What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients? -- View this message in context: http://r.789695.n4.nabble.com/Get-variable-names-from-results-of-lm-tp4631095.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 2012-05-23 09:55, R. Michael Weylandt wrote:
I think the easiest that comes to mind is simply names(coef(myMod)) But did you look at myMod$terms[[3]] ? That seems to be the RHS of the formula input (in the few cases I tried) Best, Michael
It depends a bit on just what the OP wants. In case one of the
predictors is a factor, say 'grp' with levels c('A','B','C'), the
coefs will include the names 'grpB' and 'grpC'. If only the
name 'grp' is wanted, one could use myMod$terms[[3]] or
equivalently formula(myMod)[[3]]. It might be instructive for the
OP to look at as.list(formula(myMod)). formula(myMod) is a
language object which has the tilde operator operate on
the LHS (component 2) and the RHS (component 3).
Peter Ehlers
On Wed, May 23, 2012 at 10:58 AM, jdub<jack at ramas.com> wrote:
What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients? -- View this message in context: http://r.789695.n4.nabble.com/Get-variable-names-from-results-of-lm-tp4631095.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On May 23, 2012, at 1:42 PM, Peter Ehlers wrote:
On 2012-05-23 09:55, R. Michael Weylandt wrote:
I think the easiest that comes to mind is simply names(coef(myMod)) But did you look at myMod$terms[[3]] ? That seems to be the RHS of the formula input (in the few cases I tried) Best, Michael
It depends a bit on just what the OP wants. In case one of the
predictors is a factor, say 'grp' with levels c('A','B','C'), the
coefs will include the names 'grpB' and 'grpC'. If only the
name 'grp' is wanted, one could use myMod$terms[[3]] or
equivalently formula(myMod)[[3]]. It might be instructive for the
OP to look at as.list(formula(myMod)). formula(myMod) is a
language object which has the tilde operator operate on
the LHS (component 2) and the RHS (component 3).
Peter Ehlers
Just to throw out another solution here, the function ?all.vars is helpful: LM <- lm(Petal.Length ~ ., data = iris)
formula(LM)
Petal.Length ~ Sepal.Length + Sepal.Width + Petal.Width + Species
all.vars(formula(LM))
[1] "Petal.Length" "Sepal.Length" "Sepal.Width" "Petal.Width" [5] "Species" Regards, Marc Schwartz
On Wed, May 23, 2012 at 10:58 AM, jdub<jack at ramas.com> wrote:
What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients?
You might want to look at https://stat.ethz.ch/pipermail/r-help/2012-April/310582.html Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Marc Schwartz Sent: Wednesday, May 23, 2012 12:13 PM To: Peter Ehlers Cc: r-help at r-project.org; jdub Subject: Re: [R] Get variable names from results of lm() On May 23, 2012, at 1:42 PM, Peter Ehlers wrote:
On 2012-05-23 09:55, R. Michael Weylandt wrote:
I think the easiest that comes to mind is simply names(coef(myMod)) But did you look at myMod$terms[[3]] ? That seems to be the RHS of the formula input (in the few cases I tried) Best, Michael
It depends a bit on just what the OP wants. In case one of the
predictors is a factor, say 'grp' with levels c('A','B','C'), the
coefs will include the names 'grpB' and 'grpC'. If only the
name 'grp' is wanted, one could use myMod$terms[[3]] or
equivalently formula(myMod)[[3]]. It might be instructive for the
OP to look at as.list(formula(myMod)). formula(myMod) is a
language object which has the tilde operator operate on
the LHS (component 2) and the RHS (component 3).
Peter Ehlers
Just to throw out another solution here, the function ?all.vars is helpful: LM <- lm(Petal.Length ~ ., data = iris)
formula(LM)
Petal.Length ~ Sepal.Length + Sepal.Width + Petal.Width + Species
all.vars(formula(LM))
[1] "Petal.Length" "Sepal.Length" "Sepal.Width" "Petal.Width" [5] "Species" Regards, Marc Schwartz
On Wed, May 23, 2012 at 10:58 AM, jdub<jack at ramas.com> wrote:
What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients?
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Marc,
Just to point out some difference,
?x <- 1:20
? y <-? x + (x/4 - 2)^3 + rnorm(20, sd=3)
????? names(y) <- paste("O",x,sep=".")
?????? ww <- rep(1,20); ww[13] <- 0
????? summary(lmxy <- lm(y ~ x + I(x^2)+I(x^3) + I((x-10)^2),
????????????????????????? weights = ww), cor = TRUE)
all.vars(formula(lmxy))
[1] "y" "x"
variable.names(lmxy)
[1] "(Intercept)" "x"?????????? "I(x^2)"????? "I(x^3)"?? A.K. ----- Original Message ----- From: Marc Schwartz <marc_schwartz at me.com> To: Peter Ehlers <ehlers at ucalgary.ca> Cc: "r-help at r-project.org" <r-help at r-project.org>; jdub <jack at ramas.com> Sent: Wednesday, May 23, 2012 3:12 PM Subject: Re: [R] Get variable names from results of lm()
On May 23, 2012, at 1:42 PM, Peter Ehlers wrote:
On 2012-05-23 09:55, R. Michael Weylandt wrote:
I think the easiest that comes to mind is simply names(coef(myMod)) But did you look at myMod$terms[[3]] ? That seems to be the RHS of the formula input (in the few cases I tried) Best, Michael
It depends a bit on just what the OP wants. In case one of the
predictors is a factor, say 'grp' with levels c('A','B','C'), the
coefs will include the names 'grpB' and 'grpC'. If only the
name 'grp' is wanted, one could use myMod$terms[[3]] or
equivalently formula(myMod)[[3]]. It might be instructive for the
OP to look at as.list(formula(myMod)). formula(myMod) is a
language object which has the tilde operator operate on
the LHS (component 2) and the RHS (component 3).
Peter Ehlers
Just to throw out another solution here, the function ?all.vars is helpful: LM <- lm(Petal.Length ~ ., data = iris)
formula(LM)
Petal.Length ~ Sepal.Length + Sepal.Width + Petal.Width + Species
all.vars(formula(LM))
[1] "Petal.Length" "Sepal.Length" "Sepal.Width"? "Petal.Width" [5] "Species"? Regards, Marc Schwartz
On Wed, May 23, 2012 at 10:58 AM, jdub<jack at ramas.com>? wrote:
What is the best way to get the variable names used in lm() from its results? Stumbling around I found I could get the response variable name from myMod$terms[[2]] but using myMod$terms[[1 ]] gives a tilda. I found the names buried in other places in the model object and in the summary of the model, but is there a more direct way, similar to using coef(myMod) to get the coefficients?
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On May 23, 2012, at 2:52 PM, arun wrote:
Hi Marc,
Just to point out some difference,
x <- 1:20
y <- x + (x/4 - 2)^3 + rnorm(20, sd=3)
names(y) <- paste("O",x,sep=".")
ww <- rep(1,20); ww[13] <- 0
summary(lmxy <- lm(y ~ x + I(x^2)+I(x^3) + I((x-10)^2),
weights = ww), cor = TRUE)
all.vars(formula(lmxy))
[1] "y" "x"
variable.names(lmxy)
[1] "(Intercept)" "x" "I(x^2)" "I(x^3)" A.K.
<snip> Hi Arun, Note that as long as the model terms are not factors (and other terms that get 'expanded'), the above will return the names of the terms, plus of course the intercept. I suspect however, that in your example, you might want:
variable.names(lmxy, full = TRUE)
[1] "(Intercept)" "x" "I(x^2)" "I(x^3)" [5] "I((x - 10)^2)" since the last term was dropped in your output. Note that you would get essentially the same information from:
names(coef(lmxy))
[1] "(Intercept)" "x" "I(x^2)" "I(x^3)" [5] "I((x - 10)^2)" again, with no factors present. However, with factors present, consider: LM <- lm(Sepal.Length ~ ., data = iris)
all.vars(formula(LM))
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" [5] "Species" versus:
variable.names(LM)
[1] "(Intercept)" "Sepal.Width" "Petal.Length" [4] "Petal.Width" "Speciesversicolor" "Speciesvirginica" That does no better than:
names(coef(LM))
[1] "(Intercept)" "Sepal.Width" "Petal.Length" [4] "Petal.Width" "Speciesversicolor" "Speciesvirginica" This is because variable.names() is essentially getting its information from:
colnames(lmxy$qr$qr)
[1] "(Intercept)" "x" "I(x^2)" "I(x^3)" [5] "I((x - 10)^2)"
colnames(LM$qr$qr)
[1] "(Intercept)" "Sepal.Width" "Petal.Length" [4] "Petal.Width" "Speciesversicolor" "Speciesvirginica" Other options include:
labels(terms(lmxy))
[1] "x" "I(x^2)" "I(x^3)" "I((x - 10)^2)"
labels(terms(LM))
[1] "Sepal.Width" "Petal.Length" "Petal.Width" "Species" which gets the information from the 'term.labels' attribute of the model terms object, which is the RHS:
attr(terms(LM), "term.labels")
[1] "Sepal.Width" "Petal.Length" "Petal.Width" "Species" You could also use:
colnames(model.frame(LM))
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" [5] "Species"
colnames(model.frame(lmxy))
[1] "y" "x" "I(x^2)" "I(x^3)" [5] "I((x - 10)^2)" "(weights)" This gives slightly different information, but shows that there is more than one way to get information from an R object, depending upon needs. Let me throw in another twist into the mix:
variable.names(lm(y ~ poly(x, 3)))
[1] "(Intercept)" "poly(x, 3)1" "poly(x, 3)2" "poly(x, 3)3"
all.vars(formula(lm(y ~ poly(x, 3))))
[1] "y" "x"
labels(terms(lm(y ~ poly(x, 3))))
[1] "poly(x, 3)"
colnames(model.frame(lm(y ~ poly(x, 3))))
[1] "y" "poly(x, 3)" Which output do you want? That will be dependent upon use case. One needs to be cautious in proposing a generic solution to an underlying problem that needs to be precisely defined. Food for thought... Regards, Marc