Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
Loop with string variable AND customizable "summary" output
14 messages · C.Rosa at lse.ac.uk, Roger Bivand, Wensui Liu +4 more
On Mon, 29 Jan 2007 C.Rosa at lse.ac.uk wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric
vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one
independent variable (y) for each country (vale a dire: yUK, xUK, yUSA,
xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output:
"outputUK" and "outputUSA", which contain respectively the results of
the first and second regression (yUK on xUK and yUSA on xUSA).
The input data could be reshaped as y, x, country, and subset= used in the lm() call. To assign to named objects see assign(), but consider using a named list instead, assigning to a list of the required length in turn, and giving the names from the defining vector. Then you'd get output$UK, etc.
2) in STATA there is a very nice code ("outreg") to display nicely (and
as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely,
I am thinking of something that is close in spirit to "summary" but it
is also customizable. For example, suppose you want different Signif.
codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different
format display (i.e. without "t value" column) implemented automatically
(without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code
of the function "summary", but I am not able to see its (line by line)
code. Any idea?
Use a custom function on the output object from using the summary() method on the lm object (that is on the summary.lm object). Use str() to look at the summary.lm object to see what you want.
Or may be a customizable regression output already exists? Thanks really a lot! Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Carlo,
try something like:
for (i in c("UK","USA"))
{
summ<-summary(lm(y ~ x), subset = (country = i))
assign(paste('output', i, sep = ''), summ);
}
(note: it is untested, sorry).
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog)
C.Rosa wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric
vector? I have in mind a specific problem:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output:
"outputUK" and "outputUSA", which contain respectively the results of the
first and second regression (yUK on xUK and yUSA on xUSA).
Consider R functions bquote, substitute, eval and parse. Several examples are given somewhere in RNews (http://cran.r-project.org/doc/Rnews/) Unfortunately I don't remember exactly which issue, one of list members sent me a link to the article several years ago, when I was studying similar question.
C.Rosa wrote:
2) I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time). In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Stars and significance codes are printed with the symnum function. To customize the summary, explore the result returned by the lm. For example, str(outputUK) you will see, it is a list. Then you will be able to reference its elements with $ (say, outputUK$coeff) R is an object oriented language, and calls of the same function on different objects usually invoke different functions (if a class has a description of proper method). The R manuals contain very good description of this mechanism. Function methods gives you a list of all defined methods For example
methods(summary) methods(print)
If you are working with the lm results, you need to explore the function print.summary.lm
summary(outputUK)
invokes summary.lm function, as outputUK is the object of class "lm". This function produces the object of class "summary.lm" Then this object is printed with the method print.summary.lm
View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8691620 Sent from the R help mailing list archive at Nabble.com.
That is
C.Rosa wrote:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
for (i in c("UK","USA")) {
lm.txt<-paste("output",i,"<-","lm(","y",i,"x",i,")",sep="") # 1. produce a
character string containing needed expression
eval(parse(text=lm.txt)) #
2. parse and evaluate it
}
View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8692041 Sent from the R help mailing list archive at Nabble.com.
Vladimir Eremeev wrote:
That is C.Rosa wrote:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
for (i in c("UK","USA")) {
lm.txt<-paste("output",i,"<-","lm(","y",i,"~","x",i,")",sep="") # 1.
produce a character string containing needed expression
eval(parse(text=lm.txt))
# 2. parse and evaluate it
}
View this message in context: http://www.nabble.com/-R--Loop-with-string-variable-AND-customizable-%22summary%22-output-tf3136358.html#a8692073 Sent from the R help mailing list archive at Nabble.com.
Dear All,
Thank you very much for your help!
Carlo
-----Original Message-----
From: Wensui Liu [mailto:liuwensui at gmail.com]
Sent: Mon 29/01/2007 15:39
To: Rosa,C
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Carlo,
try something like:
for (i in c("UK","USA"))
{
summ<-summary(lm(y ~ x), subset = (country = i))
assign(paste('output', i, sep = ''), summ);
}
(note: it is untested, sorry).
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog)
Often you will find that if you arrange your data in a
desirable way in the first place everything becomes
easier. What you really want is a data frame such
as the last three columns of the builtin data frame
CO2 where Treatment corresponds to country and
the two numeric variables correspond to your y and x.
Then its easy:
lapply(levels(CO2$Treatment), function(lev)
lm(uptake ~ conc, CO2, subset = Treatment == lev))
The only problem with the above is that the Call: in the
output does not really tell you which level of Treatment
is being used since it literally shows
"lm(uptake ~ conc, CO2, subset = Treatment == lev)"
each time. To get around substitute the value of lev in.
Because R uses delayed evaluation you also need to force the
evaluation of lev prior to substituting it in:
lapply(levels(CO2$Treatment), function(lev) {
lev <- force(lev)
eval(substitute(lm(uptake ~ conc, CO2, subset = Treatment == lev)),
list(lev = lev))
})
Now if you really want to do it the way you specified originally
try this.
Suppose we use attach to grab the variables
x1, x2, x3, x4, y1, y2, y3, y4 out of the builtin
anscombe data frame for purposes of getting
our hands on some sample data. In your case
the variables would already be in the workspace
so the attach is not needed.
Then simply reconstruct the formula in fo. You
could simply use lm(fo) but then the Call: in the
output of lm would literally read lm(fo) so its
better to use do.call:
# next line gives the variables x1, x2, x3, x4, y1, y2, y3, y4
# from the builtin ancombe data set.
# In your case such variables would already exist.
attach(anscombe)
lapply(1:4, function(i) {
ynm <- paste("y", i, sep = "")
xnm <- paste("x", i, sep = "")
fo <- as.formula(paste(ynm, "~", xnm))
do.call("lm", list(fo))
})
detach(anscombe)
Or if all the variables have the same length you could use
a form such as ancombe in the first place:
Actually this is not really a recommended way of
proceeding. You would be better off putting all
your variables in a data frame and using that.
lapply(1:4, function(i) {
fo <- as.formula(paste(names(anscombe)[i+4], "~", names(anscombe)[i]))
do.call("lm", list(fo, data = quote(anscombe)))
})
or
lapply(1:4, function(i) {
fo <- y ~ x
fo[[2]] <- as.name(names(anscombe)[i+4])
fo[[3]] <- as.name(names(anscombe)[i])
do.call("lm", list(fo, data = quote(anscombe)))
})
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
In thinking about this a bit more here is an even shorter one yet it does show the level in the Call output. See ?bquote lapply(levels(CO2$Treatment), function(lev) eval(bquote(lm(uptake ~ conc, CO2, subset = Treatment == .(lev)))))
On 1/29/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
Often you will find that if you arrange your data in a
desirable way in the first place everything becomes
easier. What you really want is a data frame such
as the last three columns of the builtin data frame
CO2 where Treatment corresponds to country and
the two numeric variables correspond to your y and x.
Then its easy:
lapply(levels(CO2$Treatment), function(lev)
lm(uptake ~ conc, CO2, subset = Treatment == lev))
The only problem with the above is that the Call: in the
output does not really tell you which level of Treatment
is being used since it literally shows
"lm(uptake ~ conc, CO2, subset = Treatment == lev)"
each time. To get around substitute the value of lev in.
Because R uses delayed evaluation you also need to force the
evaluation of lev prior to substituting it in:
lapply(levels(CO2$Treatment), function(lev) {
lev <- force(lev)
eval(substitute(lm(uptake ~ conc, CO2, subset = Treatment == lev)),
list(lev = lev))
})
Now if you really want to do it the way you specified originally
try this.
Suppose we use attach to grab the variables
x1, x2, x3, x4, y1, y2, y3, y4 out of the builtin
anscombe data frame for purposes of getting
our hands on some sample data. In your case
the variables would already be in the workspace
so the attach is not needed.
Then simply reconstruct the formula in fo. You
could simply use lm(fo) but then the Call: in the
output of lm would literally read lm(fo) so its
better to use do.call:
# next line gives the variables x1, x2, x3, x4, y1, y2, y3, y4
# from the builtin ancombe data set.
# In your case such variables would already exist.
attach(anscombe)
lapply(1:4, function(i) {
ynm <- paste("y", i, sep = "")
xnm <- paste("x", i, sep = "")
fo <- as.formula(paste(ynm, "~", xnm))
do.call("lm", list(fo))
})
detach(anscombe)
Or if all the variables have the same length you could use
a form such as ancombe in the first place:
Actually this is not really a recommended way of
proceeding. You would be better off putting all
your variables in a data frame and using that.
lapply(1:4, function(i) {
fo <- as.formula(paste(names(anscombe)[i+4], "~", names(anscombe)[i]))
do.call("lm", list(fo, data = quote(anscombe)))
})
or
lapply(1:4, function(i) {
fo <- y ~ x
fo[[2]] <- as.name(names(anscombe)[i+4])
fo[[3]] <- as.name(names(anscombe)[i])
do.call("lm", list(fo, data = quote(anscombe)))
})
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Prior answers are certainly correct, but this is where lists and lapply
shine:
result<-lapply(list(UK,USA),function(z)summary(lm(y~x,data=z)))
As in (nearly) all else, simplicity is a virtue.
If you prefer to keep the data sources as a character vector,dataNames,
result<-lapply(dataNames,function(z)summary(lm(y~x,data=get(z))))
should work.
Note: both of these are untested for the general case where they might be
used within a function and may not find the right z unless you pay attention
to scope, especially in the get() construction.
Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of C.Rosa at lse.ac.uk
Sent: Monday, January 29, 2007 8:23 AM
To: liuwensui at gmail.com; bcarvalh at jhsph.edu; Roger.Bivand at nhh.no
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Dear All,
Thank you very much for your help!
Carlo
-----Original Message-----
From: Wensui Liu [mailto:liuwensui at gmail.com]
Sent: Mon 29/01/2007 15:39
To: Rosa,C
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Carlo,
try something like:
for (i in c("UK","USA"))
{
summ<-summary(lm(y ~ x), subset = (country = i))
assign(paste('output', i, sep = ''), summ);
}
(note: it is untested, sorry).
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All, I am using R for my research and I have two questions about it: 1) is it possible to create a loop using a string, instead of a numeric
vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one
independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output:
"outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as
the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I
am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of
the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists? Thanks really a lot! Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
And yet one more. This one does not use eval but uses do.call, quote
and bquote instead:
lapply(levels(CO2$Treatment), function(lev) do.call("lm",
list(uptake ~ conc, quote(CO2), subset = bquote(Treatment == .(lev)))))
On 1/29/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
In thinking about this a bit more here is an even shorter one yet it does show the level in the Call output. See ?bquote lapply(levels(CO2$Treatment), function(lev) eval(bquote(lm(uptake ~ conc, CO2, subset = Treatment == .(lev))))) On 1/29/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
Often you will find that if you arrange your data in a
desirable way in the first place everything becomes
easier. What you really want is a data frame such
as the last three columns of the builtin data frame
CO2 where Treatment corresponds to country and
the two numeric variables correspond to your y and x.
Then its easy:
lapply(levels(CO2$Treatment), function(lev)
lm(uptake ~ conc, CO2, subset = Treatment == lev))
The only problem with the above is that the Call: in the
output does not really tell you which level of Treatment
is being used since it literally shows
"lm(uptake ~ conc, CO2, subset = Treatment == lev)"
each time. To get around substitute the value of lev in.
Because R uses delayed evaluation you also need to force the
evaluation of lev prior to substituting it in:
lapply(levels(CO2$Treatment), function(lev) {
lev <- force(lev)
eval(substitute(lm(uptake ~ conc, CO2, subset = Treatment == lev)),
list(lev = lev))
})
Now if you really want to do it the way you specified originally
try this.
Suppose we use attach to grab the variables
x1, x2, x3, x4, y1, y2, y3, y4 out of the builtin
anscombe data frame for purposes of getting
our hands on some sample data. In your case
the variables would already be in the workspace
so the attach is not needed.
Then simply reconstruct the formula in fo. You
could simply use lm(fo) but then the Call: in the
output of lm would literally read lm(fo) so its
better to use do.call:
# next line gives the variables x1, x2, x3, x4, y1, y2, y3, y4
# from the builtin ancombe data set.
# In your case such variables would already exist.
attach(anscombe)
lapply(1:4, function(i) {
ynm <- paste("y", i, sep = "")
xnm <- paste("x", i, sep = "")
fo <- as.formula(paste(ynm, "~", xnm))
do.call("lm", list(fo))
})
detach(anscombe)
Or if all the variables have the same length you could use
a form such as ancombe in the first place:
Actually this is not really a recommended way of
proceeding. You would be better off putting all
your variables in a data frame and using that.
lapply(1:4, function(i) {
fo <- as.formula(paste(names(anscombe)[i+4], "~", names(anscombe)[i]))
do.call("lm", list(fo, data = quote(anscombe)))
})
or
lapply(1:4, function(i) {
fo <- y ~ x
fo[[2]] <- as.name(names(anscombe)[i+4])
fo[[3]] <- as.name(names(anscombe)[i])
do.call("lm", list(fo, data = quote(anscombe)))
})
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All,
I am using R for my research and I have two questions about it:
1) is it possible to create a loop using a string, instead of a numeric vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output: "outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists?
Thanks really a lot!
Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Or, to throw yet another couple of possibilities into the mix:
lapply(split(YourDF, YourDF$country),
function(x) summary(lm(y ~ x, data = x))
and:
library(nlme)
summary(lmList(y ~ x | country, YourDF))
See ?split and help(lmList, package = nlme)
HTH,
Marc Schwartz
On Mon, 2007-01-29 at 09:03 -0800, Bert Gunter wrote:
Prior answers are certainly correct, but this is where lists and lapply
shine:
result<-lapply(list(UK,USA),function(z)summary(lm(y~x,data=z)))
As in (nearly) all else, simplicity is a virtue.
If you prefer to keep the data sources as a character vector,dataNames,
result<-lapply(dataNames,function(z)summary(lm(y~x,data=get(z))))
should work.
Note: both of these are untested for the general case where they might be
used within a function and may not find the right z unless you pay attention
to scope, especially in the get() construction.
Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of C.Rosa at lse.ac.uk
Sent: Monday, January 29, 2007 8:23 AM
To: liuwensui at gmail.com; bcarvalh at jhsph.edu; Roger.Bivand at nhh.no
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Dear All,
Thank you very much for your help!
Carlo
-----Original Message-----
From: Wensui Liu [mailto:liuwensui at gmail.com]
Sent: Mon 29/01/2007 15:39
To: Rosa,C
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Carlo,
try something like:
for (i in c("UK","USA"))
{
summ<-summary(lm(y ~ x), subset = (country = i))
assign(paste('output', i, sep = ''), summ);
}
(note: it is untested, sorry).
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All, I am using R for my research and I have two questions about it: 1) is it possible to create a loop using a string, instead of a numeric
vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one
independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output:
"outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as
the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I
am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of
the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists? Thanks really a lot! Carlo
Note that the nlme solution seems to give the same coefficients but appears to use a single error term rather than one error term per level of the conditioning variable and that would change various other statistics relative to the other solutions should that matter.
summary(lmList(uptake ~ conc | Treatment, CO2))
Call:
Model: uptake ~ conc | Treatment
Data: CO2
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
nonchilled 22.01916 2.46416 8.935769 1.174616e-13
chilled 16.98142 2.46416 6.891361 1.146556e-09
conc
Estimate Std. Error t value Pr(>|t|)
nonchilled 0.01982458 0.004692544 4.224699 6.292679e-05
chilled 0.01563659 0.004692544 3.332221 1.306259e-03
Residual standard error: 8.945667 on 80 degrees of freedom
On 1/29/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
Or, to throw yet another couple of possibilities into the mix:
lapply(split(YourDF, YourDF$country),
function(x) summary(lm(y ~ x, data = x))
and:
library(nlme)
summary(lmList(y ~ x | country, YourDF))
See ?split and help(lmList, package = nlme)
HTH,
Marc Schwartz
On Mon, 2007-01-29 at 09:03 -0800, Bert Gunter wrote:
Prior answers are certainly correct, but this is where lists and lapply
shine:
result<-lapply(list(UK,USA),function(z)summary(lm(y~x,data=z)))
As in (nearly) all else, simplicity is a virtue.
If you prefer to keep the data sources as a character vector,dataNames,
result<-lapply(dataNames,function(z)summary(lm(y~x,data=get(z))))
should work.
Note: both of these are untested for the general case where they might be
used within a function and may not find the right z unless you pay attention
to scope, especially in the get() construction.
Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of C.Rosa at lse.ac.uk
Sent: Monday, January 29, 2007 8:23 AM
To: liuwensui at gmail.com; bcarvalh at jhsph.edu; Roger.Bivand at nhh.no
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Dear All,
Thank you very much for your help!
Carlo
-----Original Message-----
From: Wensui Liu [mailto:liuwensui at gmail.com]
Sent: Mon 29/01/2007 15:39
To: Rosa,C
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Loop with string variable AND customizable "summary" output
Carlo,
try something like:
for (i in c("UK","USA"))
{
summ<-summary(lm(y ~ x), subset = (country = i))
assign(paste('output', i, sep = ''), summ);
}
(note: it is untested, sorry).
On 1/29/07, C.Rosa at lse.ac.uk <C.Rosa at lse.ac.uk> wrote:
Dear All, I am using R for my research and I have two questions about it: 1) is it possible to create a loop using a string, instead of a numeric
vector? I have in mind a specific problem:
Suppose you have 2 countries: UK, and USA, one dependent (y) and one
independent variable (y) for each country (vale a dire: yUK, xUK, yUSA, xUSA) and you want to run automatically the following regressions:
for (i in c("UK","USA"))
output{i}<-summary(lm(y{i} ~ x{i}))
In other words, at the end I would like to have two objects as output:
"outputUK" and "outputUSA", which contain respectively the results of the first and second regression (yUK on xUK and yUSA on xUSA).
2) in STATA there is a very nice code ("outreg") to display nicely (and as
the user wants to) your regression results.
Is there anything similar in R / R contributed packages? More precisely, I
am thinking of something that is close in spirit to "summary" but it is also customizable. For example, suppose you want different Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 or a different format display (i.e. without "t value" column) implemented automatically (without manually editing it every time).
In alternative, if I was able to see it, I could modify the source code of
the function "summary", but I am not able to see its (line by line) code. Any idea?
Or may be a customizable regression output already exists? Thanks really a lot! Carlo
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Mon, 2007-01-29 at 14:30 -0500, Gabor Grothendieck wrote:
Note that the nlme solution seems to give the same coefficients but appears to use a single error term rather than one error term per level of the conditioning variable and that would change various other statistics relative to the other solutions should that matter.
summary(lmList(uptake ~ conc | Treatment, CO2))
Call:
Model: uptake ~ conc | Treatment
Data: CO2
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
nonchilled 22.01916 2.46416 8.935769 1.174616e-13
chilled 16.98142 2.46416 6.891361 1.146556e-09
conc
Estimate Std. Error t value Pr(>|t|)
nonchilled 0.01982458 0.004692544 4.224699 6.292679e-05
chilled 0.01563659 0.004692544 3.332221 1.306259e-03
Residual standard error: 8.945667 on 80 degrees of freedom
<snip> Gabor, Thanks for noting that. There is a solution using 'pool = FALSE':
summary(lmList(uptake ~ conc | Treatment, CO2, pool = FALSE))
Call:
Model: uptake ~ conc | Treatment
Data: CO2
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
nonchilled 22.01916 2.148475 10.248740 9.463480e-13
chilled 16.98142 2.743761 6.189103 2.562416e-07
conc
Estimate Std. Error t value Pr(>|t|)
nonchilled 0.01982458 0.004091379 4.845452 1.934996e-05
chilled 0.01563659 0.005224992 2.992653 4.721873e-03
I suppose that, while subtle, this could make this approach error prone
for those who (like me in this case) miss it...
Then of course, we get down to the format of the output, etc.
:-)
Thanks Gabor,
Marc