summary many regressions
On Nov 25, 2013, at 3:35 PM, Gary Dong wrote:
Dear R users, I have a large data set which includes data from 300 cities. I want to run a biviriate regression for each city and record the coefficient and the adjusted R square. For example, in the following, I have 10 cities represented by numbers from 1 to 10: x = cumsum(c(0, runif(999, -1, +1))) y = cumsum(c(0, runif(999, -1, +1))) city = rep(1:10,each=100) data<-data.frame(cbind(x,y,city)) I can manually run regressions for each city: fit_city1 <- lm(y ~ x,data=subset(data,data$city==1)) summary(fit_city1) Obvious, it is very tedious to run 300 regressions. I wonder if there is a quicker way to do this. Use for loop? what I want to see is something like this: City Coefficient Adjusted R square 1 -0.05 0.36 2 -0.12 0.20 3 -0.05 0.32 .....
The way to get the most rapid response from this list is to post a dataset that represents the complexity of the problem. Presumably this large dataset is either a dataframe with a column of city entries or a list of dataframes. Why not post dput() applied to an extract of three of the cities and include sufficient rows to allow a regression?
[[alternative HTML version deleted]]
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
This is a plain text list.
David Winsemius Alameda, CA, USA