Skip to content
Back to formatted view

Raw Message

Message-ID: <CAAxdm-49M0F0tzB3S-dB5_pN1yKWvx2CFYRB106oKHWAEt+m2w@mail.gmail.com>
Date: 2011-10-26T16:25:46Z
From: jim holtman
Subject: Creating data frame with residuals of a data frame
In-Reply-To: <CAMHWMgKhDWJvbm7iPhcE=YWUZLcrt0UHw8imLrD5twbacPZ9yg@mail.gmail.com>

try this:

> age<- c(5,6,10,14,16,NA,18)
> value1<- c(30,70,40,50,NA,NA,NA)
> value2<- c(2,4,1,4,4,4,4)
> df<- data.frame(age, value1, value2)
>
> #Run linear regression to adjust for age and get residuals:
>
> lm_f <- function(x) {
+ x<- residuals(lm(data=df, formula= x ~ age))
+ }
> resid <- apply(df,2,lm_f)
> resid<- resid[-1]
> for (i in names(resid)){
+     newCol <- paste(i, 'res', sep = '')
+     df[[newCol]] <- NA  # initialize
+     df[[newCol]][as.integer(names(resid[[i]]))] <- resid[[i]]
+ }
> df
  age value1 value2  value1res   value2res
1   5     30      2 -16.945813 -0.37398374
2   6     70      4  22.906404  1.50406504
3  10     40      1  -7.684729 -1.98373984
4  14     50      4   1.724138  0.52845528
5  16     NA      4         NA  0.28455285
6  NA     NA      4         NA          NA
7  18     NA      4         NA  0.04065041


On Mon, Oct 24, 2011 at 10:23 AM, francesca casalino
<francy.casalino at gmail.com> wrote:
> Dear experts,
>
> I am trying to create a data frame from the residuals I get after
> having applied a linear regression to each column of a data frame, but
> I don't know how to create this data frame from the resulting list
> since the list has differing numbers of rows.
>
> So for example:
> age<- c(5,6,10,14,16,NA,18)
> value1<- c(30,70,40,50,NA,NA,NA)
> value2<- c(2,4,1,4,4,4,4)
> df<- data.frame(age, value1, value2)
>
> #Run linear regression to adjust for age and get residuals:
>
> lm_f <- function(x) {
> x<- residuals(lm(data=df, formula= x ~ age))
> }
> resid <- apply(df,2,lm_f)
> resid<- resid[-1]
>
> Then resid is a list with different row numbers:
>
> $value1
> ? ? ? ? 1 ? ? ? ? ?2 ? ? ? ? ?3 ? ? ? ? ?4
> -16.945813 ?22.906404 ?-7.684729 ? 1.724138
>
> $value2
> ? ? ? ? ?1 ? ? ? ? ? 2 ? ? ? ? ? 3 ? ? ? ? ? 4 ? ? ? ? ? 5 ? ? ? ? ? 7
> -0.37398374 ?1.50406504 -1.98373984 ?0.52845528 ?0.28455285 ?0.04065041
>
> I am trying to get both the original variable and their residuals in
> the same data frame like this:
>
> age, value1, value2, resid_value1, resid_value2
>
> But when I try cbind or other operations I get an error message
> because they do not have the same number of rows. Can you please help
> me figure out how to solve this?
>
> Thank you.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?