Skip to content
Prev 327559 / 398503 Next

linear fit function with NA values

HI,
I couldn't get any error message with the data you provided.
return<- read.table(text="
????? ATI??????? AMU
-1? 0.734??? 9.003
0??? 0.999??? 2.001
1??? 3.097??? -1.003
2??????? NA??????? NA
3??????? NA??? 3.541
",sep="",header=TRUE)

median<- read.table(text="
????? ATI??????? AMU
-1? 3.224??? -2.003
0??? 2.999??? -1.301
1??? 1.3??????? -1.003
2??? 4.000??? 2.442
3????? -10??? 4.511
",sep="",header=TRUE)

?lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i])}) 
[[1]]

Call:
lm(formula = return[, i] ~ median[, i])

Coefficients:
(Intercept)? median[, i]? 
????? 4.696?????? -1.231? 


[[2]]

Call:
lm(formula = return[, i] ~ median[, i])

Coefficients:
(Intercept)? median[, i]? 
???? 3.3937????? -0.1607? 

lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i],na.action=na.omit)}) #same as above.

?sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C????????????? 
?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8??? 
?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8?? 
?[7] LC_PAPER=C???????????????? LC_NAME=C???????????????? 
?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C??????????? 
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C?????? 

attached base packages:
[1] stats???? graphics? grDevices utils???? datasets? methods?? base???? 

other attached packages:
[1] stringr_0.6.2? reshape2_1.2.2

loaded via a namespace (and not attached):
[1] plyr_1.8??? tools_3.0.1

BTW, It is better to ?dput() the example dataset.

A.K.



----- Original Message -----
From: iza.ch1 <iza.ch1 at op.pl>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Saturday, July 27, 2013 4:46 PM
Subject: Re: Re: [R] linear fit function with NA values

Hi

Thanks for your hints. I would like to describe my problem better and give an examle of the data that I use.

I conduct the event study and I need to create abnormal returns for the daily stock prices. I have for each stock returns from time period of 8 years. For some days I don't have the data for many reasons. in excel file they are just empty cells but I convert my data into 'zoo' and then it is transformed into NA. I get something like this

return


? ? ?  ATI? ? ? ? AMU
-1?  0.734? ?  9.003
0? ? 0.999? ?  2.001
1? ? 3.097? ?  -1.003
2? ? ? ? NA? ? ? ? NA
3? ? ? ? NA? ?  3.541

median
? ? ? ATI? ? ? ? AMU
-1?  3.224? ?  -2.003
0? ? 2.999? ?  -1.301
1? ? 1.3? ? ? ? -1.003
2? ? 4.000? ?  2.442
3? ? ?  -10? ?  4.511

I want to regress first column return with first column median and second column return with second column median. when I do 
OLS<-lapply(seq_len(ncol(return)),function(i) {lm(return[,i]~median[,i])})
I get an error message. I would like my function to omit the NAs and for example for ATI returns to take into account only the values for -1,0,1 and regress it against the same values from ATI in median which means it would also take only (3.224, 2.999, 1.3)

Is it possible to do it?

Thanks a lot 

W dniu 2013-07-27 17:33:30 u?ytkownik arun <smartpink111 at yahoo.com> napisa?: