An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120702/471bba7a/attachment.pl>
Adjusting length of series
6 messages · David Winsemius, Lekgatlhamang, lexi Setlhare, arun
On Jul 2, 2012, at 5:13 AM, Lekgatlhamang, lexi Setlhare wrote:
Hi David and AK, I have been trying to implement your suggestions since yesterday, but I encountered some challenges. As for David's suggestions, I could only implement it after some modifications. Using an abridged version of my data, I dpud my dataset and then show my steps below.
Well, your initial question (why the $ referencing did not work) is now answered. This is not a dataframe but rather a 'ts' classed object and there is no `$` method for such objects. They are really matrices with some extra attributes. > ydata$BoBCL1 Error in ydata$BoBCL1 : $ operator is invalid for atomic vectors As I understood it you were able to get useful analyses using the formula methods for lm on these objects, but were just having difficulty with the "$" operator. So the answer is ..... don't do that.
David.
>
>> dput(ydata)
> structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996,
> 54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002,
> -35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1,
> 98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8,
> -17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2,
> -35.7999999999993,
> -226.900000000001, 224.1, 123.2, -95.1999999999998, -115.500000000001,
> 166.200000000001, -13.6999999999998, -184.3, 232, 350.3,
> -840.900000000001,
> 424.500000000001, 61.7999999999993, -107, 230.400000000001,
> -395.200000000001,
> 239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999,
> -191.100000000001, 451.000000000001, -100.900000000001, -218.4,
> -20.3000000000011, 281.700000000002, -179.900000000001, -170.6,
> 416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999,
> 337.400000000001, -625.600000000001, 634.600000000001,
> -384.500000000001,
> 448.700000000001, NA, NA, -164.457840999999, 17.0793539999995,
> 95.9767880000009, 680.238166999999, -491.348690999999, -274.694009,
> -256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104,
> 757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521,
> -375.277650999999, 293.867029999999, 417.845195, 278.198807,
> -968.592033999999, -314.195986, NA, NA, NA, 181.537194999999,
> 78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999,
> 18.3611019999998, 725.955867, -616.054851, 105.354689000001,
> -65.8929020000005, 864.658367999999, -2446.902797, 4009.313485,
> -3767.078372, 1963.363941, -891.662171999999, 669.144680999999,
> 123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937,
> 5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6,
> 5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3,
> 6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5,
> 4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616,
> 5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948,
> 5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025,
> 5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021,
> 7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames
> = list(
> NULL, c("DCred1", "DCred2", "DCred3", "DBoBC2", "DBoBC3",
> "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12
> ), class = c("mts", "ts"))
>
> NB: the NAs in the dataset emanated from lagging or differencing the
> series
>
> David's suggestion
> df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1)
> Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1,
> BoBCL1) :
> arguments imply differing number of rows: 23, 22, 21, 24
>
> So I modified as follows:
> length(DCred3) # finding the minimum length of various series
> [1] 21
>
> # Then dataframe construction
> dframe<-
> data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21],
> +
> Dbobc2
> =
> DBoBC2
> [1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21])
> # Then estimated regression
>> regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL,
>> data=dframe)
>> summary(regCred)
> # Worked well as shown by results below
> Call:
> lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL +
> BoBCL, data = dframe)
> Residuals:
> Min 1Q Median 3Q Max
> -69.516 -27.695 -8.085 13.851 107.276
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 159.32304 157.15209 1.014 0.327873
> Dcre2 -0.75527 0.17262 -4.375 0.000634 ***
> Dcre3 -0.21006 0.08656 -2.427 0.029329 *
> Dbobc2 0.05111 0.06565 0.779 0.449197
> Dbobc3 0.03106 0.03510 0.885 0.391108
> CredL -0.10967 0.04933 -2.223 0.043177 *
> BoBCL 0.09756 0.03097 3.150 0.007087 **
> ---
> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> Residual standard error: 52.3 on 14 degrees of freedom
> Multiple R-squared: 0.9331, Adjusted R-squared: 0.9044
> F-statistic: 32.55 on 6 and 14 DF, p-value: 1.911e-07
>
> This is good, but couldn't I code the process for my 15 variable
> model?
> Perhaps that is where the use of
> Dcr<- lapply(..., function(x) ...)
> comes in?
>
> AK, if you spare some minutes, please use my dput data to illustrate
> the suggestion you made, I searched the lapply function (using ??
> lapply) but could not get a handle of how to use it in my case. My
> dput data is as shown below.
>
> DCred1 DCred2 DCred3 DBoBC2 DBoBC3 CredL1 BoBCL1
> Feb 2001 68.1 NA NA NA NA 4937.0 4187.500
> Mar 2001 -34.8 -102.9 NA -164.45784 NA 5005.1 4296.005
> Apr 2001 90.4 125.2 228.1 17.07935 181.53719 4970.3 4240.052
> May 2001 54.6 -35.8 -161.0 95.97679 78.89743 5060.7 4201.178
> Jun 2001 -172.3 -226.9 -191.1 680.23817 584.26138 5115.3 4258.281
> Jul 2001 51.8 224.1 451.0 -491.34869 -1171.58686 4943.0 4995.623
> Aug 2001 175.0 123.2 -100.9 -274.69401 216.65468 4994.8 5241.615
> Sep 2001 79.8 -95.2 -218.4 -256.33291 18.36110 5169.8 5212.914
> Oct 2001 -35.7 -115.5 -20.3 469.62296 725.95587 5249.6 4927.880
> Nov 2001 130.5 166.2 281.7 -146.43189 -616.05485 5213.9 5112.468
> Dec 2001 116.8 -13.7 -179.9 -41.07720 105.35469 5344.4 5150.625
> Jan 2002 -67.5 -184.3 -170.6 -106.97010 -65.89290 5461.2 5147.705
> Feb 2002 164.5 232.0 416.3 757.68826 864.65837 5393.7 5037.814
> Mar 2002 514.8 350.3 118.3 -1689.21453 -2446.90280 5558.2 5685.612
> Apr 2002 -326.1 -840.9 -1191.2 2320.09895 4009.31348 6073.0 4644.195
> May 2002 98.4 424.5 1265.4 -1446.97942 -3767.07837 5746.9 5922.877
> Jun 2002 160.2 61.8 -362.7 516.38452 1963.36394 5845.3 5754.580
> Jul 2002 53.2 -107.0 -168.8 -375.27765 -891.66217 6005.5 6102.667
> Aug 2002 283.6 230.4 337.4 293.86703 669.14468 6058.7 6075.477
> Sep 2002 -111.6 -395.2 -625.6 417.84519 123.97817 6342.3 6342.153
> Oct 2002 127.8 239.4 634.6 278.19881 -139.64639 6230.7 7026.675
> Nov 2002 -17.3 -145.1 -384.5 -968.59203 -1246.79084 6358.5 7989.396
> Dec 2002 286.3 303.6 448.7 -314.19599 654.39605 6341.2 7983.524
> Jan 2003 NA NA NA NA NA 6627.5 7663.457
>
> Thanks kindly. Lexi
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
Hello, The class of your data is not dataframe. Suppose I call your data as ydat1 str(ydat1) ?mts [1:24, 1:7] 68.1 -34.8 90.4 54.6 -172.3 ... ?- attr(*, "dimnames")=List of 2 ? ..$ : NULL ? ..$ : chr [1:7] "DCred1" "DCred2" "DCred3" "DBoBC2" ... ?- attr(*, "tsp")= num [1:3] 2001 2003 12 ?- attr(*, "class")= chr [1:2] "mts" "ts" ydat2<-data.frame(ydat1) str(ydat2) 'data.frame':??? 24 obs. of? 7 variables: ?$ DCred1: num? 68.1 -34.8 90.4 54.6 -172.3 ... ?$ DCred2: num? NA -102.9 125.2 -35.8 -226.9 ... ?$ DCred3: num? NA NA 228 -161 -191 ... ?$ DBoBC2: num? NA -164.5 17.1 96 680.2 ... ?$ DBoBC3: num? NA NA 181.5 78.9 584.3 ... ?$ CredL1: num? 4937 5005 4970 5061 5115 ... ?$ BoBCL1: num? 4188 4296 4240 4201 4258 ... #Since you wanted only to do lm for these columns, I guess it doesn't really matter whether you have month and year in the dataset. ?#With NAs ?regCred<-lm(DCred1~DCred2+DCred3+DBoBC2+DBoBC3+CredL1+BoBCL1,data=ydat2)
summary(regCred)
Call: lm(formula = DCred1 ~ DCred2 + DCred3 + DBoBC2 + DBoBC3 + CredL1 + ??? BoBCL1, data = ydat2) Residuals: ??????? Min????????? 1Q????? Median????????? 3Q???????? Max -124.988463? -33.133975??? 7.971083?? 23.607953?? 76.813601 Coefficients: ???????????????? Estimate??? Std. Error? t value?? Pr(>|t|)??? (Intercept) -538.61375718? 205.91179535 -2.61575?? 0.020344 *? DCred2???????? 0.96401908??? 0.15623660? 6.17025 2.4337e-05 *** DCred3??????? -0.25720355??? 0.08983607 -2.86303?? 0.012524 *? DBoBC2??????? -0.11222347??? 0.07828182 -1.43358?? 0.173646??? DBoBC3???????? 0.04564621??? 0.03825169? 1.19331?? 0.252578??? CredL1???????? 0.18499925??? 0.06565456? 2.81777?? 0.013693 *? BoBCL1??????? -0.07682710??? 0.03406916 -2.25503?? 0.040666 *? --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 54.44479 on 14 degrees of freedom ? (3 observations deleted due to missingness) Multiple R-squared: 0.9324472,??? Adjusted R-squared: 0.903496 F-statistic: 32.20757 on 6 and 14 DF,? p-value: 2.046024e-07 Without NAs
ydat3<-na.omit(ydat2) regCred<-lm(DCred1~DCred2+DCred3+DBoBC2+DBoBC3+CredL1+BoBCL1,data=ydat3) summary(regCred)
Call:
lm(formula = DCred1 ~ DCred2 + DCred3 + DBoBC2 + DBoBC3 + CredL1 +
??? BoBCL1, data = ydat3)
Residuals:
??????? Min????????? 1Q????? Median????????? 3Q???????? Max
-124.988463? -33.133975??? 7.971083?? 23.607953?? 76.813601
Coefficients:
???????????????? Estimate??? Std. Error? t value?? Pr(>|t|)???
(Intercept) -538.61375718? 205.91179535 -2.61575?? 0.020344 *?
DCred2???????? 0.96401908??? 0.15623660? 6.17025 2.4337e-05 ***
DCred3??????? -0.25720355??? 0.08983607 -2.86303?? 0.012524 *?
DBoBC2??????? -0.11222347??? 0.07828182 -1.43358?? 0.173646???
DBoBC3???????? 0.04564621??? 0.03825169? 1.19331?? 0.252578???
CredL1???????? 0.18499925??? 0.06565456? 2.81777?? 0.013693 *?
BoBCL1??????? -0.07682710??? 0.03406916 -2.25503?? 0.040666 *?
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 54.44479 on 14 degrees of freedom
Multiple R-squared: 0.9324472,??? Adjusted R-squared: 0.903496
F-statistic: 32.20757 on 6 and 14 DF,? p-value: 2.046024e-
#Same result
Not sure what you meant by ("This is good, but couldn't I code the process for my 15 variable model?")
A.K.
From: "Lekgatlhamang, lexi Setlhare" <lexisetlhare at yahoo.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, July 2, 2012 5:13 AM
Subject: Re: [R]??Adjusting length of series
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, July 2, 2012 5:13 AM
Subject: Re: [R]??Adjusting length of series
Hi David and AK,
I have been trying to implement your suggestions since yesterday, but I encountered some challenges.
As for David's suggestions, I could only implement it after some modifications.?Using an abridged?version of my data, I dpud my dataset and then show my steps below.
> dput(ydata)
structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996,
54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002,
-35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1,
98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8,
-17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2, -35.7999999999993,
-226.900000000001, 224.1, 123.2,
-95.1999999999998, -115.500000000001,
166.200000000001, -13.6999999999998, -184.3, 232, 350.3, -840.900000000001,
424.500000000001, 61.7999999999993, -107, 230.400000000001, -395.200000000001,
239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999,
-191.100000000001, 451.000000000001, -100.900000000001, -218.4,
-20.3000000000011, 281.700000000002, -179.900000000001, -170.6,
416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999,
337.400000000001, -625.600000000001, 634.600000000001, -384.500000000001,
448.700000000001, NA, NA, -164.457840999999, 17.0793539999995,
95.9767880000009, 680.238166999999, -491.348690999999, -274.694009,
-256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104,
757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521,
-375.277650999999, 293.867029999999, 417.845195, 278.198807,
-968.592033999999, -314.195986, NA, NA, NA,
181.537194999999,
78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999,
18.3611019999998, 725.955867, -616.054851, 105.354689000001,
-65.8929020000005, 864.658367999999, -2446.902797, 4009.313485,
-3767.078372, 1963.363941, -891.662171999999, 669.144680999999,
123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937,
5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6,
5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3,
6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5,
4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616,
5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948,
5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025,
5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021,
7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames = list(
??? NULL, c("DCred1", "DCred2",
"DCred3", "DBoBC2", "DBoBC3",
??? "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12
), class = c("mts", "ts"))
NB: the NAs in the dataset emanated from lagging?or differencing the series
David's suggestion
?df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1)
Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1, BoBCL1) :
? arguments imply differing number of rows: 23, 22, 21, 24
So I modified as follows:
length(DCred3)? # finding the minimum length of various series
[1] 21
# Then dataframe construction
dframe<- data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21],
+ Dbobc2=DBoBC2[1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21])
# Then estimated regression
> regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL, data=dframe)
> summary(regCred)
# Worked well as shown by results below
Call:
lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL +
??? BoBCL, data = dframe)
Residuals:
??? Min????? 1Q? Median????? 3Q???? Max
-69.516 -27.695? -8.085? 13.851 107.276
Coefficients:
???????????? Estimate Std. Error t value Pr(>|t|)???
(Intercept) 159.32304? 157.15209?? 1.014 0.327873???
Dcre2??????? -0.75527??? 0.17262? -4.375 0.000634 ***
Dcre3??????? -0.21006??? 0.08656? -2.427 0.029329 *?
Dbobc2??????? 0.05111??? 0.06565?? 0.779 0.449197???
Dbobc3??????? 0.03106??? 0.03510?? 0.885 0.391108???
CredL??????? -0.10967??? 0.04933? -2.223 0.043177 *?
BoBCL???????? 0.09756??? 0.03097?? 3.150 0.007087 **
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 52.3 on 14 degrees of freedom
Multiple R-squared: 0.9331,???? Adjusted R-squared: 0.9044
F-statistic: 32.55 on 6 and 14 DF,? p-value: 1.911e-07
This is good, but couldn't I code the process for my 15 variable model?
Perhaps that is where the use of
Dcr<- lapply(..., function(x) ...)
comes in?
AK, if you spare some minutes,?please use my dput data to illustrate the suggestion you made, I searched the lapply function (using ??lapply) but could not get a handle of how to use it in my case. My dput data is as shown below.
???????? DCred1 DCred2? DCred3????? DBoBC2????? DBoBC3 CredL1?? BoBCL1
Feb 2001?? 68.1???? NA????? NA????????? NA????????? NA 4937.0 4187.500
Mar 2001? -34.8 -102.9????? NA? -164.45784????????? NA 5005.1 4296.005
Apr 2001?? 90.4? 125.2??
228.1??? 17.07935?? 181.53719 4970.3 4240.052
May 2001?? 54.6? -35.8? -161.0??? 95.97679??? 78.89743 5060.7 4201.178
Jun 2001 -172.3 -226.9? -191.1?? 680.23817?? 584.26138 5115.3 4258.281
Jul 2001?? 51.8? 224.1?? 451.0? -491.34869 -1171.58686 4943.0 4995.623
Aug 2001? 175.0? 123.2? -100.9? -274.69401?? 216.65468 4994.8 5241.615
Sep 2001?? 79.8? -95.2? -218.4? -256.33291??? 18.36110 5169.8 5212.914
Oct 2001? -35.7 -115.5?? -20.3?? 469.62296?? 725.95587 5249.6 4927.880
Nov 2001? 130.5? 166.2?? 281.7? -146.43189? -616.05485 5213.9 5112.468
Dec 2001? 116.8? -13.7? -179.9?? -41.07720?? 105.35469 5344.4 5150.625
Jan 2002? -67.5
-184.3? -170.6? -106.97010?? -65.89290 5461.2 5147.705
Feb 2002? 164.5? 232.0?? 416.3?? 757.68826?? 864.65837 5393.7 5037.814
Mar 2002? 514.8? 350.3?? 118.3 -1689.21453 -2446.90280 5558.2 5685.612
Apr 2002 -326.1 -840.9 -1191.2? 2320.09895? 4009.31348 6073.0 4644.195
May 2002?? 98.4? 424.5? 1265.4 -1446.97942 -3767.07837 5746.9 5922.877
Jun 2002? 160.2?? 61.8? -362.7?? 516.38452? 1963.36394 5845.3 5754.580
Jul 2002?? 53.2 -107.0? -168.8? -375.27765? -891.66217 6005.5 6102.667
Aug 2002? 283.6? 230.4?? 337.4?? 293.86703?? 669.14468 6058.7 6075.477
Sep 2002 -111.6 -395.2? -625.6?? 417.84519?? 123.97817 6342.3 6342.153
Oct 2002? 127.8? 239.4?? 634.6?? 278.19881?
-139.64639 6230.7 7026.675
Nov 2002? -17.3 -145.1? -384.5? -968.59203 -1246.79084 6358.5 7989.396
Dec 2002? 286.3? 303.6?? 448.7? -314.19599?? 654.39605 6341.2 7983.524
Jan 2003???? NA???? NA????? NA????????? NA????????? NA 6627.5 7663.457
Thanks kindly. Lexi???????
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120702/f5d7bcc7/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120702/a2441650/attachment.pl>
Hi, One more thing, ydat1: original dataset ?ydat2<-data.frame(ydat1) #Not sure ,how you did this step on original data because:: dframe<- data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21], ?Dbobc2=DBoBC2[1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21]) I am getting errors for that step, when I used ydat1. head(ydat1) [1]?? 68.1? -34.8?? 90.4?? 54.6 -172.3?? 51.8 ?head(ydat2) ? DCred1 DCred2 DCred3???? DBoBC2????? DBoBC3 CredL1?? BoBCL1 1?? 68.1???? NA???? NA???????? NA????????? NA 4937.0 4187.500 2? -34.8 -102.9???? NA -164.45784????????? NA 5005.1 4296.005 3?? 90.4? 125.2? 228.1?? 17.07935?? 181.53719 4970.3 4240.052 4?? 54.6? -35.8 -161.0?? 95.97679??? 78.89743 5060.7 4201.178 5 -172.3 -226.9 -191.1? 680.23817?? 584.26138 5115.3 4258.281 6?? 51.8? 224.1? 451.0 -491.34869 -1171.58686 4943.0 4995.623 #I analyzed [1:21] again in ydat2. dframe<-data.frame(Dcre1=ydat2$DCred1[1:21],Dcre2=ydat2$DCred2[1:21],Dcre3=ydat2$DCred3[1:21],Dbobc2=ydat2$DBoBC2[1:21],Dbobc3=ydat2$DBoBC3[1:21],CredL=ydat2$CredL1[1:21],BoBCL=ydat2$BoBCL1[1:21]) But, the results are bit different than in my earlier post, because, here the NAs are still present in different rows.? So, the observations in those rows will be deleted while it is analyzed. regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL, data=dframe)
summary(regCred)
Call: lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL + ??? BoBCL, data = dframe) Residuals: ???? Min?????? 1Q?? Median?????? 3Q????? Max -118.687? -25.568?? -5.334?? 35.035?? 69.992 Coefficients: ????????????? Estimate Std. Error t value Pr(>|t|)??? (Intercept) -485.42427? 209.47952? -2.317 0.038958 *? Dcre2????????? 0.95097??? 0.18156?? 5.238 0.000209 *** Dcre3???????? -0.28676??? 0.10787? -2.658 0.020852 *? Dbobc2??????? -0.09512??? 0.09334? -1.019 0.328278??? Dbobc3???????? 0.03199??? 0.04933?? 0.648 0.528936??? CredL????????? 0.14825??? 0.07193?? 2.061 0.061645 .? BoBCL???????? -0.04844??? 0.04333? -1.118 0.285540???? --- A.K.
From: "Lekgatlhamang, lexi Setlhare" <lexisetlhare at yahoo.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, July 2, 2012 11:43 AM
Subject: Re: [R]??Adjusting length of series
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, July 2, 2012 11:43 AM
Subject: Re: [R]??Adjusting length of series
Thanks very much A.K. I have to admit that my problem was not clearly stated, with the structure of my data provided. Now all is well.
Cheers
Lexi
From: arun <smartpink111 at yahoo.com>
To: "Lekgatlhamang, lexi Setlhare" <lexisetlhare at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, July 2, 2012 4:40 PM
Subject: Re: [R]??Adjusting length of series
Hello,
The class of your data is not dataframe.
Suppose I call your data as ydat1
str(ydat1)
?mts [1:24, 1:7] 68.1 -34.8 90.4 54.6 -172.3 ...
?- attr(*, "dimnames")=List of 2
? ..$ : NULL
? ..$ : chr [1:7] "DCred1" "DCred2" "DCred3" "DBoBC2" ...
?- attr(*, "tsp")= num [1:3] 2001 2003 12
?- attr(*, "class")= chr [1:2] "mts" "ts"
ydat2<-data.frame(ydat1)
str(ydat2)
'data.frame':??? 24 obs. of? 7 variables:
?$ DCred1: num? 68.1 -34.8 90.4 54.6 -172.3 ...
?$ DCred2: num? NA -102.9 125.2 -35.8 -226.9 ...
?$ DCred3: num? NA NA 228 -161 -191 ...
?$ DBoBC2: num? NA -164.5 17.1 96 680.2 ...
?$ DBoBC3: num? NA NA 181.5 78.9 584.3 ...
?$ CredL1: num? 4937 5005 4970 5061 5115 ...
?$ BoBCL1: num? 4188 4296 4240 4201 4258 ...
#Since you wanted only to do lm
for these columns, I guess it doesn't really matter whether you have month and year in the dataset.
?#With NAs
?regCred<-lm(DCred1~DCred2+DCred3+DBoBC2+DBoBC3+CredL1+BoBCL1,data=ydat2)
> summary(regCred)
Call:
lm(formula = DCred1 ~ DCred2 + DCred3 + DBoBC2 + DBoBC3 + CredL1 +
??? BoBCL1, data = ydat2)
Residuals:
??????? Min????????? 1Q????? Median????????? 3Q???????? Max
-124.988463? -33.133975??? 7.971083?? 23.607953?? 76.813601
Coefficients:
???????????????? Estimate??? Std. Error? t value?? Pr(>|t|)???
(Intercept)
-538.61375718? 205.91179535 -2.61575?? 0.020344 *?
DCred2???????? 0.96401908??? 0.15623660? 6.17025 2.4337e-05 ***
DCred3??????? -0.25720355??? 0.08983607 -2.86303?? 0.012524 *?
DBoBC2??????? -0.11222347??? 0.07828182 -1.43358?? 0.173646???
DBoBC3???????? 0.04564621??? 0.03825169? 1.19331?? 0.252578???
CredL1???????? 0.18499925??? 0.06565456? 2.81777?? 0.013693 *?
BoBCL1??????? -0.07682710??? 0.03406916 -2.25503?? 0.040666 *?
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01
?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 54.44479 on 14 degrees of freedom
? (3 observations deleted due to missingness)
Multiple R-squared: 0.9324472,??? Adjusted R-squared: 0.903496
F-statistic: 32.20757 on 6 and 14 DF,? p-value: 2.046024e-07
Without NAs
> ydat3<-na.omit(ydat2)
> regCred<-lm(DCred1~DCred2+DCred3+DBoBC2+DBoBC3+CredL1+BoBCL1,data=ydat3)
> summary(regCred)
Call:
lm(formula = DCred1 ~ DCred2 + DCred3 + DBoBC2 + DBoBC3 + CredL1 +
??? BoBCL1, data = ydat3)
Residuals:
??????? Min????????? 1Q????? Median????????? 3Q???????? Max
-124.988463? -33.133975??? 7.971083?? 23.607953?? 76.813601
Coefficients:
???????????????? Estimate??? Std. Error? t value?? Pr(>|t|)???
(Intercept) -538.61375718? 205.91179535 -2.61575?? 0.020344 *?
DCred2???????? 0.96401908??? 0.15623660? 6.17025 2.4337e-05 ***
DCred3??????? -0.25720355??? 0.08983607 -2.86303?? 0.012524 *?
DBoBC2??????? -0.11222347??? 0.07828182 -1.43358?? 0.173646???
DBoBC3???????? 0.04564621??? 0.03825169? 1.19331?? 0.252578???
CredL1???????? 0.18499925??? 0.06565456?
2.81777?? 0.013693 *?
BoBCL1??????? -0.07682710??? 0.03406916 -2.25503?? 0.040666 *?
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 54.44479 on 14 degrees of freedom
Multiple R-squared: 0.9324472,??? Adjusted R-squared: 0.903496
F-statistic: 32.20757 on 6 and 14 DF,? p-value: 2.046024e-
#Same result
Not sure what you meant by ("This is good, but couldn't I code the process for my 15 variable model?")
A.K.
________________________________
From: "Lekgatlhamang, lexi Setlhare" <lexisetlhare at yahoo.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help
<r-help at r-project.org>
Sent: Monday, July 2, 2012 5:13 AM
Subject: Re: [R]??Adjusting length of series
Hi David and AK,
I have been trying to implement your suggestions since yesterday, but I encountered some challenges.
As for David's suggestions, I could only implement it after some modifications.?Using an abridged?version of my data, I dpud my dataset and then show my steps below.
> dput(ydata)
structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996,
54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002,
-35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1,
98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8,
-17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2, -35.7999999999993,
-226.900000000001, 224.1, 123.2,
-95.1999999999998,
-115.500000000001,
166.200000000001, -13.6999999999998, -184.3, 232, 350.3, -840.900000000001,
424.500000000001, 61.7999999999993, -107, 230.400000000001, -395.200000000001,
239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999,
-191.100000000001, 451.000000000001, -100.900000000001, -218.4,
-20.3000000000011, 281.700000000002, -179.900000000001, -170.6,
416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999,
337.400000000001, -625.600000000001, 634.600000000001, -384.500000000001,
448.700000000001, NA, NA, -164.457840999999, 17.0793539999995,
95.9767880000009, 680.238166999999, -491.348690999999, -274.694009,
-256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104,
757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521,
-375.277650999999, 293.867029999999, 417.845195, 278.198807,
-968.592033999999, -314.195986, NA, NA, NA,
181.537194999999,
78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999,
18.3611019999998, 725.955867, -616.054851, 105.354689000001,
-65.8929020000005, 864.658367999999, -2446.902797, 4009.313485,
-3767.078372, 1963.363941, -891.662171999999, 669.144680999999,
123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937,
5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6,
5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3,
6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5,
4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616,
5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948,
5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025,
5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021,
7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames = list(
??? NULL, c("DCred1", "DCred2",
"DCred3", "DBoBC2",
"DBoBC3",
??? "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12
), class = c("mts", "ts"))
NB: the NAs in the dataset emanated from lagging?or differencing the series
David's suggestion
?df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1)
Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1, BoBCL1) :
? arguments imply differing number of rows: 23, 22, 21, 24
So I modified as follows:
length(DCred3)? # finding the minimum length of various series
[1] 21
# Then dataframe construction
dframe<- data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21],
+ Dbobc2=DBoBC2[1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21])
# Then estimated regression
> regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL, data=dframe)
> summary(regCred)
# Worked well as shown by results
below
Call:
lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL +
??? BoBCL, data = dframe)
Residuals:
??? Min????? 1Q? Median????? 3Q???? Max
-69.516 -27.695? -8.085? 13.851 107.276
Coefficients:
???????????? Estimate Std. Error t value Pr(>|t|)???
(Intercept) 159.32304? 157.15209?? 1.014 0.327873???
Dcre2??????? -0.75527??? 0.17262? -4.375 0.000634 ***
Dcre3??????? -0.21006??? 0.08656? -2.427 0.029329 *?
Dbobc2??????? 0.05111??? 0.06565?? 0.779 0.449197???
Dbobc3??????? 0.03106??? 0.03510?? 0.885 0.391108???
CredL??????? -0.10967??? 0.04933? -2.223 0.043177 *?
BoBCL???????? 0.09756??? 0.03097?? 3.150 0.007087 **
---
Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 52.3 on 14 degrees of freedom
Multiple R-squared: 0.9331,???? Adjusted R-squared: 0.9044
F-statistic: 32.55 on 6 and 14 DF,? p-value: 1.911e-07
This is good, but couldn't I code the process for my 15 variable model?
Perhaps that is where the use of
Dcr<- lapply(..., function(x) ...)
comes in?
AK, if you spare some minutes,?please use my dput data to illustrate the suggestion you made, I searched the
lapply function (using ??lapply) but could not get a handle of how to use it in my case. My dput data is as shown below.
???????? DCred1 DCred2? DCred3????? DBoBC2????? DBoBC3 CredL1?? BoBCL1
Feb 2001?? 68.1???? NA????? NA????????? NA????????? NA 4937.0 4187.500
Mar 2001? -34.8 -102.9????? NA? -164.45784????????? NA 5005.1 4296.005
Apr 2001?? 90.4? 125.2??
228.1??? 17.07935?? 181.53719 4970.3 4240.052
May 2001?? 54.6? -35.8? -161.0??? 95.97679??? 78.89743 5060.7 4201.178
Jun 2001 -172.3 -226.9?
-191.1?? 680.23817?? 584.26138 5115.3 4258.281
Jul 2001?? 51.8? 224.1?? 451.0? -491.34869 -1171.58686 4943.0 4995.623
Aug 2001? 175.0? 123.2? -100.9? -274.69401?? 216.65468 4994.8 5241.615
Sep 2001?? 79.8? -95.2? -218.4? -256.33291??? 18.36110 5169.8 5212.914
Oct 2001? -35.7 -115.5?? -20.3?? 469.62296?? 725.95587 5249.6 4927.880
Nov 2001? 130.5? 166.2?? 281.7? -146.43189? -616.05485 5213.9 5112.468
Dec 2001? 116.8? -13.7? -179.9?? -41.07720?? 105.35469 5344.4 5150.625
Jan 2002? -67.5
-184.3? -170.6? -106.97010?? -65.89290 5461.2 5147.705
Feb 2002? 164.5? 232.0?? 416.3?? 757.68826?? 864.65837 5393.7 5037.814
Mar 2002? 514.8?
350.3?? 118.3 -1689.21453 -2446.90280 5558.2 5685.612
Apr 2002 -326.1 -840.9 -1191.2? 2320.09895? 4009.31348 6073.0 4644.195
May 2002?? 98.4? 424.5? 1265.4 -1446.97942 -3767.07837 5746.9 5922.877
Jun 2002? 160.2?? 61.8? -362.7?? 516.38452? 1963.36394 5845.3 5754.580
Jul 2002?? 53.2 -107.0? -168.8? -375.27765? -891.66217 6005.5 6102.667
Aug 2002? 283.6? 230.4?? 337.4?? 293.86703?? 669.14468 6058.7 6075.477
Sep 2002 -111.6 -395.2? -625.6?? 417.84519?? 123.97817 6342.3 6342.153
Oct 2002? 127.8? 239.4?? 634.6?? 278.19881?
-139.64639 6230.7 7026.675
Nov 2002? -17.3 -145.1? -384.5? -968.59203 -1246.79084 6358.5 7989.396
Dec 2002? 286.3? 303.6?? 448.7? -314.19599?? 654.39605 6341.2
7983.524
Jan 2003???? NA???? NA????? NA????????? NA????????? NA 6627.5 7663.457
Thanks kindly. Lexi????????