Questions on weighted least squares
I have figured out the problem. Thanks. Sincerely, Yanwei Zhang Department of Actuarial Research and Modeling Munich Re America Tel: 609-275-2176 Email: yzhang at munichreamerica.com -----Original Message----- From: Zhang Yanwei - Princeton-MRAm Sent: Wednesday, July 23, 2008 3:32 PM To: Zhang Yanwei - Princeton-MRAm Cc: r-help at r-project.org Subject: RE: [R] Questions on weighted least squares Sorry if I did not state clearly. Put it another way. If the variance of the observation is proportional to the predictor, that is, var(y_i)=x_i*sigma^2, what should be specified in the "weights" argument in the lm function? fit=lm(y~x,weights=???) Sincerely, Yanwei Zhang Department of Actuarial Research and Modeling Munich Re America Tel: 609-275-2176 Email: yzhang at munichreamerica.com -----Original Message----- From: markleeds at verizon.net [mailto:markleeds at verizon.net] Sent: Wednesday, July 23, 2008 3:00 PM To: Zhang Yanwei - Princeton-MRAm Subject: RE: [R] Questions on weighted least squares i'm not sure about your whole question but you shouldn't be normalizing the predictor. that i know. the predictors are considered "fixed" so there's no reason to normalize them, ever.
On Wed, Jul 23, 2008 at 2:49 PM, Zhang Yanwei - Princeton-MRAm wrote:
Hi all, I met with a problem about the weighted least square regression. 1. I simulated a Normal vector (sim1) with mean 425906 and standard deviation 40000. 2. I simulated a second Normal vector with conditional mean b1*sim1, where b1 is just a number I specified, and variance proportional to sim1. Precisely, the standard deviation is sqrt(sim1)*50. 3. Then I run a WLS regression without the intercept term with "weights" equal to sqrt(sim1)*50. I wonder whether I should specify the weights in this way so that each observation will have equal variance 1. 4. If step 3 is correct, it should yield the same result if I normalize the response and the predictor first with sqrt(sim1)*50, and then use the "lm" function without "weights". But the two methods yield different results. Would someone tell me which one is the correct way to do? Thanks in advance, and the code and output are as follows:
b1=474186/425906
n=240
sim1=rnorm(n,425906,40000)
sim2=matrix(0,n,1)
for (i in 1:(n)){
+ sim2[i]=rnorm(1,sim1[i]*b1,sqrt(sim1[i])*50) + }
fit1=lm(sim2~-1+sim1,weights=sqrt(sim1)*50) coef(fit1)
sim1 1.116028
y=sim2/(sqrt(sim1)*50) x=sim1/(sqrt(sim1)*50) fit2=lm(y~-1+x) coef(fit2)
x
1.116273
Sincerely,
Yanwei Zhang
Department of Actuarial Research and Modeling Munich Re America
Tel: 609-275-2176
Email: yzhang at munichreamerica.com<mailto:yzhang at munichreamerica.com>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.