Errors-In-Variables in R
Dear Cedric, If I understand correctly what you want to do, and if you're willing to assume that the variables are normally distributed, then you should be able to use any of the latent-variable structural-equation-modeling packages in R, such as sem, OpenMX, or lavaan. Here's an artificial example using the sem package: ------------ snip ----------
set.seed(12345) zeta <- rnorm(1000) y <- 1 + 2*zeta + rnorm(1000, 0, 1) x <- zeta + rnorm(1000) plot(x, y) Data <- data.frame(x, y) summary(lm(y ~ x)) # biased
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-6.6339 -1.1406 0.0299 1.1573 6.5652
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.04007 0.05514 18.86 <2e-16 ***
x 1.06089 0.04012 26.44 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.743 on 998 degrees of freedom
Multiple R-squared: 0.4119, Adjusted R-squared: 0.4113
F-statistic: 699.1 on 1 and 998 DF, p-value: < 2.2e-16
plot(x, y) # not shown library(sem) eqns <- specifyEquations()
1: y = alpha*Intercept + beta*zeta 2: x = 1*zeta 3: V(y) = sigma 4: V(x) = 1 5: V(zeta) = phi 6: Read 5 items
model <- sem(eqns, data=Data, raw=TRUE, fixed.x="Intercept") summary(model)
Model fit to raw moment matrix.
Model Chisquare = 0.2264654 Df = 1 Pr(>Chisq) = 0.6341572
AIC = 8.226465
BIC = -6.68129
Normalized Residuals
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000 0.1635 0.1711 0.2189 0.2564 0.4759
Parameter Estimates
Estimate Std Error z value Pr(>|z|)
alpha 1.0400668 0.05507397 18.884905 1.518098e-79 y <--- Intercept
beta 2.2553406 0.14197058 15.885971 7.926103e-57 y <--- zeta
sigma 0.6404697 0.25612060 2.500657 1.239632e-02 y <--> y
phi 0.8881856 0.08444223 10.518263 7.117323e-26 zeta <--> zeta
Iterations = 15
library(car)
linearHypothesis(model, c("alpha = 1", "beta = 2", "sigma = 1", "phi =
1")) # true parameter values Linear hypothesis test Hypothesis: alpha = 1 beta = 2 sigma = 1 phi = 1 Model 1: restricted model Model 2: model Res.Df Df Chisq Pr(>Chisq) 1 5 2 1 4 3.8285 0.4297 ------------ snip ---------- For other distributional assumptions, you'd have to write your own objective function but you may still be able to use sem or one of the other SEM packages. I hope this helps, John ----------------------------------------------- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Cedric Sodhi Sent: Saturday, March 02, 2013 4:56 PM To: Rui Barradas Cc: r-help at r-project.org Subject: Re: [R] Errors-In-Variables in R Perhaps it would have been clearer that this is no homework if I hadn't forgotten to say what [1] is. Sorry for that. [1] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15225 (This is no homework but genuinely adresses the problem that R to my knowledge does not have models for error in variables) On Sat, Mar 02, 2013 at 09:34:21PM +0000, Rui Barradas wrote:
There's a no homework policy in R-help. Rui Barradas Em 02-03-2013 18:28, Cedric Sodhi escreveu:
In reference to [1], how would you solve the following regression problem: Given observations (X_i,Y_i) with known respective error
distributions
(e_X_i,e_Y_i) (say, 0-mean Gaussian with known STD), find the
parameters
a and b which maximize the Likelihood of Y = a*X + b Taking the example further, how many of the very simplified
assumptions
from the above example can be lifted or eased and R still has a
method
for finding an errors-in-variables fit?
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.