Dear Consultant
I've done linear regression successfully on R a few times before. But this
time it keeps telling me:-
"Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases"
The model is:-
fm1 <- lm(TS.CM ~ AGE + SEX + HFE.Y.01 + TFC2B.01 + HFE.Y.01*TFC2B.01, data
= IRONresults, subset = DIAG2.1D == 0)
summary (fm1)
TS.CM is a continuous variable (%s), sex is coded 0 = women, 1 = men,
DIAG2.1D is coded 0 = non-demented, 1 = ALzheimer's disease and the genes,
HFE.Y.01 & TFC2B.01, are coded 0 = non-carrier and 1 = carrier
I've tried recoding the data to use 1 & 2, instead of 0 & 1, and I've
removed the rows with missing data. I've also tried putting "...lm(formula
= TS.CM ~ ..." But I always get the same error message
What am I doing wrong?
A related question: what's the minimum no of data points for regression
analysis to work? We have only 23 cases carrying both genes out of 447 and
only 8 out of 264 in the above subset (ie non-demented). I seem to
remember hearing somewhere that you needed a minimum of ~30 (?), so
probably this wouldn't work anyway. Still, I'd like to know what I was
doing wrong!
Many thanks
Donald (Lehmann)
linear regression
2 messages · Donald Lehmann, Peter Dalgaard
Donald Lehmann <donald.lehmann at pharmacology.oxford.ac.uk> writes:
Dear Consultant
I've done linear regression successfully on R a few times before. But
this time it keeps telling me:-
"Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases"
The model is:-
fm1 <- lm(TS.CM ~ AGE + SEX + HFE.Y.01 + TFC2B.01 + HFE.Y.01*TFC2B.01,
data = IRONresults, subset = DIAG2.1D == 0)
summary (fm1)
TS.CM is a continuous variable (%s), sex is coded 0 = women, 1 = men,
DIAG2.1D is coded 0 = non-demented, 1 = ALzheimer's disease and the
genes, HFE.Y.01 & TFC2B.01, are coded 0 = non-carrier and 1 = carrier
I've tried recoding the data to use 1 & 2, instead of 0 & 1, and I've
removed the rows with missing data. I've also tried putting
"...lm(formula = TS.CM ~ ..." But I always get the same error message
What am I doing wrong?
You don't need to give the main effects when there's a "*" term (that's a SASism, the R equivalent is ":" and a*b == a+b+a:b by definition), but that is hardly the main problem. Could you have a look at this? : with(IRONresults, complete.cases(TS.CM, AGE, SEX, HFE.Y.01, TFC2B.01)) If you get all FALSE, you'll know what hit you...
A related question: what's the minimum no of data points for regression analysis to work? We have only 23 cases carrying both genes out of 447 and only 8 out of 264 in the above subset (ie non-demented). I seem to remember hearing somewhere that you needed a minimum of ~30 (?), so probably this wouldn't work anyway. Still, I'd like to know what I was doing wrong!
Technically, you just need linearly independent predictors and more observations than parameters (incl. the intercept). Other bounds get bandied about on what should be required for a *meaningful* analysis (like "10 observations per parameter"), but these are quite heuristic and empirical in nature.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907