Skip to content
Prev 4765 / 7420 Next

Regression with few observations per factor level

On 24/10/2014, at 09:03 AM, V. Coudrain wrote:

            
Valerie,

This is a nice description of the structure of your data. When you model your data, you should use the same structure in your model. If you ignore some features of this structure, you should have good reasons for your decision. Reaching those decisions needs first analysing data like it is structured. Collapsing these data into, say, five (or four? how?) means does not solve any of the problems with this structure -- among other things, means ignore the temporal autocorrelation structure. (The temporal autocorrelation may be a more important aspect than Year-as-a-factor if you are absolutely uninterested in random years.) With averaging, you really lose degrees of freedom, and are easily allured to wrong conclusions. If you have five means, you can order them in 120 ways (and four means in 24 ways). Two of these are perfectly ordered (proportion 1/60 = 0.017 of all permutations of five points) , and many more are nearly perfectly or "significantly" ordered and trick you to think that a linear regression would be a good solution. With five datum points you just can't know.

Cheers, Jari Oksanen

PS. I hope this threading pleases Gav -- this certainly hurts all Outlook users.