Skip to content
Prev 43440 / 398506 Next

memory problem for R

Was your 10% sample contiguous or randomly selected from the 
entire file?  If contiguous, you might get something from, say, 
processing the file in 100 contiguous blocks, computing something like 
the mean of each 1% block (or summarizing in some other way within 
blocks), then combining the summaries and do regression on block 
summaries. 

      If it was an honest random sample (e.g., selecting approximately 
10% from each 10%), then the block averaging won't work:  You have an 
inherent singularity in the structure of the data that will likely not 
permit you to estimate everything you want to estimate.  You need to 
understand that singularity / lack of estimability and decide what to do 
about it. 

      In either case, "lm(..., singular.ok=T)" will at least give you an 
answer even when the model is not fully estimable. 

      hope this helps. 
      spencer graves
Yun-Fang Juan wrote: