Skip to content
Prev 319741 / 398502 Next

How many samples ACTUALLY used in regression?

On Mar 18, 2013, at 7:36 AM, Federico Calboli <f.calboli at imperial.ac.uk> wrote:

            
I don't know that this would be universal to all possible R model implementations, but should work for those that at least abide by "certain standards"[1] relative to the internal use of ?model.frame.

In the case where model functions use 'model = TRUE' as the default in their call (eg. lm(),  glm() and MASS::polr()), the returned model object will have a component called 'model', such that:

  nrow(my.model$model)

returns the number of rows in the internally created data frame.

Note that 'model = TRUE' is not the default for many functions, for example Terry's coxph() in survival or Frank's lrm() in rms. 

Note also that the value of 'na.action' in the modeling function call may have a potential effect on whether the number of rows in the retained 'model' data frame is really the correct value.

You can also use model.frame(), independently matching arguments passed to the model function, to replicate what takes place internally in many modeling functions. The result of model.frame() will be a data frame, again, subject to similar limitations as above.

Regards,

Marc Schwartz

[1]: http://developer.r-project.org/model-fitting-functions.txt