delete.response leaves response in attribute dataClasses
Thanks, Bill Counter-arguments at the end
On Thu, Jan 5, 2012 at 3:15 PM, William Dunlap <wdunlap at tibco.com> wrote:
My feeling that everyone would index dataClasses by name was
wrong. ?I looked through the packages that used dataClasses
and saw code that would break if the first (response) entry
were omitted. ?(I didn't check to see if passing the output
of delete.response to these functions would be appropriate.)
E.g.,
file: AICcmodavg/R/predictSE.mer.r
?##matrix with info on factors
?fact.frame <- attr(attr(orig.frame, "terms"), "dataClasses")[-1]
?##continue if factors
?if(any(fact.frame == "factor")) {
? ?id.factors <- which(fact.frame == "factor")
? ?fact.name <- names(fact.frame)[id.factors] #identify the rows for factors
Some packages create a dataClass attribute for a model.frame
(not its terms attribute) that does not have any names:
file: caper/R/macrocaic.R
? attr(mf, "dataClasses") <- rep("numeric", dim(termFactors)[2])
.checkMFClasses() does not throw an error for that, but it
doesn't do any real checking either.
Most users of dataClasses do pass it to .checkMFClasses() to
compare it with newdata and that doesn't care if you have extra
entries in dataClasses.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
I can't understand what your point is. I agree we can work around the problem, but why should we have to? If you confine yourself to the output of "delete.response" applied to a terms object from a regression, can you point to any package or usage that depends on leaving the response variable in the dataClasses attribute? I can't find one. In R base, these are all the references to delete.response: stats/R/models.R:delete.response <- function (termobj) stats/R/lm.R: Terms <- delete.response(tt) stats/R/lm.R: Terms <- delete.response(tt) stats/R/ppr.R: Terms <- delete.response(object$terms) stats/R/loess.R: as.matrix(model.frame(delete.response(terms(object)), newdata, stats/R/dummy.coef.R: Terms <- delete.response(Terms) I've looked it over carefully and predict.lm (in lm.R) would not be affected by the change I propose. I can't find any usage in loess.R of the dataClasses attribute. Furthermore, I can't see how a person would use the dataClasses attribute at all, after the other markers of the response are eliminated. How is a method to find which variable is the response, after response=0? I'm not disagreeing with you that I can workaround the peculiarity that the response is left in the dataClasses attribute of the output object from delete.response. I'm just saying it is a complication that programmers should not have to put up with, because I think delete.response should delete the response from all attributes of a terms object. pj
Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas