It really is unclear what is claimed to be a bug here. But see https://stat.ethz.ch/pipermail/r-devel/2007-May/045592.html for why the bug is not in R: your old and new data do not match. Your fit is to a category. [The problem with the web interface to R-bugs was reported last week: it is being worked on.]
On Mon, 30 Apr 2007, r.darnell at uq.edu.au wrote:
This is a multi-part message in MIME format. --------------040101030901070905010208 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit The following "issue" was found using
version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 4.1 year 2006 month 12 day 18 svn rev 40228 language R version.string R version 2.4.1 (2006-12-18)
and discussed on the R-downunder mailing list. I hope I have provided enough info. I tried to look at the Bugs Tracking page but got--- The system encountered a fatal error * cannot open config file /home/sfe/r-bugs/jitterbug/R : No such file or directory * The last error code was: No such file or directory uid/gid=30/8 Regards Ross Darnell --------------040101030901070905010208 Content-Type: message/rfc822; name="Re: [R-downunder] Beware unclass(factor)" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Re: [R-downunder] Beware unclass(factor)" Return-path: <john.maindonald at anu.edu.au> Received: from mail2a.soe.uq.edu.au (mail2a.soe.uq.edu.au [130.102.3.87]) by MAILSTORE (The University of Queensland Central Mail System) with ESMTP id <0JHB00BUB0WHC0 at anode.soe.uq.edu.au> for r.darnell at uq.edu.au; Mon, 30 Apr 2007 19:26:41 +1000 (EST) Received: from mailhub4.uq.edu.au (mailhub4.uq.edu.au [130.102.149.131]) by MAILSTORE (The University of Queensland Central Mail System) with ESMTP id <0JHB009DL0WH43 at positive.soe.uq.edu.au> for r.darnell at uq.edu.au; Mon, 30 Apr 2007 19:26:41 +1000 (EST) Received: from customer-domains.icp-qv1-irony10.iinet.net.au (customer-domains.icp-qv1-irony10.iinet.net.au [203.59.1.145]) by mailhub4.uq.edu.au (8.13.8/8.13.8) with ESMTP id l3U9QcOd021380 for <r.darnell at uq.edu.au>; Mon, 30 Apr 2007 19:26:41 +1000 Received: from 203-173-2-10.dyn.iinet.net.au (HELO [192.168.0.2]) ([203.173.2.10]) by iinet-mail.icp-qv1-irony10.iinet.net.au with ESMTP; Mon, 30 Apr 2007 17:25:10 +0800 Date: Mon, 30 Apr 2007 19:25:09 +1000 From: John Maindonald <john.maindonald at anu.edu.au> Subject: Re: [R-downunder] Beware unclass(factor) In-reply-to: <46359373.50504 at uq.edu.au> To: Ross Darnell <r.darnell at uq.edu.au> Cc: r-downunder at stat.auckland.ac.nz Message-id: <68935773-EB35-4B4F-9970-0D241FDFF73C at anu.edu.au> MIME-version: 1.0 (Apple Message framework v752.3) X-Mailer: Apple Mail (2.752.3) Content-type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Content-transfer-encoding: 7bit X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAARTNUbLrQIKUGdsb2JhbAANj3wBASo X-IronPort-AV: i="4.14,469,1170601200"; d="scan'208"; a="80792155:sNHT7461584868" X-Sorbs: not_in_sorbs X-Spam-Score: 0 (), 5 = high X-UQ-Spam-Score: UQ-Spam-Score (0), 5 = high X-UQ-FilterTime: 1177925201 X-Scanned-By: MIMEDefang 2.58 on UQ Mailhub on 130.102.149.131 References: <46359373.50504 at uq.edu.au> Original-recipient: rfc822;r.darnell at uq.edu.au Observe the following
z <- model.frame(cbind(moths,(20-moths)) ~sex+ doselin,data=worms) class(z$doselin)
[1] "other"
levels(z$doselin)
[1] "1" "2" "4" "8" "16" "32"
attributes(z$doselin)
$levels [1] "1" "2" "4" "8" "16" "32" $class [1] "other" The problem surfaces in the call for model.frame() from predict.lm() when it is called by predict.glm(). This call is jumping to conclusions when it uses the presence of a levels attribute as an indication that doselin is a factor, ironic as it was the call that was initiated by glm that seems to have given the column doselin of the object returned by model.frame() the class "other". This seems to me to be a bug. The call to unclass() does not strip the levels attribute from doselin. (This is not, I think, the bug; rather the problem is in the model matrix that is created.) The column worms$doselin does though have class "integer", at least as far as the function class() is concerned. You can fix the problem by setting: worms$doselin <- as.vector(unclass(worms$Dose)) This strips off the levels attribute. In my view model.frame ought to have stripped the levels attribute from the column doselin in the object that it returned. I consider that this should be reported as a bug, or at least as an undesirable feature. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 30 Apr 2007, at 4:57 PM, Ross Darnell wrote:
Just an observation about the use of unclass() to generate codes for factors. As an example take the dataset from the MASS4 book
worms <- data.frame(sex=gl(2,6),Dose=factor(rep(2^(0:5),
2)),moths=c(1,4,9,13,18,20,0,2,6,10,12,16))
worms$doselin <- unclass(worms$Dose)
worms.glm <- glm(cbind(moths,(20-moths)) ~sex+
doselin,data=worms,family=binomial)
predict(worms.glm,new=data.frame(sex="1",doselin=6))
Error: variable 'doselin' was fitted with class "other" but class "numeric" was supplied In addition: Warning message: variable 'doselin' is not a factor in: model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels)
The /doselin/ vector is "atomic" --- good enough for the glm() function but not acceptable by predict()
str(worms$doselin)
atomic [1:12] 1 2 3 4 5 6 1 2 3 4 ... - attr(*, "levels")= chr [1:6] "1" "2" "4" "8" ...
Cheers Ross Darnell -- R-downunder at stat.auckland.ac.nz http://www.stat.auckland.ac.nz/r-downunder To unsubscribe send an email to R-downunder- unsubscribe at stat.auckland.ac.nz
--------------040101030901070905010208--
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595