Variable labels (was Re: [R] Reading SAS version 8 data into
"Warnes, Gregory R" wrote:
From: fharrell@virginia.edu [mailto:fharrell@virginia.edu]
[snip]
I think your code is more complex that is really needed. The problem with defaulting to deparse(...) is that multiple function pass-throughs return the wrong result:
[snip]
So I don't see a large role for the deparse(...) method.
Actually one of the reasons that I included the deparse(...) method was to create a 'drop-in' substitute for the current calls to deparse(...) that are found throughout the code and that would be backward-compatible. Having such a call will immensely simplify changing existing code.
Thanks for your reply Greg. The "complex" code I was referring to was the eval() and as.name() parts of your code. The deparse can be handy although I have done that on a case-by-case basis. For example in a high-level plotting function I'll retrieve label(an argument) and if that is empty I'll use deparse(substitute(argument)).
It's true that the variable names don't get correctly handled once you down a layer of function calls, but that applies AFAIK to the current deparse(...) method as well.
Right. That's why I try to define labels early when a data frame is being created (e.g, in sas.get).
The Hmisc library already defines label<- so if you are willing to use another name for your version that would prevent confusion from users of Hmisc.
I don't think there will be a problem as long as the functions do exactly the same thing. To that end, perhaps we should agree on a common set of functions and keep them in sync. I expect that your functions are better tested than mine, since they've been available for some time.
Mine are simple:
label <- function(x) {
lab<-attr(x, "label")
if(is.null(lab))lab<-""
lab
}
#From Bill Dunlap, StatSci 15Mar95:
"label<-" <- if(!.SV4.) function(x, value)
structure(x, label=value,
class=c('labelled',
attr(x,'class')[attr(x,'class')!='labelled'])) else
function(x, value) { # 1Nov00 for Splus 5.x, 6.x
attr(x,'label') <- value
x
}
For non SV4 systems (which include R) you see above that
when putting a label on a variable a class "labelled" is
added. This is really to handle the subsetting problem
but I would rather get rid of it if subsetting can
respect selected attributes.
The problem of labels being retained after you do arithmetic on the variable is a real one, and one I've put up with for a long time with S-Plus. It would be nice if R could prevent that but that is getting tricky.
What I've wanted more generally is the ability for the user to specify a vector of attribute names in options() that would be preserved upon subsetting. That way I wouldn't have to go to trouble to write local versions of [.factor, etc. that carry the 'label' attribute. Im my usage, 'label's are always logically carried forward for subsetting.
It seems to be a good idea to preserve the labels during subset operations. What are the possible cons?
If the user or a package specifies the list of attribute names to preserve (I can only think of 'label' and 'units' right now) I don't see a downside. -Frank
-Greg LEGAL NOTICE Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
Frank E Harrell Jr Prof. of Biostatistics & Statistics Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._