Numeric class and sasxport.get
Sebastien Bihorel wrote:
Ok, just so as I get that straight, is the 'labelled' class something that you created in your package or a readily available class in base R?
It's something we added for the Hmisc package. Signing off, Frank
*Sebastien Bihorel, PharmD, PhD* PKPD Scientist Cognigen Corp Email: sebastien.bihorel at cognigencorp.com <mailto:sebastien.bihorel at cognigencorp.com> Phone: (716) 633-3463 ext. 323 Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
I also realized the flaw after testing the script on various datasets... Following up on your last note: 1- Is that the reason why the class of integer and regular numeric variable is solely "labelled" following sasxport.get?
Yes. R gurus might correct me but just creating a numeric vector doesn't create a 'hard' class, add adding your own class attribute equal to 'numeric' or 'integer' might cause a problem downstream.
2- Can class be 'soft' for other 'kind' of variables?
Not that I can recall.
3- Would you anticipate the following wrapper function to generate incompatibilities with other R functions?
I'm going to beg off on that. I'm not enough of an expert on the impact of adding such classes. Frank
SASxpt.get <- function(file, force.single = TRUE,
method=c('read.xport','dataload','csv'),
formats=NULL, allow=NULL,
out=NULL, keep=NULL, drop=NULL, as.is=0.5, FUN=NULL) {
foo <- sasxport.get(file=file, force.single=force.single,
method=method,
formats=formats, allow=allow, out=out, keep=keep,
drop=drop, as.is=as.is, FUN=FUN)
# For each variable of class "labelled" (and only "labelled"), add
the native class as a second class argument
sglClassVarInd <- which(lapply(lapply(unclass(foo),class),length)==1)
for (i in 1:length(sglClassVarInd)){
x <- foo[,sglClassVarInd[i]] if (class(x)=="labelled")
class(foo[,sglClassVarInd[i]]) <- c(class(x), class(unclass(x)))
}
return(foo)
}
*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.bihorel at cognigencorp.com
<mailto:sebastien.bihorel at cognigencorp.com>
Phone: (716) 633-3463 ext. 323
Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Thanks a lot Frank, One last question, though. I was tempted to remove all attributes of my variables after the sasxport.get call using foo <- sasxport.get(...) foo <- as.data.frame(lapply(unclass(foo),as.vector)) Since I never worked with the objects of class 'labeled', I was wondering what I will loose by removing this attribute.
Not a good idea, for many reasons including dates and other types. And the labelled type is need if you subset the data, in order to keep the labels. Note that your original issue is related to "class" being "soft" for integers and regular numerics: x <- 1:3
attributes(x)
NULL
class(x)
[1] "integer"
x <- runif(3) class(x)
[1] "numeric"
attributes(x)
NULL Frank
*Sebastien Bihorel, PharmD, PhD* PKPD Scientist Cognigen Corp Email: sebastien.bihorel at cognigencorp.com <mailto:sebastien.bihorel at cognigencorp.com> Phone: (716) 633-3463 ext. 323 Frank E Harrell Jr wrote:
Sebastien.Bihorel at cognigencorp.com wrote:
The problem is actually not related to a broken command but a attempt of operational qualification of R. A few years ago, my company developed a set of scripts for the 'operational qualification' of Splus. We are switching to R so I am currently trying to port the scripts to R. All Splus scripts imported SAS data using the importData function, which I substituted by sasxport.get. One particular script returns the class of each variable of the imported data frame; the output must match the expected values: numeric, factor, integer, etc... The R 'translation' with sasxport.get is thus problematic. If there is no easy tweak of the function, we will probably have to remove this script from our list of 'qualification' scripts. Although it would be nice
Then my advice is to write your own wrapper function for sasxport.get that takes its output, looks for labelled variables, and adds a new class of your choosing depending on properties of the variable, making sure that you write methods needed for that class (if any). Then test your new function, not sasxport.get explicitly. Frank
Sebastien Bihorel wrote:
Frank, It is a non existing issue for me if the variables of class "labelled" (and only "labelled") can only be numerical variables (integer or numeric). Sebastien
'labelled' can apply to any type of vector. I'm not clear on the problem this causes you. Please provide a command that is broken by this behavior. Frank
Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Dear R-users, The sasxport.get function (from the Hmisc package) automatically defines the class of imported variables. I have noticed that the class of theoretically numeric variables is simply "labelled", although character variables might end up been defined as "labelled" "Date" or "labelled" "factor". Is there a way to tell sasxport.get to define numeric variable as "labelled" "integer" or "labelled" "numeric"?
Sebastien, If that would fix a problem you're having we could look into it. Otherwise I'd tend to leave well enough alone. Frank
Thank you Sebastien
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
--
Frank E Harrell Jr Professor and Chair School of
Medicine
Department of Biostatistics Vanderbilt
University
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University