Skip to content

Make sure a data frame has been "fun through" a function

10 messages · Ista Zahn, David Winsemius, Charles C. Berry +2 more

#
Hello,

I would like to add something to a data frame that is 1) invisible to the
user, 2) has no side effects, and 3) I can test for in a following
function. Is this possible? I am exploring classes and attributes and I
have thought about using a list (but 1 and 2 not satisfied). Any help would
be greatly appreciated.

I did not provide a reproducible example because I see this as more of a R
language question, but I will be happy to make a toy example if that would
help.

I appreciate all of the help.

kindest regards,
#
It depends on what you mean by 1). If you mean "won't annoy the user" then
yes, e.g., add something to the class attribute. If 1) means "can't be
discovered by the user" then no (at least not easily). Anything you can see
they can see.

Best,
Ista
On Feb 20, 2017 4:21 PM, "stephen sefick" <ssefick at gmail.com> wrote:
Hello,

I would like to add something to a data frame that is 1) invisible to the
user, 2) has no side effects, and 3) I can test for in a following
function. Is this possible? I am exploring classes and attributes and I
have thought about using a list (but 1 and 2 not satisfied). Any help would
be greatly appreciated.

I did not provide a reproducible example because I see this as more of a R
language question, but I will be happy to make a toy example if that would
help.

I appreciate all of the help.

kindest regards,

--
Let's not spend our time and resources thinking about things that are so
little or so large that all they really do for us is puff us up and make us
feel like gods.  We are mammals, and have not exhausted the annoying little
problems of being mammals.

                                -K. Mullis

"A big computer, a complex algorithm and a long time does not equal
science."

                              -Robert Gentleman


______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Yes, I mean "won't annoy the user", will allow them to do anything they
need to do with a dataframe (write to csv, etc.), but will allow me to test
for in a down stream function of the analysis to stop the function and
present an error. Adding something to the class attribute seems like the
right thing to do. With my clarification do you think these seems like a
sensible thing to do? Thank you for all of the help.
kindest regards,

Stephen
On Mon, Feb 20, 2017 at 5:25 PM, Ista Zahn <istazahn at gmail.com> wrote:

            

  
    
#
On Mon, 20 Feb 2017, stephen sefick wrote:

            
Depends on exactly what you mean by `invisible' and `side effects'.

You can do this (but I am not necessarily recommending this):
+ class(x)<- c("more.stuff",class(x))
+ attr(x,"stuff")<- list(...)
+ x}
And printing and model functions will be unaffected:
a b
1 1 a
2 2 b
3 3 c
$comment
[1] "wow"

$length
[1] "3 rows"
[1] "Component ?call?: target, current do not match when deparsed"
And if you need some generics to take account of the "stuff" attribute, 
you can write the methods to do that.

---

Another solution is to put your data.framne in a package and then have 
other objects hold the 'stuff' stuff. Once your package is loaded or 
imported, the user will have access to the data in a way that might be 
said to be `invisible' in ordinary usage.

---

But seriously, you should say *why* you want to do this. There are 
probably excellent solutions that do not involve directly altering the 
data.frame and may not involve putting together a package.

HTH,

Chuck
#
Hello All,

I am writing a package. I would like to encourage the user to look at the
data to rectify errors with function A before utilizing function B to code
these data as binary. I thought about solving this problem by adding a
"flag" in the attributes that could be used downstream in B, and have a
function that adds this "flag" if the user is convinced that everything is
okay. This would allow the user to utilize their data as is, if error
checking is not necessary. Maybe I am overthinking this. Thanks again.
kindest regards,

Stephen
On Mon, Feb 20, 2017 at 6:24 PM, Charles C. Berry <ccberry at ucsd.edu> wrote:

            

  
    
#
Still not clear what is needed but there is an `attr<-` function. You might get waht you wnat by having function A add an attribute which is then checked by B.
#
Sorry for not being clear. I have never used S3 methods before. Below is
some R code that sketches out my idea. Is this a sensible solution?

test_data <- data.frame(a=1:10, b=1:10, c=1:10)

functionA <- function(x, impossible_genotype){
    ##some data processing
    y <- x

    ##return S3 to be able to use impossible genotype later
    class(y) <- append(class(y),"genotypes")

    attr(y, "impossible_genotype") <- impossible_genotype

    return(y)
}

test_data_genotypes <- functionA(test_data, impossible_genotype="Ref")

functionB <- function(x){
    ##stop if pre-processed with functionA
    if(sum(class(x)=="genotypes")!=1){stop("Need to pre-process data with
functionA")}

    ##use this later in functionB to
    impossible_genotype <- attributes(x)$impossible_genotype

    alleles <- c("Ref", "Alt")

    coded_genotype <- alleles[alleles!=impossible_genotype]



    return(coded_genotype)
}

##stop if not pre-processed with functionA
functionB(test_data)

##processed with functionA
functionB(test_data_genotypes)

On Tue, Feb 21, 2017 at 6:41 AM, David Winsemius <dwinsemius at comcast.net>
wrote:

  
    
#
On Tue, 21 Feb 2017, stephen sefick wrote:

            
Sure. See comments (untested) inline.

Chuck
class(y) <- c("genotypes",class(y))
if(!(inherits("genotypes")){
 	stop("Need to pre-process data with functionA")}


or in functionA you could skip the class()<- and just set the
"impossible_genotypes" attribute to FALSE when there are none such.

Then here test

      if (is.null(attr(x,"impossible_genotypes"))){
 		stop("Need to pre-process data with functionA")
 	} else {
 		return(alleles)
 	}
impossible_genotype <- attr(x,"impossible_genotype")
maybe `!is.element(alleles,impossible_genotype)' is safer than `!='
#
Stray attributes on data.frames may or may not survive some simple
operations on the data.frame.  E.g.,
[1] TRUE
[1] TRUE
[1] FALSE
[1] TRUE
[1] TRUE

I don't know if this would be an issue in your case.  If it is, you
could subclass "data.frame" and define methods so that the operations
of interest preserve or remove the attribute in the way that you
desire.

Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue, Feb 21, 2017 at 8:30 AM, Charles C. Berry <ccberry at ucsd.edu> wrote:
3 days later
#
Update, I have decided to make use S4 in order to solve my problem. Are
there any particular resources that might be helpful. Thanks you for all of
the help.
kindest regards,

STephen
On Tue, Feb 21, 2017 at 10:52 AM, William Dunlap <wdunlap at tibco.com> wrote: