Skip to content

inefficient ifelse() ?

9 messages · Henrique Dallazuanna, Dennis Murphy, Thomas Lumley +3 more

#
dear R experts---

  t <- 1:30
  f <- function(t) { cat("f for", t, "\n"); return(2*t) }
  g <- function(t) { cat("g for", t, "\n"); return(3*t) }
  s <- ifelse( t%%2==0, g(t), f(t))

shows that the ifelse function actually evaluates both f() and g() for
all values first, and presumably then just picks left or right results
based on t%%2.  uggh... wouldn't it make more sense to evaluate only
the relevant parts of each vector and then reassemble them?

/iaw
----
Ivo Welch
#
thanks, Henrique.  did you mean

    as.vector(t(mapply(function(x, f)f(x), split(t, ((t %% 2)==0)),
list(f, g))))   ?

otherwise, you get a matrix.

its a good solution, but unfortunately I don't think this can be used
to redefine ifelse(cond,ift,iff) in a way that is transparent.  the
ift and iff functions will always be evaluated before the function
call happens, even with lazy evaluation.  :-(

I still think that it makes sense to have a smarter vectorized %if% in
a vectorized language like R.  just my 5 cents.

/iaw

----
Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com)
On Tue, Mar 1, 2011 at 2:33 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:
#
On Wed, Mar 2, 2011 at 9:36 AM, ivo welch <ivo.welch at gmail.com> wrote:
Ivo,

There is no guarantee in general that  f(x[3,5,7]) is the same as f(x)[3,5,7]


      -thomas
#
An ifelse-like function that only evaluated
what was needed would be fine, but it would
have to be different from ifelse itself.  The
trick is to come up with a good parameterization.

E.g., how would it deal with things like
   ifelse(is.na(x), mean(x, na.rm=TRUE), x)
or
   ifelse(x>1, log(x), runif(length(x),-1,0)) 
or
   ifelse(x>1, log(x), -seq_along(x))
Would it reject such things?  Deciding that the
x in mean(x,na.rm=TRUE) should be replaced by
x[is.na(x)] would be wrong.  Deciding that
runif(length(x)) should be replaced by runif(sum(x>1))
seems a bit much to expect.  Replacing seq_along(x) with
seq_len(sum(x>1)) is wrong.  It would be better to
parameterize the new function so it wouldn't have to
think about those cases.

Would you want it to depend only on a logical
vector or perhaps also on a factor (a vectorized
switch/case function)?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
yikes.  you are asking me too much.

thanks everybody for the information.  I learned something new.

my suggestion would be for the much smarter language designers (than
I) to offer us more or less blissfully ignorant users another
vector-related construct in R.  It could perhaps be named %if% %else%,
analogous to if else (with naming inspired by %in%, and with
evaluation only of relevant parts [just as if else for scalars]), with
different outcomes in some cases, but with the advantage of typically
evaluating only half as many conditions as the ifelse() vector
construct.  %if% %else% may work only in a subset of cases, but when
it does work, it would be nice to have.  it would probably be my first
"goto" function, with ifelse() use only as a fallback.

of course, I now know how to fix my specific issue.  I was just
surprised that my first choice, ifelse(), was not as optimized as I
had thought.

best,

/iaw
On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap <wdunlap at tibco.com> wrote:
#
Try using [<- more, instead of ifelse().  I rarely find
myself really using both of the calls to [<- that ifelse
makes.  E.g., I use
   x[x==999] <- NA
instead of
   x <- ifelse(x==999, NA, x)

But if you find yourself using ifelse in a certain way often,
try writing a function that only allows that case.  E.g.,
   transform2 <- function(x, test, ifTrueFunction, ifFalseFunction)
   {
       stopifnot(is.logical(test), length(x) != length(test), is.function(ifTrueFunction), is.function(ifFalseFunction))
       retval <- x # assume output is of same type as input
       retval[test] <- ifTrueFunction(x[test])
       retval[!test] <- ifFalseFunction(x[!test])
       retval
   }
   transform2(x, x<=0, f, g) 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Hi Ivo,
It might be useful for you to study the examples below.
The key from a programming language point of view is that functions like ifelse are functions of whole vectors, not elements of vectors.  You either evaluate an argument or you don't; you don't evaluate only part of argument.  (Somebody correct me if I'm wrong.)
As you can see from the examples, if there are no TRUEs or no FALSEs in the condition, the corresponding arms are not evaluated, but if there are some of each, both must be evaluated.  This a property of the entire condition vector.  You can see all this if you type ifelse (not ?ifelse, just ifelse) and look at the definition.
If you want to operate on elements of vectors, you need to use subsetting, e.g.:
s = rep(NA,length(t)); b=t%%2==0; s[b]=g(t[b]); s[!b]=f(t[!b])
I agree that it might be counterintuitive for a beginner, but so is 0!=0^0=1, and both follow from first principles. (e.g. n! = n(n-1)!)
"Counterintuitive" is not the same as "incorrect", and "correct" is not the same as "efficient".  :)
HTH
Rex
g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
 [1]  2  6  6 12 10 18 14 24 18 30 22 36 26 42 30 48 34 54 38 60 42 66 46 72 50
[26] 78 54 84 58 90
g for 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60
 [1]   6  12  18  24  30  36  42  48  54  60  66  72  78  84  90  96 102 108 114
[20] 120 126 132 138 144 150 156 162 168 174 180
f for 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61
 [1]   6  10  14  18  22  26  30  34  38  42  46  50  54  58  62  66  70  74  78
[20]  82  86  90  94  98 102 106 110 114 118 122
g for 1 2 NA 1 2 NA 1 2 NA
f for 1 2 NA 1 2 NA 1 2 NA
[1]  2  6 NA  2  6 NA  2  6 NA
[1] NA NA NA NA NA NA NA NA NA NA
g for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
f for 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
[1]  3  4  6 12
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch
Sent: Tuesday, March 01, 2011 5:20 PM
To: William Dunlap
Cc: r-help
Subject: Re: [R] inefficient ifelse() ?

yikes.  you are asking me too much.

thanks everybody for the information.  I learned something new.

my suggestion would be for the much smarter language designers (than
I) to offer us more or less blissfully ignorant users another
vector-related construct in R.  It could perhaps be named %if% %else%,
analogous to if else (with naming inspired by %in%, and with
evaluation only of relevant parts [just as if else for scalars]), with
different outcomes in some cases, but with the advantage of typically
evaluating only half as many conditions as the ifelse() vector
construct.  %if% %else% may work only in a subset of cases, but when
it does work, it would be nice to have.  it would probably be my first
"goto" function, with ifelse() use only as a fallback.

of course, I now know how to fix my specific issue.  I was just
surprised that my first choice, ifelse(), was not as optimized as I
had thought.

best,

/iaw
On Tue, Mar 1, 2011 at 5:13 PM, William Dunlap <wdunlap at tibco.com> wrote:
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.