Skip to content

which.na

4 messages · Martin Maechler, Duncan Murdoch, Stavros Macrakis +1 more

#
>> ?is.na
    >> x <- c(NA, 3, 4, 5, NA)
    >> which(is.na(x))
    CAPE> [1] 1 5

well, of course.

But note that  which.na(.) could be implemented to be 
faster (because needing much less memory) than the above,
notably when  x  is large and  has only few NAs

But this now has *REALLY*  changed into a topic belonging to
R-devel, not R-help

--> hence I've diverted the thread to there.

I have recently entertained similar thoughts, i.e. wished for R
functions that compute  
	  which( function_returning_logical(..) )
and also
	    any( function_returning_logical(..) )

directly {on .Internal i.e. C-level} instead of going to
construct the potentially huge logical vector.

For what functions should this happen?
I agree that  is.na() is one of them; but then, why not
    is.nan() /  is.finite() 
too?

Instead of defining a slew of such functions  
which.foo(), which.bar(), any.foo(), any.bar(), etc,
it would be nice to have a generic interface such as

   whichApply(x, is.na)
   whichApply(x, is.nan)

   anyApply(x, is.na) 

where internally, for some functions {in a given internal
table}, the fast shortcut would be used, and for others the
interface would be equivalent to  which( thatFunction( x ) )

Martin Maechler, ETH Zurich (and R Core team)


    CAPE> Charles Annis, P.E.

    CAPE> Charles.Annis at StatisticalEngineering.com
    CAPE> phone: 561-352-9699
    CAPE> eFax:  614-455-3265
    CAPE> http://www.StatisticalEngineering.com
 

    CAPE> -----Original Message-----
    CAPE> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
    CAPE> Behalf Of Santosh
    CAPE> Sent: Thursday, March 19, 2009 9:37 PM
    CAPE> To: r-help at r-project.org
    CAPE> Subject: [R] which.na

    CAPE> Hi R- users

    CAPE> I was wondering if there is any function equivalent to which.na used in S+?

    CAPE> Thanks much in advance!

    CAPE> Regards,
    CAPE> Santosh

    CAPE> [[alternative HTML version deleted]]

    CAPE> ______________________________________________
    CAPE> R-help at r-project.org mailing list
    CAPE> https://stat.ethz.ch/mailman/listinfo/r-help
    CAPE> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    CAPE> and provide commented, minimal, self-contained, reproducible code.

    CAPE> ______________________________________________
    CAPE> R-help at r-project.org mailing list
    CAPE> https://stat.ethz.ch/mailman/listinfo/r-help
    CAPE> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    CAPE> and provide commented, minimal, self-contained, reproducible code.
#
Martin Maechler wrote:
A couple of different interfaces to the same idea:

  - which() could recognize a few thatFunction(x) calls before 
evaluating them, and do the fast internal version.  (This is hard 
because it needs to know
that the user hasn't redefined is.na, etc.  Probably not worth doing.)

  - which() could gain a new arg, so that

       which(x, test=is.na)

    would do as your whichApply() does.

Duncan Murdoch
#
There are many other useful extensions one might imagine along these lines.

For instance, we could have an argument for stopping the 'which'
calculation at the first result (or the first N results), which is
often useful (cf. any).

But I think it would be much cleaner for things like this to be done
in a compiler.  I suppose another possibility would be to have
low-level interfaces like which(a, test=xxx, count=nnn) which could be
used for optimization or for a source-to-source optimizer.

              -s
On Fri, Mar 20, 2009 at 6:28 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
#
If you are considering emulating the S/S+ which.na, which.nan,
etc., family of functions to save space you might also consider
the related anyMissing function (I don't know why it
isn't any.is.na to match the pattern).  anyMissing(x) returns
the same result as any(is.na(x)) or length(which.na(x)) but
stops scanning the input when it sees the first NA, saving time
and space in the common idiom of stopifnot(!any(is.na(x))).
(BTW, Should R have a stopif() function to avoid that double negative?)

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com