Skip to content

There is pmin and pmax each taking na.rm, how about psum?

5 messages · ONKELINX, Thierry, Matt Dowle, Hadley Wickham

#
Hi,

Please consider the following :

x = c(1,3,NA,5)
y = c(2,NA,4,1)

min(x,y,na.rm=TRUE)    # ok
[1] 1
max(x,y,na.rm=TRUE)    # ok
[1] 5
sum(x,y,na.rm=TRUE)    # ok
[1] 16

pmin(x,y,na.rm=TRUE)   # ok
[1] 1 3 4 1
pmax(x,y,na.rm=TRUE)   # ok
[1] 2 3 4 5
psum(x,y,na.rm=TRUE)
[1] 3 3 4 6                             # expected result
Error: could not find function "psum"   # actual result

I realise that + is already like psum, but what about NA?

x+y
[1]  3 NA NA  6        # can't supply `na.rm=TRUE` to `+`

Is there a case to add psum? Or have I missed something.

This question survived when I asked on Stack Overflow :
http://stackoverflow.com/questions/13123638/there-is-pmin-and-pmax-each-taking-na-rm-why-no-psum

And a search of the archives found that has Gabor has suggested it too as
an aside :
http://r.789695.n4.nabble.com/How-to-do-it-without-for-loops-tp794745p794750.html

If someone from R core is willing to sponsor the idea, I am willing to
write, test and submit the code for psum. Implemented in a very similar
fashion to pmin and pmax.  Or perhaps it exists already in a package
somewhere (I searched but didn't find it).

Matthew
#
Why don't you make a matrix and use colSums or rowSums?

x = c(1,3,NA,5)
y = c(2,NA,4,1)
colSums(rbind(x, y), na.rm = TRUE)


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-----Oorspronkelijk bericht-----
Van: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] Namens Matthew Dowle
Verzonden: dinsdag 30 oktober 2012 12:03
Aan: r-devel at r-project.org
Onderwerp: [Rd] There is pmin and pmax each taking na.rm, how about psum?


Hi,

Please consider the following :

x = c(1,3,NA,5)
y = c(2,NA,4,1)

min(x,y,na.rm=TRUE)    # ok
[1] 1
max(x,y,na.rm=TRUE)    # ok
[1] 5
sum(x,y,na.rm=TRUE)    # ok
[1] 16

pmin(x,y,na.rm=TRUE)   # ok
[1] 1 3 4 1
pmax(x,y,na.rm=TRUE)   # ok
[1] 2 3 4 5
psum(x,y,na.rm=TRUE)
[1] 3 3 4 6                             # expected result
Error: could not find function "psum"   # actual result

I realise that + is already like psum, but what about NA?

x+y
[1]  3 NA NA  6        # can't supply `na.rm=TRUE` to `+`

Is there a case to add psum? Or have I missed something.

This question survived when I asked on Stack Overflow :
http://stackoverflow.com/questions/13123638/there-is-pmin-and-pmax-each-taking-na-rm-why-no-psum

And a search of the archives found that has Gabor has suggested it too as an aside :
http://r.789695.n4.nabble.com/How-to-do-it-without-for-loops-tp794745p794750.html

If someone from R core is willing to sponsor the idea, I am willing to write, test and submit the code for psum. Implemented in a very similar fashion to pmin and pmax.  Or perhaps it exists already in a package somewhere (I searched but didn't find it).

Matthew

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
#
Because that's inconsistent with pmin and pmax when two NAs are summed.

x = c(1,3,NA,NA,5)
y = c(2,NA,4,NA,1)
colSums(rbind(x, y), na.rm = TRUE)
[1] 3 3 4 0 6    # actual
[1] 3 3 4 NA 6   # desired

and it would be less convenient/natural (and slower) than a psum which
would call .Internal(psum(na.rm,...)) in the same way as pmin and pmax.
#
If psum, then why not pdiff (-), pprod (*) and precip (/) ?  And
similarly, what about equivalent functions for ^, %%, %/%, &, and | ?

Hadley
#
Not pdiff because i) psum(x,-y,na.rm=TRUE) would do that and ii) diff is
quite unlike -. Yes, pprod too, but not pdiv (or precip) because
pprod(x,y^-1,na.rm=TRUE) would dominate that.
I like the suggestion, but not as useful as psum and pprod. It would
probably be going too far to add those too. Plus in ?groupGeneric, under
section 3, there are 7 functions listed :

    min, max, sum, prod, range, all, any

The p* would be extended to 2 more of those 7. Wouldn't make sense for
prange, pall or pany.  So, just psum and pprod. ^, %%, %/%, &, and | are
listed in section 2, Group "ops" and seem different to sum and prod in
that sense.