Skip to content

prod(0, 1:1000) ; 0 * Inf etc

6 messages · Martin Maechler, Mathieu Ribatet, Bill Dunlap +2 more

#
I think most of us would expect  prod(0:1000)  to return 0, and ...


... it does.

However, many of us also expect  
  prod(x1, x2)    to be equivalent to
  prod(c(x1,x2))
the same as we can expect that for min(), max(), sum() and such
members of the "Summary" group.

Consequently, prod(0, 1:1000) should also return 0,
but as you see, it gives  NaN  which may be a bit puzzling...
The explanation is relatively simple:

1) The internal implementation uses

     prod(x1, x2) := prod(x1) * prod(x2)

   which in this case is

2)  0 * Inf    and that is not 0, but NaN;

  not necessarily because we would want that, but I think just
  because the underlying C math library does so.


I would personally like to change both behaviors,
but am currently only proposing to change  prod() such as to
return 0 in such cases.
This would be S-plus compatible, in case that matters.

Opinions?

Martin Maechler, ETH Zurich & R-core
#
I definitely do agree with you.
Basically, I see two different ways to proceed:

   1. one could first check if there are any 0 in the vector and then
      return 0 without computing the product
   2. or convert prod(x1, x2, x3) in prod(c(x1, x2, x3))

Both approaches are similar except for the pathological case where one 
vector x1 is really huge. An example:

prod(0, 1:1e25)

   1. will give 0
   2. will give an error stating that the vector c(0, 1:1e255) will be
      too large - in length I mean

Consequently, my opinion will be that approach 1 will be better - and 
maybe faster because it'll avoide useless computations.

Best,
Mathieu

Martin Maechler a ?crit :

  
    
#
On Mon, 21 Apr 2008, Mathieu Ribatet wrote:

            
That would fail for prod(0,Inf), which should return the same
thing as 0*Inf=NaN.  Similarly for prod(0,NA) and prod(0,NaN).
Scanning for all these things might well be slower than just
doing the multiplications.  Scanning also means that 0 is treated
more commutatively than other numbers.
c() can convert values of classy objects in undesirable ways.
E.g.,
   > now<-Sys.time()
   > min(now-file.info(".")$mtime, now-file.info("..")$mtime)
   Time difference of 3787.759 secs
   > min(c(now-file.info(".")$mtime, now-file.info("..")$mtime))
   [1] 1.052155

This may be considered a bug in c(), at least for class
"timediff" (and  "factor" and "ordered"), but c() removes
attributes.
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."
#
G'day Martin,

On Mon, 21 Apr 2008 18:40:43 +0200
Martin Maechler <maechler at stat.math.ethz.ch> wrote:

            
Many may also expect that prod(x) and prod(rev(x)) are equivalent.
Unfortunately, this does not hold in finite precision arithmetic:
[1] 0
[1] NA
[1] NA

It might be better to educate useRs on finite precision arithmetic
than trying to catch such situations.  Note, I am saying "better", not
"easier". :-)  

Cheers,
	
	Berwin
#
Interesting problem this.

My take on it would be that the "true" value depends
on how fast the "0" approaches 0 and how fast the "n"
approaches infinity.

Consider

f1 <- function(n){prod(1/n , seq_len(n))}
f2 <- function(n){prod(1/factorial(n) , seq_len(n))}
f3 <- function(n){prod(n^(-n) , seq_len(n))}

All these are equal to prod( "small number" , 1:"big number")

but applying these functions to an increasing sequence gives different
behaviour:

 > sapply(c(10,100,1000),f1)
[1]  3.628800e+05 9.332622e+155           Inf
 > sapply(c(10,100,1000),f2)
[1]   1   1 NaN
 > sapply(c(10,100,1000),f3)
[1] 3.628800e-04 9.332622e-43          NaN
 >


f1() tends to infinity, f2() tends to 1, and f3() tends to zero.

Figuring out the appropriate limit in cases like this is a job
for a symbolic system.

I would say the original behaviour is desirable.


rksh
On 22 Apr 2008, at 02:43, Bill Dunlap wrote:

            
--
Robin Hankin
Uncertainty Analyst and Neutral Theorist,
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743
#
Yes you're right when vectors have special values - oupppsss. However, 
for the "more commutatively behaviour of 0", to me this is not a problem 
- just a computational trick. We all know easily how to multiply by 11. 
This is just the same. What we want is that the operator is still 
commutative which (of course) is still the case.

However, you're probably right with the time consuming approach of 
checking 0 (as well as special values) within the vector. The question 
is what is the main objective? Fast computations but could lead to 
miss-calculations - in rarely cases. Or run codes a little bit slower 
while returning the appropriate values - a little bit more often.

I'm not sure users do so many times products of really huge length 
vectors though - but this is only my point of view.
Best,
Mathieu