Skip to content

unexpected results in comparison (x == y)

5 messages · Peter Tillmann, David Winsemius, Meyners, Michael, LAUSANNE, AppliedMathematics +2 more

#
Dear readers of the list,

I have a problem a comparison of two data from a vector. The comparison
yields FALSE but should be TRUE. I have checked for mode(), length() and
attributes(). See the following code (R2.10.0):
-----------------------------------------------
# data vector of 66 double data
X =
matrix(c(41.41,38.1,39.22,38.1,47.29,46.82,82.46,90.11,45.24,45.74,49.96,53.40,38.20,42.65,45.41,47.92,39.82,42.02,48.17,49.47,39.67,43.89,47.55,50.05,35.75,37.41,46.13,53.64,52.18,56.30,45.15,47.13,41.57,39.08,43.39,44.73,49.38,47.00,45.67,50.53,41.08,44.22,49.28,47.83,49.48,46.04,48.37,47.00,33.96,36.30,49.40,46.44,24.40,24.79,41.55,46.26,37.43,39.88,40.63,38.64,49.92,50.19,47.88,48.61,43.73,44.18),ncol=1)
i = dim(X)

# calculating pairwise differences for each entry in X
Y = matrix(rep(0.0,i[1]*i[1]),ncol=i[1])
for (j in 1:i[1]) {
	Y[j,] = X - X[j,]
}

# getting pairwise absolute differences to vector Z and selecting only (xj -
xk), omitting (xk - xj) 
# and (xj - xj) i.e. diagonal vector
Z = rep(0, ((i[1]*i[1])-i[1])/2)
nn <- 1
for (j in 1:i[1]) {
for (k in 1:i[1]) {
	if (j > k)  {
		Z[nn] <- abs(Y[j,k])
		nn <- nn + 1
	}
}
}
nn <- nn - 1

# sorting Z ascending to ZZ, to determine double entries with same
difference
ii <- 0
ZZ <- sort(Z)

# here the problem occurs:
ZZ[1:10]
ZZ[4]
ZZ[5]
ZZ[4] == ZZ[5]
mode(ZZ[4])
mode(ZZ[5])
length(ZZ[4])
length(ZZ[5])
attributes(ZZ[4])
attributes(ZZ[5])
-----------------------------------------------

I get:
[1] 0.00 0.00 0.01 0.02 0.02 0.02 0.04 0.04 0.04 0.05
[1] 0.02
[1] 0.02
[1] FALSE
[1] "numeric"
[1] "numeric"
[1] 1
[1] 1
NULL
NULL

Which is ok, except for ZZ[4] == ZZ[5].

Can someone please give me an advice where to look?
In real world situations the original vector (X) will contain upto 100
entries.
#
I'm guessing that it's in the FAQ, although I have not committed its  
number to memory.

Try using all.equal() instead of "=="
FAQ 7.31:
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-the
se-numbers-are-equal_003f
D%3D-y%29-tp26195749p26195749.html
#
On 11/04/2009 11:49 PM, Peter Tillmann wrote:
Hi Peter,
This looks like there is a small difference in the two values that is 
not appearing when printed.

zz[4]-zz[5]
[1] -7.105427e-15

Jim
#
Peter Tillmann wrote:
This looks like a side effect of roundoff error due to the finite precision
of numbers stored by a computer.  There was a large discussion of this issue
on the list about two weeks ago, you could read the posts at: 

  http://old.nabble.com/Rounding-error-in-seq%28...%29-ts25686630.html

They may provide some insight.  The punchline is that:

  0.2 = 2 * (1/10)

And 1/10 is not representable as a form of:

  2^n

For some integer n. Therefore the number 0.2 cannot be represented exactly
in floating point arithmetic and so inconsistencies will occur-- as you just
observed.  It's one of those realities of computer science that always
catches us non-computer scientists by surprise.

Hope this helps!

-Charlie

-----
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University