The behavior of match function
On Fri, 2005-10-21 at 11:19 +0800, ronggui wrote:
x<-1:10 y<-x+1e-20 x
[1] 1 2 3 4 5 6 7 8 9 10
y
[1] 1 2 3 4 5 6 7 8 9 10
identical(x,y)
[1] FALSE
match(x,y)
[1] 1 2 3 4 5 6 7 8 9 10 What's the principle the function use to determine if x match y? Thank you!
In this case, you are comparing x (an integer) with y (a numeric):
x <- 1:10 y <- x + 1e-20
class(x)
[1] "integer"
class(y)
[1] "numeric" Now:
x == y
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE works element-wise, because the differences between the values (1e-20) are less than:
.Machine$double.eps
[1] 2.220446e-16 which is the smallest positive float such that 1 plus that value != 1. See ?.Machine for more information on that. For the same reason:
match(x, y)
[1] 1 2 3 4 5 6 7 8 9 10
x %in% y
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE both work element-wise. However, if you used the following for 'y':
y <- x + 1e-15
Note the results now:
x == y
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE because you are now have differences that are greater than .Machine $double.eps. In general however, when comparing floats, you will want to use all.equal():
all.equal(x, y)
[1] TRUE which compares the values within a specified level of tolerance. See ?all.equal for more information and importantly note the use of isTRUE() as well:
isTRUE(all.equal(x, y))
[1] TRUE Using isTRUE() in this way will result in a single TRUE or FALSE result depending upon the comparison. If the differences happen to be outside the tolerance level, you get something like the following:
y <- x + 1e-5
all.equal(x, y)
[1] "Mean relative difference: 1.818182e-06" which does not help if all you want is a single boolean result. Thus the use of isTRUE() helps here:
isTRUE(all.equal(x, y))
[1] FALSE You should also read R FAQ 7.31 "Why doesn't R think these numbers are equal?". HTH, Marc Schwartz