The behavior of match function - R-help

ronggui · 2005-10-21T03:19:48Z

> x y x [1] 1 2 3 4 5 6 7 8 9 10 > y [1] 1 2 3 4 5 6 7 8 9 10 > identical(x,y) [1] FALSE > match(x,y) [1] 1 2 3 4 5 6 7 8 9 10 What's the principle the function use to determine if x match y? Thank you! 2005-10-21 ------ Deparment of Sociology Fudan University My new mail addres is ronggui.huang at gmail.com Blog:http://sociology.yculblog.com

Marc Schwartz

Thu, Oct 20, 2005 9:42 PM #

On Fri, 2005-10-21 at 11:19 +0800, ronggui wrote:

In this case, you are comparing x (an integer) with y (a numeric):

[1] "integer"

[1] "numeric"


Now:

[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

works element-wise, because the differences between the values (1e-20)
are less than:

[1] 2.220446e-16

which is the smallest positive float such that 1 plus that value != 1.
See ?.Machine for more information on that.

For the same reason:

[1]  1  2  3  4  5  6  7  8  9 10

[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

both work element-wise.


However, if you used the following for 'y':

Note the results now:

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

because you are now have differences that are greater than .Machine
$double.eps.


In general however, when comparing floats, you will want to use
all.equal():

[1] TRUE

which compares the values within a specified level of tolerance.
See ?all.equal for more information and importantly note the use of
isTRUE() as well:

[1] TRUE

Using isTRUE() in this way will result in a single TRUE or FALSE result
depending upon the comparison. If the differences happen to be outside
the tolerance level, you get something like the following:

[1] "Mean relative  difference: 1.818182e-06"

which does not help if all you want is a single boolean result. Thus the
use of isTRUE() helps here:

[1] FALSE


You should also read R FAQ 7.31 "Why doesn't R think these numbers are
equal?".

HTH,

Marc Schwartz