Skip to content

Unexpected R Behavior: Adding 4 to Large Numbers/IDs Containing Current Year

3 messages · Christopher T. Moore, Peter Langfelder, David Winsemius

#
Hello,

I have encountered some unexpected behavior in R that seems to occur as a 
result of having the current year embedded in a number:
[1] 4.125569e+16 4.125570e+16 4.125571e+16
[1] 41255689815201104 41255699815201104 41255709815201104
[1] "41255689815201104" "41255699815201104" "41255709815201104"
IDs.character <- c("41255689815201100", "41255699815201100", 
"41255709815201100")
[1] "41255689815201100" "41255699815201100" "41255709815201100"
[1] 41255689815201104 41255699815201104 41255709815201104
#Is this problem occurring because the current year is embedded in the 
number?
[1] 41255689815201104 41255699815201000 41255709815201200
Am I doing something wrong? Any insight on how I can avoid the problem of R 
changing numbers on its own? Are others able to replicate this example? Is 
this some kind of bug? Am I right that this problem is occurring because 
the current year is embedded in the number? I discovered this when trying 
to merge two data sets, one with IDs stored numbers and one with IDs as 
characters. I have replicated this in Windows XP with R 2.12 and Windows 7 
with R 2.13 (both 32- and 64-bit versions).

Thanks,
Chris
#
You seem to be running into the limits of double-precision - your IDs
have 17 "significant" digits which is more than the double precision
floating point number can hold without any rounding errors.

Since you are using these numbers as IDs, simply keep them as
character strings throughout your code, and nothing will ever change.
Or shorten the IDs by a few digits and your IDs will be safe again.

HTH,

Peter
On Wed, Jun 29, 2011 at 11:29 AM, Christopher T. Moore <moor0554 at umn.edu> wrote:
#
On Jun 29, 2011, at 2:29 PM, Christopher T. Moore wrote:

            
No. that is not the explanation.
41255689815201100  > 2*10^9
[1] TRUE

So you may think you are working with integers but youa re in fact  
working with floating point numbers. See the R-FAQ