Skip to content

Inverse of FAQ 7.31.

7 messages · Rolf Turner, Chandra Salgado Kent, ONKELINX, Thierry +3 more

#
Why does R think these numbers ***are*** equal?

In a somewhat bizarre set of circumstances I calculated

     x0 <- 0.03580067
     x1 <- 0.03474075
     y0 <- 0.4918823
     y1 <- 0.4474461
     dx <- x1 - x0
     dy <- y1 - y0
     xx <- (x0 + x1)/2
     yy <- (y0 + y1)/2
     chk <- yy*dx - xx*dy + x0*dy - y0*dx

If you think about it ***very*** carefully ( :-) ) you'll see that 
``chk'' ought to be zero.

Blow me down, R gets 0.  Exactly.  To as many significant digits/decimal 
places
as I can get it to print out.

But .... I wrote a wee function in C to do the *same* calculation and 
dyn.load()-ed
it and called it with .C().  And I got -1.248844e-19.

This is of course zero, to all floating point arithmetic intents and 
purposes.  But if
I name the result returned by my call to .C() ``xxx'' and ask

     xxx >= 0

I get FALSE whereas ``chk >= 0'' returns TRUE (as does ``chk <= 0'', of 
course).
(And inside my C function, the comparison ``xxx >= 0'' yields ``false'' 
as well.)

I was vaguely thinking that raw R arithmetic would be equivalent to C 
arithmetic.
(Isn't R written in C?)

Can someone explain to me how it is that R (magically) gets it exactly 
right, whereas
a call to .C() gives the sort of ``approximately right'' answer that one 
might usually
expect?  I know that R Core is ***good*** but even they can't make C do 
infinite
precision arithmetic. :-)

This is really just idle curiosity --- I realize that this phenomenon is 
one that I'll simply have
to live with.  But if I can get some deeper insight as to why it occurs, 
well, that would
be nice.

     cheers,

         Rolf Turner
#
Dear Chandra,

You're on the wrong track. You don't need for loops as you can do this vectorised.

as.numeric(interaction(data$Groups, data$Dates, drop = TRUE))

Best regards,

Thierry
#
On Aug 2, 2011, at 08:02 , Rolf Turner wrote:

            
I think the long and the short of it is that R lost a couple of bits of precision that C retained. This sort of thing happens if R stores things into 64 bit floating point objects while C keeps them in 80 bit CPU registers. In general, floating point calculations do not obey the laws of math, for example the associative law (i.e., (a+b)-c ?= a+(b-c), especially if b and c are large and nearly equal), so any reordering of expressions by the compiler may give a slightly different result.
#
How about this?
Dates        Groups
[1,] "12/10/2010" "A"   
[2,] "12/10/2010" "B"   
[3,] "13/10/2010" "A"   
[4,] "13/10/2010" "B"   
[5,] "13/10/2010" "C"
Dates Groups id
1 12/10/2010      A  1
2 12/10/2010      B  2
3 13/10/2010      A  3
4 13/10/2010      B  4
5 13/10/2010      C  5
Dates Groups id
1 12/10/2010      A  1
2 12/10/2010      B  2
3 12/10/2010      B  2
4 13/10/2010      A  3
5 13/10/2010      B  4
6 13/10/2010      C  5

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Chandra Salgado Kent
Sent: Tuesday, August 02, 2011 2:12 AM
To: r-help at r-project.org
Subject: [R] Loops to assign a unique ID to a column

Dear R help,

 

I am fairly new in data management and programming in R, and am trying to
write what is probably a simple loop, but am not having any luck. I have a
dataframe with something like the following (but much bigger):

 

Dates<-c("12/10/2010","12/10/2010","12/10/2010","13/10/2010", "13/10/2010",
"13/10/2010")

Groups<-c("A","B","B","A","B","C")

data<-data.frame(Dates, Groups)

 

I would like to create a new column in the dataframe, and give each distinct
date by group a unique identifying number starting with 1, so that the
resulting column would look something like:

 

ID<-c(1,2,2,3,4,5)

 

The loop that I have started to write is something like this (but doesn't
work!):

 

data$ID<-as.number(c()) 

for(i in unique(data$Dates)){

  for(j in unique(data$Groups)){ data$ID[i,j]<-i

  i<-i+1

  }

}

 

Am I on the right track?

 

Any help on this is much appreciated!

 

Chandra



______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Whoa!

1. First and most important, there is very likely no reason you need
to do this. R can handle multiple groupings automatically in fitting
and plotting without creating artificial labels of the sort you appear
to want to create. Please read an "Intro to R" and/or get help to see
how.

2. The "solution" offered below is unnecessarily convoluted. Here is a
simpler and faster one:

z <-  within(z, indx <- as.numeric(interaction(Dates,Groups,
              drop=TRUE, lex.order=TRUE)))


Explanation:

interaction() produces all possible combinations the individual
groupings; drop=FALSE throws away any unused combinations,
lex.order-TRUE lexicographically orders the levels as you indicated.
?interaction for details.
By default, the result of the above is a factor, which as.numeric()
converts to the numeric codes used in factor representations. ?factor
 .
Finally, within() interprets and makes changes within z. The changed
result is then assigned back to z so that it is not lost. ?within

Cheers,
Bert
On Tue, Aug 2, 2011 at 8:36 AM, David L Carlson <dcarlson at tamu.edu> wrote:

  
    
#
Thanks to Peter Dalgaard and to Baptiste Auguie (off-list) for the
insights they provided.

     cheers,

         Rolf turner