Skip to content

Subsetting multiple rows of a data frame at once

3 messages · William Dunlap, arun

#
Hi,

carbon.fit = expand.grid(list(x=seq(0, 5, 0.01), y=seq(0, 5, 0.01)))
?dim(carbon.fit)
#[1] 251001????? 2


?xtNew<-sprintf("%.2f",xt)
?ytNew<- sprintf("%.2f",yt)
?carbon.fit[]<- lapply(carbon.fit,function(x) sprintf("%.2f",x))
res<-do.call(rbind,lapply(seq_along(xtNew),function(i) subset(carbon.fit,x==xtNew[i]&y==ytNew[i])))
?nrow(res)
#[1] 28
res
#????????? x??? y
#12631? 1.05 0.25
#5296?? 2.85 0.10
#45431? 3.40 0.90
#12951? 4.25 0.25
#52631? 0.25 1.05
#85476? 3.05 1.70
#103076 3.70 2.05
#145311 0.20 2.90
#117766 0.30 2.35
#130331 0.70 2.60
#127861 1.05 2.55
#107836 1.20 2.15
#137916 1.40 2.75
#102896 1.90 2.05
#135541 2.70 2.70
#113051 3.25 2.25
#128111 3.55 2.55
#103166 4.60 2.05
#183071 2.05 3.65
#153021 2.15 3.05
#150671 3.70 3.00
#175836 4.85 3.50
#188366 4.90 3.75
#243146 1.60 4.85
#225696 2.45 4.50
#225771 3.20 4.50
#168226 3.90 3.35
#245936 4.45 4.90
A.K.
#
You are running into the problem that two different computational methods that give
the same result when applied to real numbers often give different results when applied
to 64-bit floating point numbers.  (In your case you expect seq(0,5,.01) to contain, e.g.,
the floating point number generate by parsing the string "3.05".)   Hence x==y is not true
when you expect it to be.  Here is where your 18 came from:
   R> table(xt %in% carbon.fit$x, yt %in% carbon.fit$y)
          
           FALSE TRUE
     FALSE     1    6
     TRUE      3   18
Round your number to the nearest 10^-10 and you get
  > table(round(xt,10) %in% round(carbon.fit$x,10), round(yt,10) %in% round(carbon.fit$y,10))
        
         TRUE
    TRUE   28

By the way, you may prefer using the merge() function rather than the do.call(rbind,lapply(...)))
business.  I think the following call to merge will do about what you want (the row names differ -
if they are important it is possible to get them with some minor trickery):
    merge(data.frame(x=xt,y=yt), carbon.fit)
(You still want to round your numbers as before.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Hi Anika,
?merge() is a better solution.

To get the row.names intact, you could do:
carbon.fit<- within(carbon.fit,{x<-round(x,10);y<- round(y,10)}) #Using Bill's solution

dat1<- data.frame(x=round(xt,10),y=round(yt,10))
carbon.fit1<- data.frame(carbon.fit,rNames=row.names(carbon.fit),stringsAsFactors=FALSE) #changed here
?res1<-merge(dat1,carbon.fit1,by=c("x","y"))
?row.names(res1)<- res1[,3]
?res1<- res1[,-3]
A.K.



----- Original Message -----
From: William Dunlap <wdunlap at tibco.com>
To: arun <smartpink111 at yahoo.com>; Shaun ? Anika <pro_patto at hotmail.com>
Cc: R help <r-help at r-project.org>
Sent: Thursday, July 4, 2013 8:02 PM
Subject: RE: [R] Subsetting multiple rows of a data frame at once
You are running into the problem that two different computational methods that give
the same result when applied to real numbers often give different results when applied
to 64-bit floating point numbers.? (In your case you expect seq(0,5,.01) to contain, e.g.,
the floating point number generate by parsing the string "3.05".)?  Hence x==y is not true
when you expect it to be.? Here is where your 18 came from:
?  R> table(xt %in% carbon.fit$x, yt %in% carbon.fit$y)
? ? ? ? ? 
? ? ? ? ?  FALSE TRUE
? ?  FALSE? ?  1? ? 6
? ?  TRUE? ? ? 3?  18
Round your number to the nearest 10^-10 and you get
? > table(round(xt,10) %in% round(carbon.fit$x,10), round(yt,10) %in% round(carbon.fit$y,10))
? ? ? ? 
? ? ? ?  TRUE
? ? TRUE?  28

By the way, you may prefer using the merge() function rather than the do.call(rbind,lapply(...)))
business.? I think the following call to merge will do about what you want (the row names differ -
if they are important it is possible to get them with some minor trickery):
? ? merge(data.frame(x=xt,y=yt), carbon.fit)
(You still want to round your numbers as before.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com