Skip to content
Prev 275470 / 398506 Next

comparing two tables

On Oct 25, 2011, at 6:42 AM, Assa Yeroslaviz wrote:

            
rd.txt <- function(txt, header=TRUE, ...) {
      rd <- read.table(textConnection(txt), header=header, ...)
        closeAllConnections()
      rd }
# Data input
  genetable <- rd.txt("name     chr     start     end     str      
accession     Length
  gen1     4     646752     646838     +     MI0005806     86
  gen12     2L     243035     243141     -     MI0005821     106
  gen3     2L     159838     159928     +     MI0005813     90
  gen7     2L     1831685     1831799     -     MI0011290     114
  gen4     2L     2737568     2737661     +     MI0017696     93")
  loctable <- rd.txt("Chr     Start     End     length
  4     136532     138654     2122
  3     139870     141970     2100
  2L     157838     158440     602
  X     160834     162966     2132
  4     204040     208536     4496")

# Helper function
  inregion <- function(vec, locs) {
         any( apply(locs, 1, function(x) vec["start"]>x[1] &  
vec["end"]<=x[2])) }
# Test the function
  inregion(genetable[2, ], loctable[, c("Start", "End")])
# [1] FALSE

  apply(genetable, 1, function(x) inregion(x, loctable[, c("Start",  
"End")]) )
#[1] FALSE FALSE FALSE FALSE FALSE

The logical vector can be used to extract elements from genetable, but  
seems pointless to offer code that produces an empty dataframe.

(Wouldn't it have been more sensible to offer a test case that had a  
combination that satisfied you requirements?)

I'm guessing that this facility would already be implemented in one or  
more  BioConductor functions.