Skip to content
Prev 43 / 1559 Next

dbSetDataMappings with DBI.RODBC

Hi,
David Kane <David Kane wrote:
Unfortunately, yes (AFAIK).
I believe RODBC imports data from the DBMS as character
data, regardless of the type there, then it invokes 
.Internal(type.convert(...)) on the R character vectors which does the 
actual data mapping at that point (the idea is similar to read.table).
A workaround would be to post-process the data.frames
produced by fetch().  One could pass to fetch() a list of 
conversion functions to key off the field name (as below) or
field type, or some other meta-data.

The following example overloads the fetch() method to allow specific 
fields to be converted by user-specified functions (in the example
we convert a numeric field to an ordered factor):

res <- dbSendQuery(con, "select * from cars")  ## cars is from data(cars)

# build a list with conversion functions for those fields we're want
# to change ("speed").  The overloaded fetch() method looks for an argument
# called "post" and uses its names to convert those fields on the data.frame.

d <- fetch(res, post = list(speed = function(x) ordered(x, levels = sort(x)))
$speed
[1] "ordered" "factor" 

$dist
[1] "integer"

setMethod("fetch", "ODBCResult",
   def = function(res, n = 0, ...) {
      if(n<0) n <- 0
      ch <- res at Id[2]
      d <- sqlGetResults(ch, max = n)

      ## post process the fields of the output data.frame
      ## (may be able to use meta-data instead of field name)

      post <- list(...)$post
      if(!is.null(post) && is.list(post)){
         dnames <- names(d)
         for(fname in names(post))
            if(fname %in% dnames)
               d[[fname]] <- post[[fname]](d[[fname]])
      }
      d
   },
   valueClass = "data.frame"
)

One problem with this approach is that fetch() may be invoked repeatedly 
to fetch subsets of the result set, so the levels built at each invocation
will only include levels that appear during that fetch.  One could think
of using closures as converters to better manage data across invocations
of fetch(), but it doesn't seem very clean.

Doug Bates, Tim Keitt, Frank Harrell, and others have already brought up 
some of these issues in this list precisely in the context of factors -- 
e.g., you may need to store the levels of a factor outside the table where 
the factor itself resides.  I believe MS SQL server has facilities to 
define new types -- it may be worthwhile looking into that...

  
    
Message-ID: <20020116173112.A25817@jessie.research.bell-labs.com>
In-Reply-To: <15429.53800.798524.275946@gargle.gargle.HOWL>; from a296180@mica.fmr.com on Wed, Jan 16, 2002 at 02:19:04PM -0500