Skip to content

extending the colClasses argument in read.table

3 messages · Gabor Grothendieck, Romain Francois

#
Hello,

We've released the int64 package to CRAN a few days ago. The package 
provides S4 classes "int64" and "uint64" that represent signed and 
unsigned 64 bit integer vectors.

One further development of the package is to facilitate reading 64 bit 
integer data from csv, etc ... files.

I have this function that wraps a call to read.csv to:
- read the "int64" and "uint64" columns as "character"
- converts them afterwards to the appropriate type


read.csv.int64 <- function (file, ...){
     dots <- list( file, ... )
     if( "colClasses" %in% names(dots) ){
         colClasses <- dots[["colClasses"]]
         idx.int64 <- colClasses == "int64"
         idx.uint64 <- colClasses == "uint64"

         colClasses[ idx.int64 | idx.uint64 ] <- "character"
         dots[["colClasses" ]] <- colClasses

         df <- do.call( "read.csv", dots )
         if( any( idx.int64 ) ){
             df[ idx.int64 ] <- lapply( df[ idx.int64 ], as.int64 )
         }
         if( any( idx.uint64 ) ){
             df[ idx.uint64 ] <- lapply( df[ idx.uint64 ], as.uint64 )
         }
         df


     } else {
         read.csv( file, ... )
     }
}

I was wondering if it would make sense to extend the colClasses argument 
so that other package can provide drivers, so that we could let the 
users just use read.csv, read.table, etc ...

Before I start digging into the internals of read.table, I wanted to 
have opinions about whether this would be a good idea, etc ...

Best Regards,

Romain
#
2011/11/21 Romain Fran?ois <romain at r-enthusiasts.com>:
Try this:
'data.frame':   1 obs. of  1 variable:
 $ A:Formal class 'int64' [package "int64"] with 2 slots
  .. ..@ .Data:List of 1
  .. .. ..$ : int  0 12
  .. ..@ NAMES: NULL

To convince ourselves that its translating from character to int64:
[1] "character"
#
Thanks gabor, 

I will implement this and publish an updated package later. 

Cheers, 

Romain



Le 21 nov. 2011 ? 16:31, Gabor Grothendieck <ggrothendieck at gmail.com> a ?crit :