Reading many large files causes R to crash - Possible Bug in R 2.15.1 64-bit Ubuntu

Looks like the call to:

dat.i <- to.period(dat.i, period=per, k=subper, name=NULL)

If what is causing the issue.  If variable name is not set, or set to any
value other than NULL.  Than no hang occurs.  

-----Original Message-----
From: David Terk [mailto:david.terk at gmail.com] 
Sent: Monday, July 23, 2012 1:25 AM
To: 'Duncan Murdoch'
Cc: 'r-devel at r-project.org'
Subject: RE: [Rd] Reading many large files causes R to crash - Possible Bug
in R 2.15.1 64-bit Ubuntu

I've isolated the bug.  When the seg fault was produced there was an error
that memory had not been mapped.  Here is the odd part of the bug.  If you
comment out certain code and get a full run than comment in the code which
is causing the problem it will actually run.   So I think it is safe to
assume something wrong is taking place with memory allocation.  Example.
While testing, I have been able to get to a point where the code will run.
But if I reboot the machine and try again, the code will not run.

The bug itself is happening somewhere in XTS or ZOO.  I will gladly upload
the data files.  It is happening on the 10th data file which is only 225k
lines in size.

Below is the simplified code.  The call to either

dat.i <- to.period(dat.i, period=per, k=subper, name=NULL)
index(dat.i) <- index(to.period(templateTimes, period=per, k=subper))

is what is causing R to hang or crash.  I have been able to replicate this
on Windows 7 64 bit and Ubuntu 64 bit.  Seems easiest to consistently
replicate from R Studio.

The code below will consistently replicate when the appropriate files are
used.

parseTickDataFromDir = function(tickerDir, per, subper) {
  tickerAbsFilenames = list.files(tickerDir,full.names=T)
  tickerNames = list.files(tickerDir,full.names=F)
  tickerNames = gsub("_[a-zA-Z0-9].csv","",tickerNames)
  pb <- txtProgressBar(min = 0, max = length(tickerAbsFilenames), style = 3)

  for(i in 1:length(tickerAbsFilenames)) {
    dat.i = parseTickData(tickerAbsFilenames[i])
    dates <- unique(substr(as.character(index(dat.i)), 1,10))
    times <- rep("09:30:00", length(dates))
    openDateTimes <- strptime(paste(dates, times), "%F %H:%M:%S")
    templateTimes <- NULL

    for (j in 1:length(openDateTimes)) {
      if (is.null(templateTimes)) {
        templateTimes <- openDateTimes[j] + 0:23400
      } else {
        templateTimes <- c(templateTimes, openDateTimes[j] + 0:23400)
      }
    }

    templateTimes <- as.xts(templateTimes)
    dat.i <- merge(dat.i, templateTimes, all=T)
    if (is.na(dat.i[1])) {
      dat.i[1] <- -1
    }
    dat.i <- na.locf(dat.i)
	dat.i <- to.period(dat.i, period=per, k=subper, name=NULL)
	index(dat.i) <- index(to.period(templateTimes, period=per,
k=subper))
    setTxtProgressBar(pb, i)
  }
  close(pb)
}

parseTickData <- function(inputFile) {
  DAT.list <- scan(file=inputFile,
sep=",",skip=1,what=list(Date="",Time="",Close=0,Volume=0),quiet=T)
  index <- as.POSIXct(paste(DAT.list$Date,DAT.list$Time),format="%m/%d/%Y
%H:%M:%S")
  DAT.xts <- xts(DAT.list$Close,index)
  DAT.xts <- make.index.unique(DAT.xts)
  return(DAT.xts)
}

DATTick <- parseTickDataFromDir(tickerDirSecond, "seconds",10)

-----Original Message-----
From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
Sent: Sunday, July 22, 2012 4:48 PM
To: David Terk
Cc: r-devel at r-project.org
Subject: Re: [Rd] Reading many large files causes R to crash - Possible Bug
in R 2.15.1 64-bit Ubuntu

Reading many large files causes R to crash - Possible Bug in R 2.15.1 64-bit Ubuntu

Thread (13 messages)