R crashing during batch file formatting
Hi you shall probably provide more information (OS, R version). I cannot help you much with crash but here is some opinion. I would try to do conversion interactively before I transferred it to a function. However, if you want different types of NA and your data is numeric, you probably could make a distinction by using -Inf, Inf, NaN and NA, but then you need to be careful when doing analysis, as these values can be treated differently. HTH Petr
On 31 Oct 2006 at 11:43, Jon Minton wrote:
From: "Jon Minton" <jm540 at york.ac.uk> To: <r-help at stat.math.ethz.ch> Date sent: Tue, 31 Oct 2006 11:43:22 -0000 Subject: [R] R crashing during batch file formatting
Hi R users:
I have the British Household Panel Survey (BHPS) in .tab format. I
want to feed it through the Amelia package (which will be an
?interesting? job in itself)..
But first I need to convert the various types of missing value (from
about -9 to -1) to a more generic ?NA? code.
I?ve written the following function to do this:
BHPS.converter <- function(from="D:/Data/BHPS/UKDA-5151-tab/tab/",
to="D:/BHPS/NA/", ext="tab" ) {
from.files <- dir(from,
pattern=paste(".",ext,"$",sep="") )
existing.to.files <- dir(to,
pattern=paste(".",ext,"$",sep="") )
still.to.do.index <- 1:length(from.files)
still.to.do.index <-
still.to.do.index[-match(existing.to.files, from.files)]
obs.to.do <- length(still.to.do.index)
for (i in 1:obs.to.do){
temp.table <-
read.delim(paste(from,from.files[still.to.do.index[i]], sep=""))
print(paste("read:",
from.files[still.to.do.index[i]]))
temp.table[temp.table < 0 ] <- NA
write.table(temp.table,
file=paste(to,from.files[still.to.do.index[i]], sep=""))
print(paste("written:",
from.files[still.to.do.index[i]]))
}
rm(i, from.files, existing.to.files,
still.to.do.index,
obs.to.do, temp.table)
}
It checks for existing files in the ?to? directory (where files which
have been modified with R- -> NA) because when I tried to do this
conversion operation previously it got about ? way through then
crashed.
The problem is that it crashes *this time* too, without displaying a
prompt to say it?s read a single file.
The file it gets stuck on is about 75mb in size.
I am using a dual-core 3.2Ghz Pentium D processor with 2 Gb memory (&
2Gb virtual memory), and (unfortunately) Windows XP.
Questions:
1) Any general tips on how to increase the amount of memory available
to
process the file?
2) Can you see a more efficient way of doing what I?m doing?
3) What?s the best way of coding for multiple forms of NA? ? the BHPS
code ?-8? (meaning ?inapplicable?, not routed for this respondent)
should really be distinguished from other forms of nonresponse...
Thanks,
Jon
p.s. Apologies if this is slightly too vague/long winded...
Jon Minton
[[alternative HTML version deleted]]
Petr Pikal petr.pikal at precheza.cz