An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20071208/9c9df86f/attachment.pl
R memory management
2 messages · Yuri Volchik, Patrick Burns
The line: data. <- c(data., new.data) will eat both memory and time voraciously. You should change it by creating 'data.' to be the final size it will be and then subscript into it. If you don't know the final size, then you can grow it a lot a few times instead of growing it a little lots of times. Patrick Burns patrick at burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and "A Guide for the Unwilling S User")
Yuri Volchik wrote:
Hi,
I'm using R to collect data for a number of exchanges through a socket
connection and constantly running into memory problems even though task I
believe is not that memory consuming. I guess there is a miscommunication
between R and WinXP about freeing up memory.
So this is the code:
for (x in 1:length(exchanges.to.get)) {
tickers<-sqlQuery(channel,paste("SELECT Symbol FROM symbols_list WHERE
Exchange='",exchanges.to.get[x],"';",sep=''))[,1]
dir.create(paste(Working.dir,exchanges.to.get[x],'/',sep=''))
for (y in 1:length(tickers)) {
con2 <- socketConnection(Sys.info()["nodename"], port = ****) #open
socket connection to get data
writeLines(paste(command,',',tickers[y],',',interval,';',sep=''), con2)
data.<-readLines(con2)
end.of.data<-sum(c(data.=="!ENDMSG!",data.=="!SYNTAX_ERROR!"))
while(end.of.data!=1)
{new.data<-readLines(con2);end.of.data<-sum(new.data=="!ENDMSG!");
data.<-c(data.,new.data)}
if (length(data.)>3)
write.table(data.[1:(length(data.)-2)],paste(Working.dir,exchanges.to.get[x]
,'/',sub('\\*','\+',tickers[y]),'_.csv',sep=''),quote=F,col.names =
F,row.names=F)
close(con2)
}
rm(tickers)
gc()
With command gcinfo(TRUE) I got the following info (some examples) :
Garbage collection 16362 = 15411+754+197 (level 0) ...
6.3 Mbytes of cons cells used (22%)
2.2 Mbytes of vectors used (8%)
Garbage collection 16407 = 15454+756+197 (level 0) ...
13.1 Mbytes of cons cells used (46%)
10.4 Mbytes of vectors used (39%)
Garbage collection 16410 = 15456+756+198 (level 2) ...
4.9 Mbytes of cons cells used (21%)
0.9 Mbytes of vectors used (4%)
Garbage collection 16679 = 15634+796+249 (level 0) ...
150.7 Mbytes of cons cells used (95%)
203.9 Mbytes of vectors used (75%)
Garbage collection 16680 = 15634+796+250 (level 2) ...
4.9 Mbytes of cons cells used (4%)
0.9 Mbytes of vectors used (0%)
Garbage collection 16808 = 15754+802+252 (level 0) ...
6.1 Mbytes of cons cells used (7%)
1.8 Mbytes of vectors used (1%)
But the end result is in Task Manager:
RGui.exe Mem Usage 470,472K VM Size 541,988K
Even though R reports
Garbage collection 16808 = 15754+802+252 (level 0) ...
6.1 Mbytes of cons cells used (7%)
1.8 Mbytes of vectors used (1%)
Has anybody encountered this problem and how you guys deal with it? It
seems like a memory leak to me, as tasks are not memory demandind, the
biggest amount of data in a single file is about 40MB.
Thanks
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.