An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20130805/b921a92c/attachment.pl>
raster/rgdal- problem: Too many open files (Linux)
5 messages · Mauricio Zambrano-Bigiarini, Roger Bivand, Jon Olav Skoien
1 day later
On 08/05/2013 04:37 PM, Jon Olav Skoien wrote:
Dear list,
We have a problem which appears to be a bug in either rgdal or raster,
although it could also be a bug in base R or in our understanding of how
to deal with connections.
We have a process which is writing a rather large (~10-20.000) number of
geoTiffs via writeRaster. However, the process has frequently stopped
with an error of the type:
Error in .local(.Object, ...) :
TIFFOpen:/local0/skoiejo/hri/test.tif: Too many open files
The issue seems to be the creation of temp-files in the temp directory
which is given by tempdir(), not by raster:::.tmpdir(). These temp-files
seem to be created by the call
transient <- new("GDALTransientDataset", driver=driver, rows=r at nrows,
cols=r at ncols, bands=nbands, type=dataformat, fname=filename,
options=options, handle=NULL)
from raster:::.getGDALtransient
The temp-files are deleted after writing the geoTiff, but are not
removed from the list of open files in Linux, which on our system was
limited to 1024 files (ulimit -n) per process. Below is a script which
can replicate the issue (takes a few minutes to reach 1024) and
sessionInfo().
Currently we are trying to solve the issue by increasing the limit of
file connections, but we would prefer a solution where the connections
are properly deleted, either before writeRaster finishes, or a command
which we can include in our script, either R-code or a call to System().
The connections are not visible via showConnections(), and
closeAllConnections() does not help.
Thanks,
Jon
I stumbled across the same problem (with exactly the same configuration reported by Jon with 'sessionInfo()'), while trying to change the values of some pixels in more than 6000 maps. Thank you very much Jon for the detailed report about the problem, which helped me to find a workaround to this problem (so far, just to split the 6000 maps in smaller groups).
r <- raster(system.file("external/test.grd", package="raster"))
for (ifile in 1:2000) {
writeRaster(r, "test.tif", format = "GTiff", overwrite = TRUE)
print(ifile)
}
After trying the previous reproducible code, I don't understand why I got the error when ifile=1019 and not 1024: .... [1] 1018 [1] 1019 Error in .local(.Object, ...) : TIFFOpen:/home/hzambran/test.tif: Too many open files Thanks again Jon for sharing your findings about this. All the best, Mauricio Zambrano-Bigiarini, Ph.D
================================================= Water Resources Unit Institute for Environment and Sustainability (IES) Joint Research Centre (JRC), European Commission webinfo : http://floods.jrc.ec.europa.eu/ ================================================= DISCLAIMER: "The views expressed are purely those of the writer and may not in any circumstances be regarded as sta- ting an official position of the European Commission" ================================================= "Sometimes life's going to hit you in the head with a brick. Don't lose faith" (Steve Jobs) > After the script stops, I checked the open files from the process, and > got the following: > lsof -aPn -p 596 | more > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > R 596 skoiejo cwd DIR 8,17 139264 4213416 > /local0/skoiejo/Raster_temp > R 596 skoiejo rtd DIR 253,0 4096 2 / > R 596 skoiejo txt REG 253,0 14192 2500779 > /usr/lib64/R/bin/exec/R > R 596 skoiejo mem REG 253,0 2679280 2501177 > /usr/lib64/R/lib/libR.so > ...................... > > R 596 skoiejo 1015u REG 253,0 238 4983213 > /tmp/RtmpNyOKCk/toptest.tif (deleted) > R 596 skoiejo 1016u REG 253,0 238 4983214 > /tmp/RtmpNyOKCk/qxdtest.tif (deleted) > R 596 skoiejo 1017u REG 253,0 238 4983215 > /tmp/RtmpNyOKCk/zwotest.tif (deleted) > R 596 skoiejo 1018u REG 253,0 238 4983216 > /tmp/RtmpNyOKCk/cnqtest.tif (deleted) > R 596 skoiejo 1019u REG 253,0 238 4983217 > /tmp/RtmpNyOKCk/lottest.tif (deleted) > R 596 skoiejo 1020u REG 253,0 238 4983218 > /tmp/RtmpNyOKCk/fartest.tif (deleted) > R 596 skoiejo 1021u REG 253,0 238 4983219 > /tmp/RtmpNyOKCk/vsqtest.tif (deleted) > R 596 skoiejo 1022u REG 253,0 238 4983220 > /tmp/RtmpNyOKCk/czptest.tif (deleted) > > Even if tested by someone with a limit higher than 2000, it should still > be possible to see the long list of open connections, as above. > > > sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-redhat-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rgdal_0.8-10 raster_2.1-49 sp_1.0-11 > > loaded via a namespace (and not attached): > [1] grid_3.0.1 lattice_0.20-15 > > > > > > > > _______________________________________________ > R-sig-Geo mailing list > R-sig-Geo at r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo >
On Tue, 6 Aug 2013, Mauricio Zambrano-Bigiarini wrote:
On 08/05/2013 04:37 PM, Jon Olav Skoien wrote:
Dear list,
We have a problem which appears to be a bug in either rgdal or raster,
although it could also be a bug in base R or in our understanding of how
to deal with connections.
We have a process which is writing a rather large (~10-20.000) number of
geoTiffs via writeRaster. However, the process has frequently stopped
with an error of the type:
Error in .local(.Object, ...) :
TIFFOpen:/local0/skoiejo/hri/test.tif: Too many open files
The issue seems to be the creation of temp-files in the temp directory
which is given by tempdir(), not by raster:::.tmpdir(). These temp-files
seem to be created by the call
transient <- new("GDALTransientDataset", driver=driver, rows=r at nrows,
cols=r at ncols, bands=nbands, type=dataformat, fname=filename,
options=options, handle=NULL)
from raster:::.getGDALtransient
The temp-files are deleted after writing the geoTiff, but are not
removed from the list of open files in Linux, which on our system was
limited to 1024 files (ulimit -n) per process. Below is a script which
can replicate the issue (takes a few minutes to reach 1024) and
sessionInfo().
Currently we are trying to solve the issue by increasing the limit of
file connections, but we would prefer a solution where the connections
are properly deleted, either before writeRaster finishes, or a command
which we can include in our script, either R-code or a call to System().
The connections are not visible via showConnections(), and
closeAllConnections() does not help.
Thanks,
Jon
I stumbled across the same problem (with exactly the same configuration reported by Jon with 'sessionInfo()'), while trying to change the values of some pixels in more than 6000 maps. Thank you very much Jon for the detailed report about the problem, which helped me to find a workaround to this problem (so far, just to split the 6000 maps in smaller groups).
r <- raster(system.file("external/test.grd", package="raster"))
for (ifile in 1:2000) {
writeRaster(r, "test.tif", format = "GTiff", overwrite = TRUE)
print(ifile)
}
After trying the previous reproducible code, I don't understand why I got the error when ifile=1019 and not 1024: .... [1] 1018 [1] 1019 Error in .local(.Object, ...) : TIFFOpen:/home/hzambran/test.tif: Too many open files
There are other files opened by the R process that reduce the number needed. The problem is in the GDAL bindings with R, I haven't tried to see whether other applications keeping GDAL loaded face the same issues. GDAL applications typically write once and exit, so this isn't a problem there. The current GDAL.close() code says unlink() to a vector of files with the same basename, but actually unlink() now appears to fail, leaving the files in place. Using file.remove() leads to the same result, and using deleteFile() provokes other problems. This will probably turn out to be something trivial, but will take a great deal of time to debug, as the consequences of changing the dataset structure are possibly extensive. For the time being, the work-around is the only route; if volunteers can debug this, progress may be possible, but everything else has to continue to work. Roger
Thanks again Jon for sharing your findings about this. All the best, Mauricio Zambrano-Bigiarini, Ph.D
Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
On 06-Aug-13 21:35, Roger Bivand wrote:
On Tue, 6 Aug 2013, Mauricio Zambrano-Bigiarini wrote:
On 08/05/2013 04:37 PM, Jon Olav Skoien wrote:
Dear list,
We have a problem which appears to be a bug in either rgdal or raster,
although it could also be a bug in base R or in our understanding of
how
to deal with connections.
We have a process which is writing a rather large (~10-20.000)
number of
geoTiffs via writeRaster. However, the process has frequently stopped
with an error of the type:
Error in .local(.Object, ...) :
TIFFOpen:/local0/skoiejo/hri/test.tif: Too many open files
The issue seems to be the creation of temp-files in the temp directory
which is given by tempdir(), not by raster:::.tmpdir(). These
temp-files
seem to be created by the call
transient <- new("GDALTransientDataset", driver=driver,
rows=r at nrows,
cols=r at ncols, bands=nbands, type=dataformat, fname=filename,
options=options, handle=NULL)
from raster:::.getGDALtransient
The temp-files are deleted after writing the geoTiff, but are not
removed from the list of open files in Linux, which on our system was
limited to 1024 files (ulimit -n) per process. Below is a script which
can replicate the issue (takes a few minutes to reach 1024) and
sessionInfo().
Currently we are trying to solve the issue by increasing the limit of
file connections, but we would prefer a solution where the connections
are properly deleted, either before writeRaster finishes, or a command
which we can include in our script, either R-code or a call to
System().
The connections are not visible via showConnections(), and
closeAllConnections() does not help.
Thanks,
Jon
I stumbled across the same problem (with exactly the same configuration reported by Jon with 'sessionInfo()'), while trying to change the values of some pixels in more than 6000 maps. Thank you very much Jon for the detailed report about the problem, which helped me to find a workaround to this problem (so far, just to split the 6000 maps in smaller groups).
r <- raster(system.file("external/test.grd", package="raster"))
for (ifile in 1:2000) {
writeRaster(r, "test.tif", format = "GTiff", overwrite = TRUE)
print(ifile)
}
After trying the previous reproducible code, I don't understand why I got the error when ifile=1019 and not 1024: .... [1] 1018 [1] 1019 Error in .local(.Object, ...) : TIFFOpen:/home/hzambran/test.tif: Too many open files
There are other files opened by the R process that reduce the number needed. The problem is in the GDAL bindings with R, I haven't tried to see whether other applications keeping GDAL loaded face the same issues. GDAL applications typically write once and exit, so this isn't a problem there. The current GDAL.close() code says unlink() to a vector of files with the same basename, but actually unlink() now appears to fail, leaving the files in place. Using file.remove() leads to the same result, and using deleteFile() provokes other problems. This will probably turn out to be something trivial, but will take a great deal of time to debug, as the consequences of changing the dataset structure are possibly extensive. For the time being, the work-around is the only route; if volunteers can debug this, progress may be possible, but everything else has to continue to work. Roger
Roger, thanks for having a look at this. I just checked with an older version, and it seems the problem was introduced more or less at the same time as a valgrind issue was fixed in revision 456. Running the example above worked with R 2.14.0 and rgdal 0.8-5 (sessionInfo below), but failed when upgrading rgdal (and sp). The problem seems to be in the C++ code (I already tried to revert the R code of GDAL.close to 0.8-5, without any difference), which I am unfortunately not able to debug. As there is no quick fix at the moment, I just thought it would be good to summarize the possible workarounds for other people who encounter this problem: - Split up the process in smaller problems - Increase the number of possible file connections (the standard on Linux seems to be 1024, but I have not seen any reason for not increasing this to e.g. 40.000, as currently on our system) - Do parallel processing (will work better as each sub-process will have its own list of file connections). Best wishes, Jon R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgdal_0.8-5 raster_2.0-41 sp_1.0-5 loaded via a namespace (and not attached): [1] grid_2.14.0 lattice_0.20-0
Thanks again Jon for sharing your findings about this. All the best, Mauricio Zambrano-Bigiarini, Ph.D
Jon Olav Sk?ien Joint Research Centre - European Commission Institute for Environment and Sustainability (IES) Land Resource Management Unit Via Fermi 2749, TP 440, I-21027 Ispra (VA), ITALY jon.skoien at jrc.ec.europa.eu Tel: +39 0332 789206 Disclaimer: Views expressed in this email are those of the individual and do not necessarily represent official views of the European Commission.
On Wed, 7 Aug 2013, Jon Olav Skoien wrote:
On 06-Aug-13 21:35, Roger Bivand wrote:
On Tue, 6 Aug 2013, Mauricio Zambrano-Bigiarini wrote:
On 08/05/2013 04:37 PM, Jon Olav Skoien wrote:
Dear list,
We have a problem which appears to be a bug in either rgdal or raster,
although it could also be a bug in base R or in our understanding of how
to deal with connections.
We have a process which is writing a rather large (~10-20.000) number of
geoTiffs via writeRaster. However, the process has frequently stopped
with an error of the type:
Error in .local(.Object, ...) :
TIFFOpen:/local0/skoiejo/hri/test.tif: Too many open files
The issue seems to be the creation of temp-files in the temp directory
which is given by tempdir(), not by raster:::.tmpdir(). These temp-files
seem to be created by the call
transient <- new("GDALTransientDataset", driver=driver, rows=r at nrows,
cols=r at ncols, bands=nbands, type=dataformat, fname=filename,
options=options, handle=NULL)
from raster:::.getGDALtransient
The temp-files are deleted after writing the geoTiff, but are not
removed from the list of open files in Linux, which on our system was
limited to 1024 files (ulimit -n) per process. Below is a script which
can replicate the issue (takes a few minutes to reach 1024) and
sessionInfo().
Currently we are trying to solve the issue by increasing the limit of
file connections, but we would prefer a solution where the connections
are properly deleted, either before writeRaster finishes, or a command
which we can include in our script, either R-code or a call to System().
The connections are not visible via showConnections(), and
closeAllConnections() does not help.
Thanks,
Jon
I stumbled across the same problem (with exactly the same configuration reported by Jon with 'sessionInfo()'), while trying to change the values of some pixels in more than 6000 maps. Thank you very much Jon for the detailed report about the problem, which helped me to find a workaround to this problem (so far, just to split the 6000 maps in smaller groups).
r <- raster(system.file("external/test.grd", package="raster"))
for (ifile in 1:2000) {
writeRaster(r, "test.tif", format = "GTiff", overwrite = TRUE)
print(ifile)
}
After trying the previous reproducible code, I don't understand why I got the error when ifile=1019 and not 1024: .... [1] 1018 [1] 1019 Error in .local(.Object, ...) : TIFFOpen:/home/hzambran/test.tif: Too many open files
There are other files opened by the R process that reduce the number needed. The problem is in the GDAL bindings with R, I haven't tried to see whether other applications keeping GDAL loaded face the same issues. GDAL applications typically write once and exit, so this isn't a problem there. The current GDAL.close() code says unlink() to a vector of files with the same basename, but actually unlink() now appears to fail, leaving the files in place. Using file.remove() leads to the same result, and using deleteFile() provokes other problems. This will probably turn out to be something trivial, but will take a great deal of time to debug, as the consequences of changing the dataset structure are possibly extensive. For the time being, the work-around is the only route; if volunteers can debug this, progress may be possible, but everything else has to continue to work. Roger
Roger, thanks for having a look at this. I just checked with an older version, and it seems the problem was introduced more or less at the same time as a valgrind issue was fixed in revision 456. Running the example above worked with R 2.14.0 and rgdal 0.8-5 (sessionInfo below), but failed when upgrading rgdal (and sp). The problem seems to be in the C++ code (I already tried to revert the R code of GDAL.close to 0.8-5, without any difference), which I am unfortunately not able to debug.
Thanks. The changes made then were a result of an audit for references to de-referenced pointers and memory leaks. I've looked to see whether it is possible to revert optionally to the former behaviour (admitting references to de-referenced pointers and memory leaks), but I can't see an easy resolution. So for now the workarounds you propose are those to follow. Roger
As there is no quick fix at the moment, I just thought it would be good to summarize the possible workarounds for other people who encounter this problem: - Split up the process in smaller problems - Increase the number of possible file connections (the standard on Linux seems to be 1024, but I have not seen any reason for not increasing this to e.g. 40.000, as currently on our system) - Do parallel processing (will work better as each sub-process will have its own list of file connections). Best wishes, Jon R version 2.14.0 (2011-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rgdal_0.8-5 raster_2.0-41 sp_1.0-5 loaded via a namespace (and not attached): [1] grid_2.14.0 lattice_0.20-0
Thanks again Jon for sharing your findings about this. All the best, Mauricio Zambrano-Bigiarini, Ph.D
Roger Bivand Department of Economics, NHH Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no