Staged installation fail on some file systems
I have now tested with R version 3.6.0 Patched (2019-05-12 r76489) and I confirm that staged install now works on our problematic file system. Thank you again for fixing this. /Henrik
On Thu, May 9, 2019 at 5:43 AM Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
On 5/7/19 6:18 PM, Henrik Bengtsson wrote:
On Tue, May 7, 2019 at 9:05 AM Tomas Kalibera <tomas.kalibera at gmail.com> wrote:
Thanks for the report. According to my reading, this use of "mv" is ok and the renameat2() call which the invocation of "mv" leads to is also ok and allowed by POSIX in this context. It could only fail with EEXIST if the target directory (path/pkg) was not empty. So far I've not been able to reproduce but we could fall back to copy like on Windows.
Thanks for looking into this. The purpose of the pre-existing target directory (path/pkg) is to act as a directory lock in order to lower the risk for parallel installation to take place at the same - is that correct? For the same reason, you can't just do mv path/pkg path/pkg-quickly-now mv -f path/to/pkg path rmdir path/pkg-quickly-now because there's a potential race condition?
pkg (the final installation directory) should be empty when we are doing the move - and I understood from your report you actually reproduced outside R with the target directory being empty. Moving it away is the same as deleting it. Which is what I did now in R-devel, and which is what the current code has been doing already on Windows. It is a race condition, but I don't think it is avoidable and I doubt that "mv/rename" could be relied on doing the move atomically when the target directory exists, when not broken to do the move at all. Still, according to POSIX, mv/renameat should work in this context with the target being an empty directory, it works on my systems and that it does not on some is probably a bug. Likewise, from my reading of the documentation of MoveFileEx, the file rename should also work on Windows with the empty target directory with the flags we use (and it is also implied in CERT advisory FIO10-C), but contradicts what I am seeing on my system. This is why I had added the deletion on Windows before. Pragmatically, there is probably no point into trying to fine-tune this any more, because apart from the move happening, we don't have universal guarantees that the move will be atomic. But of course we have to make sure the installation always works, so thanks again for reporting this.
For efficiency, to avoid copying, could one do "atomic" moves one layer down? Something like: mv path/to/pkg/* path/to/pkg/.* path/pkg rmdir path/pkg because, in this case, we know that path/pkg/ is empty.
Yes, but as it seems that the bug in mv/renameat is rare, maybe this is not worth the added complexity/maintenance cost. And with the current solution - delete the target directory, but still try the move - we still have some hope of avoiding exposure of a partially installed package (a package that already has a DESCRIPTION file, but is not yet completely installed). Best Tomas
/Henrik
Best Tomas On 5/5/19 4:35 AM, Henrik Bengtsson wrote:
I'm observing that the new staged installation in R 3.6.0 can produce: mv: cannot move ?/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools? to ?/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools?: File exists ERROR: moving to final location failed on some file systems. # EXAMPLE $ R --vanilla R version 3.6.0 (2019-04-26) -- "Planting of a Tree" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) ...
install.packages("codetools", repos="https://cran.r-project.org")
Installing package into ?/wynton/home/cbi/hb/R/x86_64-pc-linux-gnu-library/3.6? (as ?lib? is unspecified) trying URL 'https://cran.r-project.org/src/contrib/codetools_0.2-16.tar.gz' Content type 'application/x-gzip' length 12996 bytes (12 KB) ================================================== downloaded 12 KB * installing *source* package ?codetools? ... ** package ?codetools? successfully unpacked and MD5 sums checked ** using staged installation ** R ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded from temporary location mv: cannot move ?/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools? to ?/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools?: File exists ERROR: moving to final location failed * removing ?/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools? The downloaded source packages are in ?/scratch/alice/Rtmp6UYDzu/downloaded_packages? Warning message: In install.packages("codetools", repos = "https://cran.r-project.org") : installation of package ?codetools? had non-zero exit status # WORKAROUND Disabling staged installation, for instance by setting environment variable 'R_INSTALL_STAGED=false' avoids this problem. # TROUBLESHOOTING I think it comes down to the following call in src/library/tools/R/install.R: status <- system(paste("mv -f", shQuote(instdir), shQuote(dirname(final_instdir)))) https://github.com/wch/r-source/blob/d253331f578814f919f150ffdf1fe581618079a3/src/library/tools/R/install.R#L1645-L1647 which effectively does: $ mkdir -p path/pkg ## empty final destination placeholder(?) $ mkdir -p path/to/pkg $ mv -f path/to/pkg path However, on one (and only one) of several systems I've tested, that 'mv' produce the error: mv: cannot move ?path/to/pkg? to ?path/pkg?: File exists This is on a BeeGFS parallel file system. I cannot tell if that 'mv -f' should work or not, or if it is even well defined. FWIW, the above 'mv' does indeed work if I switch to another folder that is mounted on a different, NFS, file system, i.e. it is not kernel/OS specific (here CentOS 7.6.1810). If of any use, here's the 'strace' of the above 'mv': $ strace mv -f path/to/pkg path execve("/usr/bin/mv", ["mv", "-f", "path/to/pkg", "path"], [/* 118 vars */]) = 0 brk(NULL) = 0xcf3000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ceb1000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/usr/lib64/openmpi/lib/tls/x86_64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/openmpi/lib/tls/x86_64", 0x7ffc6a3fb170) = -1 ENOENT (No such file or directory) open("/usr/lib64/openmpi/lib/tls/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/openmpi/lib/tls", 0x7ffc6a3fb170) = -1 ENOENT (No such file or directory) open("/usr/lib64/openmpi/lib/x86_64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/openmpi/lib/x86_64", 0x7ffc6a3fb170) = -1 ENOENT (No such file or directory) open("/usr/lib64/openmpi/lib/libselinux.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/openmpi/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=96960, ...}) = 0 mmap(NULL, 96960, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fde2ce99000 close(3) = 0 open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320i\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=155784, ...}) = 0 mmap(NULL, 2255184, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2ca6a000 mprotect(0x7fde2ca8e000, 2093056, PROT_NONE) = 0 mmap(0x7fde2cc8d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x23000) = 0x7fde2cc8d000 mmap(0x7fde2cc8f000, 6480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fde2cc8f000 close(3) = 0 open("/usr/lib64/openmpi/lib/libacl.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\37\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=37056, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ce98000 mmap(NULL, 2130560, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2c861000 mprotect(0x7fde2c868000, 2097152, PROT_NONE) = 0 mmap(0x7fde2ca68000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7fde2ca68000 close(3) = 0 open("/usr/lib64/openmpi/lib/libattr.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\23\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=19896, ...}) = 0 mmap(NULL, 2113904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2c65c000 mprotect(0x7fde2c660000, 2093056, PROT_NONE) = 0 mmap(0x7fde2c85f000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7fde2c85f000 close(3) = 0 open("/usr/lib64/openmpi/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340$\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2151672, ...}) = 0 mmap(NULL, 3981792, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2c28f000 mprotect(0x7fde2c451000, 2097152, PROT_NONE) = 0 mmap(0x7fde2c651000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c2000) = 0x7fde2c651000 mmap(0x7fde2c657000, 16864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fde2c657000 close(3) = 0 open("/usr/lib64/openmpi/lib/libpcre.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=402384, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ce97000 mmap(NULL, 2494984, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2c02d000 mprotect(0x7fde2c08d000, 2097152, PROT_NONE) = 0 mmap(0x7fde2c28d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x60000) = 0x7fde2c28d000 close(3) = 0 open("/usr/lib64/openmpi/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\r\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=19288, ...}) = 0 mmap(NULL, 2109712, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2be29000 mprotect(0x7fde2be2b000, 2097152, PROT_NONE) = 0 mmap(0x7fde2c02b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7fde2c02b000 close(3) = 0 open("/usr/lib64/openmpi/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260l\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=141968, ...}) = 0 mmap(NULL, 2208904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fde2bc0d000 mprotect(0x7fde2bc24000, 2093056, PROT_NONE) = 0 mmap(0x7fde2be23000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7fde2be23000 mmap(0x7fde2be25000, 13448, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fde2be25000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ce96000 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ce94000 arch_prctl(ARCH_SET_FS, 0x7fde2ce94840) = 0 mprotect(0x7fde2c651000, 16384, PROT_READ) = 0 mprotect(0x7fde2be23000, 4096, PROT_READ) = 0 mprotect(0x7fde2c02b000, 4096, PROT_READ) = 0 mprotect(0x7fde2c28d000, 4096, PROT_READ) = 0 mprotect(0x7fde2c85f000, 4096, PROT_READ) = 0 mprotect(0x7fde2ca68000, 4096, PROT_READ) = 0 mprotect(0x7fde2cc8d000, 4096, PROT_READ) = 0 mprotect(0x61d000, 4096, PROT_READ) = 0 mprotect(0x7fde2ceb2000, 4096, PROT_READ) = 0 munmap(0x7fde2ce99000, 96960) = 0 set_tid_address(0x7fde2ce94b10) = 85521 set_robust_list(0x7fde2ce94b20, 24) = 0 rt_sigaction(SIGRTMIN, {0x7fde2bc13790, [], SA_RESTORER|SA_SIGINFO, 0x7fde2bc1c5d0}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {0x7fde2bc13820, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7fde2bc1c5d0}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 statfs("/sys/fs/selinux", {f_type=SELINUX_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0 statfs("/sys/fs/selinux", {f_type=SELINUX_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_RELATIME}) = 0 stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 brk(NULL) = 0xcf3000 brk(0xd14000) = 0xd14000 access("/etc/selinux/config", F_OK) = 0 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=106075056, ...}) = 0 mmap(NULL, 106075056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fde256e3000 close(3) = 0 geteuid() = 34002 ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0 stat("path", {st_mode=S_IFDIR|0755, st_size=2, ...}) = 0 lstat("path/to/pkg", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 lstat("path/pkg", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 renameat2(AT_FDCWD, "path/to/pkg", AT_FDCWD, "path/pkg", 0) = -1 EEXIST (File exists) open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fde2ceb0000 read(3, "# Locale name alias data base.\n#"..., 4096) = 2502 read(3, "", 4096) = 0 close(3) = 0 munmap(0x7fde2ceb0000, 4096) = 0 open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/lib64/charset.alias", O_RDONLY|O_NOFOLLOW) = -1 ENOENT (No such file or directory) write(2, "mv: ", 4mv: ) = 4 write(2, "cannot move \342\200\230path/to/pkg\342\200\231 to"..., 47cannot move ?path/to/pkg? to ?path/pkg?) = 47 open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en_US.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en_US/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale/en/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory) write(2, ": File exists", 13: File exists) = 13 write(2, "\n", 1 ) = 1 lseek(0, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) close(0) = 0 close(1) = 0 close(2) = 0 exit_group(1) = ? +++ exited with 1 +++ /Henrik
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel