ff package: ff objects don't reload completely on NFS drives from a different machine
Hi, this could be due to how NFS works. Note that there can be up to a 30 second delay before other hosts on the same file system see the updates that was flushed by one machine. You basically cannot treat files on an shared NFS file system as if you are working on a single machine. You have to add some higher protection if your data sources should be shared ...and that is not an easy problem if you want it to be bullet proof. You need to use a semaphore/mutex or other ways to communicate when files are updated/flushed/read etc. I'm still looking for a such a mechanism done over a file system that is bullet proof (without having to relying on a central server). My $.02 /Henrik
On Sat, Jan 23, 2010 at 12:02 PM, Hao Cen <hcen at andrew.cmu.edu> wrote:
Hi ff users and Jens, I am using the ff package and it has been working great. Recently I noticed an unexpected behavior in the ff package -- ?when I save an ff matrix on one machine to an NFS drive and load it on another machine from the save NFS drive, ?I got quote a lot of zeros in the matrix. The following code reproduces the error mat = matrix(1:25, 5) matFF = ff(mat, dim=dim(mat), ?dimnames = dimnames(mat), ? ? ? ? ? ? ? ?dimorder = c(2,1), ? ? ? ? ? ? ? ?filename= ?"~/m.ff", overwrite=TRUE) save(matFF, file = "~/mat.ff.rda") load(file = "~/mat.ff.rda") open(matFF) matFF If I execute all the six lines at one machine. Everything works fine. However, when I only execute the last three line at another machine, I got
matFF
ff (open) integer length=25 (25) dim=c(5,5) dimorder=c(2,1) ? ? [,1] [,2] [,3] [,4] [,5] [1,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 [2,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 [3,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 [4,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 [5,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 If the matrix is larger, say mat = matrix(1:20000, 5), I would get the following -- dozens of zeros at the end. ?ff (open) integer length=20000 (20000) dim=c(5,4000) dimorder=c(2,1) ? ? ?[,1] ?[,2] ?[,3] ?[,4] ?[,5] ?[,6] ?[,7] ?[,8] ? [,3993] [,3994] [,3995] [,3996] [,3997] [,3998] [,3999] [,4000] [1,] ? ? 1 ? ? 6 ? ?11 ? ?16 ? ?21 ? ?26 ? ?31 ? ?36 : ? 19961 ? 19966 19971 ? 19976 ? 19981 ? 19986 ? 19991 ? 19996 [2,] ? ? 2 ? ? 7 ? ?12 ? ?17 ? ?22 ? ?27 ? ?32 ? ?37 : ? 19962 ? 19967 19972 ? 19977 ? 19982 ? 19987 ? 19992 ? 19997 [3,] ? ? 3 ? ? 8 ? ?13 ? ?18 ? ?23 ? ?28 ? ?33 ? ?38 : ? 19963 ? 19968 19973 ? 19978 ? 19983 ? 19988 ? 19993 ? 19998 [4,] ? ? 4 ? ? 9 ? ?14 ? ?19 ? ?24 ? ?29 ? ?34 ? ?39 : ? 19964 ? 19969 19974 ? 19979 ? 19984 ? 19989 ? 19994 ? 19999 [5,] ? ? 5 ? ?10 ? ?15 ? ?20 ? ?25 ? ?30 ? ?35 ? ?40 : ? ? ? 0 ? ? ? 0 ?0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0 ? ? ? 0 I tried set caching = ?"mmeachflush" in the ff function but it doesn't help. ?My computing enrionment is linux 64 bit, R 2.10, ff 2.1. If you know what causes the issue or how to solve it, please let me know. I highly appreciate. Jeff
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.