Comments requested on "changedFiles" function
Dear Duncan, This certainly looks useful. Might you consider adding the ability to supply an alternative digest function? Details below. I often use a homemade "make" type function which starts by looking at modification times e.g. in a private package https://github.com/jefferis/nat.utils/blob/master/R/make.r For some of my work, I use hash functions. However because I typically work with many large files I often use a special digest process e.g. using the crc checksum embedded in a gzip file directly or hashing only the part of a large file that is (almost) certain to change. Perhaps (code unchecked) along the lines of: changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), file.info = NULL, digest = FALSE, digestfun=NULL, full.names = FALSE, ...) if(digest){ if(is.null(digestfun)) digestfun=tools::md5sum else digestfun=match.fun(digestfun) info <- data.frame(info, digest = digestfun(fullnames)) } etc OR alternatively using only one argument: changedFiles <- function(snapshot, timestamp = tempfile("timestamp"), file.info = NULL, digest = FALSE, full.names = FALSE, ...) if(is.logical(digest)){ if(digest) digestfun=tools::md5sum } else { # Assume that digest specifies a function that we want to use digestfun=match.fun(digest) digest=TRUE } if(digest) info <- data.frame(info, digest = digestfun(fullnames)) etc Many thanks, Greg.
On 4 Sep 2013, at 18:53, Duncan Murdoch wrote:
In a number of places internal to R, we need to know which files have
changed (e.g. after building a vignette). I've just written a general
purpose function "changedFiles" that I'll probably commit to R-devel.
Comments on the design (or bug reports) would be appreciated.
The source for the function and the Rd page for it are inline below.
----- changedFiles.R:
changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
file.info = NULL,
md5sum = FALSE, full.names = FALSE, ...) {
dosnapshot <- function(args) {
fullnames <- do.call(list.files, c(full.names = TRUE, args))
names <- do.call(list.files, c(full.names = full.names, args))
if (isTRUE(file.info) || (is.character(file.info) &&
length(file.info))) {
info <- file.info(fullnames)
rownames(info) <- names
if (isTRUE(file.info))
file.info <- c("size", "isdir", "mode", "mtime")
} else
info <- data.frame(row.names=names)
if (md5sum)
info <- data.frame(info, md5sum = tools::md5sum(fullnames))
list(info = info, timestamp = timestamp, file.info = file.info,
md5sum = md5sum, full.names = full.names, args = args)
-- Gregory Jefferis, PhD Tel: 01223 267048 Division of Neurobiology MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge, CB2 OQH, UK http://www2.mrc-lmb.cam.ac.uk/group-leaders/h-to-m/g-jefferis http://jefferislab.org http://flybrain.stanford.edu