Date: Wed, 26 Nov 2003 13:52:44 +0100
From: Martin Maechler <maechler@stat.math.ethz.ch>
To: <Kurt.Hornik@wu-wien.ac.at>
Cc: <r-devel@stat.math.ethz.ch>
Subject: Re: [Rd] Question about Unix file paths
" Kurt" == Kurt Hornik <Kurt.Hornik@wu-wien.ac.at>
on Wed, 26 Nov 2003 10:05:42 +0100 writes:
Prof Brian Ripley writes:
On Mon, 24 Nov 2003, Duncan Murdoch wrote:
Duncan Murdoch <dmurdoch@pair.com> writes:
Gabor Grothendieck pointed out a bug to me in
list.files(..., >> full.name=TRUE), that essentially
comes down to the fact that in >> Windows it's not
always valid to add a path separator (slash or >>
backslash) between a path specifier and a filename. For
example,
c:foo
is different from
c:\foo
and there are other examples.
I've committed a change to r-patched to fix this in
Windows only. Sounds like it's not an issue elsewhere.
I think there are some potential issues with doubling
separators and final separators on dirs. On Unix file
systems /part1//part2 and /path/to/dir/ are valid.
However, file systems on Unix may not be Unix file
systems: examples are earlier MacOS systems on MacOS X
and mounted Windows and Novell systems on Linux. I would
not want to assume that all of these combinations worked.
Gabor also suggested an option to use shell globbing
instead of regular expressions to select the files in
the list, e.g.
list.files(dir="/", pattern="a*.dat", glob=T)
This would be easy to do in Windows, but from the little
I know about Unix programming, would not be so easy
there, so I haven't done anything about it.
It would be shell-dependent and OS-dependent as well as a
retrograde step, as those who wanted to use regular
expressions no longer would be able to.
Kurt> Right. In any case, an explicit glob() function
Kurt> seems preferable to me ...
Good idea!
More than 12 years ago, I had a similar one, and wrote a
"pat2grep()" {pattern to grep regular expression} function
--- for S-plus on Unix --- which I have now renamed to glob2regexp():
-- still not really usable outside unix (or windows with the
'sed' tool in the path), nor perfect, but maybe a good start:
sys <- function(...) system(paste(..., sep = ""))
glob2regexp <- function(pattern)
{
## Purpose: Change "ls pattern" to "grep regular expression" pattern.
## -------------------------------------------------------------------------
## Author: Martin Maechler ETH Zurich, ~ 1991
sys("echo '", pattern, "'| sed ",
"'s/\\./\\\\./g;s/*/.*/g;s/?/./g; s/^/^/;s/$/$/; s/\\.\\*\\$$//'")
}
E.g.,
^a.bc.*\.t..$
and one could use it as
list.files(...., pattern = glob2regexp("a*.dat"))
Of course, the function needs to be changed to simply use things like
sub() and gsub() --- another minor exercise for our audience ...
Martin