William Dunlap
2011-May-31 19:46 UTC
[R] how to tell if two file paths refer to the same file
Does R have a standard function that takes two file paths
(e.g., "./myDirectory/file" and "myDirectory/file")
and returns TRUE if those paths refer to the same file?
The paths make take different routes ("absolute" or
relative paths or via different symbolic or hard links)
to the same file or may use different naming conventions
(Windows or DOS 8.3 on Windows). If the file paths do not
refer to actual files the answer is not generally well-defined
and I'm sure what the best approach would be.
S+ has a match.path() function that is like match() but
the equality test is that the strings refer to the same
file. On Unix it checks that the inode and device numbers
are the same; on Windows that _fullpath() returns the same
thing for both paths. E.g.,
> match.path(c("c:/PROGRA~1",
"c:\\temp",
"C:/Program Files/../Program Files"),
c("C:\\Program Files",
"C:\\Temp"))
[1] 1 2 1
I know about R's normalizePath() but it doesn't map all ways
of refering to a file to the same string so
normalizePath(f1) == normalizePath(f1)
is not a reliable test. E.g., with R 2.13.0 on Windows it
doesn't use a standard capitalization:
> cat(normalizePath(c("c:/Program Files",
"C:/program files",
"c:\\PROGRA~1")), sep="\n")
c:\Program Files
C:\program files
c:\Program Files
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
Duncan Murdoch
2011-May-31 20:17 UTC
[R] how to tell if two file paths refer to the same file
On 11-05-31 3:46 PM, William Dunlap wrote:> Does R have a standard function that takes two file paths > (e.g., "./myDirectory/file" and "myDirectory/file") > and returns TRUE if those paths refer to the same file?I don't think so. I think normalizePath is as close as we get portably. On Windows, toupper(shortPathName()) might do a better job, but I'm sure there are exceptions for it too. And there are the really crazy exceptions, like mapped drives, UNC names, etc., where I think it is more or less hopeless to do anything without modifying the file. Duncan Murdoch> > The paths make take different routes ("absolute" or > relative paths or via different symbolic or hard links) > to the same file or may use different naming conventions > (Windows or DOS 8.3 on Windows). If the file paths do not > refer to actual files the answer is not generally well-defined > and I'm sure what the best approach would be. > > S+ has a match.path() function that is like match() but > the equality test is that the strings refer to the same > file. On Unix it checks that the inode and device numbers > are the same; on Windows that _fullpath() returns the same > thing for both paths. E.g., > > match.path(c("c:/PROGRA~1", > "c:\\temp", > "C:/Program Files/../Program Files"), > c("C:\\Program Files", > "C:\\Temp")) > [1] 1 2 1 > > I know about R's normalizePath() but it doesn't map all ways > of refering to a file to the same string so > normalizePath(f1) == normalizePath(f1) > is not a reliable test. E.g., with R 2.13.0 on Windows it > doesn't use a standard capitalization: > > cat(normalizePath(c("c:/Program Files", > "C:/program files", > "c:\\PROGRA~1")), sep="\n") > c:\Program Files > C:\program files > c:\Program Files > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.