William Dunlap
2011-May-31 19:46 UTC
[R] how to tell if two file paths refer to the same file
Does R have a standard function that takes two file paths (e.g., "./myDirectory/file" and "myDirectory/file") and returns TRUE if those paths refer to the same file? The paths make take different routes ("absolute" or relative paths or via different symbolic or hard links) to the same file or may use different naming conventions (Windows or DOS 8.3 on Windows). If the file paths do not refer to actual files the answer is not generally well-defined and I'm sure what the best approach would be. S+ has a match.path() function that is like match() but the equality test is that the strings refer to the same file. On Unix it checks that the inode and device numbers are the same; on Windows that _fullpath() returns the same thing for both paths. E.g., > match.path(c("c:/PROGRA~1", "c:\\temp", "C:/Program Files/../Program Files"), c("C:\\Program Files", "C:\\Temp")) [1] 1 2 1 I know about R's normalizePath() but it doesn't map all ways of refering to a file to the same string so normalizePath(f1) == normalizePath(f1) is not a reliable test. E.g., with R 2.13.0 on Windows it doesn't use a standard capitalization: > cat(normalizePath(c("c:/Program Files", "C:/program files", "c:\\PROGRA~1")), sep="\n") c:\Program Files C:\program files c:\Program Files Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
Duncan Murdoch
2011-May-31 20:17 UTC
[R] how to tell if two file paths refer to the same file
On 11-05-31 3:46 PM, William Dunlap wrote:> Does R have a standard function that takes two file paths > (e.g., "./myDirectory/file" and "myDirectory/file") > and returns TRUE if those paths refer to the same file?I don't think so. I think normalizePath is as close as we get portably. On Windows, toupper(shortPathName()) might do a better job, but I'm sure there are exceptions for it too. And there are the really crazy exceptions, like mapped drives, UNC names, etc., where I think it is more or less hopeless to do anything without modifying the file. Duncan Murdoch> > The paths make take different routes ("absolute" or > relative paths or via different symbolic or hard links) > to the same file or may use different naming conventions > (Windows or DOS 8.3 on Windows). If the file paths do not > refer to actual files the answer is not generally well-defined > and I'm sure what the best approach would be. > > S+ has a match.path() function that is like match() but > the equality test is that the strings refer to the same > file. On Unix it checks that the inode and device numbers > are the same; on Windows that _fullpath() returns the same > thing for both paths. E.g., > > match.path(c("c:/PROGRA~1", > "c:\\temp", > "C:/Program Files/../Program Files"), > c("C:\\Program Files", > "C:\\Temp")) > [1] 1 2 1 > > I know about R's normalizePath() but it doesn't map all ways > of refering to a file to the same string so > normalizePath(f1) == normalizePath(f1) > is not a reliable test. E.g., with R 2.13.0 on Windows it > doesn't use a standard capitalization: > > cat(normalizePath(c("c:/Program Files", > "C:/program files", > "c:\\PROGRA~1")), sep="\n") > c:\Program Files > C:\program files > c:\Program Files > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.