On 27/02/2014 7:10 AM, Alexander Shenkin wrote:> Hi folks,
>
> I'm interested in finding files by matching both filenames and
> directories via regex. If I have:
>
> dir1_pat1/file1.csv
> dir2_pat1/file2.csv
> dir2_pat1/file3.txt
> dir3_pat2/file4.csv
>
> I would like to find, for example, all csv files in directories that
> have "pat1" in their name:
>
> dir1_pat1/file1.csv
> dir2_pat1/file2.csv
>
> > list.files(path = ".", pattern =
".*pat1/.*\\.csv", recursive = T)
> character(0)
> > list.files(path = ".", pattern =
".*pat1/.*\\.csv", recursive = T,
> full.names=T)
> character(0)
> > list.files(path = ".", pattern = ".*\\.csv",
recursive = T, full.names=T)
> [1] "./dir1_pat1/file1.csv" "./dir2_pat1/file2.csv"
"./dir3_pat2/file4.csv"
> > list.files(path = ".", pattern = "pat1",
recursive = T, full.names=T)
> character(0)
>
> I think list.files just runs the regex pattern against the file names,
> not the full path. I tried full.names=T, but it still matches against
> the file name only.
>
> Suggestions are greatly appreciated.
Two suggestions:
1. Use Sys.glob() instead of list.files(). It uses shell globbing for
the pattern instead of regular expressions, but it will handle your case:
Sys.glob("*pat1/*.csv")
should give you what you want.
2. Break up your regex into part to match the path and part to match
the filename. Use list.files on the filename part, then subset the
result using the path part.
Duncan Murdoch