Hi,
On Mon, Nov 14, 2011 at 7:59 PM, Debs Majumdar <debs_stata at yahoo.com>
wrote:> Hi,
>
> ?I am working with the following list of files:
>
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info"
> [3] "study_chr1.one.phased.impute2.chunk1_info_by_sample"
> [4] "study_chr1.one.phased.impute2.chunk1_summary"
> [5] "study_chr1.one.phased.impute2.chunk1_warnings"
>
> The folder has many other files. I am trying to use gsub to give me just
this file: study_chr1.one.phased.impute2.chunk1
>
> With Uwe's help I have tried the following:
>
> fls <- list.files(pattern="^study") # which gives me the list
above.
>
> ufls <- unique(gsub("(_.*)_.*", "\\1", fls))? #
which outputs
>
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info_by"
So you want the file name that starts with study and ends in 1?
I'd use grep() rather than gsub(), since you just want to match from a
list, or is there more going on than in your example?
You didn't give a reproducible dataset, but here's a fake one,
matching strings that begin with "a" instead of "study", and
ending
with "1" as in your example:
> testdata <- c("abcd1", "abcd1_info",
"nota1", "nota1_info")
> testdata[grepl("^a.*1$", testdata)]
[1] "abcd1"
You might really just need
yourdata[grepl("1$", yourdata)]
to select filenames that end in 1.
If that's all you really need, you've made it far too complicated.
Sarah
--
Sarah Goslee
http://www.functionaldiversity.org