gsub help
Hi,
On Mon, Nov 14, 2011 at 7:59 PM, Debs Majumdar <debs_stata at yahoo.com> wrote:
Hi,
?I am working with the following list of files:
[1] "study_chr1.one.phased.impute2.chunk1"
[2] "study_chr1.one.phased.impute2.chunk1_info"
[3] "study_chr1.one.phased.impute2.chunk1_info_by_sample"
[4] "study_chr1.one.phased.impute2.chunk1_summary"
[5] "study_chr1.one.phased.impute2.chunk1_warnings"
The folder has many other files. I am trying to use gsub to give me just this file: study_chr1.one.phased.impute2.chunk1
With Uwe's help I have tried the following:
fls <- list.files(pattern="^study") # which gives me the list above.
ufls <- unique(gsub("(_.*)_.*", "\\1", fls))? # which outputs
[1] "study_chr1.one.phased.impute2.chunk1"
[2] "study_chr1.one.phased.impute2.chunk1_info_by"
So you want the file name that starts with study and ends in 1? I'd use grep() rather than gsub(), since you just want to match from a list, or is there more going on than in your example? You didn't give a reproducible dataset, but here's a fake one, matching strings that begin with "a" instead of "study", and ending with "1" as in your example:
testdata <- c("abcd1", "abcd1_info", "nota1", "nota1_info")
testdata[grepl("^a.*1$", testdata)]
[1] "abcd1"
You might really just need
yourdata[grepl("1$", yourdata)]
to select filenames that end in 1.
If that's all you really need, you've made it far too complicated.
Sarah
Sarah Goslee http://www.functionaldiversity.org