Back to formatted view
Raw Message

Message-ID: <CAM_vjukaecZKuYjhikYRC3vRnD8+f6_+qofCn=3gvoWOxrkE-A@mail.gmail.com>
Date: 2011-11-15T01:12:03Z
From: Sarah Goslee
Subject: gsub help
In-Reply-To: <1321318746.55423.YahooMailNeo@web114702.mail.gq1.yahoo.com>

Hi,

On Mon, Nov 14, 2011 at 7:59 PM, Debs Majumdar <debs_stata at yahoo.com> wrote:
> Hi,
>
> ?I am working with the following list of files:
>
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info"
> [3] "study_chr1.one.phased.impute2.chunk1_info_by_sample"
> [4] "study_chr1.one.phased.impute2.chunk1_summary"
> [5] "study_chr1.one.phased.impute2.chunk1_warnings"
>
> The folder has many other files. I am trying to use gsub to give me just this file: study_chr1.one.phased.impute2.chunk1
>
> With Uwe's help I have tried the following:
>
> fls <- list.files(pattern="^study") # which gives me the list above.
>
> ufls <- unique(gsub("(_.*)_.*", "\\1", fls))? # which outputs
>
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info_by"

So you want the file name that starts with study and ends in 1?

I'd use grep() rather than gsub(), since you just want to match from a
list, or is there more going on than in your example?

You didn't give a reproducible dataset, but here's a fake one,
matching strings that begin with "a" instead of "study", and ending
with "1" as in your example:

> testdata <- c("abcd1", "abcd1_info", "nota1", "nota1_info")
> testdata[grepl("^a.*1$", testdata)]
[1] "abcd1"

You might really just need
yourdata[grepl("1$", yourdata)]
to select filenames that end in 1.

If that's all you really need, you've made it far too complicated.

Sarah




-- 
Sarah Goslee
http://www.functionaldiversity.org