how do I remove entries in data frame from a vector
Makes sense, thank you!
On Wed, 21 Oct 2020 at 17:46, Rolf Turner <r.turner at auckland.ac.nz> wrote:
On Wed, 21 Oct 2020 16:15:22 -0500 Ana Marija <sokovic.anamarija at gmail.com> wrote:
Hello, I have a data frame with one column:
remove
V1
1 ABAFT_g_4RWG569_BI_SNP_A10_35096
2 ABAFT_g_4RWG569_BI_SNP_B12_35130
3 ABAFT_g_4RWG569_BI_SNP_E09_35088
4 ABAFT_g_4RWG569_BI_SNP_E12_35136
5 ABAFT_g_4RWG569_BI_SNP_F11_35122
6 ABAFT_g_4RWG569_BI_SNP_F12_35138
7 ABAFT_g_4RWG569_BI_SNP_G07_35060
8 ABAFT_g_4RWG569_BI_SNP_G12_35140
I want to remove these 8 entries from remove data frame from this
vector that looks like this:
head(celFiles)
[1]
"/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A01_34952.CEL"
[2]
"/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A02_34968.CEL"
[3]
"/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A03_34984.CEL"
[4]
"GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A04_35000.CEL"
[5]
"/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A05_35016.CEL"
[6]
"/GOKIND/75327/PhenoGenotypeFiles/RootStudyConsentSet_phs000018.GAIN_GoKinD.v2.p1.c1.DS-T1DCR-IRB/GenotypeFiles/ABAFT_g_4RWG569_BI_SNP_A06_35032.CEL"
... I tried doing this: b= celFiles[!basename(celFiles) %in% as.character(remove$V1)] but none of the 8th entries in "remove" data frame have been removed. Please advise, Ana
I would advise you to *look* at basename(celFiles)!!!
The entries end in ".CEL"; the names in remove$V1 do not. So %in%
finds no matches. Perhaps:
b <- celFiles[!basename(celFiles) %in%
paste0(as.character(remove$V1),".CEL")]
Note that, for the data that you have presented, none of the entries of
celFiles "match up" with "remove" so it is *still* the case that (for
the data shown) none of the entries will be removed. So your example
was bad.
cheers,
Rolf Turner
--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276