Hi all,
I realise that the convention is to provide a working example of my problem
but the data are of a sensitive nature so I'm not able to do that in this
case.
I need to query a database for multiple search terms:
db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28,
9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
"data.frame", row.names = c(NA,
-4L))
terms_include <- c("1","2","3")
terms_exclude <- c("1.1","1.2","1.3")
So I need to write a loop where the search of each value in the list of
terms_include is searched over the entire data frame. I thought of using
apply with grepl and subset? At the same time if the value of terms_include
occurs in the same row as values from terms_exclude then that row must be
excluded from the output dataframe.
I'm not sure where to even begin. I've only worked very basically with
subset. The final database is much larger and the number of search terms is
many more than are presented here so I would really need to be able to loop
over the data frame successively to return a final df with my searched
values in at least one of the columns.
Your help and assistance is much appreciated,
Natalie
-----
Natalie Van Zuydam
PhD Student
University of Dundee
nvanzuydam at dundee.ac.uk
--
View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html
Sent from the R help mailing list archive at Nabble.com.
Subsetting a data frame with multiple values and exclusions.
3 messages · natalie.vanzuydam, Dennis Murphy
Hi: Is this what you're after? f <- function(x) !any(x %in% terms_exclude) && any(x %in% terms_include) db[apply(db[, -1], 1, f), ] ind test1 test2 test3 2 ind2 2 27 28.0 4 ind4 3 2 1.2 HTH, Dennis
On Wed, Oct 5, 2011 at 8:53 AM, natalie.vanzuydam <nvanzuydam at gmail.com> wrote:
Hi all,
I realise that the convention is to provide a working example of my problem
but the data are ?of a sensitive nature so I'm not able to do that in this
case.
I need to query a database for multiple search terms:
db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28,
9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class =
"data.frame", row.names = c(NA,
-4L))
terms_include <- c("1","2","3")
terms_exclude <- c("1.1","1.2","1.3")
So I need to write a loop where the search of each value in the list of
terms_include is searched over the entire data frame. ?I thought of using
apply with grepl and subset? ?At the same time if the value of terms_include
occurs in the same row as values from terms_exclude then that row must be
excluded from the output dataframe.
I'm not sure where to even begin. ?I've only worked very basically with
subset. ?The final database is much larger and the number of search terms is
many more than are presented here so I would really need to be able to loop
over the data frame successively to return a final df with my searched
values in at least one of the columns.
Your help and assistance is much appreciated,
Natalie
-----
Natalie Van Zuydam
PhD Student
University of Dundee
nvanzuydam at dundee.ac.uk
--
View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks. Such a short and sweet answer that does what it should. ----- Natalie Van Zuydam PhD Student University of Dundee nvanzuydam at dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html Sent from the R help mailing list archive at Nabble.com.