Hi folks
I have two data frames. I know that the nth (let's say the 7th) row
in the first data frame (sequence) is there in the second
(today.sequence). When I try to check that by doing 'sequence[7,]
%in% today.sequence', I get all FALSE when it should be all TRUE.
I'm certain I'm making some trivial mistake. Any solutions?
The code to recreate the data frames and see for yourself is:
----
sequence <- structure(list(DATE = structure(c(14549, 14549, 14553, 14550,
14557, 14550, 14551, 14550), class = "Date"), DATASET = c(1L,
2L, 1L, 2L, 2L, 3L, 3L, 4L), REP = c(1L, 0L, 2L, 2L, 3L, 0L,
1L, 0L), WRONGS_ABS = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), WRONGS_RATIO = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), DONE = c(1L, 1L, 0L, 1L, 0L, 1L,
0L, 0L)), .Names = c("DATE", "DATASET", "REP", "WRONGS_ABS",
"WRONGS_RATIO", "DONE"), class = "data.frame", row.names = c(NA,
-8L))
today.sequence <- structure(list(DATE = structure(c(14551, 14550),
class = "Date"),
DATASET = 3:4, REP = c(1L, 0L), WRONGS_ABS = c(0L, 0L),
WRONGS_RATIO = c(0L,
0L), DONE = c(0L, 0L)), .Names = c("DATE", "DATASET", "REP",
"WRONGS_ABS", "WRONGS_RATIO", "DONE"), row.names = 7:8, class = "data.frame")
sequence[7,] #You should see '2009-11-03 3 1 0
0 0'
today.sequence #You can clearly see that sequence [7,] is the first
row in today.sequence
sequence[7,] %in% today.sequence #This should show 'TRUE TRUE TRUE
TRUE TRUE TRUE'. Instead
# it shows 'FALSE FALSE FALSE FALSE FALSE FALSE'
----
Thanks
Kaushik Krishnan
(kaushik.s.krishnan at gmail.com)
?"%in%" says "x" and "table" must be vectors. You supplied
data.frames. So %in% is coercing your today.sequence to a vector using
as.character(today.sequence)
Perhaps you should paste the columns together first:
x <- do.call("paste", c(sequence, sep = "::"))
table <- do.call("paste", c(today.sequence, sep = "::"))
x[7] %in% table
I'm not sure if this is what you want/need, but it does match your example.
HTH,
--sundar
On Tue, Nov 3, 2009 at 7:53 AM, Kaushik Krishnan
<kaushik.s.krishnan at gmail.com> wrote:
Hi folks
I have two data frames. ?I know that the nth (let's say the 7th) row
in the first data frame (sequence) is there in the second
(today.sequence). ?When I try to check that by doing 'sequence[7,]
%in% today.sequence', I get all FALSE when it should be all TRUE.
I'm certain I'm making some trivial mistake. ?Any solutions?
The code to recreate the data frames and see for yourself is:
----
sequence <- structure(list(DATE = structure(c(14549, 14549, 14553, 14550,
14557, 14550, 14551, 14550), class = "Date"), DATASET = c(1L,
2L, 1L, 2L, 2L, 3L, 3L, 4L), REP = c(1L, 0L, 2L, 2L, 3L, 0L,
1L, 0L), WRONGS_ABS = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), WRONGS_RATIO = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), DONE = c(1L, 1L, 0L, 1L, 0L, 1L,
0L, 0L)), .Names = c("DATE", "DATASET", "REP", "WRONGS_ABS",
"WRONGS_RATIO", "DONE"), class = "data.frame", row.names = c(NA,
-8L))
today.sequence <- structure(list(DATE = structure(c(14551, 14550),
class = "Date"),
? ?DATASET = 3:4, REP = c(1L, 0L), WRONGS_ABS = c(0L, 0L),
WRONGS_RATIO = c(0L,
? ?0L), DONE = c(0L, 0L)), .Names = c("DATE", "DATASET", "REP",
"WRONGS_ABS", "WRONGS_RATIO", "DONE"), row.names = 7:8, class = "data.frame")
sequence[7,] #You should see '2009-11-03 ? ? ? 3 ? 1 ? ? ? ? ?0
? ?0 ? ?0'
today.sequence #You can clearly see that sequence [7,] is the first
row in today.sequence
sequence[7,] %in% today.sequence #This should show 'TRUE TRUE TRUE
TRUE TRUE TRUE'. ?Instead
# it shows 'FALSE FALSE FALSE FALSE FALSE FALSE'
----
Thanks
--
Kaushik Krishnan
(kaushik.s.krishnan at gmail.com)
Kaushik,
The documentation doesn't quite tell (me, anyway) how the function behaves
when 'target' is a list (or data.frame). You'll need to dig into match.c
or experiment with match() or %in% to see what it is actually doing.
But it looks like it is matching whole columns of the data.frame rather
than elements within each column :
sequence %in% sequence
[1] TRUE TRUE TRUE TRUE TRUE TRUE
sequence %in% rev(sequence)
[1] TRUE TRUE TRUE TRUE TRUE TRUE
sequence[1,] %in% sequence
[1] FALSE FALSE FALSE FALSE FALSE FALSE
sequence[1,] %in% sequence[1,]
[1] TRUE TRUE TRUE TRUE TRUE TRUE
Maybe you wanted something like
mapply( function(x,y) x%in%y , sequence[7, ], today.sequence )
??
HTH,
Chuck
On Tue, 3 Nov 2009, Kaushik Krishnan wrote:
Hi folks
I have two data frames. I know that the nth (let's say the 7th) row
in the first data frame (sequence) is there in the second
(today.sequence). When I try to check that by doing 'sequence[7,]
%in% today.sequence', I get all FALSE when it should be all TRUE.
I'm certain I'm making some trivial mistake. Any solutions?
The code to recreate the data frames and see for yourself is:
----
sequence <- structure(list(DATE = structure(c(14549, 14549, 14553, 14550,
14557, 14550, 14551, 14550), class = "Date"), DATASET = c(1L,
2L, 1L, 2L, 2L, 3L, 3L, 4L), REP = c(1L, 0L, 2L, 2L, 3L, 0L,
1L, 0L), WRONGS_ABS = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), WRONGS_RATIO = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L), DONE = c(1L, 1L, 0L, 1L, 0L, 1L,
0L, 0L)), .Names = c("DATE", "DATASET", "REP", "WRONGS_ABS",
"WRONGS_RATIO", "DONE"), class = "data.frame", row.names = c(NA,
-8L))
today.sequence <- structure(list(DATE = structure(c(14551, 14550),
class = "Date"),
DATASET = 3:4, REP = c(1L, 0L), WRONGS_ABS = c(0L, 0L),
WRONGS_RATIO = c(0L,
0L), DONE = c(0L, 0L)), .Names = c("DATE", "DATASET", "REP",
"WRONGS_ABS", "WRONGS_RATIO", "DONE"), row.names = 7:8, class = "data.frame")
sequence[7,] #You should see '2009-11-03 3 1 0
0 0'
today.sequence #You can clearly see that sequence [7,] is the first
row in today.sequence
sequence[7,] %in% today.sequence #This should show 'TRUE TRUE TRUE
TRUE TRUE TRUE'. Instead
# it shows 'FALSE FALSE FALSE FALSE FALSE FALSE'
----
Thanks
--
Kaushik Krishnan
(kaushik.s.krishnan at gmail.com)
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901