-----Original Message-----
From: Carlos J. Gil Bellosta [mailto:cgb at datanalytics.com]
Sent: Tuesday, January 13, 2009 2:55 PM
To: Doran, Harold
Cc: r-help at r-project.org
Subject: Re: [R] Comparing elements for equality
Hello,
You could build your output dataframe along the following lines:
foo <- function(x) length( unique(x) ) == 1
results <- data.frame(
freq = tapply( dat$id, dat$id, length ),
var1 = tapply( dat$var1, dat$id, foo ),
var2 = tapply( dat$var2, dat$id, foo )
)
Best regards,
Carlos J. Gil Bellosta
http://www.datanalytics.com
On Tue, 2009-01-13 at 14:17 -0500, Doran, Harold wrote:
Suppose I have a dataframe as follows:
dat <- data.frame(id = c(1,1,2,2,2), var1 =
c(10,10,20,20,25), var2 =
c('foo', 'foo', 'foo', 'foobar', 'foo'))
Now, if I were to subset by id, such as:
id var1 var2
1 1 10 foo
2 1 10 foo
I can see that the elements in var1 are exactly the same and the
elements in var2 are exactly the same. However,
id var1 var2
3 2 20 foo
4 2 20 foobar
5 2 25 foo
Shows the elements are not the same for either variable in this
instance. So, what I am looking to create is a data frame
be like this
id freq var1 var2
1 2 TRUE TRUE
2 3 FALSE FALSE
Where freq is the number of times the ID is repeated in the
A TRUE appears in the cell if all elements in the column
for the ID and FALSE otherwise. It is insignificant which values
differ for my problem.
The way I am thinking about tackling this is to loop through the ID
variable and compare the values in the various columns of
The problem I am encountering is that I don't think all.equal or
identical are the right functions in this case.
So, say I was wanting to compare the elements of var1 for id ==1. I
would have
x <- c(10,10)
Of course, the following works
[1] TRUE
As would a similar call to identical. However, what if I
vector of values (or if the column consists of names) that
assess for equality when I am trying to automate a process over
thousands of cases? As in the example above, the vector may contain
only two values or it may contain many more. The number of
the vector differ by id.
Any thoughts?
Harold