Skip to content

from list to dataframe

2 messages · sms13+@pitt.edu, Stephen D. Weigand

#
I was wondering if someone can help me figure out the following:
I have two patient datasets, ds1 and ds2.  ds1 has fields "patid", "date", 
and "lab1".  ds2 has "patid", "date", and "lab2".  I want to find all the 
patids that have at least 2 dated records for each lab.  I started by 
splitting each dataset by patid, to create ds1.list and ds2.list.  Then I 
did some processing (with sapply) to each list to get the lengths of each 
patient list item.  Then I kind of lost my way and things got messy as I 
tried to extract just the patids of those with lengths >= 2, convert them 
to dataframes (which I didn't have much success with), and then merge the 
two dataframes to get a vector of the desired patids.  Any help would be 
much appreciated.

Thanks,
Steven
#
On May 18, 2005, at 5:39 PM, sms13+ at pitt.edu wrote:

            
Steven,

I might not exactly understand your problem, but for
what it's worth, you could try to identify the patients
in ds1 who appear at least twice and identify the patients
in ds2 who appear at least twice via

ptid1 <- c("A", "A", "B", "C", "D", "D")
keep1 <- names(table(ptid1))[table(ptid1) >= 2]
keep1

or if ptid is numeric

ptid1 <- c(1, 1, 2, 3, 4, 4)
keep1 <- as.numeric(names(table(ptid1))[table(ptid1) >= 2])
keep1

then subset the respective data sets via

ds1.keep <- subset(ds1, ptid %in% intersect(keep1, keep2))
ds2.keep <- subset(ds2, ptid %in% intersect(keep1, keep2))

then use merge().

Good luck!

Stephen