Skip to content
Back to formatted view

Raw Message

Message-ID: <3CC20FC2-99F0-49C5-A56A-0AC1431AAFA0@gmail.com>
Date: 2012-08-03T07:31:53Z
From: Peter Dalgaard
Subject: all duplicated wanted
In-Reply-To: <E67F2D0E-9C82-4233-BFDA-36D8D891F4DC@gmail.com>

On Aug 3, 2012, at 09:06 , Weijia Wang wrote:

> Hi,
> 
> Has anyone been able to figure out how to print all duplicated observations?
> 
> I have a dataset, with patients ID, and other lab records.
> 
> Some patients have multiple lab records, but 'duplicated' ID will only show me the duplicates, not the original observation.
> 
> How can I print both the original one and the duplicates?

Something like this?

dd[ID %in% unique(ID[duplicated(ID)]),]

Let's try:

> ID <- sample(1:10, 10, replace=TRUE)
> table(ID)
ID
 1  2  3  4  7 10 
 1  1  3  1  2  2 
> ID[ID %in% unique(ID[duplicated(ID)])]
[1]  7  7 10  3  3  3 10

The unique() bit is really just for efficiency:

> ID[ID %in% ID[duplicated(ID)]]
[1]  7  7 10  3  3  3 10


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com