Back to formatted view
Raw Message

Message-ID: <9EA364B2FAFC264A86CED5A4404A19DB4EDE56F0@AMXPRD0310MB366.eurprd03.prod.outlook.com>
Date: 2013-07-05T09:28:44Z
From: Pancho Mulongeni
Subject: Unique in discerning missing values NA

Hi,
I am trying to remove duplicate Patient numbers in a clinical record, I used unique
menPatients[1:40,1]
 [1] abr1160(C)/001 ABR1363(A)/001 ABR1363(A)/001 ABR1363(A)/001 abr1772(B)/001
 [6] AFR0003/001    AFR0003/001    afr0290(C)/001 afr1861(B)/001 Aga0007/001   
[11] AGA1548(A)/001 AGA1548(A)/001 AGA1548(A)/001 AGU1680(A)/001 AGU1680(A)/001
[16] AIS0492/001    AIS0492/001    AKO4268(C)/001 AKO4268(C)/001 AKT0042(B)/001
[21] AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001
[26] AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001 ALF1735(A)/001
[31] ALF1735(A)/001 ALP4321(C)/001 <NA>           <NA>           ALU4262(B)/001
[36] ALV4286(C)/001 ALW2579(C)/001 <NA>           ALW4330(B)/001 AMA0011/001   
3886 Levels: 0750/002 0751/001 0984/002 ABE2560(C)/001 ... zul1737(B)/001

testData<-menPatients[1:40,1]

I then used unique, please note the NA at position 32 in testData
testUnique<-unique(testData)
testUnique
 [1] abr1160(C)/001 ABR1363(A)/001 abr1772(B)/001 AFR0003/001    afr0290(C)/001
 [6] afr1861(B)/001 Aga0007/001    AGA1548(A)/001 AGU1680(A)/001 AIS0492/001   
[11] AKO4268(C)/001 AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001
[16] ALF1735(A)/001 ALP4321(C)/001 <NA>           ALU4262(B)/001 ALV4286(C)/001
[21] ALW2579(C)/001 ALW4330(B)/001 AMA0011/001   

The missing value NA originally at position 32 in testdata is still there, it is in position 18. Why is this? How can I prevent this?
I tried using incomprables=c(NA), but this did not work.

Thanks


Pancho Mulongeni
Research Assistant
PharmAccess Foundation
1 Fouch? Street
Windhoek West
Windhoek
Namibia
?
Tel:?? +264 61 419 000
Fax:? +264 61 419 001/2
Mob: +264 81 4456 286