I am trying to delete rows containing missing values from a groupeddata object. Several of the columns are character (sexChar, HAPI, rs2304785) the rest are numeric. For some reason I am excluding all rows with missing values. Your suggestions for corrections would be appreciated.
This did not work
GC2 <- GC[c("logtg" != NA & "ctime" != NA & !is.na("sexChar") & !is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
rs2304795)), ]
nor did
GC2 <- GC["logtg" != NA & "ctime" != NA & !is.na("sexChar") & !is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
rs2304795), ]
John
John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC and
University of Maryland School of Medicine Claude Pepper OAIC
University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
410-605-7119
- NOTE NEW EMAIL ADDRESS:
jsorkin at grecc.umaryland.edu
Delete missing values
2 messages · John Sorkin, Marc Schwartz
On Wed, 2005-12-14 at 21:34 -0500, John Sorkin wrote:
I am trying to delete rows containing missing values from a
groupeddata object. Several of the columns are character (sexChar,
HAPI, rs2304785) the rest are numeric. For some reason I am excluding
all rows with missing values. Your suggestions for corrections would
be appreciated.
This did not work
GC2 <- GC[c("logtg" != NA & "ctime" != NA & !is.na("sexChar") & !
is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
rs2304795)), ]
nor did
GC2 <- GC["logtg" != NA & "ctime" != NA & !is.na("sexChar") & !
is.na("HAPI") & "logfirsttg" != NA & "BMI" != NA & !is.na(GC$
rs2304795), ]
John
John, You cannot use: Values != NA and get the TRUE/FALSE results of the boolean comparison of Values that are not equal to NA. For example:
a <- sample(c(NA, 1:5), 20, replace = TRUE)
a
[1] 2 3 3 1 3 4 5 3 NA 4 3 2 1 2 2 NA 2 2 NA 1
a != NA
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA or
a == NA
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA is undefined, so by definition, any comparisons to NA, as above, will be as well. Simply put:
NA == NA
[1] NA # Note that this is not TRUE That is why there is a specific function to be used, which you have in some cases above. That is is.na().
!is.na(a)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE [12] TRUE TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE which then can be used as such:
a[!is.na(a)]
[1] 2 3 3 1 3 4 5 3 4 3 2 1 2 2 2 2 1 In the case of a data frame (which a groupedData object contains), you can use complete.cases() to access the rows that do not have missing values. So, if your initial object is called GC, you should be able to use: GC2 <- GC[complete.cases(GC), ] An alternative is to use na.omit() as follows: GC2 <- na.omit(GC) See ?complete.cases and ?na.omit for more information. HTH, Marc Schwartz