Back to formatted view
Raw Message

Message-ID: <29F0E57C-6CAA-455E-9B92-A38FF2B8E1E4@comcast.net>
Date: 2011-10-27T12:50:46Z
From: David Winsemius
Subject: Consistant test for NAs in a factor when exclude = NULL?
In-Reply-To: <1319689303769-3943157.post@n4.nabble.com>

On Oct 27, 2011, at 12:21 AM, andrewH wrote:

> Thanks Jeff! I appreciate you sharing your experience.
>
> My data set is survey data, 13,209 records over nine years,  
> collected by
> someone else, converted from SPSS format. It includes missing values,
> identified however SPSS does so, and translated to NAs by the import
> process. It also includes values along the lines of "none of your  
> business"
> or "beats me" that are missing so far as I am concerned. I have  
> assigned NAs
> to these values.  Now I am trying to figure out some things about  
> where
> these missing values are -- whether they are disproportionately  
> located in
> any period or group.  I have been trying to get counts for subsets,  
> but I
> have not been able to make the subset counts add up to the total  
> counts that
> I get from, e.g. summary.
>
> So I wrote these simplified versions, and even for the simplest  
> examples, I
> could not find a function that correctly identified the NAs that I  
> knew were
> there because I put them there myself. That is why I am looking for  
> help.
> Does this make sense?

You might consider looking at the Hmisc package. I think it provides  
facilities for multiple missing attributes imported from SAS datasets.  
The help page to consult is sas.get {Hmisc},  I see no indication that  
a direct spss read facility was contmeplated, so it may take some  
extra work to get use out of this application of R attributes to store  
type-of-missingness-information in sequence with R NA's.

-- 

David Winsemius, MD
West Hartford, CT