Message-ID: <29F0E57C-6CAA-455E-9B92-A38FF2B8E1E4@comcast.net>
Date: 2011-10-27T12:50:46Z
From: David Winsemius
Subject: Consistant test for NAs in a factor when exclude = NULL?
In-Reply-To: <1319689303769-3943157.post@n4.nabble.com>
On Oct 27, 2011, at 12:21 AM, andrewH wrote:
> Thanks Jeff! I appreciate you sharing your experience.
>
> My data set is survey data, 13,209 records over nine years,
> collected by
> someone else, converted from SPSS format. It includes missing values,
> identified however SPSS does so, and translated to NAs by the import
> process. It also includes values along the lines of "none of your
> business"
> or "beats me" that are missing so far as I am concerned. I have
> assigned NAs
> to these values. Now I am trying to figure out some things about
> where
> these missing values are -- whether they are disproportionately
> located in
> any period or group. I have been trying to get counts for subsets,
> but I
> have not been able to make the subset counts add up to the total
> counts that
> I get from, e.g. summary.
>
> So I wrote these simplified versions, and even for the simplest
> examples, I
> could not find a function that correctly identified the NAs that I
> knew were
> there because I put them there myself. That is why I am looking for
> help.
> Does this make sense?
You might consider looking at the Hmisc package. I think it provides
facilities for multiple missing attributes imported from SAS datasets.
The help page to consult is sas.get {Hmisc}, I see no indication that
a direct spss read facility was contmeplated, so it may take some
extra work to get use out of this application of R attributes to store
type-of-missingness-information in sequence with R NA's.
--
David Winsemius, MD
West Hartford, CT