Skip to content

NA's in survey analysis

4 messages · Donatas G., PIKAL Petr

#
Hello,

I am trying to analyze sociological survey data using R. It is often
important in survey to calculate both the actual factor sums and
percentages (easily done with describe() ), but also the numbers and
total percentage of NA's. Often it is important to present NA's in
graphs besides the factors.

Is there any easy way to make R treat NA's as if those were factors
besides other factors?

Now, describe(data$a) gives me percentages only for the factors. So I
have to redo percentages manually.

barplot() also ignores NA's. So, to include NA's into barplot I need
to do a table more or less manually.

The other way to do it is to convert NA's into factors (doable,
although, unlike in SPSS, I cannot make an assumption that 99 is a
good code for a factor "NA" ? it has to be the next number in the
factor list,so, might be different for each column in a data frame).
And besides, I have read somewhere in this list that IT IS THE WRONG
WAY TO DO STUFF IN R :)

Is there the right way to do things that I want, and if not ? what are
the possible workarounds, smarter than the ones I listed?
#
Hello,

I am trying to analyze sociological survey data using R. It is often
important in survey to calculate both the actual factor sums and
percentages (easily done with describe() ), but also the numbers and
total percentage of NA's. Often it is important to present NA's in
graphs besides the factors.

Is there any easy way to make R treat NA's as if those were factors
besides other factors?

Now, describe(data$a) gives me percentages only for the factors. So I
have to redo percentages manually.

barplot() also ignores NA's. So, to include NA's into barplot I need
to do a table more or less manually.

The other way to do it is to convert NA's into factors (doable,
although, unlike in SPSS, I cannot make an assumption that 99 is a
good code for a factor "NA" ? it has to be the next number in the
factor list,so, might be different for each column in a data frame).
And besides, I have read somewhere in this list that IT IS THE WRONG
WAY TO DO STUFF IN R :)

Is there the right way to do things that I want, and if not ? what are
the possible workarounds, smarter than the ones I listed?

--
Donatas Glodenis
#
Hi

r-help-bounces at r-project.org napsal dne 21.12.2010 11:02:07:
not necessary to code missing values, you can set NA as one level.

x<-factor(sample(c(1:3, NA),20,replace=T), exclude=NULL)
x
 [1] 1    1    3    3    3    2    3    <NA> 3    1    2    <NA> 3    <NA> 
2 
[16] 2    3    1    <NA> 3 
Levels: 1 2 3 <NA>
boxplot(split(y,x))

Besides you could find it from factor help page as I did.

Regards
Petr
http://www.R-project.org/posting-guide.html
#
2010/12/21 Petr PIKAL <petr.pikal at precheza.cz>:
Thank you Petr, this info (re exclude=NULL) might have saved me tons
of time last week :)

I still have not found an equivalent parameter in describe(), but
anyway, I have been helped a lot!