Skip to content

Calculating percentage Missing value for variables using one object

3 messages · Shreyasee, David Winsemius, Gabor Grothendieck

#
It looks to me that you should be using the table or the xtabs  
function. You have apparently already decided not to use NA for  
missing values, so the instances in which variable1 == "" you should  
get counts with those functions:

dft <- data.frame(var1 = sample(c("", "this", "that", "and"), 120,  
replace=TRUE),
                dt = sample( seq(as.Date("2006-01-01"),  
as.Date("2007-12-31"), by="months"),
                            120, replace=TRUE))

mo.tbl <- xtabs( ~var1+ dt, data=dft)      # the =="" entry is the  
first row

 > mo.tbl[1,]
2006-01-01 2006-02-01 2006-03-01 2006-04-01 2006-05-01 2006-06-01  
2006-07-01
          2          1          1          2          2           
3          1
2006-08-01 2006-09-01 2006-10-01 2006-11-01 2006-12-01 2007-01-01  
2007-02-01
          0          1          1          1          2           
1          2
2007-03-01 2007-04-01 2007-05-01 2007-06-01 2007-07-01 2007-08-01  
2007-09-01
          2          2          2          0          1           
3          4
2007-10-01 2007-11-01 2007-12-01
          1          3          2

x <- seq(as.Date("2006-01-01"), as.Date("2007-03-31"), by="months")
plot(mo.tbl[1,]~x)
#
Read in the data, aggregate it by month and
then turn it into a monthly zoo object and plot
using a custom X axis:

Lines <- 'dos,variable1,variable2
May-06,1,""
May-06,2,""
June-06,"",2
June-06,1,4
July-06,1,4
July-06,1,4
August-06,1,4
August-06,1,4'
DF <- read.table(textConnection(Lines), header = TRUE, sep = ",")

library(zoo)
DF.na <- aggregate(DF[-1], DF["dos"], function(x) mean(is.na(x)))

z <- zoo(as.matrix(DF.na[-1]), as.yearmon(DF.na$dos, "%B-%y"))

i <- 1
plot(z[,i], xaxt = "n", ylab = "Fraction Missing", main = names(DF)[i+1])
axis(1, time(z), format(time(z), "%m/%y"), cex.axis = .7)
On Tue, Mar 24, 2009 at 2:58 AM, Shreyasee <shreyasee.pradhan at gmail.com> wrote: