Skip to content

Detect and replace omitted data

7 messages · Sarah Goslee, Trevor Davies, Jonny Armstrong +2 more

#
Hi Jonny,

On Tue, Oct 18, 2011 at 1:02 PM, Jonny Armstrong
<jonny5armstrong at gmail.com> wrote:
Thanks for the actual reproducible example.
Here's one option, though it may not be terribly concise:

all.samples <- expand.grid(day=unique(x[,"day"]), unit=1:16)
all.samples <- all.samples[order(all.samples[,"day"], all.samples[,"unit"]),]
x.final <- merge(x, all.samples, all.y=TRUE)
x.final[is.na(x.final[,"value"]), "value"] <- 0

Sarah

  
    
#
On Oct 18, 2011, at 2:24 PM, Sarah Goslee wrote:

            
I could not figure out what you wanted precisely. If "day" is the row  
designator, and you want values by 'unit' and 'day' with zeros for the  
missing, then that is exactly what `xtab` delivers:

 > xtabs(value ~ day+unit, data=x)
    unit
day  1  2  3  4  5  6  7  8 10 12 14 16
   4 25 34  3 25 38 18 19 33  0  0  0  0
   6  0 22  0 42  0 37  0  4 12 31 17 28

You cannot get much more concise than that.
#
Prompted by David's xtabs() suggestion, one way to do what I think the
OP wants is to
 * define day and unit as factors whose levels comprise the full range
of desired values;
 * use xtabs();
 * return the result as a data frame.
Something like

x <- data.frame( day = factor(rep(c(4, 6), each = 8), levels = 4:6),
                 unit = factor(c(1:8, seq(2,16,2)), levels = 1:16),
                 value = floor(rnorm(16,25,10)) )
as.data.frame(with(x, xtabs(value ~ unit + day)))

HTH,
Dennis

On Tue, Oct 18, 2011 at 11:33 AM, David Winsemius
<dwinsemius at comcast.net> wrote:
#
On Oct 18, 2011, at 2:53 PM, Dennis Murphy wrote:

            
Oh, ... sometimes I'm "slow". Dennis' code has it's virtues, but  
sometimes people want to avoid factors. Could also create a zero- 
numeric-matrix to fill the interiors and rbind to the analysis matrix  
just in the data= input to xtabs:

  zeroes <- cbind(day =seq( min(day), max(day), by=1),
                 unit=seq(min(unit), max(unit), by=1),
                 value=0)   # ignore warning

xtabs(value~day+unit, data=rbind(x, zeroes) )
    unit
day  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
   4 25 34  3 25 38 18 19 33  0  0  0  0  0  0  0  0
   5  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
   6  0 22  0 42  0 37  0  4  0 12  0 31  0 17  0 28