Detect and replace omitted data
On Oct 18, 2011, at 2:53 PM, Dennis Murphy wrote:
Prompted by David's xtabs() suggestion, one way to do what I think the
OP wants is to
* define day and unit as factors whose levels comprise the full range
of desired values;
* use xtabs();
* return the result as a data frame.
Something like
x <- data.frame( day = factor(rep(c(4, 6), each = 8), levels = 4:6),
unit = factor(c(1:8, seq(2,16,2)), levels = 1:16),
value = floor(rnorm(16,25,10)) )
as.data.frame(with(x, xtabs(value ~ unit + day)))
Oh, ... sometimes I'm "slow". Dennis' code has it's virtues, but
sometimes people want to avoid factors. Could also create a zero-
numeric-matrix to fill the interiors and rbind to the analysis matrix
just in the data= input to xtabs:
zeroes <- cbind(day =seq( min(day), max(day), by=1),
unit=seq(min(unit), max(unit), by=1),
value=0) # ignore warning
xtabs(value~day+unit, data=rbind(x, zeroes) )
unit
day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
4 25 34 3 25 38 18 19 33 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 22 0 42 0 37 0 4 0 12 0 31 0 17 0 28
David. > > HTH, > Dennis > > On Tue, Oct 18, 2011 at 11:33 AM, David Winsemius > <dwinsemius at comcast.net> wrote: >> >> On Oct 18, 2011, at 2:24 PM, Sarah Goslee wrote: >> >>> Hi Jonny, >>> >>> On Tue, Oct 18, 2011 at 1:02 PM, Jonny Armstrong >>> <jonny5armstrong at gmail.com> wrote: >>>> >>>> I am analyzing the spatial distribution of fish in a stream. The >>>> stream >>>> is >>>> divided into equally sized units, and the number of fish in each >>>> unit is >>>> counted. My problem is that my dataset is missing rows where the >>>> count in >>>> a >>>> unit equals zero. I need to create zero data for the missing units. >>>> >>>> For example: >>>> day<-(c(rep(4,8),rep(6,8))) >>>> unit<-c(seq(1,8,1),seq(2,16,2)) >>>> value<-floor(rnorm(16,25,10)) >>>> x<-cbind(day,unit,value) >>> >>> Thanks for the actual reproducible example. >>> >>>> x >>>> day unit value >>>> [1,] 4 1 19 >>>> [2,] 4 2 15 >>>> [3,] 4 3 16 >>>> [4,] 4 4 20 >>>> [5,] 4 5 17 >>>> [6,] 4 6 15 >>>> [7,] 4 7 14 >>>> [8,] 4 8 29 >>>> [9,] 6 2 18 >>>> [10,] 6 4 22 >>>> [11,] 6 6 27 >>>> [12,] 6 8 16 >>>> [13,] 6 10 45 >>>> [14,] 6 12 36 >>>> [15,] 6 14 34 >>>> [16,] 6 16 13 >>>> >>>> Lets say the stream has 16 units. For each day, I want to fill in >>>> rows >>>> for >>>> any missing units (e.g., units 9-16 for day 4, the odd numbered >>>> units on >>>> day >>>> 6) with values of zero. >> >> I could not figure out what you wanted precisely. If "day" is the row >> designator, and you want values by 'unit' and 'day' with zeros for >> the >> missing, then that is exactly what `xtab` delivers: >> >>> xtabs(value ~ day+unit, data=x) >> unit >> day 1 2 3 4 5 6 7 8 10 12 14 16 >> 4 25 34 3 25 38 18 19 33 0 0 0 0 >> 6 0 22 0 42 0 37 0 4 12 31 17 28 >> >> You cannot get much more concise than that. >> >> -- >> david. >>> >>> Here's one option, though it may not be terribly concise: >>> >>> all.samples <- expand.grid(day=unique(x[,"day"]), unit=1:16) >>> all.samples <- all.samples[order(all.samples[,"day"], >>> all.samples[,"unit"]),] >>> x.final <- merge(x, all.samples, all.y=TRUE) >>> x.final[is.na(x.final[,"value"]), "value"] <- 0 >>> >>> Sarah >>> >>>> Does anyone know a relatively concise way to do this? >>>> Thank you. >>>> >>>> [[alternative HTML version deleted]] >>>> >>> >>> -- >>> Sarah Goslee >>> http://www.functionaldiversity.org >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius, MD >> West Hartford, CT >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> David Winsemius, MD West Hartford, CT