Message-ID: <CADv2QyGP+bp+BX2Cm3Obo42BtawD3aANusxYOxvLpQR_y8v1xA@mail.gmail.com>
Date: 2011-10-18T18:53:59Z
From: Dennis Murphy
Subject: Detect and replace omitted data
In-Reply-To: <64FA77FE-1877-40AB-836E-1E2281D5D8F0@comcast.net>
Prompted by David's xtabs() suggestion, one way to do what I think the
OP wants is to
* define day and unit as factors whose levels comprise the full range
of desired values;
* use xtabs();
* return the result as a data frame.
Something like
x <- data.frame( day = factor(rep(c(4, 6), each = 8), levels = 4:6),
unit = factor(c(1:8, seq(2,16,2)), levels = 1:16),
value = floor(rnorm(16,25,10)) )
as.data.frame(with(x, xtabs(value ~ unit + day)))
HTH,
Dennis
On Tue, Oct 18, 2011 at 11:33 AM, David Winsemius
<dwinsemius at comcast.net> wrote:
>
> On Oct 18, 2011, at 2:24 PM, Sarah Goslee wrote:
>
>> Hi Jonny,
>>
>> On Tue, Oct 18, 2011 at 1:02 PM, Jonny Armstrong
>> <jonny5armstrong at gmail.com> wrote:
>>>
>>> I am analyzing the spatial distribution of fish in a stream. The stream
>>> is
>>> divided into equally sized units, and the number of fish in each unit is
>>> counted. My problem is that my dataset is missing rows where the count in
>>> a
>>> unit equals zero. I need to create zero data for the missing units.
>>>
>>> For example:
>>> day<-(c(rep(4,8),rep(6,8)))
>>> unit<-c(seq(1,8,1),seq(2,16,2))
>>> value<-floor(rnorm(16,25,10))
>>> x<-cbind(day,unit,value)
>>
>> Thanks for the actual reproducible example.
>>
>>> x
>>> ? ? day unit value
>>> ?[1,] ? 4 ? ?1 ? ?19
>>> ?[2,] ? 4 ? ?2 ? ?15
>>> ?[3,] ? 4 ? ?3 ? ?16
>>> ?[4,] ? 4 ? ?4 ? ?20
>>> ?[5,] ? 4 ? ?5 ? ?17
>>> ?[6,] ? 4 ? ?6 ? ?15
>>> ?[7,] ? 4 ? ?7 ? ?14
>>> ?[8,] ? 4 ? ?8 ? ?29
>>> ?[9,] ? 6 ? ?2 ? ?18
>>> [10,] ? 6 ? ?4 ? ?22
>>> [11,] ? 6 ? ?6 ? ?27
>>> [12,] ? 6 ? ?8 ? ?16
>>> [13,] ? 6 ? 10 ? ?45
>>> [14,] ? 6 ? 12 ? ?36
>>> [15,] ? 6 ? 14 ? ?34
>>> [16,] ? 6 ? 16 ? ?13
>>>
>>> Lets say the stream has 16 units. For each day, I want to fill in rows
>>> for
>>> any missing units (e.g., units 9-16 for day 4, the odd numbered units on
>>> day
>>> 6) with values of zero.
>
> I could not figure out what you wanted precisely. If "day" is the row
> designator, and you want values by 'unit' and 'day' with zeros for the
> missing, then that is exactly what `xtab` delivers:
>
>> xtabs(value ~ day+unit, data=x)
> ? unit
> day ?1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 10 12 14 16
> ?4 25 34 ?3 25 38 18 19 33 ?0 ?0 ?0 ?0
> ?6 ?0 22 ?0 42 ?0 37 ?0 ?4 12 31 17 28
>
> You cannot get much more concise than that.
>
> --
> david.
>>
>> Here's one option, though it may not be terribly concise:
>>
>> all.samples <- expand.grid(day=unique(x[,"day"]), unit=1:16)
>> all.samples <- all.samples[order(all.samples[,"day"],
>> all.samples[,"unit"]),]
>> x.final <- merge(x, all.samples, all.y=TRUE)
>> x.final[is.na(x.final[,"value"]), "value"] <- 0
>>
>> Sarah
>>
>>> Does anyone know a relatively concise way to do this?
>>> Thank you.
>>>
>>> ? ? ? [[alternative HTML version deleted]]
>>>
>>
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>