subtotal, submean, aggregate
Yes, that must be it. Probably best to issue a: set.seed(1) as part of the code when posting examples with random numbers. Also here is a variation that uses rle that Roger used together with some elements of the solution I posted: runno <- with(rle(as.numeric(transect[,2])), rep(seq(along = lengths), lengths)) aggregate(transect[,1], list(obs = transect[,2], runno), sum)[,-2]
On 2/26/06, Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr> wrote:
Yes right. Checking some examples, all come out OK. same as your example but I think there are some errors in your example output. Simply the 'errors' observed come simply from the seed in rpois(length(habitats),2) It is unlikely it is the same on your and my computer... Cheers, Patrick Gabor Grothendieck a ??crit : We are just comparing the difference to 0 so it does not matter if its positive
or negative. All that matters is whether its 0 or not. In fact,
the runno you calculate with the abs is identical to the one
I posted
without the abs:
runno <- cumsum(c(TRUE,
abs(diff(as.numeric(transect[,2])))!=0))
runno2 <- cumsum(c(TRUE,
diff(as.numeric(transect[,2])))!=0)
identical(runno, runno2) # TRUE On
2/26/06, Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr> wrote:
Excellent! I was messing with this problem since the early afternoon.
Actually the discrepancy you noticed remaining comes from
negative
difference in diff(as.numeric(transect[,2])) One can work it around
using abs(diff(as.numeric(transect[,2]))). This
makes: runno <-
cumsum(c(TRUE, abs(diff(as.numeric(transect[,2])))!=0))
aggregate(transect[,1], list(obs =
transect[,2], runno = runno), sum)
I did not know about this use of diff,
which was the key point... and then
cumsum for polishing. Really great and
also elegant (concise). I like it!
Thanks a
lot!!!
Cheers, Patrick Gabor Grothendieck a ??crit : Create another
variable that gives the run number and aggregate on
both the
habitat and run number removing the run number after
aggregating:
runno <-
cumsum(c(TRUE, diff(as.numeric(transect[,2])) !=0))
aggregate(transect[,1],
list(obs = transect[,2], runno = runno), sum)[,-2]
This does not give the
same as your example but I think there are some
errors in your example
output.
On 2/26/06, Patrick Giraudoux
<patrick.giraudoux at univ-fcomte.fr> wrote:
Dear All,
I would like to make partial sums (or means or any other
function) of
the values in intervals along a sequence (spatial transect)
where groups
are defined.
For
instance:
habitats<-rep(c("meadow","forest","meadow","pasture"),c(10,5,12,6))
observations<-rpois(length(habitats),2) transect<-data.frame(observations=observations,habitats=habitats) aggregate()
is not suitable for my purpose because I want a result
respecting the order
of the habitats encountered although they may have
the same name (and not
pooling each group on each level of the factor
created). For instance, the
output of the ideal function
mynicefunction() would be something
as:
mynicefunction(transect$observations,
by=list(transect$habitats),sum)
meadow 16
forest 9 meadow 21 pasture 17 and
not
aggregate(transect$observations,by=list(transect$habitats),sum) Group.1 x
1 forest 9
2 meadow 37 3 pasture 17 Did anybody hear about such a
function already written in R? If no, any
idea to make it simple and elegant
to write?
Cheers,
Patrick
Giraudoux
______________________________________________
R-help at stat.math.ethz.ch
mailing
list
PLEASE do
read the posting guide!
http://www.R-project.org/posting-guide.html