Skip to content

understanding output of tapply/by cumsum

3 messages · Gerrit Draisma, jim holtman

#
Dear R-users,

I have a dataset with categories and numbers.
I would like to compute and add cumulative numbers
to the dataset.
I do not understand the structure of by(...) or
tapply(...) output enough to handle it.

Here a small example
--------------
d<-expand.grid(a=1:5,b=1:3,c=1:2)
d$n = 10 * d$a + d$b +0.1* d$c
Sn<-by(d$n,list(d$a,d$c),cumsum)
str(Sn)
---------
List of 10
  $ : num [1:3] 11.1 23.2 36.3
  $ : num [1:3] 21.1 43.2 66.3
  $ : num [1:3] 31.1 63.2 96.3
  $ : num [1:3]  41.1  83.2 126.3
  $ : num [1:3]  51.1 103.2 156.3
  $ : num [1:3] 11.2 23.4 36.6
  $ : num [1:3] 21.2 43.4 66.6
  $ : num [1:3] 31.2 63.4 96.6
  $ : num [1:3]  41.2  83.4 126.6
  $ : num [1:3]  51.2 103.4 156.6
  - attr(*, "dim")= int [1:2] 5 2
  - attr(*, "dimnames")=List of 2
   ..$ : chr [1:5] "1" "2" "3" "4" ...
   ..$ : chr [1:2] "1" "2"
  - attr(*, "call")= language by.default(data = d$n, INDICES = list(d$a, 
d$c), FUN = cumsum)
  - attr(*, "class")= chr "by
---------
# these give (a) lists of one numerical vector(a)
Sn[5,2]
Sn[cbind(d$a,d$c)]
# how to access the individual cumsum values?
# and assign them to d$Sn?
--------------

Thanks,
Gerrit.

---
Gerrit Draisma
Department of Public Health
Erasmus MC, University Medical Center Rotterdam
Room AE-235
P.O. Box 2040 3000 CA  Rotterdam The Netherlands
Phone: +31 10 7043787 Fax: +31 10 7038474
http://mgzlx4.erasmusmc.nl/pwp/?gdraisma
#
Maybe 'ave' is what you were looking for:
a b c    n   cum
1  1 1 1 11.1  11.1
2  2 1 1 21.1  21.1
3  3 1 1 31.1  31.1
4  4 1 1 41.1  41.1
5  5 1 1 51.1  51.1
6  1 2 1 12.1  23.2
7  2 2 1 22.1  43.2
8  3 2 1 32.1  63.2
9  4 2 1 42.1  83.2
10 5 2 1 52.1 103.2
11 1 3 1 13.1  36.3
12 2 3 1 23.1  66.3
13 3 3 1 33.1  96.3
14 4 3 1 43.1 126.3
15 5 3 1 53.1 156.3
16 1 1 2 11.2  11.2
17 2 1 2 21.2  21.2
18 3 1 2 31.2  31.2
19 4 1 2 41.2  41.2
20 5 1 2 51.2  51.2
21 1 2 2 12.2  23.4
22 2 2 2 22.2  43.4
23 3 2 2 32.2  63.4
24 4 2 2 42.2  83.4
25 5 2 2 52.2 103.4
26 1 3 2 13.2  36.6
27 2 3 2 23.2  66.6
28 3 3 2 33.2  96.6
29 4 3 2 43.2 126.6
30 5 3 2 53.2 156.6

        
On Tue, Dec 7, 2010 at 6:39 AM, Gerrit Draisma <gdraisma at xs4all.nl> wrote:

  
    
#
You can also use 'split' to separate each group:
$`1.1`
   a b c    n  cum
1  1 1 1 11.1 11.1
6  1 2 1 12.1 23.2
11 1 3 1 13.1 36.3

$`2.1`
   a b c    n  cum
2  2 1 1 21.1 21.1
7  2 2 1 22.1 43.2
12 2 3 1 23.1 66.3

$`3.1`
   a b c    n  cum
3  3 1 1 31.1 31.1
8  3 2 1 32.1 63.2
13 3 3 1 33.1 96.3

$`4.1`
   a b c    n   cum
4  4 1 1 41.1  41.1
9  4 2 1 42.1  83.2
14 4 3 1 43.1 126.3

$`5.1`
   a b c    n   cum
5  5 1 1 51.1  51.1
10 5 2 1 52.1 103.2
15 5 3 1 53.1 156.3

$`1.2`
   a b c    n  cum
16 1 1 2 11.2 11.2
21 1 2 2 12.2 23.4
26 1 3 2 13.2 36.6

$`2.2`
   a b c    n  cum
17 2 1 2 21.2 21.2
22 2 2 2 22.2 43.4
27 2 3 2 23.2 66.6

$`3.2`
   a b c    n  cum
18 3 1 2 31.2 31.2
23 3 2 2 32.2 63.4
28 3 3 2 33.2 96.6

$`4.2`
   a b c    n   cum
19 4 1 2 41.2  41.2
24 4 2 2 42.2  83.4
29 4 3 2 43.2 126.6

$`5.2`
   a b c    n   cum
20 5 1 2 51.2  51.2
25 5 2 2 52.2 103.4
30 5 3 2 53.2 156.6

        
On Tue, Dec 7, 2010 at 6:39 AM, Gerrit Draisma <gdraisma at xs4all.nl> wrote: