Message-ID: <OF3AB12121.2FD2FC55-ONC1257552.002EBC03-C1257552.002FE373@e-control.at>
Date: 2009-02-03T08:43:04Z
From: Karina Knaus
Subject: Collapsing panel data
In-Reply-To: <mailman.17.1233572404.8272.r-help@r-project.org>
Dear R-helpers,
I've been thinking about this for some time, maybe someone can help. I have
a fairly large dataset with thousands of firms, call the a, b, c, etc..
such as
[,1] [,2]
[1,] "A" 0.5
[2,] "" 0.2
[3,] "" 0.3
[4,] "B" 0.1
[5,] "" 0.9
[6,] "C" 0.4
Or to put it differently two vectors such as
y <- c("A", "", "", "B", "", "C")
x <- c(0.5, 0.2, 0.3, 0.1, 0.9, 0.4)
The empty lines "" always belong to the firm above. Now I want to collapse
the dataset so that each firm (A,B, C, etc) has one line only, using
summation.
So what I would like is
yNew <- c("A", "B", "C")
xNew <- c(1, 1, 0.4)
The problem I'm having is that each firm has a different number of entries
for x, so some like C have just one and others have ten or more, so I have
difficulty imagining how to use a loop in this case.
I'd be greatful for any suggestions.
Karina