Skip to content
Prev 75525 / 398502 Next

Panel data handling (lags, growth rates)

On 8/14/05, Ajay Narottam Shah <ajayshah at mayin.org> wrote:
Don't know of a source, I just study code, but 
conceptually by just splits up the rows by the grouping
argument giving a list of data frames and applies the
function to each element of the list giving the result.

For example, if we write:

f <- function(x) colSums(x[,-5])
iris.by <- by(iris, iris$Species, f)

is the same as:

f <- function(x) colSums(x[,-5])
iris.split <- split(iris, iris$Species)
iris.lapply <- lapply(iris.split, f)

except that in the by case the result gets a class of "by".

In either of the above cases the result is a list of these
three elements, i.e. these three data frames:

el1 <- iris.by[[1]]; el2 <- iris.by[[2]]; el3 <- iris.by[[3]]

Now, if g <- function(x,y)x+y then

	g(1,2)

is the same as

	do.call("g", list(1,2))

so going back the iris example, to rbind el1, el2 and el3 together we do this:

	rbind(el1, el2, el3)

which is the same as 

	do.call("rbind", list(e1, e2, e3))

which is the same as

	do.call("rbind", iris.by)