[R-es] Frecuencia absoluta acumulada por individuo y por año

An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140310/c3ef6547/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140310/23cfd2c7/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140310/643327d9/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140310/7d89807c/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140311/9732c9e8/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140311/0b9cdc84/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140311/d5c5fd82/attachment.pl>
Llego tarde al hilo, pero creo que se llega rÃ¡pidamente al resultado con la 
complicidad del paquete "reshape2". Si DT es el data.table que escojo Francisco como
ejemplo:
DT
ID YEAR CANTIDAD
1: 100 2005        1
2: 100 2005        2
3: 100 2007        1
4: 100 2007        1
5: 100 2007        1
6: 120 2006        1
7: 120 2006        5
8: 120 2006        1
9: 120 2007        3
require(reshape2)
temp=dcast.data.table(DT,ID~YEAR,sum,value.var="CANTIDAD")
temp
ID 2005 2006 2007
1: 100    3    0    3
2: 120    0    7    3
res=melt(temp,id.var="ID")
setkey(res,ID,variable)
res
ID variable value
1: 100     2005     3
2: 100     2006     0
3: 100     2007     3
4: 120     2005     0
5: 120     2006     7
6: 120     2007     3
res[,cumsum:=cumsum(value),by=ID]
res
ID variable value cumsum
1: 100     2005     3      3
2: 100     2006     0      3
3: 100     2007     3      6
4: 120     2005     0      0
5: 120     2006     7      7
6: 120     2007     3     10
subset(res,cumsum>0)
ID variable value cumsum
1: 100     2005     3      3
2: 100     2006     0      3
3: 100     2007     3      6
4: 120     2006     7      7
5: 120     2007     3     10

Un saludo. Olivier
En primer lugar, muchas gracias Carlos, Javier RubÃ©n y Daniel por vuestra ayuda. He
aprendido mucho con cada una de vuestras sugerencias.

En particular, gracias a Carlos Ortega por el cÃ³digo con el segundo bucle. EstÃ¡
curradÃsimo y me ha parecido muy buena la idea de separarlo en dos bucles. Lo he
aplicado a mis datos y obtengo:

   ID YEAR  CUSUM
1 100 2005     3
2 100 2006     3
3 100 2007     9
4 120 2006     7
5 120 2007     3

Que es casi lo que busco, pues lo que persigo obtener (diculpa Carlos porque me era
difÃcil explicar con palabras y de forma resumida el objetivo) es lo siguiente:

   ID YEAR  CUSUM
1 100 2005     3
2 100 2006     3 (se incorporan 0 datos nuevos: siguen habiendo 3 acumulados)
3 100 2007     6 (se incorporan 3 datos nuevos: hay 6 acumulados)
4 120 2006     7
5 120 2007     10 (se incorporan 3 datos nuevos: hay 10 acumulados)

Carlos, he usado tÃº cÃ³digo sobre el data frame:

#  Mi data frame: datos2

datos2 <- data.frame(ID = c(rep(100,5),rep(120,4)),
  FECHA = c("02/08/2005", "19/10/2005", "09/02/2007", "25/10/2007","29/10/2007",
  "11/05/2006", "17/08/2006", "15/10/2006", "16/04/2007"),
  CANTIDAD = c(1, 2, 1, 1, 1, 1, 5, 1, 3))

class(datos2$FECHA) # Es un factor

datos2$FECHA <- as.Date(datos2$FECHA,"%d/%m/%Y")
class(datos2$FECHA) # Ahora ya es una fecha

# CÃ³digo

library(sqldf)

df.tmp <- sqldf("select ID,YEAR, sum(CANTIDAD) as cusum from datos2 group by ID,YEAR
order by ID,YEAR")

for(i in 1:nrow(df.tmp)) {
  if(i==1 ) {
        df.tmp$difID[i] <- 0
        df.tmp$difYE[i] <- 0

  }
  else{

     if(df.tmp$ID[i]!=df.tmp$ID[i-1] & (df.tmp$YEAR[i]-df.tmp$YEAR[i-1] <0)) {
                df.tmp$difID[i] <- 0
                df.tmp$difYE[i] <- 0
     } else {
              df.tmp$difID[i] <- df.tmp$ID[i] - df.tmp$ID[i-1]
              df.tmp$difYE[i] <- df.tmp$YEAR[i] - df.tmp$YEAR[i-1]
     }
  }
}
#df.tmp (Lo comento porque no es el resultado finalmente buscado)

#------- Segundo bucle para introducir filas en los saltos de aÃ±os
# Introduzco filas cuando el salto de aÃ±os sea mayor que 2...
df.new <- 0
for(i in 1:nrow(df.tmp)) {

  #Copio la fila tal cual cuando la diferencia en aÃ±os es 0 o menor que dos.
  if(df.tmp$difYE[i] < 2) {

        df.new <- rbind(df.new, c(df.tmp$ID[i] , df.tmp$YEAR[i],df.tmp$cusum[i]))

  } else {
    # Si la diferencia en aÃ±os es mayor que dos, ciclo en aÃ±os y teniendo
    #         en cuenta que cusum se acumula...
    cusum.cont <- df.tmp$cusum[i-1]
    for(j in 1:(df.tmp$difYE[i]-1) ) {
        df.new <- rbind(df.new, c(df.tmp$ID[i] , df.tmp$YEAR[i-1]+j
,df.tmp$cusum[i-1]))
        cusum.cont <- cusum.cont + df.tmp$cusum[i-1]
    }
    # Y tras ciclar copio la fila en la que estaba
        df.new <- rbind(df.new, c(df.tmp$ID[i] ,
df.tmp$YEAR[i],df.tmp$cusum[i]+cusum.cont))
  }

}
# df.tmp (Lo comento al ser un data frame intermedio)

# Data frame finalmente buscado: df.new
df.new <- df.new[2:nrow(df.new),]
row.names(df.new) <- NULL
df.new <- as.data.frame(df.new)
names(df.new) <- c('ID', 'YEAR','CUSUM')
df.new

	[[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

--
____________________________________

Olivier G. NuÃ±ez
Email: onunez en unex.es
http://kolmogorov.unex.es/~onunez
Tel : +34 663 03 69 09
Departamento de MatemÃ¡ticas
Universidad de Extremadura
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140312/cb305cdc/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140312/e5b26535/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20140312/dd5d93a2/attachment.pl>