Performance problems to fill up a dataframe
Try to assign some names to your initial variables: dat <- data.frame(A=c(60001,60001,60050,60050,60050), B=c(27,129,618,27,1579)) And what you want is simply:
table(dat)
B A 27 129 618 1579 60001 1 1 0 0 60050 1 0 1 1 Why do you need it as a dataframe anyway? Hth, Adrian
On Monday 24 September 2007, Florian Jansen wrote:
Dear Listmembers,
I'm trying to fill up a dataframe depending on an arbitrary list of
references:
Here is my code, which works:
dat <- data.frame(c(60001,60001,60050,60050,60050),c(27,129,618,27,1579))
LR <- sort(unique(dat[,1]))
LC <- sort(unique(dat[,2]))
m <- as.data.frame(matrix(data=NA, nrow=length(LR), ncol=length(LC),
dimnames=list(LR,LC)))
for(i in 1:nrow(dat)){
m[as.character(dat[i,1]), as.character(dat[i,2])] <- 1
}
m[is.na(m)] <- 0
Now I'm trying to prevent the loop, because it take ages for a list of
20000 entries, but I run out of ideas.
Should I inflate my list beforehand and how? Can I adress the dataframe
fields more effieciently?
Thanks for your help.
Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
+40 21 3120210 / int.101