An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20051208/655c04b9/attachment.pl
data.frame() size
2 messages · Matthew Dowle, Peter Dalgaard
Matthew Dowle <mdowle at concordiafunds.com> writes:
Hi, In the example below why is d 10 times bigger than m, according to object.size ? It also takes around 10 times as long to create, which fits with object.size() being truthful. gcinfo(TRUE) also indicates a great deal more garbage collector activity caused by data.frame() than matrix(). $ R --vanilla ....
nr = 1000000 system.time(m<<-matrix(integer(1), nrow=nr, ncol=2))
[1] 0.22 0.01 0.23 0.00 0.00
system.time(d<<-data.frame(a=integer(nr), b=integer(nr)))
[1] 2.81 0.20 3.01 0.00 0.00 # 10 times longer
dim(m)
[1] 1000000 2
dim(d)
[1] 1000000 2 # same dimensions
storage.mode(m)
[1] "integer"
sapply(d, storage.mode)
a b "integer" "integer" # same storage.mode
object.size(m)/1024^2
[1] 7.629616
object.size(d)/1024^2
[1] 76.29482 # but 10 times bigger
sum(sapply(d, object.size))/1024^2
[1] 7.629501 # or is it ? If its not really 10 times bigger, why 10 times longer above ?
Row names!!
r <- as.character(1:1e6) object.size(r)
[1] 72000056
object.size(r)/1024^2
[1] 68.6646 'nuff said?
O__ ---- Peter Dalgaard ??ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907