Skip to content
Prev 24063 / 63424 Next

Help with "row.names = as.integer(c(NA, 5))" in file from dput

Mike Prager wrote:
It's mainly a space-saving device. Originally, row.names was a character 
vector, but storage of character vectors is quite inefficient, so we now 
allow integer names and also a very short form where 1:n is stored just 
using the single value n. To distinguish the latter two, we use the 
c(NA, n) form, because row names are not allowed to be missing.

Consider the following and notice how the string row names take up 
roughly 36 bytes per  record where the actual data are only 8 bytes per 
record.

 > d<-data.frame(x=rnorm(1000))
 > object.size(d)
[1] 8392
 > row.names(d)<-as.character(1:1000)
 > object.size(d)
[1] 44384
 > row.names(d)<-1000:1
 > object.size(d)
[1] 12384
 > row.names(d)<-NULL
 > object.size(d)
[1] 8392