Skip to content

Best way to preallocate numeric NA array?

8 messages · Douglas Bates, Henrique Dallazuanna, Rob Steele +2 more

#
These are the ways that occur to me.

## This produces a logical vector, which will get converted to a numeric
## vector the first time a number is assigned to it.  That seems
## wasteful.
x <- rep(NA, n)

## This does the conversion ahead of time but it's still creating a
## logical vector first, which seems wasteful.
x <- as.numeric(rep(NA, n))

## This avoids type conversion but still involves two assignments for
## each element in the vector.
x <- numeric(n)
x[] <- NA

## This seems reasonable.
x <- rep(as.numeric(NA), n)

Comments?

Thanks,
Rob
#
On Thu, Nov 26, 2009 at 10:03 AM, Rob Steele
<freenx.10.robsteele at xoxy.net> wrote:
My intuition would be to go with the third method (allocate a numeric
vector then assign NA to its contents) but I haven't tested the
different.  In fact, it would be difficult to see differences in, for
example, execution time unless n was very large.

This brings up a different question which is, why do you want to
consider this?  Are you striving for readability, for speed, for low
memory footprint, for "efficiency" in some other way?  When we were
programming in S on machines with 1 mips processors and a couple of
megabytes of memory, such considerations were important.  I'm not sure
they are quite as important now.
#
You can try this also:

rep(NA_integer_, 10)

On Thu, Nov 26, 2009 at 2:03 PM, Rob Steele
<freenx.10.robsteele at xoxy.net> wrote:

  
    
#
Or best:

rep(NA_real_, 10)
On Thu, Nov 26, 2009 at 2:31 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote:

  
    
#
Douglas Bates wrote:
Thanks--good questions.  For any code, I'd order the requirements like this:

1) Correct
2) Readable
3) Space efficient
4) Time efficient

Compromises are sometimes necessary.  R is such an odd language that it
really helps readability to settle on easily recognizable idioms.
That's true in any language where there's more than one way to do things
but I find it especially true in R.  I agree that the efficiency of this
operation only matters with very large vectors or very many repetitions.
#
Douglas Bates wrote:
Thanks--good questions.  For any code, I'd order the requirements like this:

1) Correct
2) Readable
3) Space efficient
4) Time efficient

Compromises are sometimes necessary.  R is such an odd language that it
really helps readability to settle on easily recognizable idioms.
That's true in any language where there's more than one way to do things
but I find it especially true in R.  I agree that the efficiency of this
operation only matters with very large vectors or very many repetitions.
#
Hi

There is one issue which I encountered recently with this type of 
behaviour, 

mat<-matrix(NA,5,4)
fix(mat)

put some number in any cell and close fix

mat
     col1 col2 col3 col4
[1,]   NA   NA   NA   NA
[2,]   NA   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA
[5,]   NA   NA   NA   NA

No value is put into mat. There is easy workaround, but it can be 
frustrating if somebody tries to find why the value is not inputed. I know 
that it is not preferred way to fill a matrix but if you have such small 
matrix and it has only few non NA values this could be used.

Maybe on help page could be some kind of explanation:

"Fix can not convert a type(mode) of its argument and therefore it is 
possible to input only values which match type of x."

or something like that

Regards
Petr


 
r-help-bounces at r-project.org napsal dne 26.11.2009 17:22:45:
numeric
http://www.R-project.org/posting-guide.html
#
I like the rep-based ones because they are more 'functional':
I can pass the output directly into another function.  If speed
is a big concern, I think that
   x <- rep.int(NA_real_, n)
may be the fastest but then it is not immediately apparent how
to do the same for integer or fancier classes.  (NA_real_ is
to the numeric class as NA_integer_ is to the integer class.)
Your last one, rep(as.<class>(NA), n), seems to me to be a
good compromise between readability and speed.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com