Skip to content

bug in seq_along

3 messages · Hervé Pagès, Kasper Daniel Hansen

#
Using the IRanges package from Bioconductor and somewhat recent R-2.9.1.

ov = IRanges(1:3, 4:6)
length(ov) # 3
seq(along = ov) # 1 2 3 as wanted
seq_along(ov) # 1!

I had expected that the last line would yield 1:3. My guess is that  
somehow seq_along don't utilize that ov is an S4 class with a length  
method.

The last line of the *Details* section of ?seq has a typeo. Currently  
it is
      'seq.int', 'seq_along' and 'seq.int' are primitive: the latter two
      ignore any argument name.
I would guess it ought to be
      'seq.int', 'seq_along' and 'seq_len' are primitive: the latter two
      ignore any argument name.

Kasper
4 days later
#
Hi Kasper and R developers,
Kasper Daniel Hansen wrote:
I agree, this is not good. seq_along() has always been broken on S4
objects:

   https://stat.ethz.ch/pipermail/r-devel/2007-July/046337.html

so I prefer to not use it, ever. Even when I deal with S3 objects.
Because the day I need to extend my code to deal with S4 objects,
it's too easy to forget to replace 'seq_along(x)' with 'seq_len(length(x))'.
So I'd rather use the latter all the time and from the very beginning
(hopefully there is no serious performance penalty for doing this).

Surprisingly, seq_along() diserves its own C implementation (why
wouldn't seq_along <- function(x) seq_len(length(x)) be just good
enough?). It's calling length() at the C level which is an inline
function defined as:

INLINE_FUN R_len_t length(SEXP s)
{
     int i;
     switch (TYPEOF(s)) {
     case NILSXP:
         return 0;
     case LGLSXP:
     case INTSXP:
     case REALSXP:
     case CPLXSXP:
     case STRSXP:
     case CHARSXP:
     case VECSXP:
     case EXPRSXP:
     case RAWSXP:
         return LENGTH(s);
     case LISTSXP:
     case LANGSXP:
     case DOTSXP:
         i = 0;
         while (s != NULL && s != R_NilValue) {
             i++;
             s = CDR(s);
         }
         return i;
     case ENVSXP:
         return Rf_envlength(s);
     default:
         return 1;
     }
}

Hence it will return 1 when 's' is an S4SXP.

If for whatever reason, seq_along() is not able to figure out what
the *real* length of an S4 object is, then wouldn't it be better to
make it return an error? Or at least to put a big warning in its
man page saying: DON'T TRUST ME ON YOUR S4 OBJECTS, I'M BROKEN!

Cheers,
H.

  
    
6 days later
#
This has now been fixed in R-2.9 and R-devel by Martin Maechler.

Thanks
Kasper
On Jul 13, 2009, at 15:43 , Herv? Pag?s wrote: