Skip to content

Subsetting a vector using an index with all missing values

7 messages · Peter Langfelder, Bill Dunlap, Ebert,Timothy Aaron +2 more

#
Hi all,

I stumbled on subsetting behavior that seems counterintuitive and
perhaps is a bug. Here's a simple example:
[1] NA NA NA NA NA NA NA NA NA NA

I would have expected 3 NAs (the length of the index), not 10 (all
values in x). Looked at the documentation for the subsetting operator
`[` but found nothing indicating that if the index contains all
missing data, the result is the entire vector.

I can work around the issue for a general 'index' using a somewhat
clunky but straightforward construct along the lines of
[1] NA NA NA

but I'm wondering if the behaviour above is intended.

Thanks,

Peter
#
This has to do with the mode of the subscript - logical subscripts are
repeated to the length of x and integer/numeric ones are not.  NA is
logical, NA_integer_ is integer, so we get
[1] NA NA NA
[1] NA NA NA NA NA NA NA NA NA NA

-Bill


On Fri, Jul 1, 2022 at 8:31 PM Peter Langfelder <peter.langfelder at gmail.com>
wrote:

  
  
#
Ah, thanks, that makes sense.

Peter
On Fri, Jul 1, 2022 at 10:01 PM Bill Dunlap <williamwdunlap at gmail.com> wrote:
#
That nicely explains the difference in outcome between 
x[rep(TRUE,3)]
x[rep("TRUE",3)]


I do not quite get it.
x<-1:10
x[rep(x<2,3)]
[1] 1 NA NA
The length is three

but
x[rep(x>2,3)]
[1] 3  4  5  6  7  8  9  10  NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
The length is 24

Tim
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Peter Langfelder
Sent: Saturday, July 2, 2022 2:19 AM
To: Bill Dunlap <williamwdunlap at gmail.com>
Cc: r-help <r-help at r-project.org>
Subject: Re: [R] Subsetting a vector using an index with all missing values

[External Email]

Ah, thanks, that makes sense.

Peter
On Fri, Jul 1, 2022 at 10:01 PM Bill Dunlap <williamwdunlap at gmail.com> wrote:
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=5kixibKMuZiVbTFd2D5fSZBYO3aFODtFyW96wUN-oC5gJtbOYJ9G0j6zoo-P6z4W&s=WGGoTTZ6ENtmckv7K_B0OepH04TDjbiNp0D6IbdqpAg&e=
PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=5kixibKMuZiVbTFd2D5fSZBYO3aFODtFyW96wUN-oC5gJtbOYJ9G0j6zoo-P6z4W&s=JErkxZzuGa2y8pjLddJY5u_vDIbjw4tX1vzkb8LAe98&e=
and provide commented, minimal, self-contained, reproducible code.
#
Perhaps it should be an error if the length of a logical subscript is
bigger than the dimension it is subscripting.  Currently in that case, x is
extended (with NA or NULL) to the length of the logical subscript.  I doubt
this is desired very often.
c(NA_integer_, NA_integer_)
c(3L, NA)

-Bill
On Sat, Jul 2, 2022 at 7:49 AM Ebert,Timothy Aaron <tebert at ufl.edu> wrote:

            

  
  
#
Actually, Bill, I suspect there is a not uncommon use when you want a briefer logical vector to be broadcast or re-used as often as needed.?

The example below, if it can be read, uses an abbreviated set of boolean vectors to get the odd elements of another vector, then the even ones, and then the ones not divisible by three because it gets recycled. I suggest there are many variants like this in use, albeit a better design for some things might have been to add a flag to allow such expansion and otherwise consider it an error.
[1] 11 13 15 17 19
[1] 12 14 16 18 20
[1] 12 13 15 16 18 19



-----Original Message-----
From: Bill Dunlap <williamwdunlap at gmail.com>
To: Ebert,Timothy Aaron <tebert at ufl.edu>
Cc: r-help <r-help at r-project.org>
Sent: Sat, Jul 2, 2022 11:24 am
Subject: Re: [R] Subsetting a vector using an index with all missing values

Perhaps it should be an error if the length of a logical subscript is
bigger than the dimension it is subscripting.? Currently in that case, x is
extended (with NA or NULL) to the length of the logical subscript.? I doubt
this is desired very often.
c(NA_integer_, NA_integer_)
c(3L, NA)

-Bill
On Sat, Jul 2, 2022 at 7:49 AM Ebert,Timothy Aaron <tebert at ufl.edu> wrote:

            
??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Reread Bill's comment. It referred to logical indices being **longer** than
the vectors they subscript. You have it the other way round.

Bert

On Sun, Jul 3, 2022, 12:40 AM Avi Gross via R-help <r-help at r-project.org>
wrote: