Randomly remove condition-selected rows from a matrix
Stavros Macrakis wrote:
On Wed, Dec 31, 2008 at 12:44 PM, Guillaume Chapron <carnivorescience at gmail.com> wrote:
m[-sample(which(m[,1]<8 & m[,2]>12),2),]
Supposing I sample only one row among the ones matching my criteria. Then
consider the case where there is just one row matching this criteria. Sure,
there is no need to sample, but the instruction would still be executed.
Then if this row index is 15, my instruction becomes which(15,1), and this
can gives me any row from 1 to 15, which is not correct. I have to make a
condition in case there is only one row matching the criteria.
Yes, this is a (documented!) design flaw in 'sample' -- see the man page. For some reason, the designers of R have chosen to document the flaw and leave it up to individual users to work around it rather than fix it definitively. A related case is sample(c(),0), which gives an error rather than giving an empty vector, though in general R deals with empty vectors correctly (e.g. sum(c()) => 0).
interestingly, ?sample says:
"
'sample' takes a sample of the specified size from the elements of
'x' using either with or without replacement.
x: Either a (numeric, complex, character or logical) vector of
more than one element from which to choose, or a positive
integer.
If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
'x >= 1', sampling takes place from '1:x'. _Note_ that this
convenience feature may lead to undesired behaviour when 'x' is of
varying length 'sample(x)'. See the 'resample()' example below.
"
yet the following works, even though x has length 1 and is *not* numeric:
x = "foolme"
is.numeric(x)
sample(x, 1)
sample(x)
x = NA
is.numeric(NA)
sample(x, 1)
sample(x)
is this a bug in the code, or a bug in the documentation?
To my mind, it is bizarre to have an important basic function which works for some argument lengths but not others. The convenience of being able to write sample(5,2) for sample(1:5,2) hardly seems worth inflicting inconsistency on all users -- but perhaps one of the designers of R/S can enlighten us on the design rationale here.
hopefully. vQ