sequential row selection in dataframe - R-help

Mon, Dec 25, 2006 9:07 PM #

Dear all;

I'm wondering if there is any 'efficient' approach for selecting a
sample of 'every nth rows'  from a dataframe. For example, let's use
the dataframe GAGurine in MASS library:

[1] 314

# select an 75% of the dataset, i.e. = 236 rows, every 2 rows starting
from row 1

[1] 157

# so, I still need another 79 rows, one way could be:
test2<-GAGurine[-seq(1,314,2),]

[1] 157

# and then
final<-rbind(test2,test3)

[1] 236

Does anyone have a better idea to get the same results but without
creating different datasets like test2 and test3?

Thanks
PM

Michael Kubovy

Tue, Dec 26, 2006 4:42 AM #

On Dec 26, 2006, at 12:07 AM, Pedro Mardones wrote:

A probabilistic approach:

len <- length(GAGurine[,1])
GAGu <- GAGurine[sample(1:len, round(.75 * len)), ] # 236 rows

A deterministic one:

nr <- 1 #or 2
GAGu2 <- GAGurine[-seq(nr, len, 4),] # drop every 4th, giving 235 rows
nr <- 3 # or 4
will give 236 rows.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/

Greg Snow

Thu, Dec 28, 2006 7:23 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20061228/58ed4f88/attachment.pl