Skip to content

Sampling from a Postgres database

3 messages · christiaan pauw, Bart Joosen, Joe Conway

#
One way could be to first select only the unique ID's, sample this and then
select only the relevant records:

strQuery = "SELECT ID from tblFoo;"
IDs <- sqlQuery(channel, strQuery)
sample.IDs <- sample(IDs,10)
strQuery = paste("SELECT ID from tblFoo WHRE ID IN(", sample.IDs, ");")
IDs <- sqlQuery(channel, strQuery)

Bart
christiaan pauw-2 wrote:

  
    
#
On 01/15/2010 01:49 AM, Bart Joosen wrote:
Better is to use the built-in random() function in Postgres:

#select count(*) from visits;
  count
---------
 4846604
(1 row)

# select count(*) from visits where random() < 0.005;
 count
-------
 24391
(1 row)

HTH,

Joe

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 899 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100115/10f492cc/attachment.bin>