Random Relabelling

11 messages · kmatthews, Kevin Matthews, Jeremy Hetzel +4 more

Original

1

11

kmatthews

Wed, Apr 20, 2011 7:04 AM #

I have 4000 observations that I need to randomly relabel 1000 times and then
calculate the mean of the 1000 values at each of the 4000 points.  Any ideas
for where to begin? 

Thanks
Kevin 

--
View this message in context: http://r.789695.n4.nabble.com/Random-Relabelling-tp3463100p3463100.html
Sent from the R help mailing list archive at Nabble.com.

John Kane

Wed, Apr 20, 2011 9:08 AM #

Can you explain this a bit more. At the moment I don't see what you are trying to achieve.   "calculate the mean of the 1000 values at each of the 4000 points" does not seem to make sense.

--- On Wed, 4/20/11, kmatthews <kevin-matthews at uiowa.edu> wrote:

Wed, Apr 20, 2011 10:22 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110420/bc72c4cc/attachment.pl>

John Kane

Wed, Apr 20, 2011 10:56 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110420/cdbd32b9/attachment.pl>

Wed, Apr 20, 2011 11:25 AM #

Kevin,

The following follows John's suggestion, but without the loop.  It's quick 
for me.

Jeremy


Jeremy T. Hetzel
Boston University



## Generate sample data
n <- 4000
rep <- 1000
rate <- rnorm(n, mean = 15, sd = 2) / 100000 # Mortality rates around 
15/100k

## Create an empty matrix with appropriate dimensions
permutations <- matrix(ncol = n, nrow = rep)

## Use apply() to resample
permutations <- apply(permutations, 1, function(x)
{
sample(rate, size = n, replace = F)
})

## Look at the matrix
dim(permutations)
head(permutations)

## Find the column means
means <- apply(permutations, 1, mean)
means

On Wednesday, April 20, 2011 1:56:35 PM UTC-4, John Kane wrote:

There is probably a better way to do this but a for loop like this should 
work. You would just need to change the numbers to yours and then add on the 
locations 
========================================================= 

scores  <- 1:5
mydata <- matrix(data=NA, nrow=5, ncol=10)

for(i in 1:10) {
mydata[,i] <- sample(scores, 5, replace=FALSE)
}

=========================================================
--- On Wed, 4/20/11, Kevin Matthews <kevin-m... at uiowa.edu> wrote:

From: Kevin Matthews <kevin-m... at uiowa.edu>
Subject: Re: [R] Random Relabelling
To: "John Kane" <jrkr... at yahoo.ca>
Cc: r-h... at r-project.org
Received: Wednesday, April 20, 2011, 1:22 PM

I have a map of Iowa of with 4000 locations.  At each location, I have a 
cancer mortality rate.  I need to test my null hypothesis; that the spatial 
distribution of the mortality rates is  random.  For this test, I need to 
establish a spatial reference distribution.  


My reference distribution will be created by some random relabelling 
algorithm.  The 4000 locations would remain fixed, but the observed 
mortality rates would be randomly redistributed.  Then, I want 1000 
permutations of the same algorithm.  For each of those 1000 times, I would 
record the redistributed mortality rate at each location.  Then,  I would 
calculate the mean of the 1000 points.  The result would be a spatial 
reference distribution with a mean value of the random permutations at each 
of the 4000 locations.  

Thanks for the response,Kevin

On Wed, Apr 20, 2011 at 11:08 AM, John Kane <jrkr... at yahoo.ca> wrote:


Can you explain this a bit more. At the moment I don't see what you are 
trying to achieve.   "calculate the mean of the 1000 values at each of the 
4000 points" does not seem to make sense.

--- On Wed, 4/20/11, kmatthews <kevin-m... at uiowa.edu> wrote:

From: kmatthews <kevin-m... at uiowa.edu>

Subject: [R] Random Relabelling

To: r-h... at r-project.org

Received: Wednesday, April 20, 2011, 10:04 AM

I have 4000 observations that I need

to randomly relabel 1000 times and then

calculate the mean of the 1000 values at each of the 4000

points.  Any ideas

for where to begin?

Thanks

Kevin


[[alternative HTML version deleted]]

______________________________________________
R-h... at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Wed, Apr 20, 2011 12:11 PM #

Hi:

How about

y <- rnorm(4000)
ymat <- rowMeans(replicate(1000, y[sample(4000)]))
hist(ymeans)

system.time({y <- rnorm(4000); yy <- rowMeans(replicate(1000,
y[sample(4000)]))})
   user  system elapsed
   0.19    0.03    0.22

HTH,
Dennis

On Wed, Apr 20, 2011 at 7:04 AM, kmatthews <kevin-matthews at uiowa.edu> wrote:

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

2 days later

John Kane

Sat, Apr 23, 2011 3:42 AM #

I KNEW there was a better way!

--- On Wed, 4/20/11, Jeremy Hetzel <jthetzel at gmail.com> wrote:

From: Jeremy Hetzel <jthetzel at gmail.com>
Subject: Re: [R] Random Relabelling
To: r-help-archive at googlegroups.com
Cc: r-help at r-project.org, "Kevin Matthews" <kevin-matthews at uiowa.edu>
Received: Wednesday, April 20, 2011, 2:25 PM
Kevin,

The following follows John's suggestion, but without the
loop.? It's quick 
for me.

Jeremy


Jeremy T. Hetzel
Boston University



## Generate sample data
n <- 4000
rep <- 1000
rate <- rnorm(n, mean = 15, sd = 2) / 100000 # Mortality
rates around 
15/100k

## Create an empty matrix with appropriate dimensions
permutations <- matrix(ncol = n, nrow = rep)

## Use apply() to resample
permutations <- apply(permutations, 1, function(x)
{
sample(rate, size = n, replace = F)
})

## Look at the matrix
dim(permutations)
head(permutations)

## Find the column means
means <- apply(permutations, 1, mean)
means





On Wednesday, April 20, 2011 1:56:35 PM UTC-4, John Kane
wrote:

There is probably a better way to do this but a for

loop like this should

work. You would just need to change the numbers to

yours and then add on the

locations

=========================================================

scores? <- 1:5
mydata <- matrix(data=NA, nrow=5, ncol=10)

for(i in 1:10) {
mydata[,i] <- sample(scores, 5, replace=FALSE)
}

=========================================================

--- On Wed, 4/20/11, Kevin Matthews <kevin-m... at uiowa.edu>

wrote:

From: Kevin Matthews <kevin-m... at uiowa.edu>
Subject: Re: [R] Random Relabelling
To: "John Kane" <jrkr... at yahoo.ca>
Cc: r-h... at r-project.org
Received: Wednesday, April 20, 2011, 1:22 PM

I have a map of Iowa of with 4000 locations.? At

each location, I have a

cancer mortality rate.? I need to test my null

hypothesis; that the spatial

distribution of the mortality rates is?

random.? For this test, I need to

establish a spatial reference distribution.? 


My reference distribution will be created by some

random relabelling

algorithm.? The 4000 locations would remain

fixed, but the observed

mortality rates would be randomly redistributed.?

Then, I want 1000

permutations of the same algorithm.? For each of

those 1000 times, I would

record the redistributed mortality rate at each

location.? Then,? I would

calculate the mean of the 1000 points.? The

result would be a spatial

reference distribution with a mean value of the random

permutations at each

of the 4000 locations.? 

Thanks for the response,Kevin

On Wed, Apr 20, 2011 at 11:08 AM, John Kane <jrkr... at yahoo.ca>

wrote:


Can you explain this a bit more. At the moment I don't

see what you are

trying to achieve.???"calculate the

mean of the 1000 values at each of the

4000 points" does not seem to make sense.

--- On Wed, 4/20/11, kmatthews <kevin-m... at uiowa.edu>

wrote:

From: kmatthews <kevin-m... at uiowa.edu>

Subject: [R] Random Relabelling

To: r-h... at r-project.org

Received: Wednesday, April 20, 2011, 10:04 AM

I have 4000 observations that I need

to randomly relabel 1000 times and then

calculate the mean of the 1000 values at each of

the 4000

points.? Any ideas

for where to begin?

Thanks

Kevin


[[alternative HTML version deleted]]

______________________________________________
R-h... at r-project.org

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

Peter Ehlers

Sat, Apr 23, 2011 6:15 AM #

On 2011-04-23 03:42, John Kane wrote:

And you might note that

  means <- rowMeansy(permutations)

is about 10-15 times faster (if speed matters).

Peter Ehlers

[...snipped...]

2 days later

kmatthews

Mon, Apr 25, 2011 7:53 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110425/b35fdb70/attachment.pl>

David Winsemius

Mon, Apr 25, 2011 9:21 AM #

On Apr 25, 2011, at 10:53 AM, kmatthews wrote:

The practice varies, some people do appreciate it. Doing so when one  
is not subscribed, however, adds many additional mouse-maneuvers to  
the moderator workload, ...  wasted time in my opinion. I would  
suggest a private thank you message in that instance (or even more to  
be preferred ... subscribing.)

David.

John Kane

Mon, Apr 25, 2011 4:54 PM #

--- On Mon, 4/25/11, kmatthews <kevin-matthews at uiowa.edu> wrote:

Certainly and thanks

On Wed, Apr 20, 2011 at 1:26 PM, jthetzel [via R] <
ml-node+3463799-950416470-231689 at n4.nabble.com>
wrote:

Kevin,

The following follows John's suggestion, but without

the loop.? It's quick

for me.

Jeremy


Jeremy T. Hetzel
Boston University



## Generate sample data
n <- 4000
rep <- 1000
rate <- rnorm(n, mean = 15, sd = 2) / 100000 #

Mortality rates around

15/100k

## Create an empty matrix with appropriate dimensions
permutations <- matrix(ncol = n, nrow = rep)

## Use apply() to resample
permutations <- apply(permutations, 1, function(x)
{
sample(rate, size = n, replace = F)
})

## Look at the matrix
dim(permutations)
head(permutations)

## Find the column means
means <- apply(permutations, 1, mean)
means





On Wednesday, April 20, 2011 1:56:35 PM UTC-4, John

Kane wrote:

There is probably a better way to do this but a

for loop like this should

work. You would just need to change the numbers

to yours and then add on

the

locations

=========================================================

scores? <- 1:5
mydata <- matrix(data=NA, nrow=5, ncol=10)

for(i in 1:10) {
mydata[,i] <- sample(scores, 5,

replace=FALSE)

=========================================================

--- On Wed, 4/20/11, Kevin Matthews <[hidden

email]<http://user/SendEmail.jtp?type=node&node=3463799&i=0&by-user=t>>

wrote:

From: Kevin Matthews <[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=1&by-user=t>>

Subject: Re: [R] Random Relabelling
To: "John Kane" <[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=2&by-user=t>>

Cc: [hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=3&by-user=t>
Received: Wednesday, April 20, 2011, 1:22 PM

I have a map of Iowa of with 4000

locations.? At each location, I have a

cancer mortality rate.? I need to test my

null hypothesis; that the

spatial

distribution of the mortality rates is?

random.? For this test, I need to

establish a spatial reference distribution.


My reference distribution will be created by some

random relabelling

algorithm.? The 4000 locations would remain

fixed, but the observed

mortality rates would be randomly

redistributed.? Then, I want 1000

permutations of the same algorithm.? For

each of those 1000 times, I

would

record the redistributed mortality rate at each

location.? Then,? I would

calculate the mean of the 1000 points.? The

result would be a spatial

reference distribution with a mean value of the

random permutations at

each

of the 4000 locations.

Thanks for the response,Kevin

On Wed, Apr 20, 2011 at 11:08 AM, John Kane

<[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=4&by-user=t>>

wrote:


Can you explain this a bit more. At the moment I

don't see what you are

trying to achieve.???"calculate

the mean of the 1000 values at each of

the

4000 points" does not seem to make sense.

--- On Wed, 4/20/11, kmatthews <[hidden

email]<http://user/SendEmail.jtp?type=node&node=3463799&i=5&by-user=t>>

wrote:

From: kmatthews <[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=6&by-user=t>>

Subject: [R] Random Relabelling

To: [hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=7&by-user=t>

Received: Wednesday, April 20, 2011, 10:04

AM

I have 4000 observations that I need

to randomly relabel 1000 times and then

calculate the mean of the 1000 values at

each of the 4000

points.? Any ideas

for where to begin?

Thanks

Kevin


[[alternative HTML version deleted]]

______________________________________________
[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=8&by-user=t>mailing

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

______________________________________________
[hidden email]<http://user/SendEmail.jtp?type=node&node=3463799&i=9&by-user=t>mailing

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.