Skip to content
Prev 248004 / 398503 Next

Pearson correlation with randomization

Hi David,

Thanks a lot for you inputs. I have modified my code accordingly. There
is one more place that I need some help.
This is my code:

========================================================================
======

X<- read.table("X.txt",as.is=T,header=T,row.names=1)
Y<- read.table("Y.txt",as.is=T,header=T,row.names=1)

X.mat<- as.matrix(X)
Y.mat<- as.matrix(Y)

# calculating the true correlation values from my original dataset
True.Corrs<- matrix()
for (k in 1:nrow(SNP.mat)){
True.Corrs[k]<- cor.test(X.mat[k,],Y.mat[k,],alternative
=c("greater"),method= c("pearson"))$p.value
}

# Creating the random distribution of Correlation p-values
X.rand <- list()
Y.rand<- list()

X.rand<-replicate(1000,(X[sample(1:ncol(Y))]),simplify=FALSE)  #
Randomizing the column values for each row
Y.rand<-replicate(1000,Y,simplify=FALSE) # Creating an equivalent list
of the Y matrix (non-randomised), to be able to do a pair-wise cor.test

Corrs.rand<- list()
tmp<- list()
for (i in 1:2){
for (j in 1:3){
# How to store a multiple values per element of list?
tmp[[j]] <- cor.test(t(X.rand[[i]][j,]),t(Y.rand[[i]][j,]),alternative
=c("greater"),method= c("pearson"))$p.value
Corrs.rand[[i]] <- rbind(Corrs.rand[[j]],tmp[[j]])
}
}

========================================================================

At this step:

for (i in 1:length(X.rand)){
for (j in 1:nrow(X.rand[[1]]){
# How to store a multiple values per element of list?
tmp[[j]] <- cor.test(t(X.rand[[i]][j,]),t(Y.rand[[i]][j,]),alternative
=c("greater"),method= c("pearson"))$p.value
Corrs.rand[[i]] <- rbind(Corrs.rand[[j]],tmp[[j]])
}
}

I am not sure how I can store multiple values per element. For eg. I
want a list of length 1000 (which is the number of random permutations I
have generated for my dataset) and in each element of the list I need to
store 12 p.values where 12 corresponds to the number of rows I have in
my randomized dataset. Eg.

[[1]]
0.23
0.05
0.78
0.78
0.87
0.11
0.003
0.9
0.76
0.11
0.23
0.56
[[2]]
0.08
0.67
0.45
0.23
0.35
0.85
0.99
0.78
0.66
0.45
0.06
0.1
[[3]]
So on... 
 
I maybe going about this in a complicated way and there may be other
ways of storing the p.values for each of my randomized dataset. So if
anybody has ideas please oblige me.
======================================================
X dataset:(sample)
#Probes	X10851	X12144	X12155	X11882	X10860	X12762	X12239	X12154
1	1	1	0	0	1	0	2	0
2	0	0	0	0	0	0	0	0
3	2	2	2	2	1	2	1	2
4	0	0	0	0	0	0	0	0
5	2	2	2	2	2	2	2	2
6	0	1	0	0	1	1	1	1
7	2	2	NaN	2	2	2	2	2
8	2	2	2	2	2	2	2	2
9	0	1	0	1	1	NaN	1	2
10	2	2	2	2	2	2	2	2
11	2	0	0	0	0	0	0	0
12	0	1	0	1	1	0	1	1


Y dataset:(sample)

Probes	X10851	X12144	X12155	X11882	X10860	X12762	X12239	X12154
1	793.0830793	788.1813828	867.8504057	729.8321265
816.8519963	805.2113707	774.5990003	854.6384306
2	12.8695023	4.312894024	10.69769375	5.872212512
13.79299806	9.394132659	6.297552848	9.307943304
3	699.7791876	826.997429	795.6409729	770.9376141
806.1241089	782.3970486	817.107482	859.7154906
4	892.8217221	869.0481458	806.3386667	812.0431017
873.5565439	794.4752191	813.9587056	814.8681274
5	892.8217221	869.0481458	806.3386667	812.0431017
873.5565439	794.4752191	813.9587056	814.8681274
6	839.7350251	943.4455677	950.7575323	859.0208018
894.246041	853.524053	941.4841508	913.0246205
7	653.1272418	751.5217836	750.1757745	737.382114
757.8486157	758.2407075	724.2185775	770.8669409
8	12.8695023	4.312894024	10.69769375	5.872212512
13.79299806	9.394132659	6.297552848	9.307943304
9	839.7350251	943.4455677	950.7575323	859.0208018
894.246041	853.524053	941.4841508	913.0246205
10	653.1272418	751.5217836	750.1757745	737.382114
757.8486157	758.2407075	724.2185775	770.8669409
11	653.1272418	751.5217836	750.1757745	737.382114
757.8486157	758.2407075	724.2185775	770.8669409
12	839.7350251	943.4455677	950.7575323	859.0208018
894.246041	853.524053	941.4841508	913.0246205

Thanks again

Manisha




-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Tuesday, January 18, 2011 11:56 PM
To: Brahmachary, Manisha
Cc: R-help at r-project.org
Subject: Re: [R] Pearson correlation with randomization
On Jan 18, 2011, at 11:23 PM, Brahmachary, Manisha wrote:

            
X <- X[sample(1:nrow(Y)), ]
You may want to look at the replicate function.
You won't have a very stable estimate of the 95th order statistics  
with "maybe" 100 replications.