An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/36a0676e/attachment-0001.pl>
Index out SNP position
9 messages · JiangZhengyu, Sarah Goslee, David L Carlson +1 more
Assuming I understand what you want, which I'm not certain of, here's one way; there are more (probably some more elegant). I'm not sure how you'd put them in a vector, since there are different numbers of values for each row of A, so instead I've made a list. unlist(SNP) will turn it into a vector. It's also not consistent which column of A has the higher and lower values. SNP <- lapply(seq_len(nrow(A)), function(x)B[B >= min(A[x,]) & B <= max(A[x,])])
SNP
[[1]] [1] 36003918 35838399 35838589 [[2]] [1] 35838589 [[3]] numeric(0) [[4]] [1] 36003918 [[5]] numeric(0)
On Thu, Jan 3, 2013 at 4:54 PM, JiangZhengyu <zhyjiang2006 at hotmail.com> wrote:
Dear R experts,
I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang----
A <- matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
-- Sarah Goslee http://www.functionaldiversity.org
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/d67984df/attachment-0001.pl>
Any element of B that falls into any pair of elements of A?
unique(unlist(SNP))
[1] 36003918 35838399 35838589
On Thu, Jan 3, 2013 at 5:16 PM, JiangZhengyu <zhyjiang2006 at hotmail.com> wrote:
Hi Sarah, Outstanding! Thanks! I did not notice that numbers of each row of A are different. Actually, I have over 10,000 in B and over 5000 ranges in A. What if I only need to take out all the B rows that fall into the ranges of A? - remove the repetitive results. Best, Jiang
Date: Thu, 3 Jan 2013 17:04:59 -0500 Subject: Re: [R] Index out SNP position From: sarah.goslee at gmail.com To: zhyjiang2006 at hotmail.com CC: r-help at r-project.org Assuming I understand what you want, which I'm not certain of, here's one way; there are more (probably some more elegant). I'm not sure how you'd put them in a vector, since there are different numbers of values for each row of A, so instead I've made a list. unlist(SNP) will turn it into a vector. It's also not consistent which column of A has the higher and lower values. SNP <- lapply(seq_len(nrow(A)), function(x)B[B >= min(A[x,]) & B <= max(A[x,])])
SNP
[[1]] [1] 36003918 35838399 35838589 [[2]] [1] 35838589 [[3]] numeric(0) [[4]] [1] 36003918 [[5]] numeric(0) On Thu, Jan 3, 2013 at 4:54 PM, JiangZhengyu <zhyjiang2006 at hotmail.com> wrote:
Dear R experts,
I have 2 matix: A& B. I am trying to index B against A - (1) find out B
rows that fall between the col 1 and 2 of A& put them into a new vector
SNP.I made code as below, but I cannot think of a right way to do it. Could
anyone help me with the code? Thanks,Jiang----
A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992),
ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
-- Sarah Goslee http://www.functionaldiversity.org
Something like this?
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2])) SNP <- B[indx] SNP
[1] 36003918 35838399 35838589 ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of JiangZhengyu
Sent: Thursday, January 03, 2013 3:55 PM
To: r-help at r-project.org
Subject: [R] Index out SNP position
Dear R experts,
I have 2 matix: A& B. I am trying to index B against A - (1) find out B
rows that fall between the col 1 and 2 of A& put them into a new
vector SNP.I made code as below, but I cannot think of a right way to
do it. Could anyone help me with the code? Thanks,Jiang----
A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
I missed the fact that the columns are not consistently smaller/larger:
A <- t(apply(A, 1, function(x) c(min(x), max(x)))) A
[,1] [,2] [1,] 35838396 36151202 [2,] 35838584 35838674 [3,] 35838674 36003908 [4,] 36003908 36004090 [5,] 36003992 36150188
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2])) SNP <- B[indx] SNP
[1] 36003918 35838399 35838589 ------- David
-----Original Message----- From: David L Carlson [mailto:dcarlson at tamu.edu] Sent: Thursday, January 03, 2013 4:23 PM To: 'JiangZhengyu'; 'r-help at r-project.org' Subject: RE: [R] Index out SNP position Something like this?
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2])) SNP <- B[indx] SNP
[1] 36003918 35838399 35838589 ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of JiangZhengyu Sent: Thursday, January 03, 2013 3:55 PM To: r-help at r-project.org Subject: [R] Index out SNP position Dear R experts, I have 2 matix: A& B. I am trying to index B against A - (1) find out
B
rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang---- A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2) B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/1d5b4bc6/attachment-0001.pl>
On Jan 3, 2013, at 1:54 PM, JiangZhengyu wrote:
Dear R experts,
I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang----
A <- matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
sapply(B, function(x) apply(A, 1, function(two) x %in% two[1]:two[2]))
[,1] [,2] [,3] [,4] [1,] TRUE TRUE TRUE FALSE [2,] FALSE FALSE TRUE FALSE [3,] FALSE FALSE FALSE FALSE [4,] TRUE FALSE FALSE FALSE [5,] FALSE FALSE FALSE FALSE So the first and third B-locations are in the range of two of the rows, the second-B in one range and the third is in none of them. There is also a bioconductor package called `IRanges` that will undoubtedly be more efficient. (This works because the problem is of necessity dealing with integers.)
David Winsemius Alameda, CA, USA
So given B
cbind(B, apply(B, 1, diff))
[,1] [,2] [,3] [1,] 35838396 36151202 312806 [2,] 35838674 35838584 -90 [3,] 36003908 35838674 -165234 [4,] 36004090 36003908 -182 [5,] 36150188 36003992 -146196 Row 1 is start/end and rows 2 through 5 are end/start so you only want to exclude nucleotides that fall between start/end in row 1, ignoring rows 2 through 5 which are end/start? Given your sample matrix A, which rows do you want to include/exclude? David C From: JiangZhengyu [mailto:zhyjiang2006 at hotmail.com] Sent: Thursday, January 03, 2013 6:36 PM To: dcarlson at tamu.edu; r-help at r-project.org Cc: sarah.goslee at gmail.com Subject: RE: [R] Index out SNP position Hi David, ? Thanks for your reply! ? But what if I cannot change the?positions of each row pairs in A.?Sorry I did not make it very clear. ? The two columns?in A represent start-and-end or end-and-start positions of a gene. The one column in B is the?single nucleotide position .? I am?trying to?index out all the??single nucleotides that fall between the start and end region of a gene. ? Jiang ? ?
From: dcarlson at tamu.edu To: dcarlson at tamu.edu; zhyjiang2006 at hotmail.com; r-help at r-project.org CC: sarah.goslee at gmail.com Subject: RE: [R] Index out SNP position Date: Thu, 3 Jan 2013 16:35:30 -0600 I missed the fact that the columns are not consistently smaller/larger:
A <- t(apply(A, 1, function(x) c(min(x), max(x)))) A
[,1] [,2] [1,] 35838396 36151202 [2,] 35838584 35838674 [3,] 35838674 36003908 [4,] 36003908 36004090 [5,] 36003992 36150188
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2])) SNP <- B[indx] SNP
[1] 36003918 35838399 35838589 ------- David
-----Original Message----- From: David L Carlson [mailto:dcarlson at tamu.edu] Sent: Thursday, January 03, 2013 4:23 PM To: 'JiangZhengyu'; 'r-help at r-project.org' Subject: RE: [R] Index out SNP position Something like this?
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2])) SNP <- B[indx] SNP
[1] 36003918 35838399 35838589 ----------------------------------- ----------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of JiangZhengyu Sent: Thursday, January 03, 2013 3:55 PM To: r-help at r-project.org Subject: [R] Index out SNP position Dear R experts, I have 2 matix: A& B. I am trying to index B against A - (1) find out
B
rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to
&g t; > > do it. Could anyone help me with the code? Thanks,Jiang----
A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2) B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide c ommented, minimal, self-contained, reproducible code.