Index out SNP position

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/36a0676e/attachment-0001.pl>
Assuming I understand what you want, which I'm not certain of, here's
one way; there are more (probably some more elegant).

I'm not sure how you'd put them in a vector, since there are different
numbers of values for each row of A, so instead I've made a list.
unlist(SNP) will turn it into a vector.

It's also not consistent which column of A has the higher and lower values.

SNP <- lapply(seq_len(nrow(A)), function(x)B[B >= min(A[x,]) & B <= max(A[x,])])
SNP
[[1]]
[1] 36003918 35838399 35838589

[[2]]
[1] 35838589

[[3]]
numeric(0)

[[4]]
[1] 36003918

[[5]]
numeric(0)

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the  col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it.  Could anyone help me with the code? Thanks,Jiang----

A <- matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}

--
Sarah Goslee
http://www.functionaldiversity.org
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/d67984df/attachment-0001.pl>
Any element of B that falls into any pair of elements of A?
unique(unlist(SNP))
[1] 36003918 35838399 35838589
Hi Sarah,

Outstanding! Thanks!

I did not notice that numbers of each row of A are different.

Actually, I have over 10,000 in B and over 5000 ranges in A.  What if I only
need to take out all the B rows that fall into the ranges of A? - remove the
repetitive results.

Best,
Jiang

Date: Thu, 3 Jan 2013 17:04:59 -0500
Subject: Re: [R] Index out SNP position
From: sarah.goslee at gmail.com
To: zhyjiang2006 at hotmail.com
CC: r-help at r-project.org

Assuming I understand what you want, which I'm not certain of, here's
one way; there are more (probably some more elegant).

I'm not sure how you'd put them in a vector, since there are different
numbers of values for each row of A, so instead I've made a list.
unlist(SNP) will turn it into a vector.

It's also not consistent which column of A has the higher and lower
values.

SNP <- lapply(seq_len(nrow(A)), function(x)B[B >= min(A[x,]) & B <=
max(A[x,])])

SNP
[[1]]
[1] 36003918 35838399 35838589

[[2]]
[1] 35838589

[[3]]
numeric(0)

[[4]]
[1] 36003918

[[5]]
numeric(0)

On Thu, Jan 3, 2013 at 4:54 PM, JiangZhengyu <zhyjiang2006 at hotmail.com>
wrote:

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out B
rows that fall between the col 1 and 2 of A& put them into a new vector
SNP.I made code as below, but I cannot think of a right way to do it. Could
anyone help me with the code? Thanks,Jiang----

A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992),
ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}

--
Sarah Goslee
http://www.functionaldiversity.org
Something like this?
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2]))
SNP <- B[indx]
SNP
[1] 36003918 35838399 35838589

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of JiangZhengyu
Sent: Thursday, January 03, 2013 3:55 PM
To: r-help at r-project.org
Subject: [R] Index out SNP position

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out B
rows that fall between the  col 1 and 2 of A& put them into a new
vector SNP.I made code as below, but I cannot think of a right way to
do it.  Could anyone help me with the code? Thanks,Jiang----

A <-
matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
I missed the fact that the columns are not consistently smaller/larger:
A <- t(apply(A, 1, function(x) c(min(x), max(x))))
A
[,1]     [,2]
[1,] 35838396 36151202
[2,] 35838584 35838674
[3,] 35838674 36003908
[4,] 36003908 36004090
[5,] 36003992 36150188
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2]))
SNP <- B[indx]
SNP
[1] 36003918 35838399 35838589

-------
David
-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: Thursday, January 03, 2013 4:23 PM
To: 'JiangZhengyu'; 'r-help at r-project.org'
Subject: RE: [R] Index out SNP position

Something like this?

indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2]))
SNP <- B[indx]
SNP
[1] 36003918 35838399 35838589

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of JiangZhengyu
Sent: Thursday, January 03, 2013 3:55 PM
To: r-help at r-project.org
Subject: [R] Index out SNP position

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out
B
rows that fall between the  col 1 and 2 of A& put them into a new
vector SNP.I made code as below, but I cannot think of a right way to
do it.  Could anyone help me with the code? Thanks,Jiang----

A <-

matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130104/1d5b4bc6/attachment-0001.pl>

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the  col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it.  Could anyone help me with the code? Thanks,Jiang----

A <- matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1) nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
} 
sapply(B, function(x)  apply(A, 1, function(two) x %in% two[1]:two[2]))
[,1]  [,2]  [,3]  [,4]
[1,]  TRUE  TRUE  TRUE FALSE
[2,] FALSE FALSE  TRUE FALSE
[3,] FALSE FALSE FALSE FALSE
[4,]  TRUE FALSE FALSE FALSE
[5,] FALSE FALSE FALSE FALSE

So the first and third B-locations are in the range of two of the rows, the second-B in one range and the third is in none of them. There is also a bioconductor package called `IRanges` that will undoubtedly be more efficient. (This works because the problem is of necessity dealing with integers.)
David Winsemius
Alameda, CA, USA
So given B
cbind(B, apply(B, 1, diff))
[,1]     [,2]    [,3]
[1,] 35838396 36151202  312806
[2,] 35838674 35838584     -90
[3,] 36003908 35838674 -165234
[4,] 36004090 36003908    -182
[5,] 36150188 36003992 -146196

Row 1 is start/end and rows 2 through 5 are end/start
so you only want to exclude nucleotides that fall
between start/end in row 1, ignoring rows 2 through 5
which are end/start? Given your sample matrix A, which
rows do you want to include/exclude?

David C

From: JiangZhengyu [mailto:zhyjiang2006 at hotmail.com] 
Sent: Thursday, January 03, 2013 6:36 PM
To: dcarlson at tamu.edu; r-help at r-project.org
Cc: sarah.goslee at gmail.com
Subject: RE: [R] Index out SNP position

Hi David,
?
Thanks for your reply! 
?
But what if I cannot change the?positions of each row pairs in A.?Sorry I
did not make it very clear.
?
The two columns?in A represent start-and-end or end-and-start positions of a
gene. The one column in B is the?single nucleotide position .? I am?trying
to?index out all the??single nucleotides that fall between the start and end
region of a gene.
?
Jiang
?
?
From: dcarlson at tamu.edu
To: dcarlson at tamu.edu; zhyjiang2006 at hotmail.com; r-help at r-project.org
CC: sarah.goslee at gmail.com
Subject: RE: [R] Index out SNP position
Date: Thu, 3 Jan 2013 16:35:30 -0600

I missed the fact that the columns are not consistently smaller/larger:

A <- t(apply(A, 1, function(x) c(min(x), max(x))))
A
[,1] [,2]
[1,] 35838396 36151202
[2,] 35838584 35838674
[3,] 35838674 36003908
[4,] 36003908 36004090
[5,] 36003992 36150188
indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2]))
SNP <- B[indx]
SNP
[1] 36003918 35838399 35838589

-------
David

-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: Thursday, January 03, 2013 4:23 PM
To: 'JiangZhengyu'; 'r-help at r-project.org'
Subject: RE: [R] Index out SNP position

Something like this?

indx <- sapply(1:nrow(B), function(i) any(B[i]>A[,1] & B[i]<A[,2]))
SNP <- B[indx]
SNP
[1] 36003918 35838399 35838589

----------------------------------- -----------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of JiangZhengyu
Sent: Thursday, January 03, 2013 3:55 PM
To: r-help at r-project.org
Subject: [R] Index out SNP position

Dear R experts,

I have 2 matix: A& B. I am trying to index B against A - (1) find out
B
rows that fall between the col 1 and 2 of A& put them into a new
vector SNP.I made code as below, but I cannot think of a right way to
&g t; > > do it. Could anyone help me with the code? Thanks,Jiang----
A <-

matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584
,35838674,36003908,36003992), ncol = 2)
B <- matrix(c(36003918,35838399,35838589,36262559),ncol = 1)
nr=nrow(A)
rn=nrow(B) for (i in 1:nr)
{
for (j in 1:rn){if (B[i,1]<=A[j,1] && B[i,1]>=A[j,2]){SNP[i]=B[i,1]}}
}

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide c ommented, minimal, self-contained, reproducible code.