An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110601/7895e19e/attachment.pl>
Identifying sequences
6 messages · David Winsemius, ONKELINX, Thierry, Jonathan Daily +2 more
On Jun 1, 2011, at 6:27 AM, christiaan pauw wrote:
Hallo Everybody Consider the following vector a=1:10 b=20:30 c=40:50 x=c(a,b,c) I need a function that can tell me that there are three set of continuos sequences and that the first is from 1:10, the second from 20:30 and the third from 40:50. In other words: a,b, and c.
You probably want something like > which(diff(x) >1) [1] 10 21 Or perhaps what the rle function provides. ?diff ?rle
David Winsemius, MD West Hartford, CT
Something like this? a=1:10 b=20:30 c=40:50 x=c(a,b,c) borders <- which(diff(x) != 1) seqs <- data.frame(start = c(1, borders + 1), end = c(borders, length(x))) Best regards, Thierry
-----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens christiaan pauw Verzonden: woensdag 1 juni 2011 12:27 Aan: r-help at r-project.org Onderwerp: [R] Identifying sequences Hallo Everybody Consider the following vector a=1:10 b=20:30 c=40:50 x=c(a,b,c) I need a function that can tell me that there are three set of continuos sequences and that the first is from 1:10, the second from 20:30 and the third from 40:50. In other words: a,b, and c. regards Christiaan [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I am assuming in this case that you are looking for continuity along integers, so if you expect noninteger values this will not work. You can get the index of where breaks can be found in your example using which(diff(x) > 1)
On Wed, Jun 1, 2011 at 6:27 AM, christiaan pauw <cjpauw at gmail.com> wrote:
Hallo Everybody Consider the following vector a=1:10 b=20:30 c=40:50 x=c(a,b,c) I need a function that can tell me that there are three set of continuos sequences and that the first is from 1:10, the second from 20:30 and the third from 40:50. In other words: a,b, and c. regards Christiaan ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
=============================================== Jon Daily Technician =============================================== #!/usr/bin/env outside # It's great, trust me.
Thanks to David, Thierry and Jonathan for your help.
I have been able to put this function together
a=1:10
b=20:30
c=40:50
x=c(a,b,c)
seq.matrix <- function(x){
lower<- x[which(diff(x) != 1)]
upper <- x[which(diff(x) != 1)+1]
extremities <- c(1,lower, upper,x[length(x)])
m <- data.frame(matrix(extremities[order(extremities)],ncol=2,byrow=TRUE,dimnames=list(rows=paste("group",1:(length(lower)+1),sep=""),cols=c("lower","upper"))))
m$length=m$upper-m$lower+1
m
}
s.m=seq.matrix(x)
s.m
lower upper length
group1 1 10 10
group2 20 30 11
group3 40 50 11
One can then make a test to see if a certain value (say 9) falls
within one of the groups and use that to find the group name or lower
or upper border
s.m.test=function(s.m,i){which(s.m[,1] <i & i < s.m[,2])}
s.m.test(s.m,i=9)
[1] 1
e.g.
row.names(s.m)[s.m.test(s.m,i=9)]
[1] "group1"
Cheers
Christiaan
On 1 June 2011 14:31, Jonathan Daily <biomathjdaily at gmail.com> wrote:
I am assuming in this case that you are looking for continuity along integers, so if you expect noninteger values this will not work. You can get the index of where breaks can be found in your example using which(diff(x) > 1) On Wed, Jun 1, 2011 at 6:27 AM, christiaan pauw <cjpauw at gmail.com> wrote:
Hallo Everybody Consider the following vector a=1:10 b=20:30 c=40:50 x=c(a,b,c) I need a function that can tell me that there are three set of continuos sequences and that the first is from 1:10, the second from 20:30 and the third from 40:50. In other words: a,b, and c. regards Christiaan ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- =============================================== Jon Daily Technician =============================================== #!/usr/bin/env outside # It's great, trust me.
----------------------------------------
Date: Wed, 1 Jun 2011 17:12:29 +0200
From: cjpauw at gmail.com
To: r-help at r-project.org
Subject: Re: [R] Identifying sequences
Thanks to David, Thierry and Jonathan for your help.
I have been able to put this function together
a=1:10
b=20:30
c=40:50
x=c(a,b,c)
seq.matrix <- function(x){
lower<- x[which(diff(x) != 1)]
upper <- x[which(diff(x) != 1)+1]
extremities <- c(1,lower, upper,x[length(x)])
m <- data.frame(matrix(extremities[order(extremities)],ncol=2,byrow=TRUE,dimnames=list(rows=paste("group",1:(length(lower)+1),sep=""),cols=c("lower","upper"))))
m$length=m$upper-m$lower+1
m
}
s.m=seq.matrix(x)
s.m
lower upper length
group1 1 10 10
group2 20 30 11
group3 40 50 11
One can then make a test to see if a certain value (say 9) falls
within one of the groups and use that to find the group name or lower
or upper border
As I understand, you are looking for large derivatives or approx discontinuity against smooth signal. This seems like a natural application for wavelets, try the haar wavelet and use package wavelets,
library(wavelets) f=wt.filter(c(-1,1),modwt=T) z<-modwt(X=as.numeric(x),filter=f,n.levels=1) z at W
$W1
[,1]
[1,] 49
[2,] -1
[3,] -1
[4,] -1
[5,] -1
[6,] -1
[7,] -1
[8,] -1
[9,] -1
[10,] -1
[11,] -10
[12,] -1
[13,] -1
[14,] -1
[15,] -1
[16,] -1
[17,] -1
[18,] -1
[19,] -1
[20,] -1
[21,] -1
[22,] -10
[23,] -1
[24,] -1
[25,] -1
[26,] -1
[27,] -1
[28,] -1
[29,] -1
[30,] -1
[31,] -1
[32,] -1
s.m.test=function(s.m,i){which(s.m[,1] > s.m.test(s.m,i=9)
[1] 1
e.g.
row.names(s.m)[s.m.test(s.m,i=9)]
[1] "group1"
Cheers
Christiaan
On 1 June 2011 14:31, Jonathan Daily wrote:
I am assuming in this case that you are looking for continuity along integers, so if you expect noninteger values this will not work. You can get the index of where breaks can be found in your example using which(diff(x) > 1) On Wed, Jun 1, 2011 at 6:27 AM, christiaan pauw wrote:
Hallo Everybody Consider the following vector a=1:10 b=20:30 c=40:50 x=c(a,b,c) I need a function that can tell me that there are three set of continuos sequences and that the first is from 1:10, the second from 20:30 and the third from 40:50. In other words: a,b, and c. regards Christiaan [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- =============================================== Jon Daily Technician =============================================== #!/usr/bin/env outside # It's great, trust me.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.