Hi,
On 09/01/2016 12:00 AM, Dario Strbenac wrote:
Good day,
According to the documentation, I wouldn't think that substr or
strsplit would work on a BStringSet, but substr does.
A BStringSet instance of length 5
width seq
[1] 61 D00626:168:C9CWMANXX:1:1105:1816:1998 1:N:0:TCCGGAGA+ATAGAGGC
[2] 61 D00626:168:C9CWMANXX:1:1105:2113:1989 1:N:0:TCCGGAGA+ATAGAGGC
[3] 61 D00626:168:C9CWMANXX:1:1105:2703:1986 1:N:0:TCCGGAGA+ATAGAGGC
[4] 61 D00626:168:C9CWMANXX:1:1105:3255:1979 1:N:0:TCCGGAGA+ATAGAGGC
[5] 61 D00626:168:C9CWMANXX:1:1105:4525:1995 1:N:0:TCCGGAGA+ATAGAGGC
[1] "D00626:168:C9CWMANXX:1:1105:1816:1998"
[2] "D00626:168:C9CWMANXX:1:1105:2113:1989"
[3] "D00626:168:C9CWMANXX:1:1105:2703:1986"
[4] "D00626:168:C9CWMANXX:1:1105:3255:1979"
[5] "D00626:168:C9CWMANXX:1:1105:4525:1995"
Error in strsplit(IDs, " ") : non-character argument
I think that both of these functions shouldn't work or both should
work, to be consistent.
Why? Because they both have "str" in their name?
It sounds that you are expecting that every string manipulation function
defined in base R should work on a BStringSet object. Well that's not
the case and I don't think that's ever going to happen. Some of them
work and some of them don't. We can add more if needed (e.g. strsplit)
but there are things like the grep family that BStringSet objects will
probably never support.
If you need to strsplit() an XStringSet object, you can use this:
strsplitXStringSet <- function(x, split)
{
m <- vmatchPattern(split, x)
at <- gaps(IRangesList(start=start(m),
end=end(m)), start=1L, end=width(x))
extractAt(x, at)
}
It's going to behave like strsplit(x, split, fixed=TRUE) except when
there is a match at the beginning or end of one of the sequences (in
which case strsplit() has a questionable behavior). Also, unlike
strsplit(), strsplitXStringSet() doesn't support an empty split
pattern.