Skip to content
Prev 274142 / 398506 Next

strsplit question

unlist(strsplit(Block[1:5], "-.+$"))

if you are going to want the other pieces later, the most efficient
way depends on the assumptions you can make about your data.  If there
are always two elements from the split:

matrix(unlist(strsplit(Block[1:5], "-")), ncol = 2, byrow = TRUE)
## or
do.call("rbind", strsplit(Block[1:5], "-"))

the first option dropping everything after - is marginally more
efficient, followed by the matrix technique.  A series of clunkier
options (in my view) would be:

unlist(strsplit(Block[1:5], "-"))[seq(from = 1, to = 2 *
length(Block[1:5]), by = 2)]

or very flexible in terms of extracting the first element (regardless
of how many there are), but computationally less efficient:

sapply(strsplit(Block[1:5], "-"), `[[`, 1)

but this is only slightly less so, and testing on a simple character
vector of length 10^8, was still complete in less than 1 second on a
1.66ghz dual core on R devel r57214 windows x64.

Cheers,

Josh
On Tue, Oct 11, 2011 at 10:20 PM, Erin Hodgess <erinm.hodgess at gmail.com> wrote: