Skip to content
Prev 394112 / 398500 Next

Split String in regex while Keeping Delimiter

On Wed, 12 Apr 2023 08:29:50 +0000
Emily Bakker <emilybakker at outlook.com> wrote:

            
It sounds like you need positive look-behind, not look-ahead: split on
spaces only if they _follow_ one to three of '+' or '-'. Unfortunately,
repetition quantifiers like {n,m} or + are not directly supported in
look-behind expressions (nor in Perl itself). As a special case, you
can use \K, where anything to the left of \K is a zero-width positive
match:

x <- c(
 'leucocyten + gramnegatieve staven +++ grampositieve staven ++',
 'leucocyten - grampositieve coccen +'
)
strsplit(x, '[+-]{1,3}+\\K ', perl = TRUE)
# [[1]]
# [1] "leucocyten +"             "gramnegatieve staven +++"
#     "grampositieve staven ++" 
# 
# [[2]]
# [1] "leucocyten -"           "grampositieve coccen +"