Skip to content
Prev 20302 / 63424 Next

(PR#8777) strsplit does [not] return correct value when spliting ""

Now using R 2.3.0.

I have a string that can be "".  I want to find the max screen width of 
the all the lines in the string. so I run the command

  > x <- c("hello", "bob is\ngreat", "foo", "", "bar")
  > substrings <- strsplit(x, "\n"), type="width")
  > sapply(substrings, FUN=function(x) max(nchar(x, type="width")))
which returns
[1]    5    6    3 -Inf    3

This happens because of the behavior of strsplit for a string that is not ""
  > strsplit("Hello\nBob", "\n")

it returns
[[1]]
[1] "Hello" "Bob"


for a string that is ""
  > strsplit("", "\n")

it returns
[[1]]
character(0)


I would expect
[[1]]
[1] ""

because "" is character vector of length 1 containing a string of length 
0, not a character vector of length 0.

For any other string if the split string is not matched in argument x 
then it returns the original string x.

The man page states in the value section that strsplit returns:
      A list of length 'length(x)' the 'i'-th element of which contains
      the vector of splits of 'x[i]'.

It mentions no change in behavior if the value of x[i] = "".
Prof Brian Ripley wrote: