Skip to content
Prev 169452 / 398506 Next

how to separate char and num within a variable

on 02/05/2009 05:20 PM Bill Hyman wrote:
See ?strsplit

Vec <- "chr1:000889594-000889638"
[1] "chr1:000889594-000889638"

# Use a regular expression, defining the 'split' character
# as either ":" or "-", where the vertical bar means 'or':
[[1]]
[1] "chr1"      "000889594" "000889638"


Note that the split characters are not retained in the result.

Let's presume that you have a column in a data frame of the original
data and wish to split it into 3 columns:

DF <- data.frame(Col = rep(Vec, 10))
Col
1  chr1:000889594-000889638
2  chr1:000889594-000889638
3  chr1:000889594-000889638
4  chr1:000889594-000889638
5  chr1:000889594-000889638
6  chr1:000889594-000889638
7  chr1:000889594-000889638
8  chr1:000889594-000889638
9  chr1:000889594-000889638
10 chr1:000889594-000889638

Note that by default, 'Col' will be a factor and strsplit() expects a
character vector, thus we do the coercion and use do.call() to create a
character matrix, via rbind(), from the result:
[,1]   [,2]        [,3]
 [1,] "chr1" "000889594" "000889638"
 [2,] "chr1" "000889594" "000889638"
 [3,] "chr1" "000889594" "000889638"
 [4,] "chr1" "000889594" "000889638"
 [5,] "chr1" "000889594" "000889638"
 [6,] "chr1" "000889594" "000889638"
 [7,] "chr1" "000889594" "000889638"
 [8,] "chr1" "000889594" "000889638"
 [9,] "chr1" "000889594" "000889638"
[10,] "chr1" "000889594" "000889638"


See ?regex, ?do.call and ?rbind for more information.

HTH,

Marc Schwartz