extract fixed width fields from a string
On Sun, Jan 22, 2012 at 03:34:12PM -0500, Sam Steingold wrote:
* Petr Savicky <fnivpxl at pf.pnf.pm> [2012-01-20 21:59:51 +0100]:
Try the following.
x <-
tolower("ThusThisLongWordWithLettersAndDigitsFrom0to9isAnIntegerBase36")
x <- strsplit(x, "")[[1]]
digits <- 0:35
names(digits) <- c(0:9, letters)
y <- digits[x]
# solution using gmp package
library(gmp)
b <- as.bigz(36)
sum(y * b^(length(y):1 - 1))
[1]
"70455190722800243410669999246294410591724807773749367607882253153084991978813070206061584038994
thanks, here is what I wrote:
## convert a string to an integer in the given base
digits <- 0:63
names(digits) <- c(0:9, letters, toupper(letters), "-_")
string2int <- function (str, base=10) {
d <- digits[strsplit(str,"")[[1]]]
sum(d * base^(length(d):1 - 1))
}
and it appears to work.
however, I want to be able to apply it to all elements of a vector.
I can use apply:
unlist(lapply(c("100","12","213"),string2int))
[1] 100 12 213 but not directly:
string2int(c("100","12","213"))
[1] 100
Hi.
Here, you get the result only for the first string due
to "[[1]]" applied to strsplit(str,"").
As suggested by Michael, a matrix can be used, if
the input is a character vector, whose components
have the same character length (nchar).
strings2int <- function (str, base=10) {
m <- length(str)
n <- unique(nchar(str))
stopifnot(length(n) == 1) # test of all nchar() equal
ch <- strsplit(str, "")
ch <- unlist(ch)
d <- matrix(digits[ch], nrow=m, ncol=n, byrow=TRUE)
c(d %*% base^(n:1 - 1))
}
strings2int(c("100","012","213","453"))
[1] 100 12 213 453
strings2int(c("100","12","213","453"))
Error: length(n) == 1 is not TRUE
Petr.