Skip to content
Prev 395294 / 398502 Next

Best way to test for numeric digits?

?s 19:35 de 18/10/2023, Leonard Mada escreveu:
Hello,

You are right, sorry for the blunder :(.
In the code below I have replaced stringr::str_replace_all by the 
package stringi function stri_replace_all_regex and the improvement is 
significant.


split_chem_elements <- function(x, rm.digits = TRUE) {
   regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"
   if(rm.digits) {
     stringi::stri_replace_all_regex(x, "#", regex) |>
       strsplit("#|[0-9]") |>
       lapply(\(x) x[nchar(x) > 0L])
   } else {
     strsplit(x, regex, perl = TRUE)
   }
}

# system.time(
#   split_chem_elements(mol10000)
# )
#  user  system elapsed
#  0.06    0.00    0.09
# system.time(
#   split.symbol.character(mol10000)
# )
#  user  system elapsed
#  0.25    0.00    0.28



Hope this helps,

Rui Barradas