Spark DataFrame: replace NULL cell by NA
"... if("factor" %in% class(x)) x <- as.character(x) ## since ifelse wont
work with factors "
Nonsense!
x <- factor(c("a","", "b"))
x
[1] a b Levels: a b
levels(x)
[1] "" "a" "b"
x <- factor(ifelse(x==(""),NA,x))
x
[1] 2 <NA> 3 Levels: 2 3 Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Dec 9, 2018 at 1:07 PM Karim Mezhoud <kmezhoud at gmail.com> wrote:
Dear All,
## function to relpace empty cell by NA
empty_as_na <- function(x){
if("factor" %in% class(x)) x <- as.character(x) ## since ifelse wont work
with factors
ifelse(as.character(x)!="", x, NA)
}
## connect to spark local
sc <- spark_connect(master = "local")
# load an example of dataframe taht has empty cells (needs cgdsr package)
clinicalData <- cgdsr::getClinicalData(cgds, "gbm_tcga_pub_all")
## copy to spark
clinicalData_tbl <- dplyr::copy_to(sc, clinicalData, overwrite = TRUE)
# This works
clinicalData %>% mutate_all(funs(empty_as_na))
# This Does not works
clinicalData_tbl %>% mutate_all(funs(empty_as_na))
Thanks,
Karim
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.