Print All Warnings that Occurr in All Parallel Nodes

How could I check that a CSV can be opened before applying the function,
and create an empty data.frame for those CSV.
Use tryCatch().  E.g., instead of
    result <- read_csv2(file)

use

    result <- tryCatch(read_csv2(file), error=function(e)
makeEmptyDataFrame(conditionMessage(e)))

where makeEmptyDataFrame(msg=NULL) is a function (which you write) that
returns a data.frame with no rows but with the proper column names and
types.  I show  it with a msg (message) argument, as you might want to
attach the error message to it as an attribute so you can see what went
wrong.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 14, 2017 at 12:48 AM, TELLERIA RUIZ DE AGUIRRE, JUAN <

Dear R Users,

I have developed the following code for importing a series of zipped CSV
by parallel computing.

My problems are that:

A) Some ZIP Files (Which contain CSVs inside) are corrupted, and cannot be
opened.
B) After executing parRapply I can only see the last.warning variable
error, for knowing which CSV have failed in each node, but I cannot see all
warnings, only 1 at a time.

So:

* For showing a list of all warnings in all nodes, I was thinking of using
the following function in the code:

    warnings(DISPOIN_CSV_List <- parRapply(c1, DISPOIN_DIR_REL,
parRaplly_Function))

    Would this work?

* And also, How could I check that a CSV can be opened before applying the
function, and create an empty data.frame for those CSV.

Thank you,
Juan

CODE
############################################################
####################
## DISPOIN Data Import Into MariaDB
############################################################
####################

## ------------------------------------------------------------
-----------------
## Packages
## ------------------------------------------------------------
-----------------

# update.packages("RODBC")
# update.packages("tidyverse")

## ------------------------------------------------------------
-----------------
## Libraries
## ------------------------------------------------------------
-----------------

suppressMessages(require(RODBC))
suppressMessages(require(tidyverse))
suppressMessages(require(parallel))

## ------------------------------------------------------------
-----------------
## CMD: Command for DISPOIN's Directory Acquisition
## ------------------------------------------------------------
-----------------

# shell(cmd = 'pushd "\\srvdiscsv\data" && dir *AL*.zip /b /s >
D:\DISPOIN_Data_Directories.csv && popd')

## ------------------------------------------------------------
-----------------
## RODBC
## ------------------------------------------------------------
-----------------

## A) MariaDB Connection String

con <- odbcConnect("MariaDB_Tornado24")

invisible(sqlQuery(con, "USE dispoin;"))

# B) Import R Data Directories from MariaDB

DISPOIN_DIR_REL <- as_tibble(sqlFetch(con, "dispoin.t_DISPOIN_DIR_REL"))

odbcClose(con)

# C) Import Zipped CSV data into List of Dataframes, which latter on are
compiled as a single dataframe by
#    means of rbind

  # C.1) parRapply Function Initialization:

  parRaplly_Function <- function (DISPOIN_CSV_Row)
  {
    return(read_csv2(
      file = DISPOIN_CSV_Row,
      col_names = c(
        "SCADA",
        "TAG",
        "ID_del_AEG",
        "Descripcion",
        "Time_ON",
        "Time_OFF",
        "Delta_Time",
        "Comentario",
        "Es_Alarma",
        "Es_Ultima",
        "Comentarios"),
      col_types = cols(
        "SCADA" = "c",
        "TAG" = "c",
        "ID_del_AEG" = "c",
        "Descripcion" = "c",
        "Time_ON" = "c",
        "Time_OFF" = "c",
        "Delta_Time" = "c",
        "Comentario" = "c",
        "Es_Alarma" = "c",
        "Es_Ultima" = "c",
        "Comentarios" = "c"),
      locale = default_locale(),
      na = c("", " "),
      quoted_na = TRUE,
      quote = "\"",
      comment = "",
      trim_ws = TRUE,
      skip = 0,
      n_max = Inf,
      guess_max = min(1000, n_max),
      progress = FALSE))
  }

  # C.2) parallel Package: Environment Settings

  no_cores <- detectCores()

  c1 <- makeCluster(no_cores)

  invisible(clusterEvalQ(c1, library(readr)))

  setDefaultCluster(c1)

  # C.3) parRapply Function Application:

  DISPOIN_CSV_List <- parRapply(c1, DISPOIN_DIR_REL, parRaplly_Function)

  suppressWarnings(stopCluster(c1))

# D) List's Tibbles Compilation into a single Tibble:

  DISPOIN_CSV <- do.call(rbind, DISPOIN_CSV_List)

# E) Write Compiled Table into CSV:

  write_csv(
    DISPOIN_CSV,
    path = file.path("D:/MySQL/R", "DISPOIN_CSV.csv"),
    na = "\\N",
    append = FALSE,
    col_names = TRUE)

# F) Data Cleaning: Environment Variable Removal

  rm(list=ls())

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Print All Warnings that Occurr in All Parallel Nodes

Thread (2 messages)