Skip to content
Prev 361158 / 398513 Next

Antwort: Re: Creating a data frame from scratch (SOLVED)

Hi Dan,
Hi All,

many thanks for your help.

Please find enclosed my little function for your use:

-- cut --

#-------------------------------------------------------------------------------
# Module        : t_count_na.R
# Author        : Georg Maubach
# Date          : 2016-05-24
# Update        : 2016-05-25
# Description   : Count NA's
# Source System : R 3.2.2 (64 Bit)
# Target System : R 3.2.2 (64 Bit)
# License       : CC-BY-SA-NC
#--------1---------2---------3---------4---------5---------6---------7---------8

test <- FALSE

t_count_na <- function(dataset,
                       variables = "all") {
  # Counts the number of NA within given set of veriables
  #
  # Args:
  #   dataset  : Object with dimnames, e.g. data frame, data table.
  #   variables: Character vector with variable names.
  #
  # Operation:
  #   Adds the variable "na_count" to the given dataset containing the 
count of
  #   NA's within the given variables
  #
  # Returns:
  #   Original dataset with variable "na_count" added.
  #
  # Error handling:
  #   None.
  #
  # Credits: 
  #   
http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in-data-frame
  #   
http://r.789695.n4.nabble.com/Creating-variables-on-the-fly-td4720034.html
 
  version <- "2016-05-25"
 
  if (identical(variables, "all")) {
    variable_list <- names(dataset)
  }  else {
    variable_list <- variables
  } 
  dataset[["na_count"]] <- apply(dataset[,variable_list],
                                 1, 
                                 function(x) sum(is.na(x)))
 
  return(dataset)
 
}

#-------------------------------------------------------------------------------

test <- function(do_test = FALSE) {
 
  cat("\n", "\n", "Test function t_count_na()", "\n", "\n")
 
  # Example dataset
    gene <- 
c("ENSG00000208234","ENSG00000199674","ENSG00000221622","ENSG00000207604", 

 "ENSG00000207431","ENSG00000221312","ENSG00134940305","ENSG00394039490",
              "ENSG09943004048")
    hsap <- c(0,0,0, 0, 0, 0, 1,1, 1)
    mmul <- c(NA,2 ,3, NA, 2, 1 , NA,2, NA)
    mmus <- c(NA,2 ,NA, NA, NA, 2 , NA,3, 1)
    rnor <- c(NA,2 ,NA, 1 , NA, 3 , NA,NA, 2)
    cfam <- c(NA,2,NA, 2, 1, 2, 2,NA, NA)
    ds_example <- data.frame(gene, hsap, mmul, mmus, rnor, cfam)
    ds_example$gene <- as.character(ds_example$gene)
 
  cat("\n", "\n", "Example dataset before function call", "\n", "\n")
  print(ds_example)
 
  cat("\n", "\n", "Function call", "\n", "\n")
  ds_example <- t_count_na(dataset = ds_example,
                           variables = c("mmul", "mmus"))
 
  cat("\n", "\n", "Example dataset after function call", "\n", "\n")
  print(ds_example)
}

test(do_test = test)

# EOF .

-- cut --

Kind regards

Georg Maubach




Von:    "Nordlund, Dan (DSHS/RDA)" <NordlDJ at dshs.wa.gov>
An:      "r-help at r-project.org" <r-help at r-project.org>, 
Datum:  24.05.2016 21:41
Betreff:        Re: [R] Creating a data frame from scratch
Gesendet von:   "R-help" <r-help-bounces at r-project.org>




I would probably write the function something like this:


t_count_na <- function(dataset,
                       variables = "all") {
  if (identical(variables, "all")) {
    variable_list <- names(dataset)
  }  else {
    variable_list <- variables
  } 
  apply(dataset[,variable_list], 1, function(x) sum(is.na(x)))
}


Hope this is helpful,

Dan

Daniel Nordlund, PhD
Research and Data Analysis Division
Services & Enterprise Support Administration
Washington State Department of Social and Health Services
the fly
same
storing
for a
place nor
allocation.
than 1 GB
original
http://r.789695.n4.nabble.com/Creating-variables-on-the-fly-td4720034.html
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Message-ID: <OF0F397D14.F41EFF1F-ONC1257FBE.002C4294-C1257FBE.002D14C8@lotus.hawesko.de>
In-Reply-To: <F7E6D18CC2877149AB5296CE54EA276630968253@WAXMXOLYMB025.WAX.wa.lcl>