Need help to understand the code - R-SIG-teaching

Fri, Jun 14, 2019 11:18 AM #

I am reading the book  R in action, but get confused by the following code

mystats <- function(x, na.omit=FALSE){

if (na.omit)

x <- x[!is.na(x)]

m <- mean(x)

n <- length(x)

s <- sd(x)

skew <- sum((x-m)^3/s^3)/n

kurt <- sum((x-m)^4/s^4)/n - 3

return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))

}

my question is  when if control statement is used inside the function, why
the { } after the (na.omit)  is not followed???  why it still works???

Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech

	[[alternative HTML version deleted]]

Christopher David Desjardins

Fri, Jun 14, 2019 11:37 AM #

Hi Chenguang Du,

This is really a better question for R-help as R-Sig-Teaching is about
teaching statistics with R. But ?

This function:

mystats <- function(x, na.omit=FALSE){
 if (na.omit)
 x <- x[!is.na(x)]
 m <- mean(x)
 n <- length(x)
 s <- sd(x)
 skew <- sum((x-m)^3/s^3)/n
 kurt <- sum((x-m)^4/s^4)/n - 3
 return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
 }

is equivalent to:

mystats <- function(x, na.omit=FALSE){if (na.omit){
   x <- x[!is.na(x)]
}
m <- mean(x)
 n <- length(x)
 s <- sd(x)
 skew <- sum((x-m)^3/s^3)/n
 kurt <- sum((x-m)^4/s^4)/n - 3
 return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
 }

So, the authors of that function wanted that if() statement to apply only
to the line immediately below it (i.e., this code x <- x[!is.na(x)]) and
not the rest of the function.

On Fri, Jun 14, 2019 at 1:19 PM Chenguang Du <dcheng6 at vt.edu> wrote:

_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

Randall Pruim

Fri, Jun 14, 2019 12:37 PM #

Perhaps the original question was only the syntax question about how if() statements work. That has been answered. (But I?ll add a note that omitting  { } in this situation is a good way to introduce bugs over time, so I generally avoid doing that.)

But in case the question is motivated by the desire to compute summary statistics for a data set, the df_stats() function provides a simple way to do this, and the fBasics package includes functions for skew and kurtosis.  Combining, you can do this:


library(fBasics)   # defines skewness and kurtosis
library(ggformula) # or library(mosaic)

df_stats( ~ Sepal.Length, data = iris, mean, sd, skewness, kurtosis, n = length())
##   mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length kurtosis_Sepal.Length   n
## 1          5.843333       0.8280661             0.3086407            -0.6058125 150

df_stats( Sepal.Length ~ Species, data = iris, mean, sd, skewness, kurtosis, n = length())
##      Species mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length kurtosis_Sepal.Length  n
## 1     setosa             5.006       0.3524897            0.11297784            -0.4508724 50
## 2 versicolor             5.936       0.5161711            0.09913926            -0.6939138 50
## 3  virginica             6.588       0.6358796            0.11102862            -0.2032597 50

df_stats(~Sepal.Length, data = iris)
##   min  Q1 median  Q3 max     mean        sd   n missing
## 1 4.3 5.1    5.8 6.4 7.9 5.843333 0.8280661 150       0

There are options to control how things are named if the defaults are long for your liking.  The results are returned in a data frame, so they are suitable for downstream things like plotting with ggformula.

?rjp

On Jun 14, 2019, at 2:37 PM, Christopher David Desjardins <cddesjardins at gmail.com<mailto:cddesjardins at gmail.com>> wrote:

Hi Chenguang Du,

This is really a better question for R-help as R-Sig-Teaching is about
teaching statistics with R. But ?

This function:

mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}

is equivalent to:

mystats <- function(x, na.omit=FALSE){if (na.omit){
  x <- x[!is.na(x)]
}
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}

So, the authors of that function wanted that if() statement to apply only
to the line immediately below it (i.e., this code x <- x[!is.na(x)]) and
not the rest of the function.

On Fri, Jun 14, 2019 at 1:19 PM Chenguang Du <dcheng6 at vt.edu<mailto:dcheng6 at vt.edu>> wrote:

I am reading the book  R in action, but get confused by the following code

mystats <- function(x, na.omit=FALSE){

if (na.omit)

x <- x[!is.na(x)]

m <- mean(x)

n <- length(x)

s <- sd(x)

skew <- sum((x-m)^3/s^3)/n

kurt <- sum((x-m)^4/s^4)/n - 3

return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))

}

my question is  when if control statement is used inside the function, why
the { } after the (na.omit)  is not followed???  why it still works???
--
Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech


_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=



_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=