I am reading the book R in action, but get confused by the following code
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
my question is when if control statement is used inside the function, why
the { } after the (na.omit) is not followed??? why it still works???
Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech
[[alternative HTML version deleted]]
Hi Chenguang Du,
This is really a better question for R-help as R-Sig-Teaching is about
teaching statistics with R. But ?
This function:
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
is equivalent to:
mystats <- function(x, na.omit=FALSE){if (na.omit){
x <- x[!is.na(x)]
}
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
So, the authors of that function wanted that if() statement to apply only
to the line immediately below it (i.e., this code x <- x[!is.na(x)]) and
not the rest of the function.
On Fri, Jun 14, 2019 at 1:19 PM Chenguang Du <dcheng6 at vt.edu> wrote:
I am reading the book R in action, but get confused by the following code
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
my question is when if control statement is used inside the function, why
the { } after the (na.omit) is not followed??? why it still works???
--
Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech
[[alternative HTML version deleted]]
Perhaps the original question was only the syntax question about how if() statements work. That has been answered. (But I?ll add a note that omitting { } in this situation is a good way to introduce bugs over time, so I generally avoid doing that.)
But in case the question is motivated by the desire to compute summary statistics for a data set, the df_stats() function provides a simple way to do this, and the fBasics package includes functions for skew and kurtosis. Combining, you can do this:
library(fBasics) # defines skewness and kurtosis
library(ggformula) # or library(mosaic)
df_stats( ~ Sepal.Length, data = iris, mean, sd, skewness, kurtosis, n = length())
## mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length kurtosis_Sepal.Length n
## 1 5.843333 0.8280661 0.3086407 -0.6058125 150
df_stats( Sepal.Length ~ Species, data = iris, mean, sd, skewness, kurtosis, n = length())
## Species mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length kurtosis_Sepal.Length n
## 1 setosa 5.006 0.3524897 0.11297784 -0.4508724 50
## 2 versicolor 5.936 0.5161711 0.09913926 -0.6939138 50
## 3 virginica 6.588 0.6358796 0.11102862 -0.2032597 50
df_stats(~Sepal.Length, data = iris)
## min Q1 median Q3 max mean sd n missing
## 1 4.3 5.1 5.8 6.4 7.9 5.843333 0.8280661 150 0
There are options to control how things are named if the defaults are long for your liking. The results are returned in a data frame, so they are suitable for downstream things like plotting with ggformula.
?rjp
On Jun 14, 2019, at 2:37 PM, Christopher David Desjardins <cddesjardins at gmail.com<mailto:cddesjardins at gmail.com>> wrote:
Hi Chenguang Du,
This is really a better question for R-help as R-Sig-Teaching is about
teaching statistics with R. But ?
This function:
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
is equivalent to:
mystats <- function(x, na.omit=FALSE){if (na.omit){
x <- x[!is.na(x)]
}
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
So, the authors of that function wanted that if() statement to apply only
to the line immediately below it (i.e., this code x <- x[!is.na(x)]) and
not the rest of the function.
On Fri, Jun 14, 2019 at 1:19 PM Chenguang Du <dcheng6 at vt.edu<mailto:dcheng6 at vt.edu>> wrote:
I am reading the book R in action, but get confused by the following code
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
my question is when if control statement is used inside the function, why
the { } after the (na.omit) is not followed??? why it still works???
--
Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech
_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=
_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=