Confused about using data.table package,
On Feb 19, 2017, at 11:37 AM, C W <tmrsg11 at gmail.com> wrote: Hi R, I am a little confused by the data.table package. library(data.table) df <- data.frame(w=rnorm(20, -10, 1), x= rnorm(20, 0, 1), y=rnorm(20, 10, 1), z=rnorm(20, 20, 1)) df <- data.table(df)
df <- setDT(df) is preferred.
#drop column w df_1 <- df[, w := NULL] # I thought you are supposed to do: df_1 <- df[, -w]
Nope. The "[.data.table" function is very different from the "[.data.frame' function. As you should be able to see, an expression in the `j` position for "[.data.table" gets evaluated in the environment of the data.table object, so unquoted column names get returned after application of any function. Here it's just a unary minus. Actually "nope" on two accounts. You cannot use a unary minus for column names in `[.data.frame` either. Would have needed to be df[ , !colnames(df) in "w"] # logical indexing
df_2 <- df[x<y] # aren't you supposed to do df_2 <- df[x<y]?
I don't see a difference.
df_3 <- df[, a := x-y] # created new column a using x minus y, why are we using colon equals?
You need to do more study of the extensive documentation. The behavior of the ":=" function is discussed in detail there.
I am a bit confused by this syntax.
It's non-standard for R but many people find the efficiencies of the package worth the extra effort to learn what is essentially a different evaluation strategy.
Thanks! [[alternative HTML version deleted]]
Rhelp is a plain text mailing list,
David > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA