Skip to content
Prev 393614 / 398503 Next

Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying the data it does not actually do so. Most R objects try to be immutable, that is, the object may not change after creation. This guarantees that if you have a binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gross at gmail.com: