Excluding all teh columns from a data frame if the standard deviation of that column is zero(0).
On Tue, Oct 16, 2012 at 9:08 AM, siddu479 <onlyfordigitalstuff at gmail.com> wrote:
Hi All,
I have a data frame where nearly 10K columns of data, where most of them
have standard deviation( of all rows) as zero.
I want to exclude all the columns from the data frame and proceed to further
processing.
I tried like blow.
*data <- read.csv("data.CSV", header=T)
for(i in 2:ncol(data))
if(sd(data[,i])==0){
df[,i] <-NULL
}
*
where I have the data columns from 2:ncol, but getting the error "Error in
df[, i] <- NULL : object of type 'closure' is not subsettable"
Can any one suggest the right method to accomplish this.
A perfect example of why "df" is a bad function name. Here you are getting the function ( = closure, more or less) df, density function of the F distribution, instead of the uninitialized variable "df". Since the function can't be subsetted, you get the error. In fact, I think you really just want this one liner: !(apply(data, 2, sd) == 0) which can be used to subset. In the same vein as the df problem, data is also a bad function name (it's also a pre-defined function used for loading, surprise surprise!, data) but R is smart enough to keep them straight in this simple example. In your real script, however, I'd strongly suggest you change it. Cheers, Michael