Skip to content

data.frame: How to get the classes of all components and how to remove their factor structure?

4 messages · Marius Hofert, PIKAL Petr

#
Dear expeRts,

I have two questions concerning data frames:
(1) How can I apply the class function to each component in a data.frame? As you can see below, applying class to each column is not the right approach; applying it to each component seems bulky.
(2) After transforming the data frame a bit, the classes of certain components change to factor. How can I remove the factor structure?

Cheers,

Marius

x <- c(2004:2010, 2002:2011, 2000:2011)
df <- data.frame(x=x, group=c(rep("low",7), rep("middle",10), rep("high",12)), 
                 y=x+100*runif(length(x))) 

## Question (1): why do the following lines do not give the same "class"?
apply(df, 2, class)
class(df$x)
class(df$group)
class(df$y)

df. <- as.data.frame(xtabs(y ~ x + group, data=df))

class(df.$x)
class(df.$group)
class(df.$Freq)

## Question (2): how can I remove the factor structure from x?
df.$x <- as.numeric(as.character(df.$x)) # seems bulky; note that as.numeric(df.$x) is not correct
class(df.$x)
#
Hi
data.frame?
rep("high",12)),
"class"?

from help page
?apply
Arguments
X
an array, including a matrix.

array is not a data frame
sapply(df, class)
        x     group         y 
"integer"  "factor" "numeric"
Actually it is correct in a sense it behaves as documented

?factor

Warning
The interpretation of a factor depends on both the codes and the "levels" 
attribute. Be careful only to compare factors with the same set of levels 
(in the same order). In particular, as.numeric applied to a factor is 
meaningless, and may happen by implicit coercion. To transform a factor f 
to approximately its original numeric values, as.numeric(levels(f))[f] is 
recommended and slightly more efficient than as.numeric(as.character(f)). 


Regards
Petr

        
http://www.R-project.org/posting-guide.html
#
data.frame?
rep("high",12)),
"class"?
If you do it often you can

unfactor <- function(x) as.numeric(as.character(x))
df.$x <- unfactor(df.$x)

or you can use 
df. <- as.data.frame(xtabs(y ~ x + group, data=df), 
stringsAsFactors=FALSE)
df.$x <- as.numeric(df.$x)

But it seems to me that it is not much less bulkier.

Regards
Petr
http://www.R-project.org/posting-guide.html
#
Dear Petr,

thanks for your posts, they perfectly answered my questions.

Cheers,

Marius
On 2011-06-28, at 09:49 , Petr PIKAL wrote: