Skip to content

Dataframe manipulation

5 messages · Antje, John Kane, David Winsemius +1 more

#
Hello,

can anybody help me with this problem?
I have a dataframe, which contains its values as factors though I have numbers 
but it was read as factors with "scan". Now I would like to convert these 
columns (multiple) to a numeric format.


# this example creates a similar situation

testdata <- as.factor(c("1.1",NA,"2.3","5.5"))
testdata2 <- as.factor(c("1.7","4.3","8.5",10.0))

df <- data.frame(testdata, testdata2)

what do I have to do to get the same datafram but with numeric values???

Antje
#
try this (also look at R-FAQ 7.10):

sapply(df, function (x) as.numeric(levels(x))[as.integer(x)])


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Antje" <niederlein-rstat at yahoo.de>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, December 04, 2007 11:46 AM
Subject: [R] Dataframe manipulation
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
#
See  R-FAQ # 7-11 for the solution.


Have a look at
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98227.html
for a discussion of this type of problem and ways to
get around the issue.
--- Antje <niederlein-rstat at yahoo.de> wrote:

            
#
"Dimitris Rizopoulos" <dimitris.rizopoulos at med.kuleuven.be> wrote in
news:002001c8367f$65a8d8d0$0540210a at www.domain:
That looks rather dangerous. By the time I saw your suggestion, I had 
already added an extra variable with:

df$testdata1<-as.numeric(levels(df$testdata))[as.integer(df$testdata)]
 
When I tried your suggestion I got no error, but there was also no 
effect. When I tried:

df2<-sapply(df, function (x) as.numeric(levels(x))[as.integer(x)])

I discovered that the numeric variable, testdata1, had been entirely 
coverted to NA's and str(df2) did not look data.frame-like.
[1] FALSE
#
my original reply was intended for the original version of 'df', in 
which both columns were factors. In your example you have added a 
numeric column, so not exactly the case I've replied for. For your 
example can use the following:

testdata <- as.factor(c("1.1",NA,"2.3","5.5"))
testdata2 <- as.factor(c("1.7","4.3","8.5",10.0))
df <- data.frame(testdata, testdata2)

df$testdata1 <- 
as.numeric(levels(df$testdata))[as.integer(df$testdata)]

fcts <- sapply(df, is.factor)
df[fcts] <- lapply(df[fcts], function (x) 
as.numeric(levels(x))[as.integer(x)])
df
str(df)


Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "David Winsemius" <dwinsemius at comcast.net>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, December 04, 2007 4:47 PM
Subject: Re: [R] Dataframe manipulation
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm