Skip to content

How to change variables in datasets automatically

2 messages · Muhammad Subianto, Gabor Grothendieck

#
Dear R-helpers,
Suppose I have a dataset,
 data(iris)
 a <- data.frame(Sepal.Length=c(1:4), Sepal.Width=c(2:5),
Petal.Length=c(3:6), Petal.Width=c(4:7), Species=rep("rosa",4))
 b <- iris[1:10,]
 newtest.iris <- rbind(a,b)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           1.0         2.0          3.0         4.0    rosa
2           2.0         3.0          4.0         5.0    rosa
3           3.0         4.0          5.0         6.0    rosa
4           4.0         5.0          6.0         7.0    rosa
11          5.1         3.5          1.4         0.2  setosa
21          4.9         3.0          1.4         0.2  setosa
31          4.7         3.2          1.3         0.2  setosa
41          4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa
 
I want to change each labels (variables) like: Sepal.Length=SL, Sepal.Width=SW,
Petal.Length=PL, Petal.Width=PW, and Species=Class. Then I want to
change each cell
in Species variable like rosa=0 and setosa=1. The result something like this,
SL  SW  PL  PW Class
1  1.0 2.0 3.0 4.0     0
2  2.0 3.0 4.0 5.0     0
3  3.0 4.0 5.0 6.0     0
4  4.0 5.0 6.0 7.0     0
5  5.1 3.5 1.4 0.2     1
6  4.9 3.0 1.4 0.2     1
7  4.7 3.2 1.3 0.2     1
8  4.6 3.1 1.5 0.2     1
9  5.0 3.6 1.4 0.2     1
10 5.4 3.9 1.7 0.4     1
11 4.6 3.4 1.4 0.3     1
12 5.0 3.4 1.5 0.2     1
13 4.4 2.9 1.4 0.2     1
14 4.9 3.1 1.5 0.1     1
I can do it the result above like this,
+                        SW = newtest.iris$Sepal.Width,
+                        PL = newtest.iris$Petal.Length,
+                        PW = newtest.iris$Petal.Width,
+                        Class)

Because I have more variables in my datasets which I must to change.
Is there any way to change automatically and which library contains a
function to compute that?
I  would be very happy if anyone could help me.
Thank you very much in advance.

Kindly regards, 
Muhammad Subianto
#
On 4/29/05, Muhammad Subianto <subianto at gmail.com> wrote:
Someone else has already indicated how to do this but as you say you
have a large number of columns you might want an automated way as
well.  For example the following removes lower case letters and dots
from the names and then changes Species to Class.   Note that there is
a dot after a-z

# remove lower case letters and dots from column names and 
# change name of col5 to Class
data(iris)
names(iris) <- gsub("[a-z.]", "", names(iris)) 
names(iris)[5] <- "Class"

Another possibility might be to use abbreviate.  This does
not give the exact result you are looking for but its close
and its very easy:

data(iris)
names(iris) <- abbreviate(names(iris))
names(iris)[5] <- "Class"