From: aramucia at hotmail.com
To: ivan.calandra at uni-hamburg.de
Subject: RE: [R] "Re: Change class factor to numeric"
Date: Tue, 18 May 2010 16:36:55 +0000
Hallo Ivan
I do thank you a lot, but as you have read in the last email, I did think that maybe one erroneus typing was indicating R to take it as a factor. I did look again the original dataframe and did find the mistake, and now it works OK. Sorry, but I had done so many reviews of the data frame, that did not think on it before.
Schon wieder danke sch?n, und freundliche Gr?sse aus Spanien ;)
Arantzazu Blanco Bernardeau
Dpto de Qu?mica Agr?cola, Geolog?a y Edafolog?a
Universidad de Murcia-Campus de Espinardo
----------------------------------------
Date: Tue, 18 May 2010 18:28:45 +0200
From: ivan.calandra at uni-hamburg.de
To: aramucia at hotmail.com
Subject: Re: [R] "Re: Change class factor to numeric"
As you can notice, I'm writing off list for you to send me your data (as
csv).
But do it fast, I'll be leaving soon. If not it might have to wait until
tomorrow!
In any case, I'm no expert, so I'm not sure I'll be able to help you.
And I don't think NAs should be problematic. It might be solved by some
arguments into the read.table() call.
Can you also send me the line of code you used to import your csv?
Ivan
Le 5/18/2010 18:25, Arantzazu Blanco Bernardeau a ?crit :
Hi again!
could it be that NA was introduced in the variable for not available values, and being NA a character, it takes everything as factor??
is the only idea I have, because it is the first time I have this problem
Thanks a lot, I am really learning! :)
Arantzazu Blanco Bernardeau
Dpto de Qu?mica Agr?cola, Geolog?a y Edafolog?a
Universidad de Murcia-Campus de Espinardo
----------------------------------------
Date: Tue, 18 May 2010 18:17:02 +0200
From: ivan.calandra at uni-hamburg.de
To: r-help at r-project.org
Subject: Re: [R] "Re: Change class factor to numeric"
Hi again,
If you used the function read.table() to read from a csv file into a
data.frame, it is weird that numeric data are converted into factors.
I would check in the original data that you don't have a typo somewhere.
I don't know all the possibilities, but a special character can
definitely make R interpret this variable differently.
For Drenaje, it is normal. In that case you can just use:
caperf$Drenaje<- factor(caperf$Drenaje)
HTH
Ivan
Le 5/18/2010 17:59, Arantzazu Blanco Bernardeau a ?crit :
Hello
so, here you have the output of the data frame. The data frame comes from a csv file.
I could take Gr_2 instead of arcilla, because it is the same value... but curiously, it is a factor as well.
'data.frame': 556 obs. of 38 variables:
$ Hoja : int 818 818 818 818 818 818 818 818 818 818 ...
$ idmuestra : Factor w/ 555 levels "1015-I","1015-II",..: 26 27 28 29 31 32 33 34 30 35 ...
$ A?o : int 1994 1994 1994 1994 1994 1994 1994 1994 1994 1994 ...
$ x : int 655500 633050 632200 635000 637150 643700 655300 648000 653400 646200 ...
$ y : int 4285800 4283050 4298150 4290000 4294800 4288850 4282700 4290350 4298450 4296650 ...
$ CO_gkg1 : num 3.7 6.5 6.3 2.6 12.1 6.9 3.5 10.8 10.3 3.3 ...
$ NTgkg_1 : num 0.53 1.01 0.66 0.42 1.3 0.82 0.43 1.31 0.85 0.51 ...
$ C_Nratio : num 6.98 6.46 9.55 6.18 9.33 ...
$ C03Ca : num 53.6 38 1.2 1.1 21.1 ...
$ pHw : num 8 8.4 8.3 8.4 8.45 8.2 8.4 8.4 8.1 8.4 ...
$ pHClK : num 7.5 7.4 7.2 7.4 8.3 7.5 7.8 7.4 7.4 7.5 ...
$ CCC : num 9.7 14.4 10.5 7.2 12.8 ...
$ CEdSm : num 0.62 0.72 0.38 0.36 19.35 ...
$ pF1_3atm : num 25.4 21.3 9.1 12.8 24.1 ...
$ pF15atm : num 15.3 10.6 5.8 6.1 8.45 11 3.7 18.8 10 14.3 ...
$ Gr_2 : Factor w/ 391 levels "0","0.71","0.9",..: 200 31 158 36 142 60 377 263 140 151 ...
$ Gr2_20 : num 19.1 39.1 2.9 5.2 NA 12.5 5.7 29.5 17.3 10.5 ...
$ Gr20_50 : num 17.6 9.8 4.7 4 NA 10.4 4.2 13.3 12.5 7.8 ...
$ Gr50_100 : num 14.1 12.2 7.6 9 NA 9.1 6.1 8.5 9.1 18.9 ...
$ Gr100_250 : num 13.4 17.1 28.7 46.2 NA 20.8 28.7 13.6 16.8 32.9 ...
$ Gr250_500 : num 7.1 5.7 28.8 15.2 NA 24.7 23.4 3.8 11.4 6.9 ...
$ Gr500_1000 : num 4 1.8 5 3.9 NA 4.2 14.3 0.9 8.9 2.1 ...
$ Gr1000_2000 : num 1.4 2 1.9 4 NA 4.1 8.9 0.4 4.8 0.9 ...
$ arcilla : Factor w/ 391 levels "0","0.71","0.9",..: 201 32 158 37 NA 61 377 263 141 151 ...
$ limo : num 36.7 48.9 7.6 9.2 0 22.9 9.9 42.8 29.8 18.3 ...
$ arena : num 40 38.8 72 78.3 0 62.9 81.4 27.2 51 61.7 ...
$ SUMA : Factor w/ 15 levels "0","100","100.2",..: 2 2 2 2 1 2 2 2 2 2 ...
$ codusosuelo : logi NA NA NA NA NA NA ...
$ pendiente : Factor w/ 9 levels "0","1","2","3",..: 3 2 3 3 2 4 3 2 2 3 ...
$ profutil : logi NA NA NA NA NA NA ...
$ profutil.1 : int 50 110 120 40 80 52 30 120 37 80 ...
$ pedregosidad: int 0 0 0 2 0 4 4 0 3 3 ...
$ Drenaje : int 4 4 4 4 1 4 5 3 4 4 ...
$ codsuelo : num 1.1 2.1 3.1 2.1 12.3 2.5 2.5 4.1 2.5 2.1 ...
$ textura : logi NA NA NA NA NA NA ...
$ m.original : Factor w/ 7 levels "Calizas . dolom?as y areniscas",..: 4 7 7 2 2 7 7 7 7 7 ...
$ GRUPOPPAL : Factor w/ 13 levels "Arenosoles","Calcisoles",..: 11 2 3 2 12 2 2 4 2 2 ...
$ SUELO : Factor w/ 40 levels "Arenosol calc?rico",..: 30 3 8 3 36 7 7 10 7 3 ...
In the other side, the variable Drenaje (drainage) that is factor mode, appears as integer.
Thanks a lot!
Arantzazu Blanco Bernardeau
Dpto de Qu?mica Agr?cola, Geolog?a y Edafolog?a
Universidad de Murcia-Campus de Espinardo
----------------------------------------
Date: Tue, 18 May 2010 17:49:54 +0200
From: ivan.calandra at uni-hamburg.de
To:
Subject: Re: [R] "Re: Change class factor to numeric"
Hi,
I think that providing the output from str(data array or whatever you
have) would help.
Because, for now, we don't have much idea of what you really have.
Moreover, some sample data is always welcomed (using the function dput
for example)
Ivan
Le 5/18/2010 17:36, Arantzazu Blanco Bernardeau a ?crit :
sorry I had a mistake sending my question without a subject. I do resend again. Please excuse me.
Hello
I have a data array with soil variables (caperf), in which the variable "clay" is factor (as I see entering str(caperf)) . I need to do a regression model, so I need to have arcilla (=clay) as a numeric variable. For that I have entered
as.numeric(as.character(arcilla))
and even entering
'as.numeric(levels(arcilla))[arcilla]'the variable is resting as factor, and the linear model is not valid (for my purposes).
The decimal commas have been converted to decimal points, so I have no idea of what to do.
Thanks a lot
Arantzazu Blanco Bernardeau
Dpto de Qu?mica Agr?cola, Geolog?a y Edafolog?a
Universidad de Murcia-Campus de Espinardo