Windows 2000, R 1.3.0 I have been given a data set (ASCII tab delimited) in which many variables that are supposed to be numeric actually contain characters. I would be happy to have those character entries disappear and become NA. When I use read.table and then attempt to convert these using as.numeric(), of course it doesn't work, and I get the ASCII? representation? maybe? When I use scan() and specify the data type (what=...), I get an error message saying that I have character data in the numeric variable. Any ideas are appreciated. Thank you, Henry ***************************** Martin Henry H. Stevens HStevens at muohio.edu tel: (513) 529 - 4206 FAX: (513) 529 - 4243 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
converting among modes
5 messages · Martin Henry H. Stevens, Brian Ripley, Thomas Lumley +1 more
On Fri, 27 Jul 2001, Martin Henry H. Stevens wrote:
Windows 2000, R 1.3.0 I have been given a data set (ASCII tab delimited) in which many variables that are supposed to be numeric actually contain characters. I would be happy to have those character entries disappear and become NA. When I use read.table and then attempt to convert these using as.numeric(), of course it doesn't work, and I get the ASCII? representation? maybe? When
Yes, `of course'. You need as.numeric(as.character()) since those columns are factors. (See the FAQ item 7.13.)
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Fri, 27 Jul 2001, Martin Henry H. Stevens wrote:
Windows 2000, R 1.3.0 I have been given a data set (ASCII tab delimited) in which many variables that are supposed to be numeric actually contain characters. I would be happy to have those character entries disappear and become NA. When I use read.table and then attempt to convert these using as.numeric(), of course it doesn't work, and I get the ASCII? representation? maybe? When I use scan() and specify the data type (what=...), I get an error message saying that I have character data in the numeric variable.
If there are only a few different character strings causing problems you can use the na.strings option of read.table(). Your data are actually read as *factors*, not as characters, and as.numeric() gives the factor codes. Converting factors to numeric is a FAQ (7.13); use as.numeric(as.character(the.factor)) -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Thank you very much for the response. The way to get rid of those pesky characters is to first convert to character (from the default factor), THEN convert to numeric. I had tried using as.is= in read.table, but I used it incorrectly (as.is=TRUE instead of as.is= <vector of col indices>). as.numeric( as.character( vect.dat ) ) NOT simply as.numeric(vect.dat) I used as.is= incorrectly. I simply used as.is=TRUE, and did not specify ----- Original Message ----- From: "Thomas Lumley" <tlumley at u.washington.edu> To: "Martin Henry H. Stevens" <hstevens at muohio.edu> Cc: <r-help at stat.math.ethz.ch> Sent: Friday, July 27, 2001 1:47 PM Subject: Re: [R] converting among modes
On Fri, 27 Jul 2001, Martin Henry H. Stevens wrote:
Windows 2000, R 1.3.0 I have been given a data set (ASCII tab delimited) in which many
variables
that are supposed to be numeric actually contain characters. I would be happy to have those character entries disappear and become NA. When I use read.table and then attempt to convert these using
as.numeric(),
of course it doesn't work, and I get the ASCII? representation? maybe?
When
I use scan() and specify the data type (what=...), I get an error
message
saying that I have character data in the numeric variable.
If there are only a few different character strings causing problems you can use the na.strings option of read.table(). Your data are actually read as *factors*, not as characters, and as.numeric() gives the factor codes. Converting factors to numeric is a FAQ (7.13); use as.numeric(as.character(the.factor)) -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
2 days later
Martin Henry H. Stevens <hstevens at muohio.edu> writes:
Windows 2000, R 1.3.0 I have been given a data set (ASCII tab delimited) in which many variables that are supposed to be numeric actually contain characters. I would be happy to have those character entries disappear and become NA.
Check out the na.strings parameter to read.table(). Mark -- Mark Myatt -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._