Message-ID: <7F0A4DA6-9EE2-4C35-9E19-431B1FD2191A@plessthan.com>
Date: 2009-01-20T17:07:47Z
From: Dennis Fisher
Subject: Unable to read.csv because of special character in file
Colleagues,
I am trying to read a file that contains the ? (mu character).
readLines is succcessful and shows the following:
"\xb5g/mL\"
read.csv yields the following:
> Error in type.convert(data[[i]], as.is = as.is[i], dec = dec,
> na.strings = character(0)) :
> invalid multibyte string at '<b5>g/mL'
using a text editor, i replaced all occurrences of ? (mu) - at which
point read.csv worked properly.
sessionInfo()
> R version 2.8.0 (2008-10-20)
> i386-apple-darwin8.11.1
>
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
Although my work-around was successful, I am wondering whether there
is some means to accomplish this without editing the source document:
1. is it possible to inform R to read the character in its natural
form?
2. if not, I could execute readLines, then do a gsub (which did not
work - any ideas of how to formulate the regular expression would be
appreciated). then write to a tempfile and read in again (or use a
textConnection).
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com