Wide character in print? - R-help

Mon, Feb 4, 2013 8:39 AM #

Hello:


	  Googling for "Wide characters in print" led me to a discussion that 
pushed me to review the "read.table" help page.  Careful study there 
suggested I try setting "fileEncoding" to something;  it suggested I 
look at the "Encoding" section in the help file for "file".  This 
suggested that anything I got to work on my computer might not be 
portable.


	  Suggestions?
	  Thanks,
	  Spencer
	

###########################


       I get "Wide character in print" from trying 
read.xls("22_data.xls") in the gdata package, with "22_data.xls" 
downloaded from "Varieties_Country_A-E.xls" at 
"http://www.reinhartandrogoff.com/data/browse-by-topic/topics/7/":

Wide character in print at 
C:/Users/sgraves/pgms/R/R-2.15.2/library/gdata/perl/xls2csv.pl line 270.

R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

other attached packages:
[1] gdata_2.12.0

loaded via a namespace (and not attached):
[1] gtools_2.7.0


       I get the same message from xls2sep("22_data.xls").


       It's only a comment, so I suppose I could ignore it.  However, 
it's generated by a function I'm adding to the Ecdat package, and I'd 
rather find a way to avoid it.  (I suppose I could dump it to sink, but 
that's pretty extreme and could mask other problems.)


       Thanks,
       Spencer

Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San Jos?, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

Marc Schwartz

Mon, Feb 4, 2013 9:33 AM #

On Feb 4, 2013, at 10:39 AM, Spencer Graves <spencer.graves at structuremonitoring.com> wrote:

Spencer,

The error message is coming from Perl, not from R and from what I understand, is typically encountered when there are UTF-8/Unicode characters in the source. "Wide character" apparently referring to multi-byte encodings.

Having downloaded the Excel file you indicate above, my first reaction is that it is not really structured in a way to facilitate automated parsing to a CSV file (the intermediate step before using read.table()) to then be read into R to a data frame. They are not purely rows and columns of data, which is the typical application for read.xls().

There are lengthy header lines in the worksheets, some of which include copyright symbols, which is likely why you are getting the error from Perl. There are also embedded objects in the worksheets, which appear to be image crops of tables from a paper. I honestly don't know if read.xls() is set up to handle that stuff and you may need to contact the maintainers.

Given the above, I am not sure what I would recommend if your goal is to parse the raw data contained in the Excel worksheets and include them in a package. You may need to copy and paste the data ranges to the OS clipboard and read them into R from there, or consider using a different R package that has more flexibility in defining the specific Excel worksheet cell ranges that you want to extract.

Others may have different ideas for you.

Regards,

Marc Schwartz