Problem using read.xls - Everything converted to factors
On Fri, Jun 3, 2011 at 10:24 AM, Sebastian Lerch <lerch at lavabit.com> wrote:
Hallo,
I would like to use to read.xls function from the gdata package to read data
from Microsoft Excel files but I experienced a problem: For example I used
the following code:
testfile<-read.xls("/home/.../wsjecon0603.xls", #file path
? ? ? ? ? header=F,
? ? ? ? ? dec=",",
? ? ? ? ? na.strings="n.a.",
? ? ? ? ? skip=5,
? ? ? ? ? sheet=2,
? ? ? ? ? col.names=c("Name", "Firm","GDP1","GDP2","GDP3","GDP4","CPI5",
?"CPI11","UNEMP5","UNEMP11","PROF03","PROF04","STARTS03","STARTS04"),
? ? ? ? ? nrows=54,
#colClasses=c(character,character,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric,numeric)
)
print(testfile)
Although the xls file contains numeric values in all the columns except the
ones which I named "Name" and "Firm", everything in the data frame has
"factor" as class. I tried to use the colClasses option as above and as well
with " "'s around each word, but this does not work and I will always
receive the following error:
Fehler in is(object, Class) :
?versuche einen Slot "className" von einem Objekt der einfachen Klasse
("list") ohne Slots anzufordern
Calls: read.xls -> read.csv -> read.table -> <Anonymous> -> is
After some hours of reasearch I figured out how I can manually change the
classes of the columns:
testfile$GDP2<-as.numeric(levels(testfile$GDP2))[testfile$GDP2]
testfile$Name<-as.character(levels(testfile$Name))[testfile$Name] #and so on
This works, but is a lot of work since I have to import many different data
sets. So I was wondering if there is another way to let the classes be
recognized correctly.
Additionally I would like to know if there is any way to import data from
different sheets with the same layout at once into one data frame.
I use Ubuntu 11.04 with Rkward if this is of any importance.
Assuming you are the gdata package then read.xls has a ... argument which it passes to read.table so see ?read.table . In particular, as.is = TRUE prevents conversion to factors and any column which has even one non-numeric will not be regarded as numeric. You can rbind the results from different sheets if they have same layout.
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com