Skip to content

R Error, very odd....

5 messages · Katie2009, Dieter Menne, Don MacQueen

#
I'm trying to analyse some excel data in R.  The problem is that when i input
the data with the first column as absolute values, everything works fine,
can analyse as normal.  When I leave the first column unchanged to import
negative numbers as well I get:


Error in storage.mode(y) <- "double" : 
  invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, "numeric") :
  using type="numeric" with a factor response will be ignored

What can I do to get around this? I would like to analyse the data in R that
does include the negative numbers.....

Thanks!!
#
Katie2009 wrote:
You did not tell us anything how you got the data from Excel, so I have to
guess. Try to re-arrange your Excel row so that the first (3 ? check the
docs; which docs? your unknown function's) lines contain non-missing data.

Dieter
#
hi dieter,

the method i'm using is in excel, copying the data, then in r
i've been  doing a bit more fiddling, and have identified the 'class' of the
column that i'm having trouble with, is classified as 'factor' whilst the
rest are numeric.

if i just change that column to as.numeric of the column, this has appeared
to have solved the problem.  Has this changed that data at all though?

Thanks.
Dieter Menne wrote:

  
    
#
Katie2009 wrote:
read.delim internally uses the same function as read.table, so you might
consult that documentation:

The number of data columns is determined by looking at the first five lines
of input (or the whole file if it has less than five lines), or from the
length of col.names if it is specified and is longer. This could conceivably
be wrong if fill or blank.lines.skip are true, so specify col.names if
necessary.

So if the first lines are unusual, many things can happen. However, reading
from the clipboard
is not a very stable way to get Excel data; I would suggest to use
library(RODBC) instead,
or one of the other Excel readers. With RODBC, use named ranges, not
worksheet names
for a stable import.

Dieter
#
At 12:37 AM -0700 5/11/09, Katie2009 wrote:
Well, you can look at the data to find out. Either just print it, or 
perhaps use simple exploration techniques, such as
   unique( column)
   table(column)
   summary(column)
compare before and after converting to numeric

In general, changing a factor to numeric can change the data relative 
to what you might think the data actually is.  Here is an example:
[1] -1  0  1
[1] -1 0  1
Levels: -1 0 1
[1] 1 2 3

tmp3 is  clearly not the same as tmp1.
[1] -1  0  1


Evidently, R is creating a factor from a column that you think is 
numeric. Your job is to find out why. I would guess that you have, 
somewhere in the column, an entry that isn't a number. Perhaps a 
typographical error in the spreadsheet column. The effect of 
inputting as absolute values vs with negative numbers should give a 
clue to that.

Try
    tmp <-  as.numeric(format( column ))
    any(is.na(tmp))
and if that is TRUE, then
    column[is.na(tmp)]

-Don