Skip to content

inconsistency with cor() - "x must be numeric"

6 messages · Justin Fincher, Erik Iverson, Joshua Wiley +1 more

#
Please provide a reproducible example!

E.g., use ?dput to dump a minimal data.frame that
exhibits this issue on the newest version of R.
Justin Fincher wrote:
#
Hi Fincher,

cor() only works on numeric arguments now (as of R 2.11 or 2.10 if
memory serves).  So, I would update your function to ensure that you
are only passing numeric data to cor() and the error should go away
(it will probably be easier on you if you can update your version of R
to the latest and greatest...quite a bit has changed since 2.8.1).  If
you post a reproducible example of your function, I'm sure we can help
update it.

Cheers,

Josh
On Mon, Dec 13, 2010 at 1:56 PM, Justin Fincher <fincher at cs.fsu.edu> wrote:

  
    
#
Hi,

I can certainly understand not wanting to be long winded, and no
damage done.  Here's a link to the R news file:
http://cran.stat.ucla.edu/src/base/NEWS   and if you search in your
browser for "cor() and cov()" you should find what happened.

At any rate, I could not fully check your code because:  object
'accessibility_data' not found, but my guess would be that you created
a matrix (if inadvertently), and at least one of the columns had some
character data in it, which would push *all* the data to character
class (even though a particular column may be numeric data it is not
stored as character).  Previously I think cor() did not check this,
and would silently convert using as.numeric().

I would look at:

str(acc_averages)

and I bet you will find that it is not numeric.  If this is the case,
one fix would be:

correlation = cor(as.numeric(acc_averages[,2]),
gene_densities$avg_density[1:23])

probably a better fix would be to initiate acc_averages as a
data.frame rather than with c(), that way it can store different types
of data without moving everything up the hierarchy of classes.  To see
what I mean look at ?rbind under the heading "Values" the second
paragraph.

Cheers,

Josh
On Mon, Dec 13, 2010 at 2:23 PM, Justin Fincher <fincher at cs.fsu.edu> wrote:

  
    
#
On Dec 13, 2010, at 23:23 , Justin Fincher wrote:

            
Well, let me put it this way: Once you realize what you are doing, you will appreciate that R is not letting you do that anymore...
This and the similar line 3 lines earlier is the culprit. The c() construct creates a character vector because its 1st argument is character. Hence, acc_averages is a character matrix. Now, are you _sure_ you know what happens if you correlate something with the character vector acc_averages[,2]? It may have given you the right thing for Pearson correlations, but it certainly did not for rank correlations pre 2.11.0, leading to a "non-bug report" and the subsequent check for numeric data. What happened then was that ranks were based on the _alphabetical_ ordering of data!

I'm fairly confident that you'd really want to do the whole thing with a suitable aggregate() call, but for now, how about just keeping the labels and the values in two separate vectors?