inconsistency with cor() - "x must be numeric"
Hi, I can certainly understand not wanting to be long winded, and no damage done. Here's a link to the R news file: http://cran.stat.ucla.edu/src/base/NEWS and if you search in your browser for "cor() and cov()" you should find what happened. At any rate, I could not fully check your code because: object 'accessibility_data' not found, but my guess would be that you created a matrix (if inadvertently), and at least one of the columns had some character data in it, which would push *all* the data to character class (even though a particular column may be numeric data it is not stored as character). Previously I think cor() did not check this, and would silently convert using as.numeric(). I would look at: str(acc_averages) and I bet you will find that it is not numeric. If this is the case, one fix would be: correlation = cor(as.numeric(acc_averages[,2]), gene_densities$avg_density[1:23]) probably a better fix would be to initiate acc_averages as a data.frame rather than with c(), that way it can store different types of data without moving everything up the hierarchy of classes. To see what I mean look at ?rbind under the heading "Values" the second paragraph. Cheers, Josh
On Mon, Dec 13, 2010 at 2:23 PM, Justin Fincher <fincher at cs.fsu.edu> wrote:
I apologize for the lack of example. ?I was trying not to be too long
winded. ?Below is the first portion of my function that is causing the
error. (I'm including both calls to cor(), though it quits after the first
throws an error). ?I do not believe he has redefined cor() as he is a novice
user and we tried this after starting a fresh session. ?And I will look into
upgrading. ?I realize it is a little out of date since it is the version in
the repository for my distribution and not the latest-and-greatest from R.
?I just didn't realize a change like that would be made that would
(seemingly to me) reduce functionality. Thank you again for your help.
- Fincher
?? # As they don't change, hard code gene density values
?? gene_densities =
data.frame(chrom=c("chr1","chr2","chr3","chr4","chr5","chr6","chr7",
"chr8","chr9","chr10","chr11","chr12","chr13",
"chr14","chr15","chr16","chr17","chr18","chr19",
"chr20","chr21","chr22","chrX","chrY"),
?avg_density=c(10.19,6.457,6.71,4.917,6.083,7.491,7.453,
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 5.939,7.27,7.132,11.38,9.429,3.757,
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 7.607,8.455,11.81,17.84,4.649,26.52,
?? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 11.19,6.51,11.28,7.535,2.931))
?? acc_averages = c()
?? # subset out relevant data
?? accessibility_data = subset(accessibility_data,
accessibility_data$V9==";color=000000")
?? # calculate mean accessibility value for each chromosome
?? for(i in seq(1,22)){
?? ? ?sub = paste("chr",i,sep="")
?? ? ?temp = subset(accessibility_data,accessibility_data$V1==sub)
?? ? ?acc_averages = rbind(acc_averages,c(sub,as.double(mean(temp$V6))))
?? }
?? temp = subset(accessibility_data,accessibility_data$V1=="chrX")
?? acc_averages = rbind(acc_averages,c("chrX",as.double(mean(temp$V6))))
?? # Output the correlation without including chromosome Y
?? correlation = cor(acc_averages[,2],gene_densities$avg_density[1:23])
?? cat("Correlation w/o chrY:",correlation,'\n')
?? temp = subset(accessibility_data,accessibility_data$V1=="chrY")
?? acc_averages = rbind(acc_averages,c("chrY",mean(temp$V6)))
?? # Output overall correlation
?? correlation = cor(acc_averages[,2],gene_densities$avg_density)
?? cat("Correlation w/chrY:",correlation,'\n')
On Mon, Dec 13, 2010 at 17:06, Joshua Wiley <jwiley.psych at gmail.com> wrote:
Hi Fincher, cor() only works on numeric arguments now (as of R 2.11 or 2.10 if memory serves). ?So, I would update your function to ensure that you are only passing numeric data to cor() and the error should go away (it will probably be easier on you if you can update your version of R to the latest and greatest...quite a bit has changed since 2.8.1). ?If you post a reproducible example of your function, I'm sure we can help update it. Cheers, Josh On Mon, Dec 13, 2010 at 1:56 PM, Justin Fincher <fincher at cs.fsu.edu> wrote:
Howdy,
? I have written a small function to generate a simple plot and my
colleague is having an error when attempting to run it. ?Essentially I
loop
through categories in a data frame and take the average value for each
category The categories are in $V1, subset first then mean taken and
concatenated to previous values using rbind(c("label",mean(data$V6)).
?The
result is a two-column matrix with labels in column one and values in
column
two. ?Within the function I calculate the correlation of column two and
another set of values that are part of the function. ?On my computer
(linux
box running R 2.8.1) the function runs correctly. ?On my colleague's
computer (Windows box running R 2.12) the function throws an error at
the
cor() function call saying that "x must be numeric." ?We are running on
the
exact same data set and source'ing the same function definition. ?Any
help
would be appreciated.
- Fincher
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/