An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20130211/d29fcd4e/attachment.pl>
vegdist Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande
3 messages · Carolina Bello, Philippi, Tom, Jari Oksanen
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20130211/9341bf72/attachment.pl>
Dear Carolina Bello, You asked this same thing in the general R mailing list, and Brian Ripley answered to you on Saturday. The essential things he told you were that you cannot do that with 32G of RAM, and that you should rethink your problem. All we can do here is to repeat his message, and Tom Philippi already did so. With N = 138037 you need 71G to store the result, and 32 G of RAM is too little. I don't know how much further you would get with vegan:::vegdist in R 3.0.0 but at least the error message will change to a Spanish version of "Error: cannot allocate vector of size 71.0 Gb". You really should re-think your problem. You need to use methods that can handle large data sets like that or you need to thin your data. Your data are modelled? At least I find it difficult to believe that you really have observations on 89 species in 138037 grid cells in rugged terrain like the Andes. Cheers, Jari Oksanen
On 12/02/2013, at 00:15 AM, Carolina Bello wrote:
Hi I have some problems with the vegdist function.I want to do a hierarchical cluster from 138037 pixels of 1 lkm^2 from a study area of colombian Andes. I have distributions models for 89 species so i have a matrix with the pixels in the rows and species in the columns and is full with absence(0)/presence(1) of each species per each pixel. I think the bigger problem is that for agglomeration method in the hierarchical cluster i need the hole matrix so i can?t divided it. For doing this I want to calculate a distance matrix with jaccard. I have binary data. The problem is that i have a matrix of 138037 rows (sites) and 89 columns (species). my script is: rm(list=ls(all=T)) gc() ##para borrar todo lo que quede oculto en memoria memory.limit(size = 100000) # it gives 1 Tera from HDD in case ram memory is over DF=as.data.frame(MODELOS) DF=na.omit(DF) DISTAN=vegdist(DF[,2:ncol(DF)],"jaccard") Almost immediately IT produces the error:* Error en double(N * (N - 1)/2) : tama?o del vector especificado es muy grande* I think this a memory error, but i don?t know why if i have a pc with 32GB of ram and 1 Tera of HDD. I also try to do a dist matrix whit the function dist from package proxy, i did: library(proxy) vector=dist(DF, method = "Jaccard") it starts to run but when it gets to 10 GB of ram, a window announces that R committed an error and it will close, so it closes and start a new section. I really don't know what is going on and less how to solve this, can anybody help me? thanks Carolina Bello IAVH-COLOMBIA -- View this message in context: http://r.789695.n4.nabble.com/vegdist-Error-en-double-N-N-1-2-tama-o-del-vector-especificado-es-muy-grande-tp4658010.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland jari.oksanen at oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa