It depends on what you want to do with that data in R. If you want to play with the whole data, just storing it in R will require more than 2.6GB of memory (assuming all data are numeric and are stored as doubles):
7e6 * 50 * 8 / 1024^2
[1] 2670.288 That's not impossible, but you'll need to be on a computer with quite a bit more memory than that, and running on an OS that supports it. If that's not feasible, you need to re-think what you want to do with that data in R (e.g., read in and process a small chunk at a time, or read in a random sample, etc.). Andy
From: Thomas W Volscho Dear List, I have some projects where I use enormous datasets. For instance, the 5% PUMS microdata from the Census Bureau. After deleting cases I may have a dataset with 7 million+ rows and 50+ columns. Will R handle a datafile of this size? If so, how? Thank you in advance, Tom Volscho ************************************ Thomas W. Volscho Graduate Student Dept. of Sociology U-2068 University of Connecticut Storrs, CT 06269 Phone: (860) 486-3882 http://vm.uconn.edu/~twv00001
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html