Error in vector("integer", length) : vector size cannot be NA
On Wed, Dec 08, 2010 at 08:12:34PM -0800, mathijsdevaan wrote:
Hello, I have uploaded a csv file that looks like this:
gc
alpha_id beta_id
1 142053 1
2 9454 1
3 295618 2
4 42691 2
5 389224 3
6 9455 3
The alpha_id contains 310660 unique values and the beta_id contains 17431
unique values. The number of rows adds up to more than 1.3 million. Now I
want to convert this list of observations into a matrix with alpha_id in the
first row and beta_id in the first column (or vice versa) and a count in the
cells. So this would be an option M = as.matrix( table(gc) ). However, I
keep getting this error message:
Error in vector("integer", length) : vector size cannot be NA
In addition: Warning messages:
1: In pd * (as.integer(cat) - 1L) : NAs produced by integer overflow
2: In pd * nl : NAs produced by integer overflow
There is no missing data in my file, so I don't know what's wrong. Can you
please help me? Thanks!
The number of entries in the table is 310660*17431. Using integer type, this is 310660*17431*4 bytes, which is 20.17 GB. This probably does not fit into RAM. Function table() produces a full matrix, not a sparse one, even if there are empty cells. Petr Savicky.