mathijsdevaan
2010-Dec-09 04:12 UTC
[R] Error in vector("integer", length) : vector size cannot be NA
Hello, I have uploaded a csv file that looks like this:> gcalpha_id beta_id 1 142053 1 2 9454 1 3 295618 2 4 42691 2 5 389224 3 6 9455 3 The alpha_id contains 310660 unique values and the beta_id contains 17431 unique values. The number of rows adds up to more than 1.3 million. Now I want to convert this list of observations into a matrix with alpha_id in the first row and beta_id in the first column (or vice versa) and a count in the cells. So this would be an option M = as.matrix( table(gc) ). However, I keep getting this error message: Error in vector("integer", length) : vector size cannot be NA In addition: Warning messages: 1: In pd * (as.integer(cat) - 1L) : NAs produced by integer overflow 2: In pd * nl : NAs produced by integer overflow There is no missing data in my file, so I don't know what's wrong. Can you please help me? Thanks! Mathijs -- View this message in context: http://r.789695.n4.nabble.com/Error-in-vector-integer-length-vector-size-cannot-be-NA-tp3079566p3079566.html Sent from the R help mailing list archive at Nabble.com.
Petr Savicky
2010-Dec-09 08:24 UTC
[R] Error in vector("integer", length) : vector size cannot be NA
On Wed, Dec 08, 2010 at 08:12:34PM -0800, mathijsdevaan wrote:> > Hello, > > I have uploaded a csv file that looks like this: > > > gc > alpha_id beta_id > 1 142053 1 > 2 9454 1 > 3 295618 2 > 4 42691 2 > 5 389224 3 > 6 9455 3 > > The alpha_id contains 310660 unique values and the beta_id contains 17431 > unique values. The number of rows adds up to more than 1.3 million. Now I > want to convert this list of observations into a matrix with alpha_id in the > first row and beta_id in the first column (or vice versa) and a count in the > cells. So this would be an option M = as.matrix( table(gc) ). However, I > keep getting this error message: > > Error in vector("integer", length) : vector size cannot be NA > In addition: Warning messages: > 1: In pd * (as.integer(cat) - 1L) : NAs produced by integer overflow > 2: In pd * nl : NAs produced by integer overflow > > There is no missing data in my file, so I don't know what's wrong. Can you > please help me? Thanks!The number of entries in the table is 310660*17431. Using integer type, this is 310660*17431*4 bytes, which is 20.17 GB. This probably does not fit into RAM. Function table() produces a full matrix, not a sparse one, even if there are empty cells. Petr Savicky.
jim holtman
2010-Dec-09 10:49 UTC
[R] Error in vector("integer", length) : vector size cannot be NA
Try using 'sqldf' to get your result
sqldf("select alpha_id, beta_id, count(*) from gc group by alpha_id,
beta_id")
You might also try 'data.table'
On Wed, Dec 8, 2010 at 11:12 PM, mathijsdevaan <mathijsdevaan at
gmail.com> wrote:>
> Hello,
>
> I have uploaded a csv file that looks like this:
>
>> gc
> ? ? ? ? alpha_id ? ? beta_id
> 1 ? ? ? 142053 ? ? ? 1
> 2 ? ? ? ? 9454 ? ? ? 1
> 3 ? ? ? 295618 ? ? ? 2
> 4 ? ? ? ?42691 ? ? ? 2
> 5 ? ? ? 389224 ? ? ? 3
> 6 ? ? ? ? 9455 ? ? ? 3
>
> The alpha_id contains 310660 unique values and the beta_id contains 17431
> unique values. The number of rows adds up to more than 1.3 million. Now I
> want to convert this list of observations into a matrix with alpha_id in
the
> first row and beta_id in the first column (or vice versa) and a count in
the
> cells. So this would be an option M = as.matrix( table(gc) ). However, I
> keep getting this error message:
>
> Error in vector("integer", length) : vector size cannot be NA
> In addition: Warning messages:
> 1: In pd * (as.integer(cat) - 1L) : NAs produced by integer overflow
> 2: In pd * nl : NAs produced by integer overflow
>
> There is no missing data in my file, so I don't know what's wrong.
Can you
> please help me? Thanks!
>
> Mathijs
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Error-in-vector-integer-length-vector-size-cannot-be-NA-tp3079566p3079566.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?