thr3ads.net - R help - [R] R 2.9.2 memory max - object vector size [Sep 2009]

If this information is useful, please help other people find it:
Share via:

S. Few

2009-Sep-10 20:45 UTC

[R] R 2.9.2 memory max - object vector size

Me:

Win XP
4 gig ram
R 2.9.2

library(foreign) # to read/write SPSS files
library(doBy) # for summaryBy
library(RODBC)
setwd("C:\\Documents and Settings\\............00909BR")
gc()
memory.limit(size=4000)

##  PROBLEM:

I have memory limit problems. R and otherwise. My dataframes for
merging or subsetting are about 300k to 900k records.
I've had errors such as vector size too large. gc() was done.....reset
workspace, etc.

This fails:

y$pickseq<-with(y,ave(as.numeric(as.Date(timestamp)),id,FUN=seq))

Any clues?

Is this 2.9.2?

Skipping forward, should I download version R 2.8 or less?

Thanks!
Steve

William Dunlap

2009-Sep-10 23:23 UTC

head link

[R] R 2.9.2 memory max - object vector size

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of S. Few
> Sent: Thursday, September 10, 2009 1:46 PM
> To: r-help at r-project.org
> Subject: [R] R 2.9.2 memory max - object vector size
> 
> Me:
> 
> Win XP
> 4 gig ram
> R 2.9.2
> 
> library(foreign) # to read/write SPSS files
> library(doBy) # for summaryBy
> library(RODBC)
> setwd("C:\\Documents and Settings\\............00909BR")
> gc()
> memory.limit(size=4000)
> 
> ##  PROBLEM:
> 
> I have memory limit problems. R and otherwise. My dataframes for
> merging or subsetting are about 300k to 900k records.
> I've had errors such as vector size too large. gc() was done.....reset
> workspace, etc.
> 
> This fails:
> 
> y$pickseq<-with(y,ave(as.numeric(as.Date(timestamp)),id,FUN=seq))
If any values in id are singletons then the call to
seq(timestamp[id=="singleton"])
returns a vector whose length is timestamp[id=="singleton"] (not the
length
of that, the value of that).  as.numeric(as.Date("2009-09-10")) is
14497
so you
might have a lot of 14497-long vectors being created (and thrown away,
unused
except for their initial value).  Using seq_along instead of seq would
take
care of that potential problem.  E.g.,
   >
d1<-data.frame(x=c(2,3,5e9,4,5),id=c("A","B","B","B","A"))
   >
d2<-data.frame(x=c(2,3,5e9,4,5),id=c("A","B","C","B","A"))
   > # d1$id has no singletons, d2$id does where d2$x is huge
   > with(d1, ave(x,id,FUN=seq))
   [1] 1 1 2 3 2
   > with(d2, ave(x,id,FUN=seq))
   Error in 1L:from : result would be too long a vector
   > with(d2, ave(x,id,FUN=seq_along))
   [1] 1 1 1 2 2
   
If your intent is to create a vector of within-group sequence numbers
then there are more efficient ways to do it.  E.g., with the following
functions
   withinGroupSeq <- function(x){
      x <- as.factor(x)
      retval <- integer(length(x))
      retval[order(as.integer(x))] <- Sequence(table(x))
      retval
   }
   # Sequence is like base::sequence but should use less memory
   # by avoiding the list that sequence's lapply call makes.
   Sequence <- function(nvec) {
      seq_len(sum(nvec)) - rep(cumsum(c(0L,nvec[-length(nvec)])), nvec)
   }
you can get the same result as ave(FUN=seq_along) in less time and,
I suspect, less memory
   > withinGroupSeq(d1$id)
   [1] 1 1 2 3 2
   > withinGroupSeq(d2$id)
   [1] 1 1 1 2 2

Base R may have a function for that already.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

> 
> Any clues?
> 
> Is this 2.9.2?
> 
> Skipping forward, should I download version R 2.8 or less?
> 
> Thanks!
> Steve
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

gug

2009-Sep-11 10:47 UTC

head link

[R] R 2.9.2 memory max - object vector size

At the risk of stating the obvious:

  - rm(.....)  # clears specific objects out of memory as soon as they're no
longer needed in the routine.

  - sapply(ls(), function(x) object.size(get(x)))  #lists all objects with
the memory each is using.

  - rm(list=ls())  #clears out all objects, e.g. before the routine, to free
up memory.

Guy


S. Few wrote:> 
> ##  PROBLEM:
> 
> I have memory limit problems. R and otherwise. My dataframes for
> merging or subsetting are about 300k to 900k records.
> I've had errors such as vector size too large. gc() was done.....reset
> workspace, etc.
> 
-- 
View this message in context:
http://www.nabble.com/R-2.9.2-memory-max---object-vector-size-tp25390745p25398830.html
Sent from the R help mailing list archive at Nabble.com.

Maybe Matching Threads

Search for more maybe matching threads

R help - Sep 2009 - R 2.9.2 memory max - object vector size

[R] R 2.9.2 memory max - object vector size

[R] R 2.9.2 memory max - object vector size

[R] R 2.9.2 memory max - object vector size

Maybe Matching Threads