thr3ads.net - R help - [R] Does R accumulate memory [Jan 2005]

If this information is useful, please help other people find it:
Share via:

Doran, Harold

2005-Jan-08 21:38 UTC

[R] Does R accumulate memory

Dear List:

I am running into a memory issue that I haven't noticed before. I am
running a simulation with all of the code used below. I have increased
my memory to 712mb and have a total of 1 gb on my machine.

What appears to be happening is I run a simulation where I create 1,000
datasets with a sample size of 100. I then run each dataset through a
gls and obtain some estimates.

This works fine. But, when I view how much memory is being used in
Windows, I see that it does not reduce once the analysis is complete. As
a result, I must quit R and then perform another analysis. 

So for example, before starting the 1st simulation, my windows task
manager tells me I am using 200mb of memory. After running the first
simulation it may go up to 500mb. I then try and run another simulation
with a larger sample size, but I quickly run out of memory because it
starts at 500 and increases from there and the simulation halts.

So, it appears that R does not release memory after intense analyses,
but is accumulated. Is this correct? If so, could this be due to
inefficient code? Or, is this an issue specific to Windows? I didn't see
this in the FAQ section on memory or in my searches on the web. I'm not
sure how I can work more efficiently here.

Thanks
Harold
R 2.0
Windows XP


#Housekeeping
library(MASS)
library(nlme)
mu<-c(100,150,200,250)
Sigma<-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4,4
)
mu2<-c(0,0,0)
Sigma2<-diag(16,3)
sample.size<-100
N<-1000 #Number of datasets
#Take a draw from VL distribution
vl.error<-mvrnorm(n=N, mu2, Sigma2)

#Step 1 Create Data
Data <- lapply(seq(N), function(x)
as.data.frame(cbind(1:10,mvrnorm(n=sample.size, mu, Sigma))))

#Step 2 Add Vertical Linking Error
for(i in seq(along=Data)){
Data[[i]]$V6 <- Data[[i]]$V2
Data[[i]]$V7 <- Data[[i]]$V3 + vl.error[i,1] 
Data[[i]]$V8 <- Data[[i]]$V4 + vl.error[i,2]
Data[[i]]$V9 <- Data[[i]]$V5 + vl.error[i,3] 
}

#Step 3 Restructure for Longitudinal Analysis
long <- lapply(Data, function(x) reshape(x, idvar="Data[[i]]$V1",
varying=list(c(names(Data[[i]])[2:5]),c(names(Data[[i]])[6:9])),
v.names=c("score.1","score.2"), direction="long"))

# Step 4 Run GLS

glsrun1 <- lapply(long, function(x) gls(score.1~I(time-1), data=x, 
correlation=corAR1(form=~1|V1), method='ML'))

glsrun2 <- lapply(long, function(x) gls(score.2~I(time-1), data=x, 
correlation=corAR1(form=~1|V1), method='ML'))

# Step 5 Extract Intercepts and slopes
int1 <- lapply(glsrun1, function(x) x$coefficient[1])
slo1 <- lapply(glsrun1, function(x) x$coefficient[2])
int2 <- lapply(glsrun2, function(x) x$coefficient[1])
slo2 <- lapply(glsrun2, function(x) x$coefficient[2])

# Step 6 Compute SD of intercepts and slopes

int.sd1 <- sapply(glsrun1, function(x) x$coefficient[1])
slo.sd1 <- sapply(glsrun1, function(x) x$coefficient[2])
int.sd2 <- sapply(glsrun2, function(x) x$coefficient[1])
slo.sd2 <- sapply(glsrun2, function(x) x$coefficient[2])

cat("Original Standard Errors","\n",
"Intercept","\t",
sd(int.sd1),"\n","Slope","\t","\t",
sd(slo.sd1),"\n")

cat("Modified Standard Errors","\n",
"Intercept","\t",
sd(int.sd2),"\n","Slope","\t","\t",
sd(slo.sd2),"\n")

	[[alternative HTML version deleted]]

Duncan Murdoch

2005-Jan-08 22:36 UTC

head link

[R] Does R accumulate memory

On Sat, 8 Jan 2005 16:38:31 -0500, "Doran, Harold" <HDoran at
air.org>
wrote:
>Dear List:
>
>I am running into a memory issue that I haven't noticed before. I am
>running a simulation with all of the code used below. I have increased
>my memory to 712mb and have a total of 1 gb on my machine.
>
>What appears to be happening is I run a simulation where I create 1,000
>datasets with a sample size of 100. I then run each dataset through a
>gls and obtain some estimates.
>
>This works fine. But, when I view how much memory is being used in
>Windows, I see that it does not reduce once the analysis is complete. As
>a result, I must quit R and then perform another analysis. 
If you ask Windows how much memory is being used, you'll likely get an
incorrect answer.  R may not release memory back to the OS, but it may
be available for re-use within R.

Call gc() to see how much memory R thinks is in use.
>So for example, before starting the 1st simulation, my windows task
>manager tells me I am using 200mb of memory. After running the first
>simulation it may go up to 500mb. I then try and run another simulation
>with a larger sample size, but I quickly run out of memory because it
>starts at 500 and increases from there and the simulation halts.
The difficulty you're running into may be memory fragmentation.  When
you run with a larger sample size, R will try to allocate larger
chunks than it did originally.  If the "holes" created when the
original simulation is deleted are too small, R will need to ask
Windows for new memory to store things in.

You could try deleting everything in your workspace before running the
2nd simulation; this should reduce the fragmentation.  Or you could
run the big simulation first, then the smaller one will fit in the
holes left from it.


Duncan Murdoch

Prof Brian Ripley

2005-Jan-08 22:45 UTC

head link

[R] Does R accumulate memory

One hint: R rarely releases memory to the OS, especially under Windows.
So do not expect to see the usage reported by Windows going down.

One possibility is that you are storing lots of results and not removing 
them.  You don't need to store all the gls fits, just the parts you need.

You can use gc(), memory.profile() and object.size() to see where memory 
is being used.

On Sat, 8 Jan 2005, Doran, Harold wrote:
> Dear List:
>
> I am running into a memory issue that I haven't noticed before. I am
> running a simulation with all of the code used below. I have increased
> my memory to 712mb and have a total of 1 gb on my machine.
>
> What appears to be happening is I run a simulation where I create 1,000
> datasets with a sample size of 100. I then run each dataset through a
> gls and obtain some estimates.
>
> This works fine. But, when I view how much memory is being used in
> Windows, I see that it does not reduce once the analysis is complete. As
> a result, I must quit R and then perform another analysis.
>
> So for example, before starting the 1st simulation, my windows task
> manager tells me I am using 200mb of memory. After running the first
> simulation it may go up to 500mb. I then try and run another simulation
> with a larger sample size, but I quickly run out of memory because it
> starts at 500 and increases from there and the simulation halts.
>
> So, it appears that R does not release memory after intense analyses,
> but is accumulated. Is this correct? If so, could this be due to
> inefficient code? Or, is this an issue specific to Windows? I didn't
see
> this in the FAQ section on memory or in my searches on the web. I'm not
> sure how I can work more efficiently here.
>
> Thanks
> Harold
> R 2.0
> Windows XP
>
>
> #Housekeeping
> library(MASS)
> library(nlme)
> mu<-c(100,150,200,250)
> Sigma<-matrix(c(400,80,80,80,80,400,80,80,80,80,400,80,80,80,80,400),4,4
> )
> mu2<-c(0,0,0)
> Sigma2<-diag(16,3)
> sample.size<-100
> N<-1000 #Number of datasets
> #Take a draw from VL distribution
> vl.error<-mvrnorm(n=N, mu2, Sigma2)
>
> #Step 1 Create Data
> Data <- lapply(seq(N), function(x)
> as.data.frame(cbind(1:10,mvrnorm(n=sample.size, mu, Sigma))))
>
> #Step 2 Add Vertical Linking Error
> for(i in seq(along=Data)){
> Data[[i]]$V6 <- Data[[i]]$V2
> Data[[i]]$V7 <- Data[[i]]$V3 + vl.error[i,1]
> Data[[i]]$V8 <- Data[[i]]$V4 + vl.error[i,2]
> Data[[i]]$V9 <- Data[[i]]$V5 + vl.error[i,3]
> }
>
> #Step 3 Restructure for Longitudinal Analysis
> long <- lapply(Data, function(x) reshape(x,
idvar="Data[[i]]$V1",
> varying=list(c(names(Data[[i]])[2:5]),c(names(Data[[i]])[6:9])),
> v.names=c("score.1","score.2"),
direction="long"))
>
> # Step 4 Run GLS
>
> glsrun1 <- lapply(long, function(x) gls(score.1~I(time-1), data=x,
> correlation=corAR1(form=~1|V1), method='ML'))
>
> glsrun2 <- lapply(long, function(x) gls(score.2~I(time-1), data=x,
> correlation=corAR1(form=~1|V1), method='ML'))
>
> # Step 5 Extract Intercepts and slopes
> int1 <- lapply(glsrun1, function(x) x$coefficient[1])
> slo1 <- lapply(glsrun1, function(x) x$coefficient[2])
> int2 <- lapply(glsrun2, function(x) x$coefficient[1])
> slo2 <- lapply(glsrun2, function(x) x$coefficient[2])
>
> # Step 6 Compute SD of intercepts and slopes
>
> int.sd1 <- sapply(glsrun1, function(x) x$coefficient[1])
> slo.sd1 <- sapply(glsrun1, function(x) x$coefficient[2])
> int.sd2 <- sapply(glsrun2, function(x) x$coefficient[1])
> slo.sd2 <- sapply(glsrun2, function(x) x$coefficient[2])
>
> cat("Original Standard Errors","\n",
"Intercept","\t",
> sd(int.sd1),"\n","Slope","\t","\t",
sd(slo.sd1),"\n")
>
> cat("Modified Standard Errors","\n",
"Intercept","\t",
> sd(int.sd2),"\n","Slope","\t","\t",
sd(slo.sd2),"\n")
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Jan 2005 - Does R accumulate memory

[R] Does R accumulate memory

[R] Does R accumulate memory

[R] Does R accumulate memory

Possibly Parallel Threads