thr3ads.net - R help - [R] R runtime performance and memory usage [Nov 2015]

If this information is useful, please help other people find it:
Share via:

Sasikumar Kandhasamy

2015-Nov-16 20:25 UTC

[R] R runtime performance and memory usage

Hi All,

I have couple of clarifications on R run-time performance. I have R-3.2.2
package compiled for MIPS64 and am running it on my linux machine with
mips64 processor (core speed 1.5GHz) and observing the following behaviors,

1. Applying "linear regression model" (lm) on 1MB of data (contains 1
column of 250K records) takes ~6 seconds to complete. Anyidea, is it an
expected behavior or not? If not, can you please the suggestions or options
to improve if we have any?

2. Also, the R process runtime virtual memory is increased by 40MB after
applying the linear model on 1MB data. Is it also expected behavior? If it
is expected, can you please share the insight of memory usage?

Thanks in advance.

Regards
Sasi

	[[alternative HTML version deleted]]

Bert Gunter

2015-Nov-16 20:44 UTC

head link

[R] R runtime performance and memory usage

Do your own homework.
Google on "memory usage in R."  etc.
You should have no trouble finding what you need there.

Cheers,
Bert



Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Mon, Nov 16, 2015 at 12:25 PM, Sasikumar Kandhasamy
<ckmsasi at gmail.com> wrote:> Hi All,
>
> I have couple of clarifications on R run-time performance. I have R-3.2.2
> package compiled for MIPS64 and am running it on my linux machine with
> mips64 processor (core speed 1.5GHz) and observing the following behaviors,
>
> 1. Applying "linear regression model" (lm) on 1MB of data
(contains 1
> column of 250K records) takes ~6 seconds to complete. Anyidea, is it an
> expected behavior or not? If not, can you please the suggestions or options
> to improve if we have any?
>
> 2. Also, the R process runtime virtual memory is increased by 40MB after
> applying the linear model on 1MB data. Is it also expected behavior? If it
> is expected, can you please share the insight of memory usage?
>
> Thanks in advance.
>
> Regards
> Sasi
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

William Dunlap

2015-Nov-16 22:04 UTC

head link

[R] R runtime performance and memory usage

You cannot do a linear regression with one column of data - there must
be at least one response column and one predictor.  By default, lm
throws in a constant term which gives you a second predictor.  If your
predictor is categorical, you get a new column for all but the first
unique value in it.

lm() deals only with double precision data, at 8 bytes/number.  Thus
250k numbers occupies 2 million bytes.  Your three columns (in the
non-categorical-predictor case)  take up 6 million bytes,

lm()'s output contains several columns the size of the response
variable: residuals, effects, and fitted.values.  It also contains the
QR decomposition of the design matrix (the size of all the predictor
columns together).

There are also some temporary variables generated in the course of the
computation.

So your observed 40 MB memory usage seems reasonable.

Use the object.size() function to see how big objects are and str() to
look at their structure.

My laptop with  a 2.5 GHz Intel i7 processor takes a quarter second to
fit a simple linear model with one numeric predictor and a constant
term.  6 seconds sounds slow.  Is that cpu or elapsed time (use
system.time() to see)?

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Nov 16, 2015 at 12:25 PM, Sasikumar Kandhasamy
<ckmsasi at gmail.com> wrote:> Hi All,
>
> I have couple of clarifications on R run-time performance. I have R-3.2.2
> package compiled for MIPS64 and am running it on my linux machine with
> mips64 processor (core speed 1.5GHz) and observing the following behaviors,
>
> 1. Applying "linear regression model" (lm) on 1MB of data
(contains 1
> column of 250K records) takes ~6 seconds to complete. Anyidea, is it an
> expected behavior or not? If not, can you please the suggestions or options
> to improve if we have any?
>
> 2. Also, the R process runtime virtual memory is increased by 40MB after
> applying the linear model on 1MB data. Is it also expected behavior? If it
> is expected, can you please share the insight of memory usage?
>
> Thanks in advance.
>
> Regards
> Sasi
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Sasikumar Kandhasamy

2015-Nov-16 23:25 UTC

head link

[R] R runtime performance and memory usage

Thanks a lot Bill & Bert.

Hi Bill,

Sorry i was wrong on number of records, actually, i am using two
dimensional data of 250K records each. And regarding CPU usage, it was the
elapsed time. Infact, i have pined one core to run R.

Thanks & Regards
Sasi

On Mon, Nov 16, 2015 at 2:04 PM, William Dunlap <wdunlap at tibco.com>
wrote:
> You cannot do a linear regression with one column of data - there must
> be at least one response column and one predictor.  By default, lm
> throws in a constant term which gives you a second predictor.  If your
> predictor is categorical, you get a new column for all but the first
> unique value in it.
>
> lm() deals only with double precision data, at 8 bytes/number.  Thus
> 250k numbers occupies 2 million bytes.  Your three columns (in the
> non-categorical-predictor case)  take up 6 million bytes,
>
> lm()'s output contains several columns the size of the response
> variable: residuals, effects, and fitted.values.  It also contains the
> QR decomposition of the design matrix (the size of all the predictor
> columns together).
>
> There are also some temporary variables generated in the course of the
> computation.
>
> So your observed 40 MB memory usage seems reasonable.
>
> Use the object.size() function to see how big objects are and str() to
> look at their structure.
>
> My laptop with  a 2.5 GHz Intel i7 processor takes a quarter second to
> fit a simple linear model with one numeric predictor and a constant
> term.  6 seconds sounds slow.  Is that cpu or elapsed time (use
> system.time() to see)?
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Mon, Nov 16, 2015 at 12:25 PM, Sasikumar Kandhasamy
> <ckmsasi at gmail.com> wrote:
> > Hi All,
> >
> > I have couple of clarifications on R run-time performance. I have
R-3.2.2
> > package compiled for MIPS64 and am running it on my linux machine with
> > mips64 processor (core speed 1.5GHz) and observing the following
> behaviors,
> >
> > 1. Applying "linear regression model" (lm) on 1MB of data
(contains 1
> > column of 250K records) takes ~6 seconds to complete. Anyidea, is it
an
> > expected behavior or not? If not, can you please the suggestions or
> options
> > to improve if we have any?
> >
> > 2. Also, the R process runtime virtual memory is increased by 40MB
after
> > applying the linear model on 1MB data. Is it also expected behavior?
If
> it
> > is expected, can you please share the insight of memory usage?
> >
> > Thanks in advance.
> >
> > Regards
> > Sasi
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Nov 2015 - R runtime performance and memory usage

[R] R runtime performance and memory usage

[R] R runtime performance and memory usage

[R] R runtime performance and memory usage

[R] R runtime performance and memory usage