thr3ads.net - R help - [R] Speeding up code [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Amie Hunter

2013-Nov-23 21:39 UTC

[R] Speeding up code

Hello R experts, 

I'm new to R and I'm wanting to know what is the best way to speed up my
code. I've read that you can vectorize the code but I'm unsure on how to
implement this into my code.


df <- data.frame(31790,31790)

for (i in 1:31790) 
{
? for (j in i:31790) 
? {
??? ken<-cor(cldm[i,3:17],cldm[j,3:17], method="kendall",
use="pairwise")
??? dis2<-deg.dist(cldm[i,2],cldm[i,1],cldm[j,2],cldm[j,1])
?? ?
??? df[i,j]<-ifelse(dis2<=500,ken,NA)
??? }
? } 
df

Thanks!

Jeff Newmiller

2013-Nov-24 07:43 UTC

head link

[R] Speeding up code

What is cldm?

We (and therefore you, to verify that we can) should be able to copy the example
from the email and paste it into a newly-started instance of R. Not having some
example data similar to yours to work with puts us at a major disadvantage. It
would also be helpful to know what you are trying to accomplish (description).

You might want to use the str function to understand what each object you are
creating really is. I don't know what you want the "df" object to
be, but a data frame of two values in default-named columns is unusual. You may
be confusing matrices with data frames?

(Note that there is a function called df in the core libraries, so you might
want to avoid using that name to avoid confusion.)
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Amie Hunter <amie_hunter at hotmail.com> wrote:>Hello R experts, 
>
>I'm new to R and I'm wanting to know what is the best way to speed
up
>my code. I've read that you can vectorize the code but I'm unsure on
>how to implement this into my code.
>
>
>df <- data.frame(31790,31790)
>
>for (i in 1:31790) 
>{
>? for (j in i:31790) 
>? {
>??? ken<-cor(cldm[i,3:17],cldm[j,3:17], method="kendall",
>use="pairwise")
>??? dis2<-deg.dist(cldm[i,2],cldm[i,1],cldm[j,2],cldm[j,1])
>?? ?
>??? df[i,j]<-ifelse(dis2<=500,ken,NA)
>??? }
>? } 
>df
>
>Thanks! 		 	   		  
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

MacQueen, Don

2013-Nov-25 18:45 UTC

head link

[R] Speeding up code

ditto to everything Jeff Newmiller said, but I'll take it a little further.

I'm guessing that with
   df <- data.frame(31790,31790)
you thought you were creating something with 31790 rows and 31790 columns.
You weren't. You were creating a data frame with one row and two columns:
> data.frame(31790,31790)  X31790 X31790.1
1  31790    31790

Given that in your loop you assign values to df[i,j],
and having started with just one row and two columns, it follows
that every time you assign to df[i,j] you are increasing
the size of your data frame, and that will slow things down.

Initialize with a matrix (I'll call it 'res' instead of
'df'):

  res <- matrix(NA, 31790,31790)

Then inside your loop, you can use

   if (dis2<=500) res[i,j] <- ken

No need to deal with 'else', since the matrix is initialized
with NA.

The ifelse() function was a less than ideal choice,
since it is designed for vector arguments, and your value, dis2,
appears to always have length = 1. You could have used
  df[i,j] <- if (dis2 <= 500) ken else NA
but as I mentioned above, if you initialize to NA there's no need
handle the 'else' case inside the loop.

It may be possible to vectorize your loop, but I kind of doubt it,
considering that you're using the cor() followed by the deg.dist()
function at every iteration.

However, you could calculate the dis2 value first, and then calculate
ken only when dis2 is <= 500. You're calculating ken even when it's
not
needed. Avoiding that should speed things up.

I don't know what deg.dist() is doing, but if it is calculating distances
between points, there are functions for doing that on whole bunches
of points at once. Perhaps your data could be rearranged to work
with one of those.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062

On 11/23/13 1:39 PM, "Amie Hunter" <amie_hunter at hotmail.com>
wrote:
>Hello R experts, 
>
>I'm new to R and I'm wanting to know what is the best way to speed
up my
>code. I've read that you can vectorize the code but I'm unsure on
how to
>implement this into my code.
>
>
>df <- data.frame(31790,31790)
>
>for (i in 1:31790)
>{
>  for (j in i:31790)
>  {
>    ken<-cor(cldm[i,3:17],cldm[j,3:17], method="kendall",
use="pairwise")
>    dis2<-deg.dist(cldm[i,2],cldm[i,1],cldm[j,2],cldm[j,1])
>    
>    df[i,j]<-ifelse(dis2<=500,ken,NA)
>    }
>  } 
>df
>
>Thanks! 		 	   		 
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

R help - Nov 2013 - Speeding up code

[R] Speeding up code

[R] Speeding up code

[R] Speeding up code