thr3ads.net - R help - [R] slowness when I use a list comprehension [Jun 2024]

If this information is useful, please help other people find it:
Share via:

Laurent Rhelp

2024-Jun-16 15:27 UTC

[R] slowness when I use a list comprehension

Dear RHelp-list,

 ?? I try to use the package comprehenr to replace a for loop by a list 
comprehension.

 ?I wrote the code but I certainly miss something because it is very 
slower compared to the for loops. May you please explain to me why the 
list comprehension is slower in my case.

Here is my example. I do the calculation of the square difference 
between the values of two vectors vec1 and vec2, the ratio sampling 
between vec1 and vec2 is equal to ratio_sampling. I have to use only the 
500th value of the first serie before doing the difference with the 
value of the second serie (vec2).

Thank you

Best regards

Laurent

library(tictoc)
library(comprehenr)

ratio_sampling <- 500
## size of the first serie
N1 <- 70000
## size of the second serie
N2 <- 100
## mock data
set.seed(123)
vec1 <- rnorm(N1)
vec2 <- runif(N2)


## 1. with the "for" loops

## the square differences will be stored in a vector
S_diff2 <- numeric((N1-(N2-1)*ratio_sampling))
tic()
for( j in 1:length(S_diff2)){
 ? sum_squares <- 0
 ? for( i in 1:length(vec2)){
 ??? sum_squares = sum_squares + ((vec1[(i-1)*ratio_sampling+j] - 
vec2[i])**2)
 ? }
 ? S_diff2[j] <- sum_squares
}
toc()
## 0.22 sec elapsed
which.max(S_diff2)
## 7857

## 2. with the lists comprehension
tic()
S_diff2 <- to_vec(for( j in 1:length(S_diff2)) sum(to_vec(for( i in 
1:length(vec2)) ((vec1[(i-1)*ratio_sampling+j] - vec2[i])**2))))
toc()
## 25.09 sec elapsed
which.max(S_diff2)
## 7857

@vi@e@gross m@iii@g oii gm@ii@com

2024-Jun-16 16:33 UTC

head link

[R] slowness when I use a list comprehension

Laurent,

Thank you for introducing me to a package I did not know existed as I use
features like list comprehension in python all the time and could see using it
in R now that I know it is available.

As to why you see your example as slow, I see you used a fairly complex and
nested expression and wonder if it was a better way to go. As you are dealing
with an interpreter doing delayed evaluation, I can imagine reasons it can be
slow. But note the package comprehenr may not be designed to be more efficient
than loops or of the more built-in functional methods that can be faster. The
package is there perhaps more as a compatibility helper that allows you to write
closer to the python style and perhaps re-shapes what you wrote into a set of
instructions in more native R.

Just for comparison, in python, things like comprehensions for list or
dictionaries or tuples often are syntactic sugar and the interpreter may simply
rewrite them more like the first program you typed and evaluates that. The
comprehensions are more designed for users who can think another way and write
things more compactly as one-liners. Depending on implementations, they may be
faster or slower on some examples.

I am not saying there is nothing else that is slowing it down for you. I am
suggesting that using the feature as currently implemented may not be an
advantage except in your thought process. It may be it could be improved, such
as by replacing more functionality out of R and into compiled languages as has
been done for many packages.

Avi

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Laurent Rhelp
Sent: Sunday, June 16, 2024 11:28 AM
To: r-help at r-project.org
Subject: [R] slowness when I use a list comprehension

Dear RHelp-list,

    I try to use the package comprehenr to replace a for loop by a list 
comprehension.

  I wrote the code but I certainly miss something because it is very 
slower compared to the for loops. May you please explain to me why the 
list comprehension is slower in my case.

Here is my example. I do the calculation of the square difference 
between the values of two vectors vec1 and vec2, the ratio sampling 
between vec1 and vec2 is equal to ratio_sampling. I have to use only the 
500th value of the first serie before doing the difference with the 
value of the second serie (vec2).

Thank you

Best regards

Laurent

library(tictoc)
library(comprehenr)

ratio_sampling <- 500
## size of the first serie
N1 <- 70000
## size of the second serie
N2 <- 100
## mock data
set.seed(123)
vec1 <- rnorm(N1)
vec2 <- runif(N2)


## 1. with the "for" loops

## the square differences will be stored in a vector
S_diff2 <- numeric((N1-(N2-1)*ratio_sampling))
tic()
for( j in 1:length(S_diff2)){
   sum_squares <- 0
   for( i in 1:length(vec2)){
     sum_squares = sum_squares + ((vec1[(i-1)*ratio_sampling+j] - 
vec2[i])**2)
   }
   S_diff2[j] <- sum_squares
}
toc()
## 0.22 sec elapsed
which.max(S_diff2)
## 7857

## 2. with the lists comprehension
tic()
S_diff2 <- to_vec(for( j in 1:length(S_diff2)) sum(to_vec(for( i in 
1:length(vec2)) ((vec1[(i-1)*ratio_sampling+j] - vec2[i])**2))))
toc()
## 25.09 sec elapsed
which.max(S_diff2)
## 7857

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Gabor Grothendieck

2024-Jun-16 18:54 UTC

head link

[R] slowness when I use a list comprehension

This can be vectorized.  Try

ix <- seq_along(vec2)
S_diff2 <- sapply(seq_len(N1-(N2-1)*ratio_sampling), \(j)
sum((vec1[(ix-1)*ratio_sampling+j] - vec2[ix])**2))

On Sun, Jun 16, 2024 at 11:27?AM Laurent Rhelp <laurentRHelp at free.fr>
wrote:>
> Dear RHelp-list,
>
>     I try to use the package comprehenr to replace a for loop by a list
> comprehension.
>
>   I wrote the code but I certainly miss something because it is very
> slower compared to the for loops. May you please explain to me why the
> list comprehension is slower in my case.
>
> Here is my example. I do the calculation of the square difference
> between the values of two vectors vec1 and vec2, the ratio sampling
> between vec1 and vec2 is equal to ratio_sampling. I have to use only the
> 500th value of the first serie before doing the difference with the
> value of the second serie (vec2).
>
> Thank you
>
> Best regards
>
> Laurent
>
> library(tictoc)
> library(comprehenr)
>
> ratio_sampling <- 500
> ## size of the first serie
> N1 <- 70000
> ## size of the second serie
> N2 <- 100
> ## mock data
> set.seed(123)
> vec1 <- rnorm(N1)
> vec2 <- runif(N2)
>
>
> ## 1. with the "for" loops
>
> ## the square differences will be stored in a vector
> S_diff2 <- numeric((N1-(N2-1)*ratio_sampling))
> tic()
> for( j in 1:length(S_diff2)){
>    sum_squares <- 0
>    for( i in 1:length(vec2)){
>      sum_squares = sum_squares + ((vec1[(i-1)*ratio_sampling+j] -
> vec2[i])**2)
>    }
>    S_diff2[j] <- sum_squares
> }
> toc()
> ## 0.22 sec elapsed
> which.max(S_diff2)
> ## 7857
>
> ## 2. with the lists comprehension
> tic()
> S_diff2 <- to_vec(for( j in 1:length(S_diff2)) sum(to_vec(for( i in
> 1:length(vec2)) ((vec1[(i-1)*ratio_sampling+j] - vec2[i])**2))))
> toc()
> ## 25.09 sec elapsed
> which.max(S_diff2)
> ## 7857
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

Reasonably Related Threads

Search for more reasonably related threads

R help - Jun 2024 - slowness when I use a list comprehension

[R] slowness when I use a list comprehension

[R] slowness when I use a list comprehension

[R] slowness when I use a list comprehension

Reasonably Related Threads