thr3ads.net - R help - [R] in continuation with the earlier R puzzle [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Raghu

2010-Jul-12 14:09 UTC

[R] in continuation with the earlier R puzzle

When I just run a for loop it works. But if I am going to run a for loop
every time for large vectors I might as well use C or any other language.
The reason R is powerful is becasue it can handle large vectors without each
element being manipulated? Please let me know where I am wrong.

for(i in 1:length(news1o)){
+ if(news1o[i]>s2o[i])
+ s[i]<-1
+ else
+ s[i]<--1
+ }

-- 
'Raghu'

	[[alternative HTML version deleted]]

Alain Guillet

2010-Jul-12 16:12 UTC

head link

[R] in continuation with the earlier R puzzle

I don't know what is wrong with your code but I believe you should use 
ifelse instead of a for loop:

s <- ifelse(news1o > s2o, 1 , -1 )


Alain

On 12-Jul-10 16:09, Raghu wrote:> When I just run a for loop it works. But if I am going to run a for loop
> every time for large vectors I might as well use C or any other language.
> The reason R is powerful is becasue it can handle large vectors without
each
> element being manipulated? Please let me know where I am wrong.
>
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
>
-- 
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Universit? catholique de Louvain
Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50

Seeliger.Curt at epamail.epa.gov

2010-Jul-12 16:14 UTC

head link

[R] in continuation with the earlier R puzzle

> The reason R is powerful is becasue it can handle large vectors without 
each> element being manipulated? Please let me know where I am wrong.
> 
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
You might give ifelse() a shot here. 
s <- ifelse(news1o > s2o, 1, -1)

Learning to think in vectors is important in R, just like thinking in sets 
is important for SQL, or thinking in rows and steps is important in SAS.

cur
-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.curt@epa.gov
541/754-4638
	[[alternative HTML version deleted]]

David Winsemius

2010-Jul-12 16:24 UTC

head link

[R] in continuation with the earlier R puzzle

On Jul 12, 2010, at 10:09 AM, Raghu wrote:
> When I just run a for loop it works. But if I am going to run a for  
> loop
> every time for large vectors I might as well use C or any other  
> language.
> The reason R is powerful is becasue it can handle large vectors  
> without each
> element being manipulated? Please let me know where I am wrong.
>
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
Perhaps:

s <- 2*( news1o > s2o[1:length(news1o)] ) - 1

...which I think will throw errors under pretty much the same  
conditions that would cause errors in that loop.

-- 
David Winsemius, MD
West Hartford, CT

(Ted Harding)

2010-Jul-12 16:36 UTC

head link

[R] in continuation with the earlier R puzzle

On 12-Jul-10 14:09:30, Raghu wrote:> When I just run a for loop it works. But if I am going to
> run a for loop every time for large vectors I might as well
> use C or any other language.
> The reason R is powerful is becasue it can handle large vectors
> without each element being manipulated? Please let me know where
> I am wrong.
> 
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
> 
> -- 
> 'Raghu'
Many operations over the whole length of vectors can be done
in "vectorised" form, in which an entire vector is changed
in one operation based on the values of the separate elemnts
of other vectors, also all take into account in a single
operation. What happens "behind to scenes" is that the single
element by element operations are performed by a function
in a precompiled (usually from C) library. Hence R already
does what you are suggesting as a "might as well" alternative!

Below is an example, using long vectors. The first case is a
copy of your R loop above (with some additional initialisation
of the vectors). The second achieves the same result in the
"vectorised" form.

  news1o <- runif(1000000)
  s2o    <- runif(1000000)
  s      <- numeric(length(news1o))

  proc.time()
  #    user  system elapsed 
  #   1.728   0.680 450.257 
  for(i in 1:length(news1o)){  ### Using a loop
    if(news1o[i]>s2o[i])
    s[i]<-   1
    else
    s[i]<- (-1)
  }
  proc.time()
  #    user  system elapsed
  #  11.184   0.756 460.340 
  s2 <- 2*(news1o > s2o) - 1   ### Vectorised
  proc.time()
  #    user  system elapsed 
  #  11.348   0.852 460.663

  sum(s2 != s)
  # [1] 0                      ### Results identical

Result: The loop took (11.184 -  1.728) = 9.456 seconds,
  Vectorised, it took (11.348 - 11.184) = 0.164 seconds.

Loop/Vector = (11.184 - 1.728)/(11.348 - 11.184) = 57.65854

i.e. nearly 60 times as long.

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 12-Jul-10                                       Time: 17:36:07
------------------------------ XFMail ------------------------------

Joshua Wiley

2010-Jul-12 18:40 UTC

head link

[R] in continuation with the earlier R puzzle

I wanted to point out one thing that Ted said, about initializing the
vectors ('s' in your example).  This can make a dramatic speed
difference if you are using a for loop (the difference is neglible
with vectorized computations).

Also, a lot of benchmarks have been flying around, each from a
different system and using random numbers without identical seeds.  So
to provide an overall comparison of all the methods I saw here plus
demonstrate the speed difference for initializing a vector (if you
know its desired length in advance), I ran these benchmarks.

Notes:
I did not want to interfere with your objects so I used different
names. The equivalencies are: news1o = x; s2o = y; s = z.
system.time() automatically calculates the time difference from
proc.time() between start and finish .
> ##R version info
> sessionInfo()R version 2.11.1 (2010-05-31)
x86_64-pc-mingw32
#snipped>
> ##Some Sample Data
> set.seed(10)
> x <- rnorm(10^6)
> set.seed(15)
> y <- rnorm(10^6)
>
> ##Benchmark 1
> z.1 <- NULL
> system.time(for(i in 1:length(x)) {+   if(x[i] > y[i]) {
+     z.1[i] <- 1
+   } else {
+     z.1[i] <- -1}
+ }
+             )
   user  system elapsed
1303.83  174.24 1483.74>
> ##Benchmark 2
> #initialize 'z' at length
> z.2 <- vector("numeric", length = 10^6)
> system.time(for(i in 1:length(x)) {+   if(x[i] > y[i]) {
+     z.2[i] <- 1
+   } else {
+     z.2[i] <- -1}
+ }
+             )
   user  system elapsed
   3.77    0.00    3.77>
> ##Benchmark 3
>
> z.3 <- NULL
> system.time(z.3 <- ifelse(x > y, 1, -1))   user  system elapsed
   0.38    0.00    0.38>
> ##Benchmark 4
>
> z.4 <- vector("numeric", length = 10^6)
> system.time(z.4 <- ifelse(x > y, 1, -1))   user  system elapsed
   0.31    0.00    0.31>
> ##Benchmark 5
>
> system.time(z.5 <- 2*(x > y) - 1)   user  system elapsed
   0.01    0.00    0.01>
> ##Benchmark 6
>
> system.time(z.6 <- numeric(length(x))-1)   user  system elapsed
      0       0       0> system.time(z.6[x > y] <- 1)   user  system elapsed
   0.03    0.00    0.03>
> ##Show that all results are identical
>
> identical(z.1, z.2)
[1] TRUE> identical(z.1, z.3)
[1] TRUE> identical(z.1, z.4)
[1] TRUE> identical(z.1, z.5)
[1] TRUE> identical(z.1, z.6)[1] TRUE


I have not replicated these on other system, but tentatively, it
appears that loops are significantly slower than ifelse(), which in
turn is slower than options 5 and 6.  However, when using the same
test data  and the same system, I did not find an appreciable
difference between options 5 and 6 speed wise.

Cheers,

Josh

On Mon, Jul 12, 2010 at 7:09 AM, Raghu <r.raghuraman at gmail.com>
wrote:> When I just run a for loop it works. But if I am going to run a for loop
> every time for large vectors I might as well use C or any other language.
> The reason R is powerful is becasue it can handle large vectors without
each
> element being manipulated? Please let me know where I am wrong.
>
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
>
> --
> 'Raghu'
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

Petr PIKAL

2010-Jul-13 11:01 UTC

head link

[R] Odp: in continuation with the earlier R puzzle

Hi

r-help-bounces at r-project.org napsal dne 12.07.2010 16:09:30:
> When I just run a for loop it works. But if I am going to run a for loop
> every time for large vectors I might as well use C or any other 
language.> The reason R is powerful is becasue it can handle large vectors without 
each> element being manipulated? Please let me know where I am wrong.
> 
> for(i in 1:length(news1o)){
> + if(news1o[i]>s2o[i])
> + s[i]<-1
> + else
> + s[i]<--1
> + }
Think in R not in C. Why using loops when you can use whole object 
directly. It is like drinking beer from snifters. It is possible but using 
pints is preferable and more convenient.

news1o>s2o

gives you a logical vector the same length

and you can use it directly for further selection or computation. You can 
consider FALSE as 0 and TRUE as 1 and use it as numeric vector
so

x<-runif(10)
y<-runif(10)

c(-1,1)[(x>y)+1]

selects -1 when FALSE and 1 when TRUE.

or you can use it in mathematical operation directly

(x>y)*2-1

Regards
Petr
> 
> -- 
> 'Raghu'
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

Reasonably Related Threads

Search for more reasonably related threads

R help - Jul 2010 - in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] in continuation with the earlier R puzzle

[R] Odp: in continuation with the earlier R puzzle

Reasonably Related Threads