HI,
This may also help:
someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth = 50*(1:8),
stage=factor(rep(".",8), levels=c(".","J")))
f2<-function(x){
?
needsChanging<-with(someTags,is.na(match(tag_id,tag_id[duplicated(tag_id)]))&lgth<300)
?x$stage[needsChanging]<-"J"
?x
?}
?f2(someTags)
#? tag_id lgth stage
#1????? 1?? 50???? J
#2????? 2? 100???? .
#3????? 2? 150???? .
#4????? 3? 200???? J
#5????? 4? 250???? J
#6????? 5? 300???? .
#7????? 6? 350???? .
#8????? 6? 400???? .
A.K.
----- Original Message -----
From: William Dunlap <wdunlap at tibco.com>
To: Guillaume2883 <guillaume.bal.pro at gmail.com>; "r-help at
r-project.org" <r-help at r-project.org>
Cc:
Sent: Friday, August 10, 2012 8:02 PM
Subject: Re: [R] vectorization condition counting
Your sum(tag_id==tag_id[i])==1, meaning tag_id[i] is the only entry with its
value, may be vectorized by the sneaky idiom
? !(duplicated(tag_id,fromLast=FALSE) | duplicated(tag_id,fromLast=TRUE)
Hence f0() (with your code in a loop) and f1() are equivalent:
f0 <- function (tags) {
? ? for (i in seq_len(nrow(tags))) {
? ? ? ? if (sum(tags$tag_id == tags$tag_id[i]) == 1 & tags$lgth[i] < 300)
{
? ? ? ? ? ? tags$stage[i] <- "J"
? ? ? ? }
? ? }
? ? tags
}
f1 <-function (tags) {
? ? needsChanging <- with(tags, !(duplicated(tag_id, fromLast = FALSE) |
? ? ? ? duplicated(tag_id, fromLast = TRUE)) & lgth < 300)
? ? tags$stage[needsChanging] <- "J"
? ? tags
}
E.g.,> someTags <- data.frame(tag_id = c(1, 2, 2, 3, 4, 5, 6, 6), lgth =
50*(1:8), stage=factor(rep(".",8),
levels=c(".","J")))
> all.equal(f0(someTags), f1(someTags))
[1] TRUE> f1(someTags)
? tag_id lgth stage
1? ? ? 1? 50? ? J
2? ? ? 2? 100? ? .
3? ? ? 2? 150? ? .
4? ? ? 3? 200? ? J
5? ? ? 4? 250? ? J
6? ? ? 5? 300? ? .
7? ? ? 6? 350? ? .
8? ? ? 6? 400? ? .
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Guillaume2883
> Sent: Friday, August 10, 2012 3:47 PM
> To: r-help at r-project.org
> Subject: [R] vectorization condition counting
>
> Hi all,
>
> I am working on a really big dataset and I would like to vectorize a
> condition in a if loop to improve speed.
>
> the original loop with the condition is currently writen as follow:
>
>
if(sum(as.integer(tags$tag_id==tags$tag_id[i]))==1&tags$lgth[i]<300){
>
>? ? ? tags$stage[i]<-"J"
>
>? ? }
>
> Do you have some ideas ? I was unable to do it correctly
> Thanking you in advance for your help
>
> Guillaume
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/vectorization-condition-
> counting-tp4639992.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.