Hi All,> act_2Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 55 2006-02-22 14:52:49 14 52 49 4 57 2006-02-22 14:52:51 14 52 51 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle I want to change act_2 to Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle in other word, I want to keep 1st if there are many repeated value, I made the program as: rm_r<-function(act_2){ dm<-dim(act_2)[1]-1 for(i in 2:dm){ if(act_2$Rep[i+1]==act_2$Rep[i]){ act_2<-act_2[-(i+1),] }else{ act_2<-act_2 } } return(act_2) } when it moved one row on 1st loop, i should still start 2 but it become 3 at 2nd loop, if I add i<-i-1, then i go to 1 seems not reasonbale. How should I modify it`? Tammy _________________________________________________________________ Drag n’ drop—Get easy photo sharing with Windows Live™ Photos. http://www.microsoft.com/windows/windowslive/products/photos.aspx [[alternative HTML version deleted]]
Hi All,> act_2Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 55 2006-02-22 14:52:49 14 52 49 4 57 2006-02-22 14:52:51 14 52 51 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle I want to change act_2 to Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle in other word, I want to keep 1st if there are many repeated value, I made the program as: Not sure what you mean here, can you describe this more fully? It seems that you might be able to avoid using loops if all you want to do is select only the rows where column x is less than a threshold value. e.g. a<-a[a$columnx<1000,] Hope this helps Simon. rm_r<-function(act_2){ dm<-dim(act_2)[1]-1 for(i in 2:dm){ if(act_2$Rep[i+1]==act_2$Rep[i]){ act_2<-act_2[-(i+1),] }else{ act_2<-act_2 } } return(act_2) } when it moved one row on 1st loop, i should still start 2 but it become 3 at 2nd loop, if I add i<-i-1, then i go to 1 seems not reasonbale. How should I modify it`? Tammy _________________________________________________________________ Drag n' drop-Get easy photo sharing with Windows LiveT Photos. http://www.microsoft.com/windows/windowslive/products/photos.aspx [[alternative HTML version deleted]] --------------------------------------------------------------------------------> ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
> Hi All, > > > > act_2 > Date Dtime Hour Min Second Rep > 51 2006-02-22 14:52:18 14 52 18 useractivity_act > 52 2006-02-22 14:52:18 14 52 18 4 > 55 2006-02-22 14:52:49 14 52 49 4 > 57 2006-02-22 14:52:51 14 52 51 4 > 58 2006-02-22 14:52:52 14 52 52 3 > 60 2006-02-22 14:54:42 14 54 42 useractivity_idle > > I want to change act_2 to > Date Dtime Hour Min Second Rep > > 51 2006-02-22 14:52:18 14 52 18 useractivity_act > > 52 2006-02-22 14:52:18 14 52 18 4 > 58 2006-02-22 14:52:52 14 52 52 3 > 60 2006-02-22 14:54:42 14 54 42 useractivity_idle > > in other word, I want to keep 1st if there are many repeated value, > I made the program as: > > > rm_r<-function(act_2){ > dm<-dim(act_2)[1]-1 > for(i in 2:dm){ > > if(act_2$Rep[i+1]==act_2$Rep[i]){ > act_2<-act_2[-(i+1),] > }else{ > act_2<-act_2 > } > } > return(act_2) > } > > when it moved one row on 1st loop, i should still start 2 but it > become 3 at 2nd loop, if I add i<-i-1, then i go to 1 > seems not reasonbale. How should I modify it`?Please don't repeatedly post the same question - it is irritating, and you're not likely to get a favourable response. Try explaining your problem more clearly. What is the condition that you want to use to keep rows? (In your example, each of the rows is different, yet you kept some and discarded others.) If you just want to discard some rows form a data frame, you don't need a loop, e.g. dfr <- data.frame(x=1:10, y=runif(10)) dfr[c(1,3,5,5),] Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}
I assume the problem is to only keep those rows whose
Rep value is not equal to the Rep value in the prior row.
In that case just compare c("", Rep[-nr]) to Rep and
use the resulting vector, ix, to select out rows.
Rep <- as.character(act_2$Rep) could
be simplified to Rep <- act_2$Rep if Rep
is already known to be "character".
Lines <- "Date    Dtime Hour Min Second               Rep
51 2006-02-22 14:52:18   14  52     18  useractivity_act
52 2006-02-22 14:52:18   14  52     18                 4
55 2006-02-22 14:52:49   14  52     49                 4
57 2006-02-22 14:52:51   14  52     51                 4
58 2006-02-22 14:52:52   14  52     52                 3
60 2006-02-22 14:54:42   14  54     42 useractivity_idle"
act_2 <- read.table(textConnection(Lines), header = TRUE, as.is = TRUE)
nr <- nrow(act_2)
Rep <- as.character(act_2$Rep)
ix <- Rep != c("", Rep[-nr])
act_2[ix]
On Thu, Mar 12, 2009 at 6:25 AM, Tammy Ma <metal_licaling at live.com>
wrote:>
>
>
> ?Hi All,
>
>
>> act_2
> ? ? ? ? Date ? ?Dtime Hour Min Second ? ? ? ? ? ? ? Rep
> 51 2006-02-22 14:52:18 ? 14 ?52 ? ? 18 ?useractivity_act
> 52 2006-02-22 14:52:18 ? 14 ?52 ? ? 18 ? ? ? ? ? ? ? ? 4
> 55 2006-02-22 14:52:49 ? 14 ?52 ? ? 49 ? ? ? ? ? ? ? ? 4
> 57 2006-02-22 14:52:51 ? 14 ?52 ? ? 51 ? ? ? ? ? ? ? ? 4
> 58 2006-02-22 14:52:52 ? 14 ?52 ? ? 52 ? ? ? ? ? ? ? ? 3
> 60 2006-02-22 14:54:42 ? 14 ?54 ? ? 42 useractivity_idle
>
> I want to change act_2 to
> ? ? ? ? Date ? ?Dtime Hour Min Second ? ? ? ? ? ? ? Rep
>
> 51 2006-02-22 14:52:18 ? 14 ?52 ? ? 18 ?useractivity_act
>
> 52 2006-02-22 14:52:18 ? 14 ?52 ? ? 18 ? ? ? ? ? ? ? ? 4
> 58 2006-02-22 14:52:52 ? 14 ?52 ? ? 52 ? ? ? ? ? ? ? ? 3
> 60 2006-02-22 14:54:42 ? 14 ?54 ? ? 42 useractivity_idle
>
> in other word, I want to keep 1st if there are many repeated value, I made
the program as:
>
>
> rm_r<-function(act_2){
> ?dm<-dim(act_2)[1]-1
> ?for(i in 2:dm){
>
> ?if(act_2$Rep[i+1]==act_2$Rep[i]){
> ? act_2<-act_2[-(i+1),]
> ? }else{
> ? act_2<-act_2
> ? }
> ?}
> return(act_2)
> }
>
> when it moved one row on 1st loop, i should still start 2 but it become 3
at 2nd loop, if I add i<-i-1, then i go to 1
> seems not reasonbale. How should I modify it`?
>
> Tammy
> _________________________________________________________________
> Drag n? drop?Get easy photo sharing with Windows Live? Photos.
>
> http://www.microsoft.com/windows/windowslive/products/photos.aspx
> ? ? ? ?[[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
I think I answered a very similar question from you yesterday but perhaps the mail went astray. The subject line is not informative. It may make it easier to think about if you use a function like isFirstInRun <- function(x) c(TRUE, x[-1]!=x[-length(x)] Given a vector x (without NA's in it) it tells you if a given element of x is the first in a run of identical values. E.g., x <- c(1,2,2,1,1,3) isFirstInRun(x) [1] TRUE TRUE FALSE TRUE FALSE TRUE You don't have to understand why this works or why it works quickly or have this idiom in your working set yet. You do need to know how to use logical values as subscripts to extract elements of interest from vectors or rows of interest from data frames. E.g., act_2[ with(act_2, isFirstInRun(Rep)), ] should returns row 51, 52, 58, and 60 of your example. If you want to only return the first of each Hour/Min combinary you could use either isFirstInRun(interaction(Hour,Min)) or isFirstInRun(Hour)|isFirstInRun(Min) as the row subscript to act_2 to pull out rows 51 and 60. If this were to become a standard function it could be modified to handle NA's, 0-long arguments, and multiple arguments. (If it accepted multiple arguments then rle() ought to be modified in the same way, as they are closely related.) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com ---------------------------------------------------------------------- Tammy Ma metal_licaling at live.com Thu Mar 12 11:25:56 CET 2009> act_2Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 55 2006-02-22 14:52:49 14 52 49 4 57 2006-02-22 14:52:51 14 52 51 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle I want to change act_2 to Date Dtime Hour Min Second Rep 51 2006-02-22 14:52:18 14 52 18 useractivity_act 52 2006-02-22 14:52:18 14 52 18 4 58 2006-02-22 14:52:52 14 52 52 3 60 2006-02-22 14:54:42 14 54 42 useractivity_idle in other word, I want to keep 1st if there are many repeated value, I made the program as: rm_r<-function(act_2){ dm<-dim(act_2)[1]-1 for(i in 2:dm){ if(act_2$Rep[i+1]==act_2$Rep[i]){ act_2<-act_2[-(i+1),] }else{ act_2<-act_2 } } return(act_2) } when it moved one row on 1st loop, i should still start 2 but it become 3 at 2nd loop, if I add i<-i-1, then i go to 1 seems not reasonbale. How should I modify it`? Tammy