thr3ads.net - R help - [R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2 [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Eugeniusz Kaluza

2010-Jun-24 11:18 UTC

[R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2

Dear useRs,

Thanks for any advices

# I do not know where are the examples how to mark groups 
#  based on signal occurence in the additional variable: cf. variable c2,  
# How to calculate different calculations for groups defined by (split by
occurence of c2 characteristic data)
 
 
#First example of simple data
#mexample   1      2    3  4     5  6  7  8  9  10 11       12 13 14 15 16 17
c0<-rbind( 1,      2 , 3, 4,      5, 6, 7, 8, 9,10,11,      12,13,14,15,16,17
)
c0                                             
c1<-rbind(10,     20 ,30,40,     50,10,60,20,30,40,50,      30,10,
0,NA,20,10.3444)
c1
c2<-rbind(NA,"Start1",NA,NA,"Stop1",NA,NA,NA,NA,NA,NA,"Start2",NA,NA,NA,NA,"Stop2")
c2
C.df<-data.frame(cbind(c0,c1,c2))
colnames(C.df)<-c("c0","c1","c2")
C.df

# preparation of form for explaining further needed result (next 3 lines are not
needed indeed, they are only  to explain how to obtain final result
 c3<-rbind(NA,"Start1","Start1","Start1","Start1","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2")
 c4<-rbind(NA, "Stop1", "Stop1", "Stop1",
"Stop1", "Stop2", "Stop2", "Stop2",
"Stop2", "Stop2", "Stop2", "Stop2",
"Stop2", "Stop2", "Stop2", "Stop2",
"Stop2")
 C.df<-data.frame(cbind(c0,c1,c2,c3,c4))
 colnames(C.df)<-c("c0","c1","c2","c3","c4")
 C.df$c5<-paste(C.df$c3,C.df$c4,sep="-")
 C.df

# NEEDED RESULTS
 # needed result 
# for Stat1-Stop1: mean(20,30,40,50)
# for Stat2-Stop2: mean(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T)
#mean:
         c1     c3    c4           c5
         20  Start1 Stop1 Start1-Stop1
   25.48585  Start2 Stop2 Start2-Stop2

#sum
# for Stat1-Stop1: sum(20,30,40,50)
# for Stat2-Stop2: sum(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T)
#sum:
         c1     c3    c4           c5
        140  Start1 Stop1 Start1-Stop1
   280.3444  Start2 Stop2 Start2-Stop2

# for Stat1-Stop1: max(20,30,40,50)
# for Stat2-Stop2: max(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T)
#max:
         c1     c3    c4           c5
        50  Start1 Stop1 Start1-Stop1
        60  Start2 Stop2 Start2-Stop2

# place of max  (in Start1-Stop1: 4 th element in gruop Start1-Stop1
# place of max  (in Start1-Stop1: 2 nd element in gruop Start1-Stop1

        c0     c3    c4           c5
         4  Start1 Stop1 Start1-Stop1
         2  Start2 Stop2 Start2-Stop2


Thanks for any suggestion,
Kaluza

	[[alternative HTML version deleted]]

Joris Meys

2010-Jun-24 13:14 UTC

head link

[R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2

On Thu, Jun 24, 2010 at 1:18 PM, Eugeniusz Kaluza
<Eugeniusz.Kaluza at polsl.pl> wrote:>
> Dear useRs,
>
> Thanks for any advices
>
> # I do not know where are the examples how to mark groups
> # ?based on signal occurence in the additional variable: cf. variable c2,
> # How to calculate different calculations for groups defined by (split by
occurence of c2 characteristic data)
>
>
> #First example of simple data
> #mexample ? 1 ? ? ?2 ? ?3 ?4 ? ? 5 ?6 ?7 ?8 ?9 ?10 11 ? ? ? 12 13 14 15 16
17
> c0<-rbind( 1, ? ? ?2 , 3, 4, ? ? ?5, 6, 7, 8, 9,10,11, ? ?
?12,13,14,15,16,17 ? ? )
> c0
> c1<-rbind(10, ? ? 20 ,30,40, ? ? 50,10,60,20,30,40,50, ? ? ?30,10,
0,NA,20,10.3444)
> c1
>
c2<-rbind(NA,"Start1",NA,NA,"Stop1",NA,NA,NA,NA,NA,NA,"Start2",NA,NA,NA,NA,"Stop2")
> c2
> C.df<-data.frame(cbind(c0,c1,c2))
> colnames(C.df)<-c("c0","c1","c2")
> C.df
>
> # preparation of form for explaining further needed result (next 3 lines
are not needed indeed, they are only ?to explain how to obtain final result
>
?c3<-rbind(NA,"Start1","Start1","Start1","Start1","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2")
> ?c4<-rbind(NA, "Stop1", "Stop1", "Stop1",
"Stop1", "Stop2", "Stop2", "Stop2",
"Stop2", "Stop2", "Stop2", "Stop2",
"Stop2", "Stop2", "Stop2", "Stop2",
"Stop2")
> ?C.df<-data.frame(cbind(c0,c1,c2,c3,c4))
>
?colnames(C.df)<-c("c0","c1","c2","c3","c4")
> ?C.df$c5<-paste(C.df$c3,C.df$c4,sep="-")
> ?C.df
>Now this is something I don't get. The list "Start2-Stop2" starts
way
before Start2, actually at Stop1. Sure that's what you want?

I took the liberty of showing how to get the data between start and
stop for every entry, and how to apply functions to it. If you don't
get the code, look at
?lapply
?apply
?grep

I also adjusted your example, as you caused all variables to be
factors by using the cbind in the data.frame function. Never do this
unless you're really sure you have to. But I can't think of a case
where that would be beneficial...

...
C.df<-data.frame(c0,c1,c2)
C.df

# find positions
Start <- grep("Start",C.df$c2)
Stop <- grep("Stop",C.df$c2)

# create indices
idx <- apply(cbind(Start,Stop),1,function(i) i[1]:i[2])
names(idx) <-
paste("Start",1:length(Start),"-Stop",1:length(Start),sep="")

# Apply the function summary and get a list back named by the interval.
out <- lapply(idx,function(i) summary(C.df[i,1:2]))
out

If you really need to start Start2 right after Stop1, you can use a
similar approach.

Cheers
Joris
> # NEEDED RESULTS
> ?# needed result
> # for Stat1-Stop1: mean(20,30,40,50)
> # for Stat2-Stop2: mean(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444),
na.rm=T)
> #mean:
> ? ? ? ? c1 ? ? c3 ? ?c4 ? ? ? ? ? c5
> ? ? ? ? 20 ?Start1 Stop1 Start1-Stop1
> ? 25.48585 ?Start2 Stop2 Start2-Stop2
>
> #sum
> # for Stat1-Stop1: sum(20,30,40,50)
> # for Stat2-Stop2: sum(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T)
> #sum:
> ? ? ? ? c1 ? ? c3 ? ?c4 ? ? ? ? ? c5
> ? ? ? ?140 ?Start1 Stop1 Start1-Stop1
> ? 280.3444 ?Start2 Stop2 Start2-Stop2
>
> # for Stat1-Stop1: max(20,30,40,50)
> # for Stat2-Stop2: max(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T)
> #max:
> ? ? ? ? c1 ? ? c3 ? ?c4 ? ? ? ? ? c5
> ? ? ? ?50 ?Start1 Stop1 Start1-Stop1
> ? ? ? ?60 ?Start2 Stop2 Start2-Stop2
>
> # place of max ?(in Start1-Stop1: 4 th element in gruop Start1-Stop1
> # place of max ?(in Start1-Stop1: 2 nd element in gruop Start1-Stop1
>
> ? ? ? ?c0 ? ? c3 ? ?c4 ? ? ? ? ? c5
> ? ? ? ? 4 ?Start1 Stop1 Start1-Stop1
> ? ? ? ? 2 ?Start2 Stop2 Start2-Stop2
>
>
> Thanks for any suggestion,
> Kaluza
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jun 2010 - ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2

[R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2

[R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2

Maybe Matching Threads