thr3ads.net - R help - [R] counting run lengths [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Mario Lavezzi

2008-Oct-27 09:38 UTC

[R] counting run lengths

Hello,
I have the following problem.

I am running simulations on possible states of a set of agents 
(1=employed, 0=unemployed).

I store these simulated time series in a matrix like the following, 
where rows indicates time periods, columns the number of agents (4 
agents and 8 periods in this case):

Atr=[
1    1    1    1
1    1    0    1
1    1    0    1
1    1    0    1
0    1    0    1
0    1    0    1
0    1    0    1
0    1    0    1]

At this point, I need to update a vector ("unSpells") which contains
the
lenghts of unemployment spells, and is initialized with ones. 
Practically, in the case represented I need to store the value "4" at 
position 1 of unSpells and "7" at position 3 of unSpells (that is, I 
care only of those who, in the last row, are zeros).

I am doing this in the following way (tt+1 indicates the time period 
reached by the simulation, n the number of agents):

    unSpells = matrix(1,nrow=1,ncol=n)   
    ppp=apply(Atr[1:(tt+1),],2,rle)
    for(i in (1:n)[Atr[tt+1,]==0]){
        unSpells[i]=tail(ppp[[i]]$lengths,1)
    }

It works, but the for (i in ...) loop slows down the simulation a lot.

Any suggestion on how to avoid this loop? (or in general, to speed up 
this part of the simulation)

Thanks!!
Mario

-- 
Andrea Mario Lavezzi
Dipartimento di Studi su Politica, Diritto e Societ?
Universit? di Palermo
Piazza Bologni 8
90134 Palermo, Italy
tel. ++39 091 6625600
fax ++39 091 6112023
skype: lavezzimario
email: lavezzi (at) unipa.it
web: http://www.unipa.it/~lavezzi

Dimitris Rizopoulos

2008-Oct-27 09:52 UTC

head link

[R] counting run lengths

it's not totally clear to me what exactly do you need in this case, but 
have a look at the following:

Atr <- cbind(rep(1:0, each = 4), 1, c(1, rep(0, 7)), 1)
unSpells <- colSums(Atr == 0)
unSpells[unSpells == 0] <- 1
unSpells


I hope it helps.

Best,
Dimitris


Mario Lavezzi wrote:> Hello,
> I have the following problem.
> 
> I am running simulations on possible states of a set of agents 
> (1=employed, 0=unemployed).
> 
> I store these simulated time series in a matrix like the following, 
> where rows indicates time periods, columns the number of agents (4 
> agents and 8 periods in this case):
> 
> Atr=[
> 1    1    1    1
> 1    1    0    1
> 1    1    0    1
> 1    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1]
> 
> At this point, I need to update a vector ("unSpells") which
contains the
> lenghts of unemployment spells, and is initialized with ones. 
> Practically, in the case represented I need to store the value
"4" at
> position 1 of unSpells and "7" at position 3 of unSpells (that
is, I
> care only of those who, in the last row, are zeros).
> 
> I am doing this in the following way (tt+1 indicates the time period 
> reached by the simulation, n the number of agents):
> 
>    unSpells = matrix(1,nrow=1,ncol=n)      ppp=apply(Atr[1:(tt+1),],2,rle)
>    for(i in (1:n)[Atr[tt+1,]==0]){
>        unSpells[i]=tail(ppp[[i]]$lengths,1)
>    }
> 
> It works, but the for (i in ...) loop slows down the simulation a lot.
> 
> Any suggestion on how to avoid this loop? (or in general, to speed up 
> this part of the simulation)
> 
> Thanks!!
> Mario
> 
-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

Domenico Vistocco

2008-Oct-27 10:09 UTC

head link

[R] counting run lengths

Try this:
unSpells[tail(Atr,1)==0] <- 
apply(Atr,2,function(x)sum(x==0))[tail(Atr,1)==0]

Or (if you don't have to preserve the value in the unSpells vector):
unSpells <- apply(Atr,2,function(x)sum(x==0))
But in this case you have 0 instead of 1 in the second and fourth position.

Ciao,
domenico

Mario Lavezzi wrote:> Hello,
> I have the following problem.
>
> I am running simulations on possible states of a set of agents 
> (1=employed, 0=unemployed).
>
> I store these simulated time series in a matrix like the following, 
> where rows indicates time periods, columns the number of agents (4 
> agents and 8 periods in this case):
>
> Atr=[
> 1    1    1    1
> 1    1    0    1
> 1    1    0    1
> 1    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1]
>
> At this point, I need to update a vector ("unSpells") which
contains
> the lenghts of unemployment spells, and is initialized with ones. 
> Practically, in the case represented I need to store the value
"4" at
> position 1 of unSpells and "7" at position 3 of unSpells (that
is, I
> care only of those who, in the last row, are zeros).
>
> I am doing this in the following way (tt+1 indicates the time period 
> reached by the simulation, n the number of agents):
>
>    unSpells = matrix(1,nrow=1,ncol=n)      
> ppp=apply(Atr[1:(tt+1),],2,rle)
>    for(i in (1:n)[Atr[tt+1,]==0]){
>        unSpells[i]=tail(ppp[[i]]$lengths,1)
>    }
>
> It works, but the for (i in ...) loop slows down the simulation a lot.
>
> Any suggestion on how to avoid this loop? (or in general, to speed up 
> this part of the simulation)
>
> Thanks!!
> Mario
>

Martin Morgan

2008-Oct-27 14:28 UTC

head link

[R] counting run lengths

Hi Mario --

This function

f <- function(m) {
    ## next 2 lines due to Bill Dunlap
    ## http://tolstoy.newcastle.edu.au/R/e4/devel/08/04/1206.html
    csum <- cumsum(!m)
    crun <- csum - cummax(m * csum)
    matrix(ifelse(crun > 0, (crun-1) %% nrow(m) + 1, 0),
           nrow=nrow(m))
}

returns a matrix with elements indicating the number of successive
0's so far in the column
> f(Atr)     [,1] [,2] [,3] [,4]
[1,]    0    0    0    0
[2,]    0    0    1    0
[3,]    0    0    2    0
[4,]    0    0    3    0
[5,]    1    0    4    0
[6,]    2    0    5    0
[7,]    3    0    6    0
[8,]    4    0    7    0

from which the last row is easily extracted
> f(Atr)[nrow(Atr),][1] 4 0 7 0

Martin

Mario Lavezzi <lavezzi at unipa.it> writes:
> Hello,
> I have the following problem.
>
> I am running simulations on possible states of a set of agents
> (1=employed, 0=unemployed).
>
> I store these simulated time series in a matrix like the following,
> where rows indicates time periods, columns the number of agents (4
> agents and 8 periods in this case):
>
> Atr=[
> 1    1    1    1
> 1    1    0    1
> 1    1    0    1
> 1    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1
> 0    1    0    1]
>
> At this point, I need to update a vector ("unSpells") which
contains
> the lenghts of unemployment spells, and is initialized with
> ones. Practically, in the case represented I need to store the value
> "4" at position 1 of unSpells and "7" at position 3 of
unSpells (that
> is, I care only of those who, in the last row, are zeros).
>
> I am doing this in the following way (tt+1 indicates the time period
> reached by the simulation, n the number of agents):
>
>     unSpells = matrix(1,nrow=1,ncol=n)
>     ppp=apply(Atr[1:(tt+1),],2,rle)
>     for(i in (1:n)[Atr[tt+1,]==0]){
>         unSpells[i]=tail(ppp[[i]]$lengths,1)
>     }
>
> It works, but the for (i in ...) loop slows down the simulation a lot.
>
> Any suggestion on how to avoid this loop? (or in general, to speed up
> this part of the simulation)
>
> Thanks!!
> Mario
>
> -- 
> Andrea Mario Lavezzi
> Dipartimento di Studi su Politica, Diritto e Societ?
> Universit? di Palermo
> Piazza Bologni 8
> 90134 Palermo, Italy
> tel. ++39 091 6625600
> fax ++39 091 6112023
> skype: lavezzimario
> email: lavezzi (at) unipa.it
> web: http://www.unipa.it/~lavezzi
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Oct 2008 - counting run lengths

[R] counting run lengths

[R] counting run lengths

[R] counting run lengths

[R] counting run lengths

Seemingly Similar Threads