thr3ads.net - R help - [R] Numbering entries for each subject [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Toni Pitcher

2011-Sep-22 03:02 UTC

[R] Numbering entries for each subject

Hi R Users

I am hoping someone might be able to give some pointers on alternative code to
the for loop described below.

I have a dataset which is ordered by subject ID and date, what I would like to
do is create a new variable that numbers the entries for each person (e.g.
1,2,3,....)

As an example if we have subjects A, B and C all with multiple entries (have
excluded date variable for simplicity), the for loop below achieves the desired
result, however my dataset is big (1 million + observations) and the for loop is
slow. Is there a more efficient way of getting to the desired result?

Many thanks in advance

Toni 


A <-
data.frame(ID=c('A','A','A','A','B','B','B',
'C','C','C','C','C'))

  ID
1   A
2   A
3   A
4   A
5   B
6   B
7   B
8   C
9   C
10  C
11  C
12  C


A$Session_ID <- 0
previous_ID <- ''
current_index <- 1
for ( i in seq(1,nrow(A)) )
{
 if (A$ID[i] != previous_ID) 
    {current_index <- 1} 
 A$Session_ID[i] <- current_index
 previous_ID <- A$ID[i]
 current_index <- current_index + 1
}

 

ID Session_ID
1   A          1
2   A          2
3   A          3
4   A          4
5   B          1
6   B          2
7   B          3
8   C          1
9   C          2
10  C          3
11  C          4
12  C          5

jim holtman

2011-Sep-22 03:11 UTC

head link

[R] Numbering entries for each subject

try this:
> x <- read.table('clipboard')
> x   V1 V2
1   1  A
2   2  A
3   3  A
4   4  A
5   5  B
6   6  B
7   7  B
8   8  C
9   9  C
10 10  C
11 11  C
12 12  C> x$ID <- ave(x$V1, x$V2, FUN = function(a)seq(length(a)))
> x   V1 V2 ID
1   1  A  1
2   2  A  2
3   3  A  3
4   4  A  4
5   5  B  1
6   6  B  2
7   7  B  3
8   8  C  1
9   9  C  2
10 10  C  3
11 11  C  4
12 12  C  5>

On Wed, Sep 21, 2011 at 11:02 PM, Toni Pitcher <toni.pitcher at
otago.ac.nz> wrote:> Hi R Users
>
> I am hoping someone might be able to give some pointers on alternative code
to the for loop described below.
>
> I have a dataset which is ordered by subject ID and date, what I would like
to do is create a new variable that numbers the entries for each person (e.g.
1,2,3,....)
>
> As an example if we have subjects A, B and C all with multiple entries
(have excluded date variable for simplicity), the for loop below achieves the
desired result, however my dataset is big (1 million + observations) and the for
loop is slow. Is there a more efficient way of getting to the desired result?
>
> Many thanks in advance
>
> Toni
>
>
> A <-
data.frame(ID=c('A','A','A','A','B','B','B',
'C','C','C','C','C'))
>
> ?ID
> 1 ? A
> 2 ? A
> 3 ? A
> 4 ? A
> 5 ? B
> 6 ? B
> 7 ? B
> 8 ? C
> 9 ? C
> 10 ?C
> 11 ?C
> 12 ?C
>
>
> A$Session_ID <- 0
> previous_ID <- ''
> current_index <- 1
> for ( i in seq(1,nrow(A)) )
> {
> ?if (A$ID[i] != previous_ID)
> ? ?{current_index <- 1}
> ?A$Session_ID[i] <- current_index
> ?previous_ID <- A$ID[i]
> ?current_index <- current_index + 1
> }
>
>
>
> ID Session_ID
> 1 ? A ? ? ? ? ?1
> 2 ? A ? ? ? ? ?2
> 3 ? A ? ? ? ? ?3
> 4 ? A ? ? ? ? ?4
> 5 ? B ? ? ? ? ?1
> 6 ? B ? ? ? ? ?2
> 7 ? B ? ? ? ? ?3
> 8 ? C ? ? ? ? ?1
> 9 ? C ? ? ? ? ?2
> 10 ?C ? ? ? ? ?3
> 11 ?C ? ? ? ? ?4
> 12 ?C ? ? ? ? ?5
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

Jeff Newmiller

2011-Sep-22 03:18 UTC

head link

[R] Numbering entries for each subject

A$Session_id <- ave(rep(1,length(A$ID),A$ID,FUN=cumsum)

---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Toni Pitcher <toni.pitcher@otago.ac.nz> wrote:

Hi R Users

I am hoping someone might be able to give some pointers on alternative code to
the for loop described below.

I have a dataset which is ordered by subject ID and date, what I would like to
do is create a new variable that numbers the entries for each person (e.g.
1,2,3,....)

As an example if we have subjects A, B and C all with multiple entries (have
excluded date variable for simplicity), the for loop below achieves the desired
result, however my dataset is big (1 million + observations) and the for loop is
slow. Is there a more efficient way of getting to the desired result?

Many thanks in advance

Toni 


A <-
data.frame(ID=c('A','A','A','A','B','B','B',
'C','C','C','C','C'))

ID
1 A
2 A
3 A
4 A
5 B
6 B
7 B
8 C
9 C
10 C
11 C
12 C


A$Session_ID <- 0
previous_ID <- ''
current_index <- 1
for ( i in seq(1,nrow(A)) )
{
if (A$ID[i] != previous_ID) 
{current_index <- 1} 
A$Session_ID[i] <- current_index
previous_ID <- A$ID[i]
current_index <- current_index + 1
}



ID Session_ID
1 A 1
2 A 2
3 A 3
4 A 4
5 B 1
6 B 2
7 B 3
8 C 1
9 C 2
10 C 3
11 C 4
12 C 5
_____________________________________________

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more seemingly similar threads

R help - Sep 2011 - Numbering entries for each subject

[R] Numbering entries for each subject

[R] Numbering entries for each subject

[R] Numbering entries for each subject

Seemingly Similar Threads