thr3ads.net - R help - [R] Expand a contingency table based on the value in one column [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Mark Na

2009-Jun-11 17:13 UTC

[R] Expand a contingency table based on the value in one column

Hi R-helpers,

I have the following (dummy) dataframe:
> test  DATE LOCATION  KIND CLASS COUNT
1    1        1   CAR     A     2
2    1        1 TRUCK     D     3
3    1        1   BUS     E     4
4    1        2   CAR     E     2
5    1        2 TRUCK     A     7
6    1        2   BUS     F     1

That I would like to turn into this:
> test2   DATE LOCATION  KIND CLASS
1     1        1   CAR     A
2     1        1   CAR     A
3     1        1 TRUCK     D
4     1        1 TRUCK     D
5     1        1 TRUCK     D
6     1        1   BUS     E
7     1        1   BUS     E
8     1        1   BUS     E
9     1        1   BUS     E
10    1        2   CAR     E
11    1        2   CAR     E
12    1        2 TRUCK     A
13    1        2 TRUCK     A
14    1        2 TRUCK     A
15    1        2 TRUCK     A
16    1        2 TRUCK     A
17    1        2 TRUCK     A
18    1        2 TRUCK     A
19    1        2   BUS     F

So, basically it's a case of expanding (adding rows to) the first dataframe
by the value in the COUNT column.

I have solved this problem with the following code:

test2<-with(test, data.frame(DATE=rep(DATE,COUNT),
LOCATION=rep(LOCATION,COUNT), KIND=rep(KIND,COUNT), CLASS=rep(CLASS,COUNT)))

but I'm unsatisfied with that solution because it's verbose and I think
there must a more elegant way. If I had more variables than 4 (which I do in
my real data) it would be a nuisance to repeat each column within the rep
function.

I would prefer to do this with Base R or package(reshape) than relying on
another package.

Any ideas? Thanks!

Mark Na

	[[alternative HTML version deleted]]

David Winsemius

2009-Jun-11 17:27 UTC

head link

[R] Expand a contingency table based on the value in one column

On Jun 11, 2009, at 1:13 PM, Mark Na wrote:
> Hi R-helpers,
>
> I have the following (dummy) dataframe:
>
>> test
>  DATE LOCATION  KIND CLASS COUNT
> 1    1        1   CAR     A     2
> 2    1        1 TRUCK     D     3
> 3    1        1   BUS     E     4
> 4    1        2   CAR     E     2
> 5    1        2 TRUCK     A     7
> 6    1        2   BUS     F     1
>
> That I would like to turn into this:
>
>> test2
>   DATE LOCATION  KIND CLASS
> 1     1        1   CAR     A
> 2     1        1   CAR     A
> 3     1        1 TRUCK     D
> 4     1        1 TRUCK     D
> 5     1        1 TRUCK     D
> 6     1        1   BUS     E
> 7     1        1   BUS     E
> 8     1        1   BUS     E
> 9     1        1   BUS     E
> 10    1        2   CAR     E
> 11    1        2   CAR     E
> 12    1        2 TRUCK     A
> 13    1        2 TRUCK     A
> 14    1        2 TRUCK     A
> 15    1        2 TRUCK     A
> 16    1        2 TRUCK     A
> 17    1        2 TRUCK     A
> 18    1        2 TRUCK     A
> 19    1        2   BUS     F
>
  test2 <- test[rep(1:nrow(test), test$COUNT),]
  rownames(test2) <- 1:nrow(test2)
  test2

> So, basically it's a case of expanding (adding rows to) the first  
> dataframe
> by the value in the COUNT column.
>
> I have solved this problem with the following code:
>
> test2<-with(test, data.frame(DATE=rep(DATE,COUNT),
> LOCATION=rep(LOCATION,COUNT), KIND=rep(KIND,COUNT),  
> CLASS=rep(CLASS,COUNT)))
>
> but I'm unsatisfied with that solution because it's verbose and I  
> think
> there must a more elegant way. If I had more variables than 4 (which  
> I do in
> my real data) it would be a nuisance to repeat each column within  
> the rep
> function.
>
> I would prefer to do this with Base R or package(reshape) than  
> relying on
> another package.
>
> Any ideas? Thanks!
>
> Mark Na
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Marc Schwartz

2009-Jun-11 17:33 UTC

head link

[R] Expand a contingency table based on the value in one column

On Jun 11, 2009, at 12:13 PM, Mark Na wrote:
> Hi R-helpers,
>
> I have the following (dummy) dataframe:
>
>> test
>  DATE LOCATION  KIND CLASS COUNT
> 1    1        1   CAR     A     2
> 2    1        1 TRUCK     D     3
> 3    1        1   BUS     E     4
> 4    1        2   CAR     E     2
> 5    1        2 TRUCK     A     7
> 6    1        2   BUS     F     1
>
> That I would like to turn into this:
>
>> test2
>   DATE LOCATION  KIND CLASS
> 1     1        1   CAR     A
> 2     1        1   CAR     A
> 3     1        1 TRUCK     D
> 4     1        1 TRUCK     D
> 5     1        1 TRUCK     D
> 6     1        1   BUS     E
> 7     1        1   BUS     E
> 8     1        1   BUS     E
> 9     1        1   BUS     E
> 10    1        2   CAR     E
> 11    1        2   CAR     E
> 12    1        2 TRUCK     A
> 13    1        2 TRUCK     A
> 14    1        2 TRUCK     A
> 15    1        2 TRUCK     A
> 16    1        2 TRUCK     A
> 17    1        2 TRUCK     A
> 18    1        2 TRUCK     A
> 19    1        2   BUS     F
>
> So, basically it's a case of expanding (adding rows to) the first  
> dataframe
> by the value in the COUNT column.
>
> I have solved this problem with the following code:
>
> test2<-with(test, data.frame(DATE=rep(DATE,COUNT),
> LOCATION=rep(LOCATION,COUNT), KIND=rep(KIND,COUNT),  
> CLASS=rep(CLASS,COUNT)))
>
> but I'm unsatisfied with that solution because it's verbose and I  
> think
> there must a more elegant way. If I had more variables than 4 (which  
> I do in
> my real data) it would be a nuisance to repeat each column within  
> the rep
> function.
>
> I would prefer to do this with Base R or package(reshape) than  
> relying on
> another package.
>
> Any ideas? Thanks!
>
> Mark Na

Mark,

A quick and dirty solution:

 > test[rep(1:nrow(test), test$COUNT), -ncol(test)]
     DATE LOCATION  KIND CLASS
1      1        1   CAR     A
1.1    1        1   CAR     A
2      1        1 TRUCK     D
2.1    1        1 TRUCK     D
2.2    1        1 TRUCK     D
3      1        1   BUS     E
3.1    1        1   BUS     E
3.2    1        1   BUS     E
3.3    1        1   BUS     E
4      1        2   CAR     E
4.1    1        2   CAR     E
5      1        2 TRUCK     A
5.1    1        2 TRUCK     A
5.2    1        2 TRUCK     A
5.3    1        2 TRUCK     A
5.4    1        2 TRUCK     A
5.5    1        2 TRUCK     A
5.6    1        2 TRUCK     A
6      1        2   BUS     F


For a more general solution to taking a tabulated data frame and  
converting it back to the raw data see my expand.dft() function:

   https://stat.ethz.ch/pipermail/r-help/2009-January/185561.html

For example:

 > expand.dft(test, freq = "COUNT")
    DATE LOCATION  KIND CLASS
1     1        1   CAR     A
2     1        1   CAR     A
3     1        1 TRUCK     D
4     1        1 TRUCK     D
5     1        1 TRUCK     D
6     1        1   BUS     E
7     1        1   BUS     E
8     1        1   BUS     E
9     1        1   BUS     E
10    1        2   CAR     E
11    1        2   CAR     E
12    1        2 TRUCK     A
13    1        2 TRUCK     A
14    1        2 TRUCK     A
15    1        2 TRUCK     A
16    1        2 TRUCK     A
17    1        2 TRUCK     A
18    1        2 TRUCK     A
19    1        2   BUS     F



HTH,

Marc Schwartz

Apparently Analagous Threads

Search for more reasonably related threads

R help - Jun 2009 - Expand a contingency table based on the value in one column

[R] Expand a contingency table based on the value in one column

[R] Expand a contingency table based on the value in one column

[R] Expand a contingency table based on the value in one column

Apparently Analagous Threads