thr3ads.net - R help - [R] How to manipulate tables [Dec 2009]

If this information is useful, please help other people find it:
Share via:

James Rome

2009-Dec-26 19:02 UTC

[R] How to manipulate tables

I am sorry to be bothering the list so much.

I made a table of counts of flight arrivals by hour:
cnts=tapply(Arrival4,list(Hour),table). There are up to 15 arrivals in a
bin.> cnts$`0`

 1  2  3  4  5  6  7  8  9 10 13
 1  2  5  9  2  7  5  4  2  4  1

$`1`

1 2 3 4
3 2 2 1

$`2`

1 3
2 2
.  . .

My first problem is how to get this table filled in with the 0 slots.
E.g., I want
$`0`

 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
 1  2  5  9  2  7  5  4  2  4   0   0   1   0   0

for all 24 hours. The elements of the tables are lists, but I do not
seem to be able to extract the names of each list, which I think would
allow this manipulation.

My second problem is that I want to compute probability distributions. I
have
lambda
          0           1           2           4           5          
6           7
0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 0.160323887
0.401619433
          8           9          10          11          12         
13          14
0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 0.495546559
0.435627530
         15          16          17          18          19         
20          21
0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 0.585425101
0.519028340
         22          23
0.556275304 0.503643725

and I need to calculate lambda**cnts for each bin, and each hour. I am
also unsure of how to do this.

Thanks in advance kind people on this list
Jim Rome

David Winsemius

2009-Dec-26 19:28 UTC

head link

[R] How to manipulate tables

On Dec 26, 2009, at 2:02 PM, James Rome wrote:
> I am sorry to be bothering the list so much.
>
> I made a table of counts of flight arrivals by hour:
No, you made a list of tables, which is different.
> cnts=tapply(Arrival4,list(Hour),table). There are up to 15 arrivals  
> in a
> bin.
Why not work with something more regular, like (this is all guesswork  
and untested in absence of test data objects):

table(Arrival4, factor(Hour, levels=1:15)  )    ?
# or use whatever your max(arrivals) might be.

 > xx <- factor(sample(1:10, 4))
 > xx
[1] 9 4 7 1
Levels: 1 4 7 9

 > table(factor(xx, levels=1:10) )

  1  2  3  4  5  6  7  8  9 10
  1  0  0  1  0  0  1  0  1  0



cnts>
> $`0`
>
> 1  2  3  4  5  6  7  8  9 10 13
> 1  2  5  9  2  7  5  4  2  4  1
>
> $`1`
>
> 1 2 3 4
> 3 2 2 1
>
> $`2`
>
> 1 3
> 2 2
> .  . .
>
> My first problem is how to get this table filled in with the 0 slots.
> E.g., I want
> $`0`
>
> 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
> 1  2  5  9  2  7  5  4  2  4   0   0   1   0   0
>
> for all 24 hours. The elements of the tables are lists, but I do not
> seem to be able to extract the names of each list, which I think would
> allow this manipulation.
>
> My second problem is that I want to compute probability  
> distributions. I
> have
> lambda
>          0           1           2           4           5
> 6           7
> 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409  
> 0.160323887
> 0.401619433
>          8           9          10          11          12
> 13          14
> 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895  
> 0.495546559
> 0.435627530
>         15          16          17          18          19
> 20          21
> 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802  
> 0.585425101
> 0.519028340
>         22          23
> 0.556275304 0.503643725
>
> and I need to calculate lambda**cnts for each bin, and each hour. I am
> also unsure of how to do this.
Can you give a justification for this procedure? (I'm not saying it is  
nonsense, only that it does not make make sense to me. So perhaps  
there is some mathematical property about which I need education.)

-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

William Dunlap

2009-Dec-26 20:00 UTC

head link

[R] How to manipulate tables

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of James Rome
> Sent: Saturday, December 26, 2009 11:03 AM
> To: r-help at r-project.org
> Subject: [R] How to manipulate tables
> 
> I am sorry to be bothering the list so much.
> 
> I made a table of counts of flight arrivals by hour:
> cnts=tapply(Arrival4,list(Hour),table). There are up to 15 
> arrivals in a
> bin.
> > cnts
> $`0`
> 
>  1  2  3  4  5  6  7  8  9 10 13
>  1  2  5  9  2  7  5  4  2  4  1
> 
> $`1`
> 
> 1 2 3 4
> 3 2 2 1
> 
> $`2`
> 
> 1 3
> 2 2
> .  . .
> 
> My first problem is how to get this table filled in with the 0 slots.
> E.g., I want
> $`0`
> 
>  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>  1  2  5  9  2  7  5  4  2  4   0   0   1   0   0
> 
> for all 24 hours. The elements of the tables are lists, but I do not
> seem to be able to extract the names of each list, which I think would
> allow this manipulation.
Let's make a fake dataset to demonstrate things more easily.
  set.seed(1)
  data <- data.frame(Arrival4=rpois(50, 4)+1,
                     Hour=sample(0:4, replace=TRUE, size=50))
Your tapply(,FUN=table) approach with this data gives
  > with(data, tapply(Arrival4,list(Hour),table))
  $`0`

  3 4 6 8 
  2 2 1 1 

  $`1`

  3 4 5 6 7 8 
  3 2 2 3 1 3 
  ... 2 more one dimensional tables, `3` and `4` ...

I prefer making a two dimensional table so the rows and
columns have common meanings
  > tab <- with(data, table(Hour, Arrival4))
  > tab
      Arrival4
  Hour 1 2 3 4 5 6 7 8 11
     0 0 0 2 2 0 1 0 1  0
     1 0 0 3 2 2 3 1 3  0
     2 0 2 0 2 3 3 0 0  0
     3 0 0 0 0 2 4 4 0  1
     4 1 0 2 3 1 2 0 0  0
  > tab[2,] # 2nd row (Hour==1)
   1  2  3  4  5  6  7  8 11 
   0  0  3  2  2  3  1  3  0
  > tab["1",] # would give the same as tab[2,]

But this is missing columns for Arrival4 in c(10,12,13,14,15).
If you make Arrival4 a factor with levels 1:15 the table will
include entries for each level.  In the following we don't actually
change Arrival4 itself, but create a factor using its data
when we call table().
  > tab <- with(data, table(Hour,
Arrival4=factor(Arrival4,levels=1:15)))
  > tab
      Arrival4
  Hour 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
     0 0 0 2 2 0 1 0 1 0  0  0  0  0  0  0
     1 0 0 3 2 2 3 1 3 0  0  0  0  0  0  0
     2 0 2 0 2 3 3 0 0 0  0  0  0  0  0  0
     3 0 0 0 0 2 4 4 0 0  0  1  0  0  0  0
     4 1 0 2 3 1 2 0 0 0  0  0  0  0  0  0
  > tab[2,] # 2nd row (Hour==1)
   1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 
   0  0  3  2  2  3  1  3  0  0  0  0  0  0  0  

If you have no data for certain hours, you may have
to use the same technique for Hour.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
 > My second problem is that I want to compute probability 
> distributions. I
> have
> lambda
>           0           1           2           4           5          
> 6           7
> 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 
> 0.160323887
> 0.401619433
>           8           9          10          11          12         
> 13          14
> 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 
> 0.495546559
> 0.435627530
>          15          16          17          18          19         
> 20          21
> 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 
> 0.585425101
> 0.519028340
>          22          23
> 0.556275304 0.503643725
> 
> and I need to calculate lambda**cnts for each bin, and each hour. I am
> also unsure of how to do this.
> 
> Thanks in advance kind people on this list
> Jim Rome
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

David Winsemius

2009-Dec-27 04:54 UTC

head link

[R] How to manipulate tables

On Dec 26, 2009, at 4:56 PM, James Rome wrote:
> Thanks David.
>
> I wanted to calculate the Poisson distribution from my histograms to  
> see
> how closely they match it. So the formula is something like
> pprob=((lambda**cnts)/factorial(cnts))*exp(lambda)
(Please don't reply privately.)

You have two vectors of different lengths, the lambdas and the counts,  
and you need to create the 2-way combinations in a form that can be  
supplied to a function.
> And I was hoping to do it for the whole table at once.
Something like this?

 > lambdas <-scan()   # scan can read a sequence of numbers separated  
by white space.

1: 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409  
0.160323887
7: 0.401619433
8:
Read 7 items

 > ldf <- expand.grid(lambdas, 1:15)
 > str(ldf)
'data.frame':	105 obs. of  2 variables:
$ Var1: num  0.19919 0.01377 0.00648 0.01781 0.09312 ...
$ Var2: int  1 1 1 1 1 1 1 2 2 2 ...
- attr(*, "out.attrs")=List of 2
  ..$ dim     : int  7 15
  ..$ dimnames:List of 2
  .. ..$ Var1: chr  "Var1=0.199190283" "Var1=0.013765182"  
"Var1=0.006477733" "Var1=0.017813765" ...
  .. ..$ Var2: chr  "Var2= 1" "Var2= 2" "Var2= 3"
"Var2= 4" ...

 > ldf$countprob <- (ldf$Var1**ldf$Var2)/ factorial(ldf$Var2)*exp(ldf 
$Var1)
 > ldf
           Var1 Var2     countprob
1   0.199190283    1 2.430946e-01
2   0.013765182    1 1.395597e-02
3   0.006477733    1 6.519830e-03
4   0.017813765    1 1.813394e-02
snipped remaining 100 lines...

-- 
David
> Jim Rome
>
> On 12/26/09 2:28 PM, David Winsemius wrote:
>>
>> On Dec 26, 2009, at 2:02 PM, James Rome wrote:
>>
>>> I am sorry to be bothering the list so much.
>>>
>>> I made a table of counts of flight arrivals by hour:
>>
>> No, you made a list of tables, which is different.
>>
>>> cnts=tapply(Arrival4,list(Hour),table). There are up to 15  
>>> arrivals in a
>>> bin.
>>
>> Why not work with something more regular, like (this is all guesswork
>> and untested in absence of test data objects):
>>
>> table(Arrival4, factor(Hour, levels=1:15)  )    ?
>> # or use whatever your max(arrivals) might be.
>>
>>> xx <- factor(sample(1:10, 4))
>>> xx
>> [1] 9 4 7 1
>> Levels: 1 4 7 9
>>
>>> table(factor(xx, levels=1:10) )
>>
>> 1  2  3  4  5  6  7  8  9 10
>> 1  0  0  1  0  0  1  0  1  0
>>
>>
>>
>> cnts
>>>
>>> $`0`
>>>
>>> 1  2  3  4  5  6  7  8  9 10 13
>>> 1  2  5  9  2  7  5  4  2  4  1
>>>
>>> $`1`
>>>
>>> 1 2 3 4
>>> 3 2 2 1
>>>
>>> $`2`
>>>
>>> 1 3
>>> 2 2
>>> .  . .
>>>
>>> My first problem is how to get this table filled in with the 0  
>>> slots.
>>> E.g., I want
>>> $`0`
>>>
>>> 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>>> 1  2  5  9  2  7  5  4  2  4   0   0   1   0   0
>>>
>>> for all 24 hours. The elements of the tables are lists, but I do
not
>>> seem to be able to extract the names of each list, which I think  
>>> would
>>> allow this manipulation.
>>>
>>> My second problem is that I want to compute probability  
>>> distributions. I
>>> have
>>> lambda
>>>         0           1           2           4           5
>>> 6           7
>>> 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409  
>>> 0.160323887
>>> 0.401619433
>>>         8           9          10          11          12
>>> 13          14
>>> 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895  
>>> 0.495546559
>>> 0.435627530
>>>        15          16          17          18          19
>>> 20          21
>>> 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802  
>>> 0.585425101
>>> 0.519028340
>>>        22          23
>>> 0.556275304 0.503643725
>>>
>>> and I need to calculate lambda**cnts for each bin, and each hour.  
>>> I am
>>> also unsure of how to do this.
>>
>> Can you give a justification for this procedure? (I'm not saying it
>> is
>> nonsense, only that it does not make make sense to me. So perhaps
>> there is some mathematical property about which I need education.)
>>
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Reasonably Related Threads

Search for more possibly parallel threads

R help - Dec 2009 - How to manipulate tables

[R] How to manipulate tables

[R] How to manipulate tables

[R] How to manipulate tables

[R] How to manipulate tables

Reasonably Related Threads