I am sorry to be bothering the list so much. I made a table of counts of flight arrivals by hour: cnts=tapply(Arrival4,list(Hour),table). There are up to 15 arrivals in a bin.> cnts$`0` 1 2 3 4 5 6 7 8 9 10 13 1 2 5 9 2 7 5 4 2 4 1 $`1` 1 2 3 4 3 2 2 1 $`2` 1 3 2 2 . . . My first problem is how to get this table filled in with the 0 slots. E.g., I want $`0` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 5 9 2 7 5 4 2 4 0 0 1 0 0 for all 24 hours. The elements of the tables are lists, but I do not seem to be able to extract the names of each list, which I think would allow this manipulation. My second problem is that I want to compute probability distributions. I have lambda 0 1 2 4 5 6 7 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 0.160323887 0.401619433 8 9 10 11 12 13 14 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 0.495546559 0.435627530 15 16 17 18 19 20 21 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 0.585425101 0.519028340 22 23 0.556275304 0.503643725 and I need to calculate lambda**cnts for each bin, and each hour. I am also unsure of how to do this. Thanks in advance kind people on this list Jim Rome
On Dec 26, 2009, at 2:02 PM, James Rome wrote:> I am sorry to be bothering the list so much. > > I made a table of counts of flight arrivals by hour:No, you made a list of tables, which is different.> cnts=tapply(Arrival4,list(Hour),table). There are up to 15 arrivals > in a > bin.Why not work with something more regular, like (this is all guesswork and untested in absence of test data objects): table(Arrival4, factor(Hour, levels=1:15) ) ? # or use whatever your max(arrivals) might be. > xx <- factor(sample(1:10, 4)) > xx [1] 9 4 7 1 Levels: 1 4 7 9 > table(factor(xx, levels=1:10) ) 1 2 3 4 5 6 7 8 9 10 1 0 0 1 0 0 1 0 1 0 cnts> > $`0` > > 1 2 3 4 5 6 7 8 9 10 13 > 1 2 5 9 2 7 5 4 2 4 1 > > $`1` > > 1 2 3 4 > 3 2 2 1 > > $`2` > > 1 3 > 2 2 > . . . > > My first problem is how to get this table filled in with the 0 slots. > E.g., I want > $`0` > > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > 1 2 5 9 2 7 5 4 2 4 0 0 1 0 0 > > for all 24 hours. The elements of the tables are lists, but I do not > seem to be able to extract the names of each list, which I think would > allow this manipulation. > > My second problem is that I want to compute probability > distributions. I > have > lambda > 0 1 2 4 5 > 6 7 > 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 > 0.160323887 > 0.401619433 > 8 9 10 11 12 > 13 14 > 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 > 0.495546559 > 0.435627530 > 15 16 17 18 19 > 20 21 > 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 > 0.585425101 > 0.519028340 > 22 23 > 0.556275304 0.503643725 > > and I need to calculate lambda**cnts for each bin, and each hour. I am > also unsure of how to do this.Can you give a justification for this procedure? (I'm not saying it is nonsense, only that it does not make make sense to me. So perhaps there is some mathematical property about which I need education.) -- David Winsemius, MD Heritage Laboratories West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of James Rome > Sent: Saturday, December 26, 2009 11:03 AM > To: r-help at r-project.org > Subject: [R] How to manipulate tables > > I am sorry to be bothering the list so much. > > I made a table of counts of flight arrivals by hour: > cnts=tapply(Arrival4,list(Hour),table). There are up to 15 > arrivals in a > bin. > > cnts > $`0` > > 1 2 3 4 5 6 7 8 9 10 13 > 1 2 5 9 2 7 5 4 2 4 1 > > $`1` > > 1 2 3 4 > 3 2 2 1 > > $`2` > > 1 3 > 2 2 > . . . > > My first problem is how to get this table filled in with the 0 slots. > E.g., I want > $`0` > > 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 > 1 2 5 9 2 7 5 4 2 4 0 0 1 0 0 > > for all 24 hours. The elements of the tables are lists, but I do not > seem to be able to extract the names of each list, which I think would > allow this manipulation.Let's make a fake dataset to demonstrate things more easily. set.seed(1) data <- data.frame(Arrival4=rpois(50, 4)+1, Hour=sample(0:4, replace=TRUE, size=50)) Your tapply(,FUN=table) approach with this data gives > with(data, tapply(Arrival4,list(Hour),table)) $`0` 3 4 6 8 2 2 1 1 $`1` 3 4 5 6 7 8 3 2 2 3 1 3 ... 2 more one dimensional tables, `3` and `4` ... I prefer making a two dimensional table so the rows and columns have common meanings > tab <- with(data, table(Hour, Arrival4)) > tab Arrival4 Hour 1 2 3 4 5 6 7 8 11 0 0 0 2 2 0 1 0 1 0 1 0 0 3 2 2 3 1 3 0 2 0 2 0 2 3 3 0 0 0 3 0 0 0 0 2 4 4 0 1 4 1 0 2 3 1 2 0 0 0 > tab[2,] # 2nd row (Hour==1) 1 2 3 4 5 6 7 8 11 0 0 3 2 2 3 1 3 0 > tab["1",] # would give the same as tab[2,] But this is missing columns for Arrival4 in c(10,12,13,14,15). If you make Arrival4 a factor with levels 1:15 the table will include entries for each level. In the following we don't actually change Arrival4 itself, but create a factor using its data when we call table(). > tab <- with(data, table(Hour, Arrival4=factor(Arrival4,levels=1:15))) > tab Arrival4 Hour 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0 0 2 2 0 1 0 1 0 0 0 0 0 0 0 1 0 0 3 2 2 3 1 3 0 0 0 0 0 0 0 2 0 2 0 2 3 3 0 0 0 0 0 0 0 0 0 3 0 0 0 0 2 4 4 0 0 0 1 0 0 0 0 4 1 0 2 3 1 2 0 0 0 0 0 0 0 0 0 > tab[2,] # 2nd row (Hour==1) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 0 3 2 2 3 1 3 0 0 0 0 0 0 0 If you have no data for certain hours, you may have to use the same technique for Hour. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> My second problem is that I want to compute probability > distributions. I > have > lambda > 0 1 2 4 5 > 6 7 > 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 > 0.160323887 > 0.401619433 > 8 9 10 11 12 > 13 14 > 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 > 0.495546559 > 0.435627530 > 15 16 17 18 19 > 20 21 > 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 > 0.585425101 > 0.519028340 > 22 23 > 0.556275304 0.503643725 > > and I need to calculate lambda**cnts for each bin, and each hour. I am > also unsure of how to do this. > > Thanks in advance kind people on this list > Jim Rome > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Dec 26, 2009, at 4:56 PM, James Rome wrote:> Thanks David. > > I wanted to calculate the Poisson distribution from my histograms to > see > how closely they match it. So the formula is something like > pprob=((lambda**cnts)/factorial(cnts))*exp(lambda)(Please don't reply privately.) You have two vectors of different lengths, the lambdas and the counts, and you need to create the 2-way combinations in a form that can be supplied to a function.> And I was hoping to do it for the whole table at once.Something like this? > lambdas <-scan() # scan can read a sequence of numbers separated by white space. 1: 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 0.160323887 7: 0.401619433 8: Read 7 items > ldf <- expand.grid(lambdas, 1:15) > str(ldf) 'data.frame': 105 obs. of 2 variables: $ Var1: num 0.19919 0.01377 0.00648 0.01781 0.09312 ... $ Var2: int 1 1 1 1 1 1 1 2 2 2 ... - attr(*, "out.attrs")=List of 2 ..$ dim : int 7 15 ..$ dimnames:List of 2 .. ..$ Var1: chr "Var1=0.199190283" "Var1=0.013765182" "Var1=0.006477733" "Var1=0.017813765" ... .. ..$ Var2: chr "Var2= 1" "Var2= 2" "Var2= 3" "Var2= 4" ... > ldf$countprob <- (ldf$Var1**ldf$Var2)/ factorial(ldf$Var2)*exp(ldf $Var1) > ldf Var1 Var2 countprob 1 0.199190283 1 2.430946e-01 2 0.013765182 1 1.395597e-02 3 0.006477733 1 6.519830e-03 4 0.017813765 1 1.813394e-02 snipped remaining 100 lines... -- David> Jim Rome > > On 12/26/09 2:28 PM, David Winsemius wrote: >> >> On Dec 26, 2009, at 2:02 PM, James Rome wrote: >> >>> I am sorry to be bothering the list so much. >>> >>> I made a table of counts of flight arrivals by hour: >> >> No, you made a list of tables, which is different. >> >>> cnts=tapply(Arrival4,list(Hour),table). There are up to 15 >>> arrivals in a >>> bin. >> >> Why not work with something more regular, like (this is all guesswork >> and untested in absence of test data objects): >> >> table(Arrival4, factor(Hour, levels=1:15) ) ? >> # or use whatever your max(arrivals) might be. >> >>> xx <- factor(sample(1:10, 4)) >>> xx >> [1] 9 4 7 1 >> Levels: 1 4 7 9 >> >>> table(factor(xx, levels=1:10) ) >> >> 1 2 3 4 5 6 7 8 9 10 >> 1 0 0 1 0 0 1 0 1 0 >> >> >> >> cnts >>> >>> $`0` >>> >>> 1 2 3 4 5 6 7 8 9 10 13 >>> 1 2 5 9 2 7 5 4 2 4 1 >>> >>> $`1` >>> >>> 1 2 3 4 >>> 3 2 2 1 >>> >>> $`2` >>> >>> 1 3 >>> 2 2 >>> . . . >>> >>> My first problem is how to get this table filled in with the 0 >>> slots. >>> E.g., I want >>> $`0` >>> >>> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 >>> 1 2 5 9 2 7 5 4 2 4 0 0 1 0 0 >>> >>> for all 24 hours. The elements of the tables are lists, but I do not >>> seem to be able to extract the names of each list, which I think >>> would >>> allow this manipulation. >>> >>> My second problem is that I want to compute probability >>> distributions. I >>> have >>> lambda >>> 0 1 2 4 5 >>> 6 7 >>> 0.199190283 0.013765182 0.006477733 0.017813765 0.093117409 >>> 0.160323887 >>> 0.401619433 >>> 8 9 10 11 12 >>> 13 14 >>> 0.191093117 0.177327935 0.318218623 0.404858300 0.463157895 >>> 0.495546559 >>> 0.435627530 >>> 15 16 17 18 19 >>> 20 21 >>> 0.418623482 0.307692308 0.405668016 0.484210526 0.580566802 >>> 0.585425101 >>> 0.519028340 >>> 22 23 >>> 0.556275304 0.503643725 >>> >>> and I need to calculate lambda**cnts for each bin, and each hour. >>> I am >>> also unsure of how to do this. >> >> Can you give a justification for this procedure? (I'm not saying it >> is >> nonsense, only that it does not make make sense to me. So perhaps >> there is some mathematical property about which I need education.) >>David Winsemius, MD Heritage Laboratories West Hartford, CT