Jonathan Greenberg
2009-Aug-30 07:10 UTC
[R] Stochastic (transition) matrices: how to determine distributions and variance?
(apologies for the cross-posting, and for this being a more general stats question rather than a specific-to-R one. I assure you I will be doing the actual analysis in R :) I am trying to determine the distribution and variance for a classic stochastic (transition) matrix problem such that: let x(t) be an initial state vector consisting of counts of classes A, B and C: x(t) = [A(t),B(t),C(t)] T is the stochastic (transition) matrix for these classes consisting of the transition probabilities between each combination of A,B and C: pAA pBA pCA T= pAB pBB pCB pAC pBC pCC By doing matrix multiplication of Tx(t) we can determine the *mean* counts of these classes at t+1 such that: x mean (t+1) = Tx(t) = [A mean (t+1),B mean (t+1),C mean (t+1)] What I want to know is what is a) what is the *distribution* of A(t+1),B(t+1) and C(t+1), and what is the variance around these mean values? Since pXY are stochastic probabilities, it seems that the distribution and variance should be calculable. Thanks! --j -- Jonathan A. Greenberg, PhD Postdoctoral Scholar Center for Spatial Technologies and Remote Sensing (CSTARS) University of California, Davis One Shields Avenue The Barn, Room 250N Davis, CA 95616 Cell: 415-794-5043 AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307
Chris Stubben
2009-Aug-31 17:33 UTC
[R] Stochastic (transition) matrices: how to determine distributions and variance?
Jonathan Greenberg-2 wrote:> > I am trying to determine the distribution and variance for a classic > stochastic (transition) matrix >I have a few suggestions that I've tried with matrix population models (so probabilities are survival rates plus dead fates). # here's a test example A<-matrix(c(.1,.3,.6,.2,.4,.4, 0,0,1), nrow=3) n<-c(100, 300, 200) A %*% n [,1] [1,] 70 [2,] 150 [3,] 380 # columns should add to 1, correct? colSums(A) # two possible solutions # option 1. sample A from a multinomial distribution apply(A, 2, function(x) rmultinom(1, size = 1000, prob=x) /1000) [,1] [,2] [,3] [1,] 0.090 0.195 0 [2,] 0.322 0.401 0 [3,] 0.588 0.404 1 n1<-matrix(numeric(1000*3), ncol=3) for(i in 1:1000) { A2<-apply(A, 2, function(x) rmultinom(1, size = 1000, prob=x) /1000) n1[i,] <-A2 %*% n } apply(n1, 2, quantile, c(0.025,0.975)) [,1] [,2] [,3] 2.5% 62.3 140.5 370.4975 97.5% 78.3 159.6 389.8000 # option 2. Since I usually don't know anything about the distribution of values in A, I would resample transitions using a bootstrap # Get number of each transition a2<-data.frame( round ( as.table(t(A)*n))) Var1 Var2 Freq 1 A A 10 2 B A 60 3 C A 0 4 A B 30 5 B B 120 6 C B 0 7 A C 60 8 B C 120 9 C C 200 ## expand into 600 rows (so A->A repeats 10 times, B->A repeats 60 times, etc) r <- rep(row.names(a2), a2$Freq) y <- a2[r, 1:2 ] rownames(y)<-1:nrow(y) ## this gets the original matrix prop.table( table(y[,2], y[,1]), 2) # so resample rows in y z <- sample(nrow(y), replace = TRUE) y1<-y[z,] prop.table( table(y1[,2], y1[,1]), 2) %*% n ## and repeat 1000 times n2<-matrix(numeric(1000*3), ncol=3) for(i in 1:1000) { z <- sample(nrow(y), replace = TRUE) y1<-y[z,] n2[i,] <-prop.table( table(y1[,2], y1[,1]), 2) %*% n } # the CIs are much wider here apply(n2, 2, quantile, c(0.025,0.975)) [,1] [,2] [,3] 2.5% 55.89959 132.6868 361.2091 97.5% 85.11097 169.3798 399.7032 Chris Stubben -- View this message in context: http://www.nabble.com/Stochastic-%28transition%29-matrices%3A-how-to-determine-distributions-and-variance--tp25209297p25227303.html Sent from the R help mailing list archive at Nabble.com.