Justin Balthrop
2015-Nov-30 15:19 UTC
[R] General copula model with heterogeneous marginals
I am looking to model the sum of a number of random variables with arbitrary gamma distributions and an empirical dependence structure that I obtain from data. Basically I observe all of the individual pieces but I want to model their sum, as opposed to many copula questions which observe a single outcome of a multivariate process and seek to fit possible marginal and covariance structure. It has been years since I coded in R, but this is what I have thus far: library(copula) library(scatterplot3d) library(psych) set.seed(1) myCop<- normalCopula(param=c(.1,.1,.1,.1,.1,.2,.2,.2,.2,.2,.2,.2,.4,.4,.4,.4,.4,.5,.5,.5,.5), dim=7, dispstr="un") myMvd<-mvdc(copula=myCop, margins=rep("gamma",7), paramMargins=list(list(shape=3,scale=4), list(shape=2, scale=5), list(shape=2, scale=5), list(shape=2, scale=5), list(shape=2, scale=5), list(shape=3, scale=5), list(shape=3, scale=5))) simulation<- rMvdc(20000,myMvd) colnames(simulation)<-c("P1","P2","P3","P4","P5","P6","P7") total = simulation[,1]+simulation[,2]+simulation[,3]+simulation[,4]+simulation[,5]+simulation[,6]+simulation[,7] As you can see, I have forced 7 gamma distributions with a placeholder covariance matrix input. The problem is that I am looking to generalize this to the order of ~150 different marginals with potentially differing distributions and parameters. Ultimately I will have the following input: ? matrix of 150 marginal distributions with family and parameters ? 150x150 covariance matrix And what I need to produce is the following: An empirical CDF/PDF of the sum of realizations from 5-10 of the underlying marginal distributions. To be more clear, assume each marginal distribution is a person's response to a treatment, and I need to calculate the cumulative treatment effect for a sub-group of the population of 150. So, I will have a vector of 0s and 1s to identify which members of the population are grouped together for a trial. Then I will have a separate vector for the next group. Each group vector will have dim=150 but have between 5 and 10 1s with the rest 0s. I need a different empirical CDF for each vector. Any help?