Luis E. Vargas
2014-Jan-24 00:56 UTC
[R] Comparing two HMMs to time series data using a Monte-Carlo method
Hi, we have some sequence data (animal vocalizations) and are wondering if someone could either confirm that our analysis strategy is sound or shoot it down (and hopefully make a recommendation for something better). We're using the *hmm.discnp R* package. For each individual, we have sequence data from two years, each treated as a single long sequence. For each individual, we want to test whether a new model fitted to year two data is significantly better at predicting year two sequences than is the year one model. We're thinking of a Monte-Carlo method as follows. 1. Fit an HMM to year one data (treated as one long sequence). Let’s call that HMM1. 2. Fit an HMM to year two data (treated as one long sequence). Let's call that HMM2. 3. Using HMM2, generate 99 (or whatever) shorter sequences of equal length. 4. For each generated sequence, find its log likelihood using HMM1. 5. For each generated sequence, find its log likelihood using HMM2. At this point we have 99 generated "year two" sequences that each have two log likelihoods, one from HMM1 (null model) and the second from HMM2 (alternative model). 6. For each generated sequence, find the log likelihood ratio using the LLs from HMM1 and HMM2. 7. Calculate LLR test stat = (log likelihood of the observed year one sequence under HMM1 - log likelihood of the observed year two sequence under HMM2)*2 8. Count how many LLRs are greater than test stat to determine if HMM2 is significantly better than HMM1 at predicting year two sequences. So our questions for the community are: A. Do you see any glaring errors in logic or methods? B. Does fitting HMM2 to sequences generated using HMM2 problematically bias our test to favor HMM2 over HMM1? We could use half of our year two data to create HMM2 and the other half to create a third HMM (HMMgenerate) used solely to generate the 99 sequences, but we'd like to use the entire data set from both years if possible. Thanks in advance for your replies! [[alternative HTML version deleted]]