Andrew Robinson
2006-Jul-31 12:36 UTC
[R] Three questions about a model for possibly periodic data with varying amplitude
Hi dear R community, I have up to 12 measures of a protein for each of 6 patients, taken every two or three days. The pattern of the protein looks periodic, but the height of the peaks is highly variable. It's something like this: patient <- data.frame( day = c(1, 3, 5, 8, 10, 12, 15, 17, 19, 22, 24, 26), protein = c(5, 3, 10, 7, 2, 8, 25, 12, 7, 20, 10, 5) ) plot(patient$day, patient$protein, type="b") My goal is two-fold: firstly, I need to test for periodicity, and secondly, I need to try to predict the temporal location of future peaks. Of course, the peaks might be occurring on unmeasured days. I have been looking at this model: wave.form <- deriv3( ~ sin(2*pi*((day-offset)/period + 0.25)) * amplitude + mean, c("period", "offset", "amplitude", "mean"), function(day, period, offset, amplitude, mean){}) curve(wave.form(x, period=7, offset=2, mean=5, amplitude=4), from=1, to=30) So, for my purposes, the mean and the amplitude seem to be irrelevant; I want to estimate the location and the offset. To get going I've been using the following approach: require(nlme) wave.1 <- gnls(protein ~ wave.form(day, period, offset, amplitude, mean), start=list(period=7, offset=0, amplitude=10, mean=0), weights=varPower(), data=patient) ... which seems to fit the wave form pretty nicely. Question 1) I'm wondering, however, if anyone can suggest any improvements to my model or fitting procedure, or general approach. Generalizing to a non-linear mixed effects model is proving difficult. For example, ################################################################# set.seed(1234) patients <- expand.grid(patient.no = 1:6, day = patient$day) patient.params <- data.frame(patient.no = 1:6, period = c(5,6,7,8,7,6), offset = c(1,2,3,1,2,3), amplitude = c(10,6,10,6,10,6), mean = c(22,14,22,14,22,14)) patients <- merge(patients, patient.params) patients$protein <- with(patients, wave.form(day, period, offset, amplitude, mean)) + rnorm(n=dim(patients)[1], mean=5, sd=2) patients <- groupedData(protein ~ day | patient.no, data=patients) protein.nlme <- nlme(protein ~ wave.form(day, period, offset, amplitude, mean), fixed = period + offset + mean + amplitude ~ 1, random = period + offset ~ 1 | patient.no, start = c(period=2*pi, offset=0, mean=10, amplitude=5), data = patients) random.effects(protein.nlme) ################################################################# I get the following values for the BLUPS: period offset 2 0 -5.738510e-09 4 0 -6.426104e-08 6 0 6.847430e-09 1 0 6.275570e-07 5 0 -1.486590e-06 3 0 9.221848e-07 It seems clear to me that these BLUPS are quite unsuitable. Question 2) Can anyone suggest how I might improve my use of nlme? Other than using more data :) Question 3) Finally, I'm also wondering what would be a suitable test for periodicity for these data. I'd like to test the null hypothesis that the data are not periodic. All suggestions, discussion, welcome! Best wishes Andrew -- Andrew Robinson Department of Mathematics and Statistics Tel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 Email: a.robinson at ms.unimelb.edu.au http://www.ms.unimelb.edu.au