I'm using nls to fit periodic gene-expression data to sine waves. I need to set the upper and lower boundaries, because I do not want any negative phase and amplitude solutions. This means that I have to use the "port" algorithm. The problem is, that depending on what start value I choose for phase, the fit works for some cases, but not for others. In the example below, the fit works using phase=pi, but not using phase=0. But there are many examples which fit just fine using 0. Is there a comparable alternative to nls that is not so extremely influenced by the start values? # Data for example fit lowervals <- list(phase=0, amp=0) uppervals <- list(phase=2*pi, amp=2) afreq <- 1 / (24 / 2 / pi) gene_expression <- c(1.551383, 1.671742, 1.549499, 1.694480, 1.632436, 1.471568, 1.623381, 1.579361, 1.809394, 1.753223, 1.685918, 1.754968, 1.963069, 1.820690, 1.985159, 2.205064, 2.160308, 2.120189, 2.194758, 2.165993, 2.189981, 2.098671, 2.122207, 2.012621, 1.963610, 1.884184, 1.955160, 1.801175, 1.829686, 1.773260, 1.588768, 1.563774, 1.559192) tpoints <- c(0,0,0,2,2,2,4,4,4,6,6,6,8,8,8,12,12,12,14,14,14,16,16,16,18,18,18,20,20,20,24,24,24) shift=mean(gene_expression) # y-axis (expression) shift # Perfect fit startvals <- list(phase=pi, amp=0.5) sine_nls <- nls(gene_expression ~ sin(tpoints * afreq + phase) * amp + shift, start=startvals, algorithm="port", lower=lowervals, upper=uppervals) # Convergence failure startvals <- list(phase=0, amp=0.5) sine_nls <- nls(gene_expression ~ sin(tpoints * afreq + phase) * amp + shift, start=startvals, algorithm="port", lower=lowervals, upper=uppervals)
Niklaus Fankhauser <niklaus.fankhauser <at> cell.biol.ethz.ch> writes:> I'm using nls to fit periodic gene-expression data to sine waves. I need > to set the upper and lower boundaries, because I do not want any > negative phase and amplitude solutions. This means that I have to use > the "port" algorithm. The problem is, that depending on what start value > I choose for phase, the fit works for some cases, but not for others. > In the example below, the fit works using phase=pi, but not using > phase=0. But there are many examples which fit just fine using 0. > > Is there a comparable alternative to nls that is not so extremely > influenced by the start values? >Use package `nls2' to first search on a grid, and then apply `nls' again to identify the globally best point: # Data for example fit afreq <- 1 / (24 / 2 / pi) tpoints <- c(0,0,0,2,2,2,4,4,4,6,6,6,8,8,8,12,12,12, 14,14,14,16,16,16,18,18,18,20,20,20,24,24,24) gene_expression <- c(1.551383, 1.671742, 1.549499, 1.694480, 1.632436, 1.471568, 1.623381, 1.579361, 1.809394, 1.753223, 1.685918, 1.754968, 1.963069, 1.820690, 1.985159, 2.205064, 2.160308, 2.120189, 2.194758, 2.165993, 2.189981, 2.098671, 2.122207, 2.012621, 1.963610, 1.884184, 1.955160, 1.801175, 1.829686, 1.773260, 1.588768, 1.563774, 1.559192) shift=mean(gene_expression) # y-axis (expression) shift # Grid search library("nls2") frml <- gene_expression ~ sin(tpoints * afreq + phase) * amp + shift startdf <- data.frame(phase=c(0, 2*pi), amp = c(0, 2)) nls2(frml, algorithm = "grid-search", start = startdf, control = list(maxiter=200)) # Perfect fit startvals <- list(phase = 4.4880, amp = 0.2857) sine_nls <- nls(frml, start=startvals) # phase amp # 4.3964 0.2931 # residual sum-of-squares: 0.1378 Maybe this can be done in one step. Hans Werner