thr3ads.net - R help - [R] NLS sensitivity to start= values or poles in data range [May 2012]

If this information is useful, please help other people find it:
Share via:

Christopher Battles

2012-May-11 20:33 UTC

[R] NLS sensitivity to start= values or poles in data range

Greetings R-help!  I'm fairly new to R and am trying to expand my knowledge
beyond using R for simple summary statistics and basic tests.  To that end I am
attempting to write an interactive R-script that will perform a general rational
function fit to a given dataset based on the example given at
http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd64.htm.

The problem that I seem to have is that the fitted function is nearly identical
to the NIST results when I use their initial values (these are representative
values that are close to actual data, but not taken from the dataset) for
generating the starting point for the nls() routine.  However if I try to use
actual data points extracted from the dataset for the preliminary fit, the
fitted function can often fail to approximate the data.

One culprit that I suspect may be that a pole (real root of the denominator) of
the initial fit is laying in the range of the data, since the NIST starting
values produce a preliminary fit that does not have any poles over the data
range.  Unfortunately, I'm not familiar enough with the nls() routines to
pass judgement.

The attached code reads in the data from the web, performs the preliminary fit
using lm(), passes that to nls() and plots the data, starting values,
preliminary fit and final fit. The code block in the middle defining indsubset
and depsubset contain three different options for the starting values. I thank
you all in advance for any help or insights.

Simplified code follows...  Actual code for the interactive version is available
at http://www.schrodingersghost.com/RationalFit-0.2.R


  # Start of Code	
  # Read and plot the data
  
  InputData <-
read.table("http://itl.nist.gov/div898/strd/nls/data/LINKS/DATA/Hahn1.dat",
				  header=0,
				  row.names=NULL, 
				  col.names=c("CTE","K"),
				  skip=60)
  indvar <- 2
  depvar <- 1
  
  indvarname <- names(InputData)[indvar]
  depvarname <- names(InputData)[depvar]
  
  attach(InputData)
  matplot(InputData[indvar],
          InputData[depvar], 
          type='p', pch=4, 
          xlab = indvarname, 
          ylab = depvarname, 
          main = bquote(paste(.(depvarname), " vs ", .(indvarname)))
          )

  # Specify values for determining nls start
  
  ########################################
  # Actual Data equally spaced in data set
  
  #indsubset <- c(14.13, 72.77, 163.48, 273.13, 419.51, 549.53, 850.98)
  #depsubset <- c(0.08, 7.267, 13.974, 16.337, 17.767, 18.61, 21.085)

  # Actual Data closest values to NIST starting values
  
  #indsubset <- c(14.13,31.30,40.03,50.24,120.25,202.14,845.97)
  #depsubset <- c(0.08,1.089,2.241,3.697,12.005,15.190,20.830)
  
  # NIST Starting Values
  
  indsubset <- c(10,30,40,50,120,200,800)
  depsubset <- c(0,2,3,5,12,15,20)
  #
  #########################################

  points(indsubset,depsubset, type="p", pch=19, col="red")
  
  # Generate starting values for nls per NIST process
  
  initmodel <- (depsubset ~   I(indsubset^1) 
					+ I(indsubset^2) 
					+ I(indsubset^3) 
					+ I(-1 * indsubset^1 * depsubset) 
					+ I(-1 * indsubset^2 * depsubset) 
					+ I(-1 * indsubset^3 * depsubset))

  InputDatalm <- lm(initmodel)
  summary(InputDatalm)
  fitted.values(InputDatalm)
  nlsstart <- coef(InputDatalm)  
  names(nlsstart) <- c("A0", "A1", "A2",
"A3", "B1", "B2", "B3")
  nlsstart <- as.list(nlsstart)
  attach(nlsstart)
  curve((A0*x^0+A1*x^1+A2*x^2+A3*x^3)/(1+B1*x^1+B2*x^2+B3*x^3), 
		from = min(K), 
		to = max(K), 
		add=TRUE, 
		col="blue", 
		lty=5, lwd=2 )
  
  # Check the poles of the starting function
  
  poles <- polyroot(c(1,B1,B2,B3))
  cat((poles), '\n')
  
  # Run the nls using above values as starting points
  
  fullmodel <- {CTE ~ (A0 * K^0 + A1 * K^1 + A2 * K^2 + A3 * K^3)/
			    (1 + B1 * K^1 + B2 * K^2 + B3 * K^3)}
  rationalfit <- nls(fullmodel,start=nlsstart)
  summary(rationalfit)
  rationalcoef <- as.list(coef(rationalfit))
  attach(rationalcoef)
  curve((A0*x^0+A1*x^1+A2*x^2+A3*x^3)/(1+B1*x^1+B2*x^2+B3*x^3), 
		from = min(K), 
		to = max(K), 
		add=TRUE, lwd=2)
  legend(600,5,c("Starting Values","Initial Value
Fit","Full Model Fit","Data"),
		lty=c(0,5,1,0),
		pch=c(19,NA,NA,4),
		col=c("red","blue","black","black"),
		plot=TRUE)

  # Check the poles of the final function
  
  poles <- polyroot(c(1,B1,B2,B3))
  cat((poles), '\n')
  
  # Cleanup
  detach(InputData)
  detach(rationalcoef)
  # End of Code




Thank you,

Christopher Battles

Possibly Parallel Threads

Search for more apparently analagous threads

R help - May 2012 - NLS sensitivity to start= values or poles in data range

[R] NLS sensitivity to start= values or poles in data range

Possibly Parallel Threads

Wisdom of the Ancients