Scott.Wilkinson at csiro.au
2007-Aug-08 04:42 UTC
[R] Define t-distribution using data, and ghyp package
Dear R users, I am fitting a t-distribution to some data, then selecting randomly from the distribution. I am using the brand new ghyp package, which seems designed to do this. Firstly is this approach appropriate or are their alternatives I should also consider? I have a small dataset and the hyperbolic distribution is also used when heavy tails are expected, but I understand the t-distribution is designed for understanding variance when data are limited. I have 2 questions specific to ghyp; as this package is new perhaps my limited knowledge is complemented by some teething issues: 1. How should I set argument nu in function fit.tuv()? It appears to be a starting value. The preliminary draft help file says "nu is defined as -2*lambda" but lambda is not defined. Experimenting with Data1 below, nu=2.5 gives no error messages and hist() looks vaguely reasonable, although >Dist1 lists Converge=FALSE. Nu=2 gives "Error in FUN(newX[, i], ...) : If lambda < 0: chi must be > 0 and psi must be >= 0! lambda -1; chi = 0; psi = 0". Nu=3 gives "Warning message: NaNs produced in: sqrt(diag(hess)), and again >Dist1 gives Converge=FALSE, but also the resulting hist() is scrambled. Looking at Data2, again nu=2.5 gives no error messages, and >Dist2 lists Converge=TRUE. Interestingly, Dist2 lists nu=18.2927194, but if I set nu=18 in fit.tuv() I get "Warning message: NaNs produced in: sqrt(diag(hess))", and Converge=FALSE (!). 2. What is the significance/consequence of Converge=FALSE? How can I achieve Converge=TRUE apart from collecting more data? 3. In fit.tuv() I have experimented with setting argument symmetric=TRUE but why is the distribution always "Asymmetric Student-t"? Code: ************************ library(ghyp) Data1 <- c(37.07000, 46.94609, 38.19270, 41.98090, 36.45126, 45.25217, 39.07771, 39.35987) Dist1 <- fit.tuv(Data1, nu = 2.5, opt.pars = c(nu = T, mu = T, sigma T), symmetric = T) hist(Dist1, data = Data1, gaussian = T, log.hist = F, ylim = c(0,0.5), ghyp.col = 1, ghyp.lwd = 1, ghyp.lty = "solid", col = 1, nclass = 30) Data2 <- c(0.07, 2.68, 2.00, 0.27, 0.39, 1.17, 1.34, 4.34, 3.36, 3.61, 1.28) Dist2 <- fit.tuv(Data2, nu = 2.5, opt.pars = c(nu = T, mu = T, sigma T), symmetric = T) hist(Dist2, data = Data2, gaussian = T, log.hist = F, ylim = c(0,1), ghyp.col = 1, ghyp.lwd = 1, ghyp.lty = "solid", col = 1, nclass = 30) ************************ Thanks in advance, Scott.