John Hillier
2016-Mar-09 17:52 UTC
[R] truncpareto() - doesn't like my data and odd error message
Dear All, I am attempting to describe a distribution of height data. It appears roughly linear on a log-log plot, so Pareto seems sensible. However, the data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would like to fit a Pareto distribution to the reliable (i.e. truncated) section of the data. I found truncpareto(), and implemented one of its example uses successfully. Specifically, the third one at http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.). When I try to run my data, I get the output below. Inputs shown with chevrons.> pdataH <- data.frame(H_to_fit$Height) > summary(pdataH)H_to_fit.Height Min. :2000 1st Qu.:2281 Median :2666 Mean :2825 3rd Qu.:3212 Max. :4794> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE)Error in eval(expr, envir, enclos) : the value of argument 'lower' is too high (requires '0 < lower < min(y)') This is odd as the usage format is - truncpareto(lower, upper), and varying 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger variations. From the summary I think that my lowest input is 2000, which I am taking as min(y). I have also played with the upper limit. pdataH has 2117 observations in it. Is this a data format thing? i.e. of pdataH (a tried a few things, but to no avail) Is truncpareto sensitive to not converging? Am I using completely the wrong command? Thank you in advance for any assistance you can give. John <http://www.inside-r.org/packages/cran/vgam/docs/paretoff><http://www.inside-r.org/packages/cran/vgam/docs/paretoff>p.s - Example that I did get to run. # Upper truncated Pareto distribution lower <- 2; upper <- 8; kay <- exp<http://inside-r.org/r-doc/base/exp>(2) pdata3 <- data.frame<http://inside-r.org/r-doc/base/data.frame>(y = rtruncpareto(n = 100, lower = lower, upper = upper, shape = kay)) fit3 <- vglm(y ~ 1, truncpareto(lower, upper), data<http://inside-r.org/r-doc/utils/data> = pdata3, trace<http://inside-r.org/r-doc/base/trace> = TRUE) coef<http://inside-r.org/r-doc/stats/coef>(fit3, matrix<http://inside-r.org/r-doc/base/matrix> = TRUE) c<http://inside-r.org/r-doc/base/c>(fit3 at misc$lower, fit3 at misc$upper) and output> # Upper truncated Pareto distribution > lower <- 2; upper <- 8; kay <- exp(2) > pdata3 <- data.frame(y = rtruncpareto(n = 100, lower = lower,+ upper = upper, shape = kay))> fit3 <- vglm(y ~ 1, truncpareto(lower, upper), data = pdata3, trace = TRUE)VGLM linear loop 1 : loglikelihood = 12.127363 VGLM linear loop 2 : loglikelihood = 12.130407 VGLM linear loop 3 : loglikelihood = 12.130407> coef(fit3, matrix = TRUE)loge(shape) (Intercept) 1.955295> c(fit3 at misc$lower, fit3 at misc$upper)[1] 2 8 ------------------------- Dr John Hillier Senior Lecturer - Physical Geography Loughborough University 01509 223727 [[alternative HTML version deleted]]
peter dalgaard
2016-Mar-09 19:58 UTC
[R] truncpareto() - doesn't like my data and odd error message
> On 09 Mar 2016, at 18:52 , John Hillier <J.Hillier at lboro.ac.uk> wrote: > > Dear All, > > > I am attempting to describe a distribution of height data. It appears roughly linear on a log-log plot, so Pareto seems sensible. However, the data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would like to fit a Pareto distribution to the reliable (i.e. truncated) section of the data. > > > I found truncpareto(), and implemented one of its example uses successfully. Specifically, the third one at http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.). > > > When I try to run my data, I get the output below. Inputs shown with chevrons. > > >> pdataH <- data.frame(H_to_fit$Height) >> summary(pdataH) > H_to_fit.Height > Min. :2000 > 1st Qu.:2281 > > Median :2666 > Mean :2825 > 3rd Qu.:3212 > Max. :4794 >> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE) > Error in eval(expr, envir, enclos) : > the value of argument 'lower' is too high (requires '0 < lower < min(y)') > > > This is odd as the usage format is - truncpareto(lower, upper), and varying 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger variations. From the summary I think that my lowest input is 2000, which I am taking as min(y). I have also played with the upper limit. pdataH has 2117 observations in it. > > > Is this a data format thing? i.e. of pdataH (a tried a few things, but to no avail) >Umm, it doesn't seem to have a column called "y"? -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
John Hillier
2016-Mar-10 08:22 UTC
[R] truncpareto() - doesn't like my data and odd error message
Thank you Peter, I believe this might be the way the error message is hard coded (i.e. it's always y to describe the input). Anyway, I changed the first line to> pdataH <- data.frame(y = H_to_fit$Height)This makes the input 'y' instead of 'H_to_fit.Height', but makes no difference to the outcome/error message. John ------------------------- Dr John Hillier Senior Lecturer - Physical Geography Loughborough University 01509 223727 ________________________________________ From: peter dalgaard <pdalgd at gmail.com> Sent: 09 March 2016 19:58 To: John Hillier Cc: r-help at r-project.org Subject: Re: [R] truncpareto() - doesn't like my data and odd error message> On 09 Mar 2016, at 18:52 , John Hillier <J.Hillier at lboro.ac.uk> wrote: > > Dear All, > > > I am attempting to describe a distribution of height data. It appears roughly linear on a log-log plot, so Pareto seems sensible. However, the data are only reliable in a limited range (e.g. 2000 to 4800 m). So, I would like to fit a Pareto distribution to the reliable (i.e. truncated) section of the data. > > > I found truncpareto(), and implemented one of its example uses successfully. Specifically, the third one at http://www.inside-r.org/packages/cran/vgam/docs/paretoff (also see p.s.). > > > When I try to run my data, I get the output below. Inputs shown with chevrons. > > >> pdataH <- data.frame(H_to_fit$Height) >> summary(pdataH) > H_to_fit.Height > Min. :2000 > 1st Qu.:2281 > > Median :2666 > Mean :2825 > 3rd Qu.:3212 > Max. :4794 >> fit3 <- vglm(y ~ 1, truncpareto(2000, 4794), data = pdataH, trace = TRUE) > Error in eval(expr, envir, enclos) : > the value of argument 'lower' is too high (requires '0 < lower < min(y)') > > > This is odd as the usage format is - truncpareto(lower, upper), and varying 2000 to 1900 and 2100 makes no difference. Neither do smaller or larger variations. From the summary I think that my lowest input is 2000, which I am taking as min(y). I have also played with the upper limit. pdataH has 2117 observations in it. > > > Is this a data format thing? i.e. of pdataH (a tried a few things, but to no avail) >Umm, it doesn't seem to have a column called "y"? -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com