Abby Spurdle
2021-Mar-06 07:09 UTC
[R] quantile from quantile table calculation without original data
I'm sorry. I misread your example, this morning. (I didn't read the code after the line that calls plot). After looking at this problem again, interpolation doesn't apply, and extrapolation would be a last resort. If you can assume your data comes from a particular type of distribution, such as a lognormal distribution, then a better approach would be to find the most likely parameters. i.e. This falls within the broader scope of maximum likelihood. (Except that you're dealing with a table of quantile-probability pairs, rather than raw observational data). I suspect that there's a relatively easy way of finding the parameters. I'll think about it... But someone else may come back with an answer first... On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle <spurdle.a at gmail.com> wrote:> > I note three problems with your data: > (1) The name "percent" is misleading, perhaps you want "probability"? > (2) There are straight (or near-straight) regions, each of which, is > equally (or near-equally) spaced, which is not what I would expect in > problems involving "quantiles". > (3) Your plot (approximating the distribution function) is > back-the-front (as per what is customary). > > > On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr <petr.pikal at precheza.cz> wrote: > > > > Dear all > > > > I have table of quantiles, probably from lognormal distribution > > > > dput(temp) > > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069, > > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1, > > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent" > > ), row.names = c(NA, -9L), class = "data.frame") > > > > and I need to calculate quantile for size 0.1 > > > > plot(temp$size, temp$percent, pch=19, xlim=c(0,2)) > > ss <- approxfun(temp$size, temp$percent) > > points((0:100)/50, ss((0:100)/50)) > > abline(v=.1) > > > > If I had original data it would be quite easy with ecdf/quantile function but without it I am lost what function I could use for such task. > > > > Please, give me some hint where to look. > > > > > > Best regards > > > > Petr > > Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner's personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ > > D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.
Abby Spurdle
2021-Mar-06 09:02 UTC
[R] quantile from quantile table calculation without original data
I came up with a solution. But not necessarily the best solution. I used a spline to approximate the quantile function. Then use that to generate a large sample. (I don't see any need for the sample to be random, as such). Then compute the sample mean and sd, on a log scale. Finally, plug everything into the plnorm function: p <- seq (0.01, 0.99,, 1e6) Fht <- splinefun (temp$percent, temp$size) x <- log (Fht (p) ) psolution <- plnorm (0.1, mean (x), sd (x), FALSE) psolution The value of the solution is very close to one. Which is not a surprise. Here's a plot of everything: u <- seq (0.000001, 1.65,, 200) v <- plnorm (u, mean (x), sd (x), FALSE) plot (u, v, type="l", ylim = c (0, 1) ) points (temp$size, temp$percent, pch=16) points (0.1, psolution, pch=16, col="blue") On Sat, Mar 6, 2021 at 8:09 PM Abby Spurdle <spurdle.a at gmail.com> wrote:> > I'm sorry. > I misread your example, this morning. > (I didn't read the code after the line that calls plot). > > After looking at this problem again, interpolation doesn't apply, and > extrapolation would be a last resort. > If you can assume your data comes from a particular type of > distribution, such as a lognormal distribution, then a better approach > would be to find the most likely parameters. > > i.e. > This falls within the broader scope of maximum likelihood. > (Except that you're dealing with a table of quantile-probability > pairs, rather than raw observational data). > > I suspect that there's a relatively easy way of finding the parameters. > > I'll think about it... > But someone else may come back with an answer first... > > > On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle <spurdle.a at gmail.com> wrote: > > > > I note three problems with your data: > > (1) The name "percent" is misleading, perhaps you want "probability"? > > (2) There are straight (or near-straight) regions, each of which, is > > equally (or near-equally) spaced, which is not what I would expect in > > problems involving "quantiles". > > (3) Your plot (approximating the distribution function) is > > back-the-front (as per what is customary). > > > > > > On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr <petr.pikal at precheza.cz> wrote: > > > > > > Dear all > > > > > > I have table of quantiles, probably from lognormal distribution > > > > > > dput(temp) > > > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069, > > > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1, > > > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent" > > > ), row.names = c(NA, -9L), class = "data.frame") > > > > > > and I need to calculate quantile for size 0.1 > > > > > > plot(temp$size, temp$percent, pch=19, xlim=c(0,2)) > > > ss <- approxfun(temp$size, temp$percent) > > > points((0:100)/50, ss((0:100)/50)) > > > abline(v=.1) > > > > > > If I had original data it would be quite easy with ecdf/quantile function but without it I am lost what function I could use for such task. > > > > > > Please, give me some hint where to look. > > > > > > > > > Best regards > > > > > > Petr > > > Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner's personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ > > > D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code.