I have some data measured with a coarsely-quantized clock. Let's say the real data are q<- sort(rexp(100,.5)) The quantized form is floor(q), so a simple quantile plot of one against the other can be calculated using: plot(q,type="l"); points(floor(q),col="red") which of course shows the characteristic stair-step. I would like to smooth the quantized form back into an approximation of the underlying data. The simplest approach I can think of adds a uniform random variable of the size of the quantization: plot(q,type="l"); points(floor(q),col="red"); points(floor(q)+runif(100,0,1),col="blue") This gives pretty good results for uniform distributions, but less good for others (like exponential). Is there a better interpolation/smoothing function for cases like this, either Monte Carlo as above or deterministic? Thanks, -s
Another approach: ? jitter plot(jitter(q, factor=1),type="l") factor = 1 by default but can get increased so the spaces get filled in to your satisfaction: plot(q,type="l"); points( jitter(floor(q), factor=2) ,col="red") plot(q,type="l"); points( jitter(floor(q), factor=3), col="red") I suppose knowing that you "rounded down" might make the choice of adding a positive runif a better option. I could not tell from the documentation what sort of noise was added to the values by jitter, but checking the code I see that it is also uniform. -- David Winsemius On Nov 20, 2008, at 10:43 AM, Stavros Macrakis wrote:> I have some data measured with a coarsely-quantized clock. Let's say > the real data are > > q<- sort(rexp(100,.5)) > > The quantized form is floor(q), so a simple quantile plot of one > against the other can be calculated using: > > plot(q,type="l"); points(floor(q),col="red") > > which of course shows the characteristic stair-step. I would like to > smooth the quantized form back into an approximation of the underlying > data. > > The simplest approach I can think of adds a uniform random variable of > the size of the quantization: > > plot(q,type="l"); points(floor(q),col="red"); > points(floor(q)+runif(100,0,1),col="blue") > > This gives pretty good results for uniform distributions, but less > good for others (like exponential). Is there a better > interpolation/smoothing function for cases like this, either Monte > Carlo as above or deterministic? > > Thanks, > > -s > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
The logspline package has tools for estimating a density function for interval censored data (the old methods), you could use those to estimate the density of your data, then compare that density to the theoretical density. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Stavros Macrakis > Sent: Thursday, November 20, 2008 8:43 AM > To: r-help at r-project.org > Subject: [R] Dequantizing > > I have some data measured with a coarsely-quantized clock. Let's say > the real data are > > q<- sort(rexp(100,.5)) > > The quantized form is floor(q), so a simple quantile plot of one > against the other can be calculated using: > > plot(q,type="l"); points(floor(q),col="red") > > which of course shows the characteristic stair-step. I would like to > smooth the quantized form back into an approximation of the underlying > data. > > The simplest approach I can think of adds a uniform random variable of > the size of the quantization: > > plot(q,type="l"); points(floor(q),col="red"); > points(floor(q)+runif(100,0,1),col="blue") > > This gives pretty good results for uniform distributions, but less > good for others (like exponential). Is there a better > interpolation/smoothing function for cases like this, either Monte > Carlo as above or deterministic? > > Thanks, > > -s > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
> I have some data measured with a coarsely-quantized clock. Let's say > the real data are > > q<- sort(rexp(100,.5)) > > The quantized form is floor(q), so a simple quantile plot of one > against the other can be calculated using: > > plot(q,type="l"); points(floor(q),col="red") > > which of course shows the characteristic stair-step. I would like to > smooth the quantized form back into an approximation of the underlying > data. > > The simplest approach I can think of adds a uniform random variable of > the size of the quantization: > > plot(q,type="l"); points(floor(q),col="red"); > points(floor(q)+runif(100,0,1),col="blue") > > This gives pretty good results for uniform distributions, but less > good for others (like exponential). Is there a better > interpolation/smoothing functionI'm not convinced that adding a random amount to the floor values to 'make up' the underlying data is very meaningful. If you know what the underlying distribution is, then you are best off using this distribution to generate plots and extra pretend data. If you know the distribution is exponential, then you can estimate the rate by treating the true values as interval censored data, somewhere between floor and floor+1. library(survival) q <- sort(rexp(100,.5)) qq <- floor(q) qq[qq==0] <- 0.00001 #survreg doesn't like values that are exactly zero ss <- Surv(qq, qq+1, type="interval2") model <- survreg(ss ~ 1, dist="exponential") summary(model) rate <- 1/exp(model$coefficients["(Intercept)"]); rate #hopefully something close to 0.5 If you don't know the underlying distribution either, then things get trickier. Try a histogram/kernel density plot/boxplot to see what it looks like. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}
I'm rather doubtful that you can improve on the uniform jittering strategy you originally considered. It would require intimate knowledge about the non-uniformity of the density in the spacings between your quantized version. But if you really _knew_ the parent distribution then something like the following might have been what you had in mind: # Toy dequantization example rate <- 1 x <- sort(rexp(100,rate)) xu <- x + runif(100) y <- floor(x) ty <- table(y) p <- c(0,cumsum(table(y)/length(y))) pup <- p[-1] plo <- p[-length(p)] fun <- function(ty,plo,pup) qexp(runif(ty,plo,pup),rate) z <- unlist(mapply(fun, ty = ty, plo = plo, pup = pup)) url: www.econ.uiuc.edu/~roger Roger Koenker email rkoenker at uiuc.edu Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On Nov 20, 2008, at 9:43 AM, Stavros Macrakis wrote:> I have some data measured with a coarsely-quantized clock. Let's say > the real data are > > q<- sort(rexp(100,.5)) > > The quantized form is floor(q), so a simple quantile plot of one > against the other can be calculated using: > > plot(q,type="l"); points(floor(q),col="red") > > which of course shows the characteristic stair-step. I would like to > smooth the quantized form back into an approximation of the underlying > data. > > The simplest approach I can think of adds a uniform random variable of > the size of the quantization: > > plot(q,type="l"); points(floor(q),col="red"); > points(floor(q)+runif(100,0,1),col="blue") > > This gives pretty good results for uniform distributions, but less > good for others (like exponential). Is there a better > interpolation/smoothing function for cases like this, either Monte > Carlo as above or deterministic? > > Thanks, > > -s > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.