Hi, Does anybody encounter the same problem when we overlap histogram and density that the density line seem to shift to the right a little bit? If you do have the same problem, what should we do to correct that? Thank you. par(mar=c(4,4,2,1.2),oma=c(0,0,0,0)) hist(datobs,prob=TRUE, main ="Volume of a catchment from four stations",col="yellowgreen", cex.axis=1, xlab="rainfall",ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200)) lines(density(dd), lwd=3,col="red") #legend("topright",c("observed","generated"),lty=c(0,1),fill=c("blue",""),bty="n") legend("topright", legend = c("observed","generated"), col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), lwd=c(0,3),bty="n", pt.cex=2) box() Thank you. [[alternative HTML version deleted]]
Roslina Zakaria <zroslina <at> yahoo.com> writes:> > Hi, > > Does anybody?encounter the same problem when?we overlap histogram and density > ? ? that the density line seem to shift to the right a little bit? > ? ? ?> ? ? par(mar=c(4,4,2,1.2),oma=c(0,0,0,0)) > ? ? hist(datobs,prob=TRUE, main ="Volume of a catchment from four > ? ? stations",col="yellowgreen", cex.axis=1, > ? ? xlab="rainfall",ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200)) > ? ? lines(density(dd), lwd=3,col="red") > ? ? legend("topright", legend = c("observed","generated"), > ? ? ?????? col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), > ? ? ?????? lwd=c(0,3),bty="n", pt.cex=2) > ? ? box()Are dd and datobs the same? There is nothing obviously (to me) wrong here. Density estimation by definition smears out sharp peaks, which can lead to differences between the histogram and density estimate. Hard to say any more without a reproducible example. z <- rnorm(5000) hist(z,prob=TRUE,col="gray",breaks=100) lines(density(z),col="red") looks fine to me.
On 11-Nov-10 18:39:34, Roslina Zakaria wrote:> Hi, > Does anybody encounter the same problem when we overlap histogram > and density that the density line seem to shift to the right a > little bit? > > If you do have the same problem, what should we do to correct that? > Thank you. > > par(mar=c(4,4,2,1.2),oma=c(0,0,0,0)) > hist(datobs,prob=TRUE, > main ="Volume of a catchment from four stations", > col="yellowgreen", cex.axis=1, xlab="rainfall", > ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200)) > > lines(density(dd), lwd=3,col="red") > >#legend("topright",c("observed","generated"), ># lty=c(0,1),fill=c("blue",""),bty="n") > > legend("topright", legend = c("observed","generated"), > col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), > lwd=c(0,3),bty="n", pt.cex=2) > box() > > Thank you.In theory that is not a problem. The density() function will estimate a density whose integral over each of the intervals in the histogram is equal to the probability of that interval, and the proportion of the data expected in that interval will also be its probability. In practice, the estent to which you observe what you describe (or a displacement to the left) will depend on how your data are distributed within the intervals, and on the precision with which density() happens to estimate the true density. The following 3 cases of the same data sampled from a log-Normal distribution, illustrate different impressions of the kind that one might get, depending on the details of the histogram. Note that there is no overall effect of "displacement to the right in any histogram, while the extent to which one observes it varies according to the histogram. Without knowledge of your data it is not possible to comment further on the extent to which you have observed it yourself! set.seed(54321) N <- 1000 X <- exp(rnorm(N,sd=0.4)) dd <- density(X) ## A coarse histogram H <- hist(X,prob=TRUE, xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.5*(0:8)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) ## A finer histogram H <- hist(X,prob=TRUE, xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) ## A still finer histogram H <- hist(X,prob=TRUE, xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 11-Nov-10 Time: 20:02:24 ------------------------------ XFMail ------------------------------
[OOPS!!I accidentally reproduced my second example below as my third example. Now corrected. See below.] On 11-Nov-10 20:02:29, Ted Harding wrote: On 11-Nov-10 18:39:34, Roslina Zakaria wrote:> Hi, > Does anybody encounter the same problem when we overlap histogram > and density that the density line seem to shift to the right a > little bit? > > If you do have the same problem, what should we do to correct that? > Thank you. > > par(mar=c(4,4,2,1.2),oma=c(0,0,0,0)) > hist(datobs,prob=TRUE, > main ="Volume of a catchment from four stations", > col="yellowgreen", cex.axis=1, xlab="rainfall", > ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200)) > > lines(density(dd), lwd=3,col="red") > >#legend("topright",c("observed","generated"), ># lty=c(0,1),fill=c("blue",""),bty="n") > > legend("topright", legend = c("observed","generated"), > col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), > lwd=c(0,3),bty="n", pt.cex=2) > box() > > Thank you.In theory that is not a problem. The density() function will estimate a density whose integral over each of the intervals in the histogram is equal to the probability of that interval, and the proportion of the data expected in that interval will also be its probability. In practice, the estent to which you observe what you describe (or a displacement to the left) will depend on how your data are distributed within the intervals, and on the precision with which density() happens to estimate the true density. The following 3 cases of the same data sampled from a log-Normal distribution, illustrate different impressions of the kind that one might get, depending on the details of the histogram. Note that there is no overall effect of "displacement to the right in any histogram, while the extent to which one observes it varies according to the histogram. Without knowledge of your data it is not possible to comment further on the extent to which you have observed it yourself! set.seed(54321) N <- 1000 X <- exp(rnorm(N,sd=0.4)) dd <- density(X) # A coarse histogram H <- hist(X,prob=TRUE, xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.5*(0:8)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) ## A finer histogram H <- hist(X,prob=TRUE, xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) ## A still finer histogram H <- hist(X,prob=TRUE, ## OOPS!! xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16)) xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.20*(0:20)) dx <- unique(diff(H$breaks)) lines(dd$x,dd$y) Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 11-Nov-10 Time: 20:12:27 ------------------------------ XFMail ------------------------------