Chang Jia-Ming
2008-Dec-05 13:59 UTC
[R] How to calculate the distance between two density functions
Dear all, I wrote the following code to calculate the density functions for two data sets, respectively. den_str <-density(str_data$Similarity); den_non_str <-density(nonstr_data$Similarity); However, I would like to knowing the difference between den_str and den_non_str, that is, the difference between the region under the curve of the den_str and the region under the curve of the den_non_str. How to do? Thank you for help. Jia-Ming [[alternative HTML version deleted]]
David Winsemius
2008-Dec-05 17:13 UTC
[R] How to calculate the distance between two density functions
A similar question was posed and answered: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/119793.html Two aspects needed to be addressed ... specifying the same domain, and getting the x-values to "line up" prior to the subtraction (or whatever function is desired). What are you going to do when the two functions cross? d1 <- dnorm(seq(-2,2,by=.1)) d2 <- dnorm(seq(-2,2,by=.1), mean=2) plot(seq(-2,2,by=.1),d1) lines(seq(-2,2,by=.1),d2) ---- or---- d4 <- dnorm(seq(-4,4,by=.1)) d5 <- dnorm(seq(-4,4,by=.1), sd=5) plot(seq(-4,4,by=.1),d4) lines(seq(-4,4,by=.1),d5) -- David Winsemius On Dec 5, 2008, at 8:59 AM, Chang Jia-Ming wrote:> Dear all, > > I wrote the following code to calculate the density functions for > two data > sets, respectively. > > den_str <-density(str_data$Similarity); > den_non_str <-density(nonstr_data$Similarity); > > However, I would like to knowing the difference between den_str and > den_non_str, that is, the difference between the region under the > curve of > the den_str and the region under the curve of the den_non_str. > > How to do? > > Thank you for help. > > Jia-Ming > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Lucke, Joseph F
2008-Dec-05 20:21 UTC
[R] How to calculate the distance between two density functions
In general, comparing two continuous densities is difficult because they can differ on a set of measure 0 (i.e., at a single point) and yet have the same distribution function. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Chang Jia-Ming Sent: Friday, December 05, 2008 8:00 AM To: r-help at r-project.org Subject: [R] How to calculate the distance between two density functions Dear all, I wrote the following code to calculate the density functions for two data sets, respectively. den_str <-density(str_data$Similarity); den_non_str <-density(nonstr_data$Similarity); However, I would like to knowing the difference between den_str and den_non_str, that is, the difference between the region under the curve of the den_str and the region under the curve of the den_non_str. How to do? Thank you for help. Jia-Ming [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Rainer M Krug
2008-Dec-06 11:38 UTC
[R] How to calculate the distance between two density functions
On Fri, Dec 5, 2008 at 3:59 PM, Chang Jia-Ming <chang.jiaming at crg.es> wrote:> Dear all, > > I wrote the following code to calculate the density functions for two data > sets, respectively. > > den_str <-density(str_data$Similarity); > den_non_str <-density(nonstr_data$Similarity); > > However, I would like to knowing the difference between den_str and > den_non_str, that is, the difference between the region under the curve of > the den_str and the region under the curve of the den_non_str.One way of calculating the difference between two density functions (or more general histograms), is the Earth Movers Distance (e.g.http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/RUBNER/emd.htm or http://en.wikipedia.org/wiki/Earth_Mover's_Distance ). Dirk Eddelbuettel and myself are finalizing an implementation of it and it will be available as soon as some licensing issues are sorted out, which will be hopefully rather soon. If you don't want to wait till the release, please drop either Dirk or myself an email and we could mail you the package. As I said, the implementation is working (I am using it in a research project at the moment), it is just that the license is at the moment nonprofit research only. Rainer> > How to do? > > Thank you for help. > > Jia-Ming > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Faculty of Science Natural Sciences Building Private Bag X1 University of Stellenbosch Matieland 7602 South Africa