Dear All, I hope this is not too off topic. I am given a set of scatteplots (nothing too fancy; think about a normal x-y 2D plot). I do not deal with two time series (indeed I have no info about time). If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two vectors of numbers most of the case, but sometimes they can be categorical variables), I can plot one against the other and I essentially I need to determine whether A=f(B, noise) or B=g(A, noise) where the noise is the effect of other possibly unknown variables, measurement errors etc.... and f and g are two functions. Without the noise, if I want to test if A=f(B) [B causes A], then I need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different effects must have a different cause), whereas it is not ruled out that f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect). However, in presence of the noise, these properties will hold only approximately so....any idea about how a statistical test, rather than eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)? Any suggestion is welcome. Lorenzo
On 22/04/2013 10:48 AM, Lorenzo Isella wrote:> Dear All, > I hope this is not too off topic. > I am given a set of scatteplots (nothing too fancy; think about a > normal x-y 2D plot). > I do not deal with two time series (indeed I have no info about time). > If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two > vectors of numbers most of the case, but sometimes they can be > categorical variables), I can plot one against the other and I > essentially I need to determine whether > > A=f(B, noise) or B=g(A, noise) > > where the noise is the effect of other possibly unknown variables, > measurement errors etc.... and f and g are two functions. > > Without the noise, if I want to test if A=f(B) [B causes A], then I > need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different > effects must have a different cause), whereas it is not ruled out that > f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect). > > However, in presence of the noise, these properties will hold only > approximately so....any idea about how a statistical test, rather than > eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)? > Any suggestion is welcome.In general there can't be such a test. Think about the case of simple linear regression. If I randomly draw X from a normal distribution, then randomly draw Y_i = a + b X_i + e_i, where the e_i are drawn from an independent normal distribution, I end up with (X,Y) having a bivariate normal distribution. In your notation, X would cause Y, but there is *nothing* here to distinguish this from draws directly from the bivariate normal distribution, or draws of Y first, followed by X from its conditional distribution (which is also a linear regression model). With some extra information inference might be possible, but not in the generality you ask for. Duncan Murdoch
On Mon, Apr 22, 2013 at 3:48 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:> Dear All, > I hope this is not too off topic. > I am given a set of scatteplots (nothing too fancy; think about a > normal x-y 2D plot). > I do not deal with two time series (indeed I have no info about time). > If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two > vectors of numbers most of the case, but sometimes they can be > categorical variables), I can plot one against the other and I > essentially I need to determine whether > > A=f(B, noise) or B=g(A, noise)What's the mathematical difference in these two cases? It seems only a matter of interpretation.> > where the noise is the effect of other possibly unknown variables, > measurement errors etc.... and f and g are two functions. > > Without the noise, if I want to test if A=f(B) [B causes A], then I > need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different > effects must have a different cause), whereas it is not ruled out that > f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect). > > However, in presence of the noise, these properties will hold only > approximatelyDo they even hold approximately?>so....any idea about how a statistical test, rather than > eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)? > Any suggestion is welcome.http://xkcd.com/552/> > Lorenzo > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On 4/22/2013 9:48 AM, Lorenzo Isella wrote:> Dear All, > I hope this is not too off topic. > I am given a set of scatteplots (nothing too fancy; think about a > normal x-y 2D plot). > I do not deal with two time series (indeed I have no info about time). > If I call A=(A1,A2,...) and B=(B1, B2, ...) the 2 variables (two > vectors of numbers most of the case, but sometimes they can be > categorical variables), I can plot one against the other and I > essentially I need to determine whether > > A=f(B, noise) or B=g(A, noise) > > where the noise is the effect of other possibly unknown variables, > measurement errors etc.... and f and g are two functions. > > Without the noise, if I want to test if A=f(B) [B causes A], then I > need at least to ensure that f(B1)!=f(B2) must imply B1!=B2 (different > effects must have a different cause), whereas it is not ruled out that > f(B1)=f(B2) for B1!=B2 (different causes may lead to the same effect). > > However, in presence of the noise, these properties will hold only > approximately so....any idea about how a statistical test, rather than > eyeballing, to tell apart A=f(B, noise) vs B=g(A, noise)? > Any suggestion is welcome.It strikes me that this is not a particularly productive approach to causality, particularly in an observational setting. You would need to design an experiment where you had a known manipulation of an explanatory variable and studied the change in a response variable, and then, you came back with the roles reversed. I don't think R or indeed any statistical package can help you here. Rob> Lorenzo > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Robert W. Baer, Ph.D. Professor of Physiology Kirksille College of Osteopathic Medicine A. T. Still University of Health Sciences Kirksville, MO 63501 USA