> No > I've got a BDD based version of andersen's around already (uncommitted). > It's slower in time (by a factor of 8 or so), but very slightly (1-2 > megabytes of memory) more memory efficient on some cases. Nowadays, > the bitmap version is actually more memory efficient than the BDD > version, even if you turn off BDD caching, etc. > > The thing you have to understand is that these context sensitive BDD > algorithms work okay on java, but fall down pretty badly on C/C++, > because it is a much bigger problem. Whaley's is just brute force, > and Zhu's is pretty darn complex.I was interested to see whether Whaley's algorithm really works particularly, given its simplicity and the claims made in the papers.> Both are amenable to the same style of offline variable optimizations > that Hardekopf, et al (Ben and a friend have been helping out with > implementation of andersen's in LLVM), and that may actually make them > usable for real world programs.Are you referring to both the PLDI and the SAS papers?> > 4. Is anyone willing to mentor this project? > I would.Thanks ;)> I think you will be surprised how badly these algorithms work on even > moderate sized C++ programs :(I'm slightly confused here as to whether you mean in terms of precision of analysis, or compute time taken by running the algorithm. Richard
On Mon, Mar 3, 2008 at 5:55 AM, Richard Warburton <richard.warburton at gmail.com> wrote:> > No > > I've got a BDD based version of andersen's around already (uncommitted). > > It's slower in time (by a factor of 8 or so), but very slightly (1-2 > > megabytes of memory) more memory efficient on some cases. Nowadays, > > the bitmap version is actually more memory efficient than the BDD > > version, even if you turn off BDD caching, etc. > > > > The thing you have to understand is that these context sensitive BDD > > algorithms work okay on java, but fall down pretty badly on C/C++, > > because it is a much bigger problem. Whaley's is just brute force, > > and Zhu's is pretty darn complex. > > I was interested to see whether Whaley's algorithm really works > particularly, given its simplicity and the claims made in the papers.Certainly, it works, and is not hard to implement. All you need to do is to add a context number to each constraint, and then you can clone them at every call and increment context number fairly easily. Storing constraint graphs in BDD's is also trivial (ben's source has a few solvers that do this, IIRC).> > > > Both are amenable to the same style of offline variable optimizations > > that Hardekopf, et al (Ben and a friend have been helping out with > > implementation of andersen's in LLVM), and that may actually make them > > usable for real world programs. > > Are you referring to both the PLDI and the SAS papers?Yes, as well as some other optimizations that have not been presented.> > > > > 4. Is anyone willing to mentor this project? > > I would. > > Thanks ;) > > > > I think you will be surprised how badly these algorithms work on even > > moderate sized C++ programs :( > > I'm slightly confused here as to whether you mean in terms of > precision of analysis, or compute time taken by running the algorithm. >Compute time. None of these algorithms are useful at all in the real world if it takes 3 years to compute the answer. Precision of them, is of course, great, assuming you use a sane heap model. --Dan
On Mon, Mar 3, 2008 at 6:55 PM, Richard Warburton <richard.warburton at gmail.com> wrote:> > No > > I've got a BDD based version of andersen's around already (uncommitted). > > It's slower in time (by a factor of 8 or so), but very slightly (1-2 > > megabytes of memory) more memory efficient on some cases. Nowadays, > > the bitmap version is actually more memory efficient than the BDD > > version, even if you turn off BDD caching, etc. > > > > The thing you have to understand is that these context sensitive BDD > > algorithms work okay on java, but fall down pretty badly on C/C++, > > because it is a much bigger problem. Whaley's is just brute force, > > and Zhu's is pretty darn complex. > > I was interested to see whether Whaley's algorithm really works > particularly, given its simplicity and the claims made in the papers.I have implemented a tool based on the Phoenix compiler framework at MSR, which can dump relations as BDD from C/C++ code and feed them into John Whaley's bddbddb for further analysis. As Dan said, currently the performance is not good enough for C/C++ code (due to more complex relations, BDD variable orders, etc.). I am still trying several optimizations. John also told that it took quite a lot of effort to get the Java version to scale. I am interested to see an implementation based on LLVM. Xi
On Wed, Mar 5, 2008 at 3:44 AM, Xi Wang <xi.wang at gmail.com> wrote:> On Mon, Mar 3, 2008 at 6:55 PM, Richard Warburton > > <richard.warburton at gmail.com> wrote: > > > > No > > > I've got a BDD based version of andersen's around already (uncommitted). > > > It's slower in time (by a factor of 8 or so), but very slightly (1-2 > > > megabytes of memory) more memory efficient on some cases. Nowadays, > > > the bitmap version is actually more memory efficient than the BDD > > > version, even if you turn off BDD caching, etc. > > > > > > The thing you have to understand is that these context sensitive BDD > > > algorithms work okay on java, but fall down pretty badly on C/C++, > > > because it is a much bigger problem. Whaley's is just brute force, > > > and Zhu's is pretty darn complex. > > > > I was interested to see whether Whaley's algorithm really works > > particularly, given its simplicity and the claims made in the papers. > > I have implemented a tool based on the Phoenix compiler framework at > MSR, which can dump relations as BDD from C/C++ code and feed them > into John Whaley's bddbddb for further analysis. As Dan said, > currently the performance is not good enough for C/C++ code (due to > more complex relations, BDD variable orders, etc.). I am still trying > several optimizations. John also told that it took quite a lot of > effort to get the Java version to scale.My guess is that if you try doing variable substitution style optimizations, you will get very very serious speedups. Usually, out of 80-100k constraint variables, you can unify 60k of them. In my testing, it reduces the number of constraint variables by 80+% in non-context sensitive analysis. bddbddb is already going to be rather good at solving the query relations, so to get faster, you have to start reducing the size of the problem itself.