On 9/4/07, Dan Gohman <djg at cray.com> wrote:> On Fri, Aug 31, 2007 at 08:10:33PM -0400, Daniel Berlin wrote: > > Suggestions, criticisms, etc, are welcome. > > I haven't studied the implementation, but I have a few comments on > the interface, and some style comments, below. > > > Index: include/llvm/ADT/SparseBitmap.h > > ==================================================================> > --- include/llvm/ADT/SparseBitmap.h (revision 0) > > +++ include/llvm/ADT/SparseBitmap.h (revision 0) > > @@ -0,0 +1,571 @@ > > +//===- llvm/ADT/SparseBitmap.h - Efficient Sparse Bitmap ----*- C++ -*-===// > > Add '-'s to make the first line line exactly 80 cols.Will do> > > +/// SparseBitmap is an implementation of a bitvector that is sparse by only > > +/// storing the elements that have non-zero bits set. In order to make this > > I'd like to suggest the name SparseBitVector for this class, to > correspond with the name of the existing BitVector class, which has > a similar interface, but isn't sparse.Sure, i have no problem with that.> > > + SparseBitmapElement(unsigned Idx) { > > This misses an explicit keyword. >Fixed!> > + if (Bitmap.Elements.empty()) > > + { > > + AtEnd = true; > > + return; > > + } > > Here and a few other places miss reformatting.Fixed. Anyone have a better .emacs for the indentation than the one in tools?> > > + template <int ElementSize> > > + class SparseBitmap { > > Do you expect clients will often want custom ElementSize values? Otherwise, > it seems like this template parameter should be given a default value, or > even just removed from the API.So, actually, at least in GCC, we discovered that we could speed up some dataflow problems a lot with varying the element size to 256, and on the other size, interference graphs could be reduced in space and time a lot by varying the element size down to 64 I'm happy to give it a default value of 128.> > > + bool AtEnd; > > + > > + SparseBitmap<ElementSize> &Bitmap; > > + > > + // Current element inside of bitmap > > + ElementListConstIter Iter; > > + > > + // Current bit number inside of our bitmap > > + unsigned BitNumber; > > + > > + // Current word number inside of our element > > + unsigned WordNumber; > > + > > + // Current bits from the element. > > + typename SparseBitmapElement<ElementSize>::BitWord Bits; > > Can these SparseBitmapIterator members, and a few more that follow, > be made private?They were intended to be, sorry.> > > + SparseBitmapIterator(SparseBitmap<ElementSize> &RHS, > > + bool end = false):Bitmap(RHS) { > > This appears to miss an explicit keyword.Fixed.
On Tue, Sep 04, 2007 at 10:35:10AM -0400, Daniel Berlin wrote:> On 9/4/07, Dan Gohman <djg at cray.com> wrote: > > On Fri, Aug 31, 2007 at 08:10:33PM -0400, Daniel Berlin wrote: > > > + template <int ElementSize> > > > + class SparseBitmap { > > > > Do you expect clients will often want custom ElementSize values? Otherwise, > > it seems like this template parameter should be given a default value, or > > even just removed from the API. > > So, actually, at least in GCC, we discovered that we could speed up > some dataflow problems a lot with varying the element size to 256, and > on the other size, interference graphs could be reduced in space and > time a lot by varying the element size down to 64 > I'm happy to give it a default value of 128.Interesting. Out of curiosity, do you happen to know roughly the sizes and sparsities where using 256-bit elements was faster? Looking at the implementation a little more, I see ElementList is a list of pointers to elements: + typedef std::list<SparseBitmapElement<ElementSize> *> ElementList; The extra pointer indirection is unfortunate; It'd be nice to have just a list of elements. I guess std::list's push_back always makes an element copy, which might not be ideal, though that could be weighed against the extra operator new call that the current code does. Another option would be to use <llvm/ADT/ilist>, which looks like it can do insert/push_back without making copies. Either of these approaches would also fix what looks like a leak in operator&=, where elements are erased from the list without being deleted. Dan -- Dan Gohman, Cray Inc.
On 9/4/07, Dan Gohman <djg at cray.com> wrote:> On Tue, Sep 04, 2007 at 10:35:10AM -0400, Daniel Berlin wrote: > > On 9/4/07, Dan Gohman <djg at cray.com> wrote: > > > On Fri, Aug 31, 2007 at 08:10:33PM -0400, Daniel Berlin wrote: > > > > + template <int ElementSize> > > > > + class SparseBitmap { > > > > > > Do you expect clients will often want custom ElementSize values? Otherwise, > > > it seems like this template parameter should be given a default value, or > > > even just removed from the API. > > > > So, actually, at least in GCC, we discovered that we could speed up > > some dataflow problems a lot with varying the element size to 256, and > > on the other size, interference graphs could be reduced in space and > > time a lot by varying the element size down to 64 > > I'm happy to give it a default value of 128. > > Interesting. Out of curiosity, do you happen to know roughly the sizes > and sparsities where using 256-bit elements was faster?They were pretty sparse, but did enough random bit testing that the grouping of bits helped. At some point later, I introduced a sparse bitmap that is more memory intensive but much faster at random bit testing, and it was a win here as well, but not worth the memory cost :)> > Looking at the implementation a little more, I see ElementList is a list > of pointers to elements: > > + typedef std::list<SparseBitmapElement<ElementSize> *> ElementList; > > The extra pointer indirection is unfortunate; It'd be nice to have just > a list of elements. I guess std::list's push_back always makes an > element copy, which might not be ideal, though that could be weighed > against the extra operator new call that the current code does. Another > option would be to use <llvm/ADT/ilist>, which looks like it can do > insert/push_back without making copies. Either of these approaches would > also fix what looks like a leak in operator&=, where elements are erased > from the list without being deleted.I seriously considered not using indirection, but it made a lot of the functions more annoying, and it didn't seem worth the benefits, given that if the compiler goes and makes element copies behind your back, it becomes really really really bad :) In GCC, you can use a custom allocator with the bitmaps that will reuse freed elements and keep a free list so you could blow them all away fast (by linking them directly into the free list). This is helpful for per-iteration pools of bitmaps, which is useful to points-to analysis. I was too lazy to make an operator new that emulated this functionality. Also, someone on IRC asked about the weird function naming conventions. I know most of llvm uses mixedCase, but i copied the style of BitVector, which has a weird mix. I'm happy to change it :) In any case, updated patch attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: sparsebitmap.diff Type: text/x-diff Size: 18889 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070904/0f15bcef/attachment.diff>