On 9/4/07, Dan Gohman <djg at cray.com> wrote:> On Tue, Sep 04, 2007 at 10:35:10AM -0400, Daniel Berlin wrote: > > On 9/4/07, Dan Gohman <djg at cray.com> wrote: > > > On Fri, Aug 31, 2007 at 08:10:33PM -0400, Daniel Berlin wrote: > > > > + template <int ElementSize> > > > > + class SparseBitmap { > > > > > > Do you expect clients will often want custom ElementSize values? Otherwise, > > > it seems like this template parameter should be given a default value, or > > > even just removed from the API. > > > > So, actually, at least in GCC, we discovered that we could speed up > > some dataflow problems a lot with varying the element size to 256, and > > on the other size, interference graphs could be reduced in space and > > time a lot by varying the element size down to 64 > > I'm happy to give it a default value of 128. > > Interesting. Out of curiosity, do you happen to know roughly the sizes > and sparsities where using 256-bit elements was faster?They were pretty sparse, but did enough random bit testing that the grouping of bits helped. At some point later, I introduced a sparse bitmap that is more memory intensive but much faster at random bit testing, and it was a win here as well, but not worth the memory cost :)> > Looking at the implementation a little more, I see ElementList is a list > of pointers to elements: > > + typedef std::list<SparseBitmapElement<ElementSize> *> ElementList; > > The extra pointer indirection is unfortunate; It'd be nice to have just > a list of elements. I guess std::list's push_back always makes an > element copy, which might not be ideal, though that could be weighed > against the extra operator new call that the current code does. Another > option would be to use <llvm/ADT/ilist>, which looks like it can do > insert/push_back without making copies. Either of these approaches would > also fix what looks like a leak in operator&=, where elements are erased > from the list without being deleted.I seriously considered not using indirection, but it made a lot of the functions more annoying, and it didn't seem worth the benefits, given that if the compiler goes and makes element copies behind your back, it becomes really really really bad :) In GCC, you can use a custom allocator with the bitmaps that will reuse freed elements and keep a free list so you could blow them all away fast (by linking them directly into the free list). This is helpful for per-iteration pools of bitmaps, which is useful to points-to analysis. I was too lazy to make an operator new that emulated this functionality. Also, someone on IRC asked about the weird function naming conventions. I know most of llvm uses mixedCase, but i copied the style of BitVector, which has a weird mix. I'm happy to change it :) In any case, updated patch attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: sparsebitmap.diff Type: text/x-diff Size: 18889 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070904/0f15bcef/attachment.diff>
On Sep 4, 2007, at 4:36 PM, Daniel Berlin wrote:> Also, someone on IRC asked about the weird function naming > conventions. I know most of llvm uses mixedCase, but i copied the > style of BitVector, which has a weird mix. I'm happy to change it :)Yes please. By default, it is a good idea to follow llvm coding conventions instead of copying violation of coding style from existing code. Otherwise it pollutes code and, in future, gives someone one additional reason to not follow preferred coding style :) While you're here please fix BitVector also. Thanks! - Devang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070904/efe1520f/attachment.html>
On Sep 4, 2007, at 4:36 PM, Daniel Berlin wrote: [snip] Don't forget to update ProgrammersManual.html "Picking the Right Data Structure for a Task" section. :) - Devang
On 9/4/07, Devang Patel <dpatel at apple.com> wrote:> > > On Sep 4, 2007, at 4:36 PM, Daniel Berlin wrote: > > > Also, someone on IRC asked about the weird function naming > > conventions. I know most of llvm uses mixedCase, but i copied the > > style of BitVector, which has a weird mix. I'm happy to change it :) > Yes please.Fixed.> By default, it is a good idea to follow llvm coding conventions > instead of copying violation of coding style from existing code. Otherwise > it pollutes code and, in future, gives someone one additional reason to not > follow preferred coding style :) While you're here please fix BitVector > also.I'll do that in a followup patch, since BitVector has a bunch of users.
On 9/4/07, Devang Patel <dpatel at apple.com> wrote:> > On Sep 4, 2007, at 4:36 PM, Daniel Berlin wrote: > > [snip] > > Don't forget to update ProgrammersManual.html "Picking the Right > Data Structure for a Task" section. :)It doesn't talk about bitvector at all. I'm not sure whether i should add it to set like containers, or add a section about bitsets that talks about bitvector and sparsebitvector. In either case, I think this should be a followup patch, but if you guys really want it done as part of this one, I will.> > - > Devang > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Tue, 4 Sep 2007, Daniel Berlin wrote:>> insert/push_back without making copies. Either of these approaches would >> also fix what looks like a leak in operator&=, where elements are erased >> from the list without being deleted. > > I seriously considered not using indirection, but it made a lot of the > functions more annoying, and it didn't seem worth the benefits, given > that if the compiler goes and makes element copies behind your back, > it becomes really really really bad :)I agree that avoiding the indirection would be good, but, in practice, getting the data structure in and refining it later also is fine. Please valgrind it when it has a client to make sure it isn't leaking memory. Some minor comments: +/// SparseBitVector is an implementation of a bitvector that is sparse by only +/// storing the elements that have non-zero bits set. In order to make this +/// fast for the most common cases, SparseBitVector is implemented as a linked list Please wrap lines to fit in 80 columns. + template <int ElementSize = 128> + struct SparseBitVectorElement { + public: I'd suggest outdenting this. Indenting due to the namespace doesn't add any value. Also, I'd suggest making ElementSize 'unsigned' instead of int. + unsigned ElementIndex; // Index of Element in terms of where first bit + // starts Please use: /// ElementIndex - Index of Element in terms of where first bit starts. unsigned ElementIndex; Please end sentences in comments with periods :) + // This implementation is used for strict weak ordering guarantees. The + // only thing it does is compare the ElementIndex's + bool operator<(const SparseBitVectorElement &RHS) const { + return ElementIndex < RHS.ElementIndex; + } + bool operator<(unsigned Idx) const { + return ElementIndex < Idx; + } Are you sure that this is ok? For two Element's with equal indexes but different contents, this will return "equal" based on their element count. Generally !(x < y) && !(y < x) implies x == y. + typedef SparseBitVectorIterator iterator; It looks like the iterator doesn't allow modification of the iterator, as such you could also typedef "const_iterator". + while (ElementIter != RHS.Elements.end()) + Elements.push_back(new SparseBitVectorElement<ElementSize>(*(*ElementIter))); line too long. + if (BecameZero) { + ElementListIter IterTmp = Iter1; + delete *IterTmp; A few tabs are in here before delete, please convert to spaces. + iterator begin() { + return iterator(*this); + } + + iterator end() { + return iterator(*this, ~0); + } Can these both be const? -Chris -- http://nondot.org/sabre/ http://llvm.org/
On 9/7/07, Chris Lattner <sabre at nondot.org> wrote:> On Tue, 4 Sep 2007, Daniel Berlin wrote: > >> insert/push_back without making copies. Either of these approaches would > >> also fix what looks like a leak in operator&=, where elements are erased > >> from the list without being deleted. > > > > I seriously considered not using indirection, but it made a lot of the > > functions more annoying, and it didn't seem worth the benefits, given > > that if the compiler goes and makes element copies behind your back, > > it becomes really really really bad :) > > I agree that avoiding the indirection would be good, but, in practice, > getting the data structure in and refining it later also is fine. > > Please valgrind it when it has a client to make sure it isn't leaking > memory. > > Some minor comments: > > +/// SparseBitVector is an implementation of a bitvector that is sparse by only > +/// storing the elements that have non-zero bits set. In order to make this > +/// fast for the most common cases, SparseBitVector is implemented as a linked list > > Please wrap lines to fit in 80 columns.Done> > + template <int ElementSize = 128> > + struct SparseBitVectorElement { > + public: > > I'd suggest outdenting this. Indenting due to the namespace doesn't add > any value. Also, I'd suggest making ElementSize 'unsigned' instead of > int.Done> > + unsigned ElementIndex; // Index of Element in terms of where first bit > + // starts > > Please use: > /// ElementIndex - Index of Element in terms of where first bit starts. > unsigned ElementIndex; > > Please end sentences in comments with periods :)Fixed> > + // This implementation is used for strict weak ordering guarantees. The > + // only thing it does is compare the ElementIndex's > + bool operator<(const SparseBitVectorElement &RHS) const { > + return ElementIndex < RHS.ElementIndex; > + } > + bool operator<(unsigned Idx) const { > + return ElementIndex < Idx; > + } > > Are you sure that this is ok? For two Element's with equal indexes but > different contents, this will return "equal" based on their element count. > Generally !(x < y) && !(y < x) implies x == y.I removed both operator's since they weren't being used anymore. It was when i was doing work on testing whether we could remove the indirection/other containers.> > + typedef SparseBitVectorIterator iterator; > > It looks like the iterator doesn't allow modification of the iterator, as > such you could also typedef "const_iterator".Done> > + while (ElementIter != RHS.Elements.end()) > + Elements.push_back(new SparseBitVectorElement<ElementSize>(*(*ElementIter))); > > line too long.Split.> > + if (BecameZero) { > + ElementListIter IterTmp = Iter1; > + delete *IterTmp; > > A few tabs are in here before delete, please convert to spaces.M-x whitespace-cleanup done :)> > > + iterator begin() { > + return iterator(*this); > + } > + > + iterator end() { > + return iterator(*this, ~0); > + } > > Can these both be const?Done Attached updated patch.> > -Chris > > -- > http://nondot.org/sabre/ > http://llvm.org/ > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- A non-text attachment was scrubbed... Name: sparsebitmap.diff Type: text/x-patch Size: 17857 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070907/f158cd31/attachment.bin>