Ehud, can you elaborate on which classes you're trying to change. I know some of the classes already use methods like getSubclassDataFromInstruction() to hide bits from the subclasses. They could probably shift the data too. ~Craig On Fri, Dec 27, 2019 at 9:35 AM Bruno Ricci via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > On 26/12/2019 20:53, Ehud Katz via llvm-dev wrote: > > I've tested it on MSVC, gcc, clang and icc. > > > > The solution in clang (in Decl.h) is not ideal (as you have said > yourself). The solution I offer, is using a union of fields of class > BitField (this is a new class that implements a bitfield of a specific type > requested). With this, the definition, of the required bitfields in the > subclass data, remains in the hands of the actual class needing them. And > these are all restricted hierarchically by the superclasses of that class. > > I spent some time packing various bits of data into bit-fields in clang. > My experience > is that the solution in clang actually works just fine. The Stmt hierarchy > has a huge > number of bit-field classes (I count more than 40! [1]), but because the > offsets are > relative adding a bit means only adding one to an enum value a few lines > below > (a static_assert then checks that the union is not too large). > > [1] > https://github.com/llvm/llvm-project/blob/f20fc65887e2085332d111ee0b05736794423c72/clang/include/clang/AST/Stmt.h#L948 > > Bruno Ricci > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191227/2c2706fb/attachment.html>
The solution in Clang is still very complicated and error prone. A lot of synchronization work (between different classes - at least in the same hierarchy) needs to be done manually. I'll summarize the capabilities of the 3 approaches in the following (kinda) table, using the following columns *[ Current LLVM | Clang | New RFC ]* : - *[_|X|X] static assert *- that the declared accumulated bitfields do not exceed the underlying subclass data size (note that int the New implementation it is automatically added on declaration) - *[_|_|X] automatic static assert *- is adding the static assert needs to be manually or is it done automatically with the declaration of the new bitfield. - *[_|_|X] runtime assert* - that a new value set, fits into the the bitfield (without truncation). - *[_|_|X] typed* - as opposed to using a representative type (like `int`) and then cast to the actual required type (like `bool` or `enum`). Typed (ordinary) bitfields cannot be implemented correctly in MSVC, as the types of all the bitfields must be of the same type. Using typed bitfields also saves us the need to synchronize the use of `unsigned/signed int` with the actual type needed. - *[X|_|X] declare in actual class* - as opposed to one of the base classes. - *[_|_|X] declare (a bitfield) in a single line* - as opposed to the need to declare helpers or somekind, like `enum` (manually). - *[_|_|X] clean bitfields* - without exposing a bit manipulation `enum`. - *[_|_|X] automatic inheritance of unused bits* - no need to get offset from super (manually). - *[_|_|X] automatic calculation of unused bits* - changing a single bitfield doesn't require any other change, but the actual bitfield itself (as opposed to changing also the sum of the bit count used by the class, in an `enum` - for exmple). - *[_|_|X] implicit reference to superclass* - as opposed to the need to use the base class' info explicitly. - *[_|_|X] no need to know anything about any of the base classes*. I think the table speaks for itself. Craig, regarding the `getSubclassDataFromInstruction()`, it still does not turn the tides of the table, above, into the current implementation. On Fri, Dec 27, 2019 at 8:57 PM Craig Topper <craig.topper at gmail.com> wrote:> Ehud, can you elaborate on which classes you're trying to change. I know > some of the classes already use methods > like getSubclassDataFromInstruction() to hide bits from the subclasses. > They could probably shift the data too. > > ~Craig > > > On Fri, Dec 27, 2019 at 9:35 AM Bruno Ricci via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi, >> >> On 26/12/2019 20:53, Ehud Katz via llvm-dev wrote: >> > I've tested it on MSVC, gcc, clang and icc. >> > >> > The solution in clang (in Decl.h) is not ideal (as you have said >> yourself). The solution I offer, is using a union of fields of class >> BitField (this is a new class that implements a bitfield of a specific type >> requested). With this, the definition, of the required bitfields in the >> subclass data, remains in the hands of the actual class needing them. And >> these are all restricted hierarchically by the superclasses of that class. >> >> I spent some time packing various bits of data into bit-fields in clang. >> My experience >> is that the solution in clang actually works just fine. The Stmt >> hierarchy has a huge >> number of bit-field classes (I count more than 40! [1]), but because the >> offsets are >> relative adding a bit means only adding one to an enum value a few lines >> below >> (a static_assert then checks that the union is not too large). >> >> [1] >> https://github.com/llvm/llvm-project/blob/f20fc65887e2085332d111ee0b05736794423c72/clang/include/clang/AST/Stmt.h#L948 >> >> Bruno Ricci >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191230/4bb98238/attachment.html>
Hi, Do you have some code we can look at (even if it is in a nasty unpolished state, just mark it WIP and put it on Phab) ? It is hard to evaluate an alternative without the code. That said I think that the table is a little bit one-sided. I have added some inline comments. On 30/12/2019 11:53, Ehud Katz wrote:> The solution in Clang is still very complicated and error prone. A lot of synchronization work (between different classes - at least in the same hierarchy) needs to be done manually. > > I'll summarize the capabilities of the 3 approaches in the following (kinda) table, using the following columns *[ Current LLVM | Clang | New RFC ]* : > > * *[_|X|X] static assert *- that the declared accumulated bitfields do not exceed the underlying subclass data size (note that int the New implementation it is automatically added on declaration) > o *[_|_|X] automatic static assert *- is adding the static assert needs to be manually or is it done automatically with the declaration of the new bitfield.This is not actually true. There is only a single static_assert for the size of the union in Stmt [1] and the same could be done for the union in DeclBase.> * *[_|_|X] runtime assert* - that a new value set, fits into the the bitfield (without truncation).This is true, I agree here that this is useful.> * *[_|_|X] typed* - as opposed to using a representative type (like `int`) and then cast to the actual required type (like `bool` or `enum`). Typed (ordinary) bitfields cannot be implemented correctly in MSVC, as the types of all the bitfields must be of the same type. Using typed bitfields also saves us the need to synchronize the use of `unsigned/signed int` with the actual type needed. > * *[X|_|X] declare in actual class* - as opposed to one of the base classes.True, but I am curious to see how you can avoid doing this in the base. The whole point of doing this bit-field thing is to save space by reusing the padding after the base subobject. Maybe with an aligned char array reinterpret_cast'ed by the derived classes as mentioned before? The trick is that we don't want to repeat the members from the base classes. Also we have to avoid UB.> * *[_|_|X] declare (a bitfield) in a single line* - as opposed to the need to declare helpers or somekind, like `enum` (manually)> * *[_|_|X] clean bitfields* - without exposing a bit manipulation `enum`.The enum is only used by the other bit-field classes. This is invisible to the class itself. No helpers are needed: just write SomeClassBits.SomeBit to refer to SomeBit in the bit-field of SomeClass.> * *[_|_|X] automatic inheritance of unused bits* - no need to get offset from super (manually). > * *[_|_|X] automatic calculation of unused bits* - changing a single bitfield doesn't require any other change, but the actual bitfield itself (as opposed to changing also the sum of the bit count used by the class, in an `enum` - for exmple). > * *[_|_|X] implicit reference to superclass* - as opposed to the need to use the base class' info explicitly. > * *[_|_|X] no need to know anything about any of the base classes*.I agree that forgetting to update the offset in the enum can be a source of error (but the enum is just a few lines below so it is hard to miss). Some things to keep in mind: - Some bit-fields are aligned to a byte boundary for (benchmarked) faster access [2] - Some bit-field classes have a hole which is used by derived classes [3] - Some bit-fields do not have a fixed size, just something "large enough". (this is not ideal and it would be better to have a well-defined limit which could then be used to trigger an error instead of just overflowing, that's on my TODO list...) [4]> > I think the table speaks for itself. > > Craig, regarding the `getSubclassDataFromInstruction()`, it still does not turn the tides of the table, above, into the current implementation. >Cheers! [1] https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L1088 [2] https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L471 [3] https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L849 [4] https://github.com/llvm/llvm-project/blob/master/clang/include/clang/AST/Stmt.h#L804 Bruno Ricci