Matthias Braun via llvm-dev
2018-Aug-29 00:35 UTC
[llvm-dev] [cfe-dev] Identifying objects within BumpPtrAllocator.
This is a great idea! I personally also wouldn't mind going further in debug builds and actually create and store sequential IDs with the objects and take the small memory hit for improved debuggability. The `PersistentId` field in SelectionDAG works that way and has helped make the output more readable IMO. - Matthias> On Aug 28, 2018, at 5:22 PM, George Karpenkov via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Patch available at https://reviews.llvm.org/D51393 <https://reviews.llvm.org/D51393> > > I would really love to see this in the static analyzer, but I think all other dumping facilities could greatly benefit as well. > >> On Aug 28, 2018, at 5:14 PM, Artem Dergachev via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >> >> In various debug dumps (eg., Clang's -ast-dump), various objects (eg., Stmts and Decls in that -ast-dump) are identified by pointers. It's very reliable in the sense that no two objects would ever have the same pointer at the same time, but it's unpleasant that pointers change across runs. Having deterministic identifiers instead of pointers would aid debugging: imagine a conditional break by object identifier that has not yet been constructed, or simply trying to align two debug dumps of different kind from different runs together. Additionally, pointers are hard to read and memorize; it's hard to notice the difference between 0x7f80a28325e0 and 0x7f80a28325a0, especially when they're a few screens apart. >> >> Hence the idea: why don't we print the offset into the allocator's memory slab instead of a pointer? We use BumpPtrAllocator all over the place, which boils down to a set of slabs on which all objects are placed in the order in which they are allocated. It is easy for the allocator to identify if a pointer belongs to that allocator, and if so, deteremine which slab it belongs to and at what offset the object is in that slab. Therefore it is possible to identify the object by its (slab index, offset) pair. Eg., "TypedefDecl 0:528" (you already memorized it) instead of "TypedefDecl 0x7f80a28325e0". This could be applied to all sorts of objects that live in BumpPtrAllocators. >> >> In order to compute such identifier, we only need access to the object and to the allocator. No additional memory is used to store such identifier. Such identifier would also be persistent across runs as long as the same objects are allocated in the same order, which is, i suspect, often the case. >> >> One of the downsides of this identifier is that it's not going to be the same on different machines, because the same data structure may require different amounts of memory on different hosts. So it wouldn't necessarily help understanding a dump that the user sent you. But it still seems to be better than pointers. >> >> Should we go ahead and try to implement it? >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180828/28feb07c/attachment.html>
George Karpenkov via llvm-dev
2018-Aug-29 00:36 UTC
[llvm-dev] [cfe-dev] Identifying objects within BumpPtrAllocator.
> On Aug 28, 2018, at 5:35 PM, Matthias Braun <mbraun at apple.com> wrote: > > This is a great idea! > > I personally also wouldn't mind going further in debug builds and actually create and store sequential IDs with the objects and take the small memory hit for improved debuggability.We’ve debated that for a while. The approach with using the allocators has an advantage that it’s readily applicable to all printers without any overhead. The downside is that numbers are less readable.> The `PersistentId` field in SelectionDAG works that way and has helped make the output more readable IMO. > > - Matthias > >> On Aug 28, 2018, at 5:22 PM, George Karpenkov via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Patch available at https://reviews.llvm.org/D51393 <https://reviews.llvm.org/D51393> >> >> I would really love to see this in the static analyzer, but I think all other dumping facilities could greatly benefit as well. >> >>> On Aug 28, 2018, at 5:14 PM, Artem Dergachev via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >>> >>> In various debug dumps (eg., Clang's -ast-dump), various objects (eg., Stmts and Decls in that -ast-dump) are identified by pointers. It's very reliable in the sense that no two objects would ever have the same pointer at the same time, but it's unpleasant that pointers change across runs. Having deterministic identifiers instead of pointers would aid debugging: imagine a conditional break by object identifier that has not yet been constructed, or simply trying to align two debug dumps of different kind from different runs together. Additionally, pointers are hard to read and memorize; it's hard to notice the difference between 0x7f80a28325e0 and 0x7f80a28325a0, especially when they're a few screens apart. >>> >>> Hence the idea: why don't we print the offset into the allocator's memory slab instead of a pointer? We use BumpPtrAllocator all over the place, which boils down to a set of slabs on which all objects are placed in the order in which they are allocated. It is easy for the allocator to identify if a pointer belongs to that allocator, and if so, deteremine which slab it belongs to and at what offset the object is in that slab. Therefore it is possible to identify the object by its (slab index, offset) pair. Eg., "TypedefDecl 0:528" (you already memorized it) instead of "TypedefDecl 0x7f80a28325e0". This could be applied to all sorts of objects that live in BumpPtrAllocators. >>> >>> In order to compute such identifier, we only need access to the object and to the allocator. No additional memory is used to store such identifier. Such identifier would also be persistent across runs as long as the same objects are allocated in the same order, which is, i suspect, often the case. >>> >>> One of the downsides of this identifier is that it's not going to be the same on different machines, because the same data structure may require different amounts of memory on different hosts. So it wouldn't necessarily help understanding a dump that the user sent you. But it still seems to be better than pointers. >>> >>> Should we go ahead and try to implement it? >>> _______________________________________________ >>> cfe-dev mailing list >>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180828/3c08476c/attachment.html>
George Karpenkov via llvm-dev
2018-Aug-29 00:39 UTC
[llvm-dev] [cfe-dev] Identifying objects within BumpPtrAllocator.
Forgot to mention: I personally very rarely use debug builds, and to me having a reproducible printer only in debug builds would be a huge downside.> On Aug 28, 2018, at 5:36 PM, George Karpenkov via cfe-dev <cfe-dev at lists.llvm.org> wrote: > > > >> On Aug 28, 2018, at 5:35 PM, Matthias Braun <mbraun at apple.com <mailto:mbraun at apple.com>> wrote: >> >> This is a great idea! >> >> I personally also wouldn't mind going further in debug builds and actually create and store sequential IDs with the objects and take the small memory hit for improved debuggability. > > We’ve debated that for a while. > The approach with using the allocators has an advantage that it’s readily applicable to all printers without any overhead. The downside is that numbers are less readable. > >> The `PersistentId` field in SelectionDAG works that way and has helped make the output more readable IMO. >> >> - Matthias >> >>> On Aug 28, 2018, at 5:22 PM, George Karpenkov via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> Patch available at https://reviews.llvm.org/D51393 <https://reviews.llvm.org/D51393> >>> >>> I would really love to see this in the static analyzer, but I think all other dumping facilities could greatly benefit as well. >>> >>>> On Aug 28, 2018, at 5:14 PM, Artem Dergachev via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >>>> >>>> In various debug dumps (eg., Clang's -ast-dump), various objects (eg., Stmts and Decls in that -ast-dump) are identified by pointers. It's very reliable in the sense that no two objects would ever have the same pointer at the same time, but it's unpleasant that pointers change across runs. Having deterministic identifiers instead of pointers would aid debugging: imagine a conditional break by object identifier that has not yet been constructed, or simply trying to align two debug dumps of different kind from different runs together. Additionally, pointers are hard to read and memorize; it's hard to notice the difference between 0x7f80a28325e0 and 0x7f80a28325a0, especially when they're a few screens apart. >>>> >>>> Hence the idea: why don't we print the offset into the allocator's memory slab instead of a pointer? We use BumpPtrAllocator all over the place, which boils down to a set of slabs on which all objects are placed in the order in which they are allocated. It is easy for the allocator to identify if a pointer belongs to that allocator, and if so, deteremine which slab it belongs to and at what offset the object is in that slab. Therefore it is possible to identify the object by its (slab index, offset) pair. Eg., "TypedefDecl 0:528" (you already memorized it) instead of "TypedefDecl 0x7f80a28325e0". This could be applied to all sorts of objects that live in BumpPtrAllocators. >>>> >>>> In order to compute such identifier, we only need access to the object and to the allocator. No additional memory is used to store such identifier. Such identifier would also be persistent across runs as long as the same objects are allocated in the same order, which is, i suspect, often the case. >>>> >>>> One of the downsides of this identifier is that it's not going to be the same on different machines, because the same data structure may require different amounts of memory on different hosts. So it wouldn't necessarily help understanding a dump that the user sent you. But it still seems to be better than pointers. >>>> >>>> Should we go ahead and try to implement it? >>>> _______________________________________________ >>>> cfe-dev mailing list >>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180828/2ca2edc8/attachment-0001.html>
Matthias Braun via llvm-dev
2018-Aug-29 00:40 UTC
[llvm-dev] [cfe-dev] Identifying objects within BumpPtrAllocator.
> On Aug 28, 2018, at 5:36 PM, George Karpenkov <ekarpenkov at apple.com> wrote: > > > >> On Aug 28, 2018, at 5:35 PM, Matthias Braun <mbraun at apple.com <mailto:mbraun at apple.com>> wrote: >> >> This is a great idea! >> >> I personally also wouldn't mind going further in debug builds and actually create and store sequential IDs with the objects and take the small memory hit for improved debuggability. > > We’ve debated that for a while. > The approach with using the allocators has an advantage that it’s readily applicable to all printers without any overhead. The downside is that numbers are less readable.Sure and the nice effect of your approach is that it works with release builds as well. Anyway it's not either/or here. Printing BumpPtr distances instead of pointers should work really way in pretty much all cases and we should do that now! I'm just saying that storing sequential IDs with the objects in debug mode would be a possible addition on top of that for the future... - Matthias> >> The `PersistentId` field in SelectionDAG works that way and has helped make the output more readable IMO. >> >> - Matthias >> >>> On Aug 28, 2018, at 5:22 PM, George Karpenkov via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> Patch available at https://reviews.llvm.org/D51393 <https://reviews.llvm.org/D51393> >>> >>> I would really love to see this in the static analyzer, but I think all other dumping facilities could greatly benefit as well. >>> >>>> On Aug 28, 2018, at 5:14 PM, Artem Dergachev via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: >>>> >>>> In various debug dumps (eg., Clang's -ast-dump), various objects (eg., Stmts and Decls in that -ast-dump) are identified by pointers. It's very reliable in the sense that no two objects would ever have the same pointer at the same time, but it's unpleasant that pointers change across runs. Having deterministic identifiers instead of pointers would aid debugging: imagine a conditional break by object identifier that has not yet been constructed, or simply trying to align two debug dumps of different kind from different runs together. Additionally, pointers are hard to read and memorize; it's hard to notice the difference between 0x7f80a28325e0 and 0x7f80a28325a0, especially when they're a few screens apart. >>>> >>>> Hence the idea: why don't we print the offset into the allocator's memory slab instead of a pointer? We use BumpPtrAllocator all over the place, which boils down to a set of slabs on which all objects are placed in the order in which they are allocated. It is easy for the allocator to identify if a pointer belongs to that allocator, and if so, deteremine which slab it belongs to and at what offset the object is in that slab. Therefore it is possible to identify the object by its (slab index, offset) pair. Eg., "TypedefDecl 0:528" (you already memorized it) instead of "TypedefDecl 0x7f80a28325e0". This could be applied to all sorts of objects that live in BumpPtrAllocators. >>>> >>>> In order to compute such identifier, we only need access to the object and to the allocator. No additional memory is used to store such identifier. Such identifier would also be persistent across runs as long as the same objects are allocated in the same order, which is, i suspect, often the case. >>>> >>>> One of the downsides of this identifier is that it's not going to be the same on different machines, because the same data structure may require different amounts of memory on different hosts. So it wouldn't necessarily help understanding a dump that the user sent you. But it still seems to be better than pointers. >>>> >>>> Should we go ahead and try to implement it? >>>> _______________________________________________ >>>> cfe-dev mailing list >>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180828/c47d8934/attachment.html>