John McCall via llvm-dev
2018-Apr-30 19:05 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com> wrote: >> The LLVM address space design has pushed well beyond the sensible boundaries >> of less-is-more and really needs some concerted effort to actually define the expected >> properties of different address spaces instead of a dozen different engineers applying >> a "don't do this optimization if the pointer is in a non-zero address space" rule to the >> optimizer with a shotgun. >> >> In fact, if we'd already done that, we wouldn't need any sort of address-space hack >> to support this request. We'd just need a very simple audit of the places that check >> the "are dereferences of the zero address undefined behavior" bit to make sure that >> they honor it even in address space 0. But instead that audit will be confused by a >> thousand places that just bail out for non-zero address spaces without further >> explanation. > > I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. > > We also need to cap the amount of extra semantics that can be put on address > spaces. For instance, we should probably never support trapping semantics on > loads/stores, even via address spaces.I would say instead that address spaces are not the right way to support trapping semantics on loads/stores. John.
David Zarzycki via llvm-dev
2018-Apr-30 20:26 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 30, 2018, at 15:05, John McCall via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com> wrote: >>> The LLVM address space design has pushed well beyond the sensible boundaries >>> of less-is-more and really needs some concerted effort to actually define the expected >>> properties of different address spaces instead of a dozen different engineers applying >>> a "don't do this optimization if the pointer is in a non-zero address space" rule to the >>> optimizer with a shotgun. >>> >>> In fact, if we'd already done that, we wouldn't need any sort of address-space hack >>> to support this request. We'd just need a very simple audit of the places that check >>> the "are dereferences of the zero address undefined behavior" bit to make sure that >>> they honor it even in address space 0. But instead that audit will be confused by a >>> thousand places that just bail out for non-zero address spaces without further >>> explanation. >> >> I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. >> >> We also need to cap the amount of extra semantics that can be put on address >> spaces. For instance, we should probably never support trapping semantics on >> loads/stores, even via address spaces. > > I would say instead that address spaces are not the right way to support trapping > semantics on loads/stores.Hi John, I might be misunderstanding the thread here, but are there architectures other than Intel that support alternative address spaces? I’m asking because x86_64 dropped support for having the code, data, stack, and “extra” segments be different from each other; and the only two remaining segment registers, “FS” and “GS”, are only used in practice to alias the current address space. In fact, *user-space* instructions were later added to read/write the FS/GS segment bases, thus embracing the fact that these segment registers are used in practice to alias the current address space.[1] I don’t think LLVM needs to model FS/GS as anything other than aliases into the existing address space. Dave [1] – Note, these new user-space instructions require permission from the kernel to execute, and popular kernels haven’t enabled them. Last I knew, the Linux folks seem receptive to the idea of enabling these instructions, but the conversation keeps stalling on implementation details. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/754b3d2d/attachment.html>
Richard Smith via llvm-dev
2018-Apr-30 20:35 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
On 30 April 2018 at 13:26, David Zarzycki via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Apr 30, 2018, at 15:05, John McCall via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com> > wrote: > On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com> wrote: > > The LLVM address space design has pushed well beyond the sensible > boundaries > of less-is-more and really needs some concerted effort to actually define > the expected > properties of different address spaces instead of a dozen different > engineers applying > a "don't do this optimization if the pointer is in a non-zero address > space" rule to the > optimizer with a shotgun. > > In fact, if we'd already done that, we wouldn't need any sort of > address-space hack > to support this request. We'd just need a very simple audit of the places > that check > the "are dereferences of the zero address undefined behavior" bit to make > sure that > they honor it even in address space 0. But instead that audit will be > confused by a > thousand places that just bail out for non-zero address spaces without > further > explanation. > > > I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. > > We also need to cap the amount of extra semantics that can be put on > address > spaces. For instance, we should probably never support trapping semantics > on > loads/stores, even via address spaces. > > > I would say instead that address spaces are not the right way to support > trapping > semantics on loads/stores. > > > Hi John, > > I might be misunderstanding the thread here, but are there architectures > other than Intel that support alternative address spaces? >Yes, they're also used by GPU targets IIUC.> I’m asking because x86_64 dropped support for having the code, data, > stack, and “extra” segments be different from each other; and the only two > remaining segment registers, “FS” and “GS”, are only used in practice to > alias the current address space. In fact, *user-space* instructions were > later added to read/write the FS/GS segment bases, thus embracing the fact > that these segment registers are used in practice to alias the current > address space.[1] > > I don’t think LLVM needs to model FS/GS as anything other than aliases > into the existing address space. > > Dave > > [1] – Note, these new user-space instructions require permission from the > kernel to execute, and popular kernels haven’t enabled them. Last I knew, > the Linux folks seem receptive to the idea of enabling these instructions, > but the conversation keeps stalling on implementation details. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/36641add/attachment-0001.html>
John McCall via llvm-dev
2018-Apr-30 20:37 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 30, 2018, at 4:26 PM, David Zarzycki <dave at znu.io> wrote: >> On Apr 30, 2018, at 15:05, John McCall via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>> wrote: >>> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>> wrote: >>>> The LLVM address space design has pushed well beyond the sensible boundaries >>>> of less-is-more and really needs some concerted effort to actually define the expected >>>> properties of different address spaces instead of a dozen different engineers applying >>>> a "don't do this optimization if the pointer is in a non-zero address space" rule to the >>>> optimizer with a shotgun. >>>> >>>> In fact, if we'd already done that, we wouldn't need any sort of address-space hack >>>> to support this request. We'd just need a very simple audit of the places that check >>>> the "are dereferences of the zero address undefined behavior" bit to make sure that >>>> they honor it even in address space 0. But instead that audit will be confused by a >>>> thousand places that just bail out for non-zero address spaces without further >>>> explanation. >>> >>> I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. >>> >>> We also need to cap the amount of extra semantics that can be put on address >>> spaces. For instance, we should probably never support trapping semantics on >>> loads/stores, even via address spaces. >> >> I would say instead that address spaces are not the right way to support trapping >> semantics on loads/stores. > > Hi John, > > I might be misunderstanding the thread here, but are there architectures other than Intel that support alternative address spaces?Yes. They're commonplace in GPUs and also used in some distributed system architectures. Also, any x32-like ABI can support a short/long pointer distinction. John.> I’m asking because x86_64 dropped support for having the code, data, stack, and “extra” segments be different from each other; and the only two remaining segment registers, “FS” and “GS”, are only used in practice to alias the current address space. In fact, *user-space* instructions were later added to read/write the FS/GS segment bases, thus embracing the fact that these segment registers are used in practice to alias the current address space.[1] > > I don’t think LLVM needs to model FS/GS as anything other than aliases into the existing address space. > > Dave > > [1] – Note, these new user-space instructions require permission from the kernel to execute, and popular kernels haven’t enabled them. Last I knew, the Linux folks seem receptive to the idea of enabling these instructions, but the conversation keeps stalling on implementation details.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/34f09fc9/attachment.html>
Joerg Sonnenberger via llvm-dev
2018-Apr-30 22:25 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
On Mon, Apr 30, 2018 at 04:26:47PM -0400, David Zarzycki via llvm-dev wrote:> I might be misunderstanding the thread here, but are there architectures > other than Intel that support alternative address spaces?Yes, efficient support for separate kernel and user VA used to be quite common and effectively means two separate address spaces. Joerg
David Chisnall via llvm-dev
2018-May-01 09:14 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
On 30 Apr 2018, at 21:26, David Zarzycki via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > I might be misunderstanding the thread here, but are there architectures other than Intel that support alternative address spaces? I’m asking because x86_64 dropped support for having the code, data, stack, and “extra” segments be different from each other; and the only two remaining segment registers, “FS” and “GS”, are only used in practice to alias the current address space. In fact, *user-space* instructions were later added to read/write the FS/GS segment bases, thus embracing the fact that these segment registers are used in practice to alias the current address space.[1]I’m not 100% sure if you’re asking whether processors support different address space, or whether you’re asking whether targets use LLVM’s notion of an address space, so I’ll try to answer both. To the first interpretation of your question: Others have pointed out that GPUs have different memory regions (shared mutable, shared immutable, local, and so on). Any processor with an MMU supports some notion of address spaces, the simplest of which involves multiple completely distinct address spaces. This is somewhat complicated by shared memory. In the C abstract machine, there is no difference between pointers to shared and unshared memory, which is unfortunate as the safety of storing pointers in such regions can vary. In OpenCL, the host can map regions into which it is safe to store pointers that are valid on both the host and device, which a more sane language than C would regard as a separate address space. In terms of out-of-tree architectures, a large number of embedded processors have different regions for (for example), stack, code ROM, data ROM, and heap. Some have different overlapping shared regions. The architecture that I’ve worked on for the last 6 years, CHERI, provides a flexible notion of address spaces allowing a model like segmentation at the coarse granularity for sandboxing legacy code (with 64-bit integers as pointers), or fine-grained memory safety by representing every pointer as a 128-bit hardware-enforced type that encodes bounds and permissions. To the second interpretation of your question: GPUs use different address spaces for their different memory types, as do out-of-tree embedded targets. Azul and the (apparently now dead) CLR back end using LLVM used AS1 to indicate that a pointer was to GC’d memory. We use AS200 to indicate a 128-bit fat pointer and AS0 to indicate a 64-bit pointer (which is implicitly relative to a default 128-bit pointer in a special register). It’s worth noting that LLVM’s notion of an address space is a property of the pointer, whereas embedded C regards it as a property of the underlying memory. This means that it is always syntactically valid to cast between address spaces in LLVM IR, though the result may be a non-dereferencable pointer. This is somewhat problematic for optimisers, because this information is not well expressed (for us, for example, casting from AS0 to AS200 always results in a pointer that is valid if the original is valid, but casting from AS200 to AS0 may be null. We’ve had to do a lot of cleanup on optimisers to prevent them from generating broken code as a result). The current model of an AS conflates two notions: a different region of memory (potentially with different properties) and a different kind of pointer (potentially with different properties). It would be nice to decouple these and provide a mechanism similar to function attributes that would allow properties on pointers to be expressed, in an orthogonal manner to address spaces. This would require moving some information (such as pointer size) into the attributes, but would probably be a long-term cleaner approach. This would probably be easier after the typeless pointer work is completed, so that pointers are all of type PTR but with attributes indicating their other properties. The AMD GPU, for example, could benefit from having an attribute indicating that -1, rather than 0, is the ‘invalid pointer’ value for some kinds. Other useful information would include aliasing scopes (which could be updated on inlining), values that are guaranteed not to be dereferenced, whether out-of-bounds values are representable, and so on. David
Seemingly Similar Threads
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [LLVMdev] n-bit bytes for clang/llvm
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang