thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang [Apr 2018]

If this information is useful, please help other people find it:
Share via:

John McCall via llvm-dev

2018-Apr-30 19:05 UTC

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com>
wrote:
>> The LLVM address space design has pushed well beyond the sensible
boundaries
>> of less-is-more and really needs some concerted effort to actually
define the expected
>> properties of different address spaces instead of a dozen different
engineers applying
>> a "don't do this optimization if the pointer is in a non-zero
address space" rule to the
>> optimizer with a shotgun.
>> 
>> In fact, if we'd already done that, we wouldn't need any sort
of address-space hack
>> to support this request.  We'd just need a very simple audit of the
places that check
>> the "are dereferences of the zero address undefined behavior"
bit to make sure that
>> they honor it even in address space 0.  But instead that audit will be
confused by a
>> thousand places that just bail out for non-zero address spaces without
further
>> explanation.
> 
> I agree.  The pattern of bailing out if AddrSpace != 0 is unfortunate.
> 
> We also need to cap the amount of extra semantics that can be put on
address
> spaces.  For instance, we should probably never support trapping semantics
on
> loads/stores, even via address spaces.
I would say instead that address spaces are not the right way to support
trapping
semantics on loads/stores.

John.

David Zarzycki via llvm-dev

2018-Apr-30 20:26 UTC

head link

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

> On Apr 30, 2018, at 15:05, John McCall via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
>> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at
apple.com> wrote:
>>> The LLVM address space design has pushed well beyond the sensible
boundaries
>>> of less-is-more and really needs some concerted effort to actually
define the expected
>>> properties of different address spaces instead of a dozen different
engineers applying
>>> a "don't do this optimization if the pointer is in a
non-zero address space" rule to the
>>> optimizer with a shotgun.
>>> 
>>> In fact, if we'd already done that, we wouldn't need any
sort of address-space hack
>>> to support this request.  We'd just need a very simple audit of
the places that check
>>> the "are dereferences of the zero address undefined
behavior" bit to make sure that
>>> they honor it even in address space 0.  But instead that audit will
be confused by a
>>> thousand places that just bail out for non-zero address spaces
without further
>>> explanation.
>> 
>> I agree.  The pattern of bailing out if AddrSpace != 0 is unfortunate.
>> 
>> We also need to cap the amount of extra semantics that can be put on
address
>> spaces.  For instance, we should probably never support trapping
semantics on
>> loads/stores, even via address spaces.
> 
> I would say instead that address spaces are not the right way to support
trapping
> semantics on loads/stores.
Hi John,

I might be misunderstanding the thread here, but are there architectures other
than Intel that support alternative address spaces? I’m asking because x86_64
dropped support for having the code, data, stack, and “extra” segments be
different from each other; and the only two remaining segment registers, “FS”
and “GS”, are only used in practice to alias the current address space. In fact,
*user-space* instructions were later added to read/write the FS/GS segment
bases, thus embracing the fact that these segment registers are used in practice
to alias the current address space.[1]

I don’t think LLVM needs to model FS/GS as anything other than aliases into the
existing address space.

Dave

[1] – Note, these new user-space instructions require permission from the kernel
to execute, and popular kernels haven’t enabled them. Last I knew, the Linux
folks seem receptive to the idea of enabling these instructions, but the
conversation keeps stalling on implementation details.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/754b3d2d/attachment.html>

Richard Smith via llvm-dev

2018-Apr-30 20:35 UTC

head link

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

On 30 April 2018 at 13:26, David Zarzycki via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Apr 30, 2018, at 15:05, John McCall via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
> wrote:
> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com>
wrote:
>
> The LLVM address space design has pushed well beyond the sensible
> boundaries
> of less-is-more and really needs some concerted effort to actually define
> the expected
> properties of different address spaces instead of a dozen different
> engineers applying
> a "don't do this optimization if the pointer is in a non-zero
address
> space" rule to the
> optimizer with a shotgun.
>
> In fact, if we'd already done that, we wouldn't need any sort of
> address-space hack
> to support this request.  We'd just need a very simple audit of the
places
> that check
> the "are dereferences of the zero address undefined behavior" bit
to make
> sure that
> they honor it even in address space 0.  But instead that audit will be
> confused by a
> thousand places that just bail out for non-zero address spaces without
> further
> explanation.
>
>
> I agree.  The pattern of bailing out if AddrSpace != 0 is unfortunate.
>
> We also need to cap the amount of extra semantics that can be put on
> address
> spaces.  For instance, we should probably never support trapping semantics
> on
> loads/stores, even via address spaces.
>
>
> I would say instead that address spaces are not the right way to support
> trapping
> semantics on loads/stores.
>
>
> Hi John,
>
> I might be misunderstanding the thread here, but are there architectures
> other than Intel that support alternative address spaces?
>
Yes, they're also used by GPU targets IIUC.

> I’m asking because x86_64 dropped support for having the code, data,
> stack, and “extra” segments be different from each other; and the only two
> remaining segment registers, “FS” and “GS”, are only used in practice to
> alias the current address space. In fact, *user-space* instructions were
> later added to read/write the FS/GS segment bases, thus embracing the fact
> that these segment registers are used in practice to alias the current
> address space.[1]
>
> I don’t think LLVM needs to model FS/GS as anything other than aliases
> into the existing address space.
>
> Dave
>
> [1] – Note, these new user-space instructions require permission from the
> kernel to execute, and popular kernels haven’t enabled them. Last I knew,
> the Linux folks seem receptive to the idea of enabling these instructions,
> but the conversation keeps stalling on implementation details.
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/36641add/attachment-0001.html>

John McCall via llvm-dev

2018-Apr-30 20:37 UTC

head link

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

> On Apr 30, 2018, at 4:26 PM, David Zarzycki <dave at znu.io> wrote:
>> On Apr 30, 2018, at 15:05, John McCall via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at
playingwithpointers.com <mailto:sanjoy at playingwithpointers.com>>
wrote:
>>> On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at
apple.com <mailto:rjmccall at apple.com>> wrote:
>>>> The LLVM address space design has pushed well beyond the
sensible boundaries
>>>> of less-is-more and really needs some concerted effort to
actually define the expected
>>>> properties of different address spaces instead of a dozen
different engineers applying
>>>> a "don't do this optimization if the pointer is in a
non-zero address space" rule to the
>>>> optimizer with a shotgun.
>>>> 
>>>> In fact, if we'd already done that, we wouldn't need
any sort of address-space hack
>>>> to support this request.  We'd just need a very simple
audit of the places that check
>>>> the "are dereferences of the zero address undefined
behavior" bit to make sure that
>>>> they honor it even in address space 0.  But instead that audit
will be confused by a
>>>> thousand places that just bail out for non-zero address spaces
without further
>>>> explanation.
>>> 
>>> I agree.  The pattern of bailing out if AddrSpace != 0 is
unfortunate.
>>> 
>>> We also need to cap the amount of extra semantics that can be put
on address
>>> spaces.  For instance, we should probably never support trapping
semantics on
>>> loads/stores, even via address spaces.
>> 
>> I would say instead that address spaces are not the right way to
support trapping
>> semantics on loads/stores.
> 
> Hi John,
> 
> I might be misunderstanding the thread here, but are there architectures
other than Intel that support alternative address spaces?
Yes.  They're commonplace in GPUs and also used in some distributed system
architectures.  Also, any x32-like ABI can support a short/long pointer
distinction.

John.
> I’m asking because x86_64 dropped support for having the code, data, stack,
and “extra” segments be different from each other; and the only two remaining
segment registers, “FS” and “GS”, are only used in practice to alias the current
address space. In fact, *user-space* instructions were later added to read/write
the FS/GS segment bases, thus embracing the fact that these segment registers
are used in practice to alias the current address space.[1]
> 
> I don’t think LLVM needs to model FS/GS as anything other than aliases into
the existing address space.
> 
> Dave
> 
> [1] – Note, these new user-space instructions require permission from the
kernel to execute, and popular kernels haven’t enabled them. Last I knew, the
Linux folks seem receptive to the idea of enabling these instructions, but the
conversation keeps stalling on implementation details.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/34f09fc9/attachment.html>

Joerg Sonnenberger via llvm-dev

2018-Apr-30 22:25 UTC

head link

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

On Mon, Apr 30, 2018 at 04:26:47PM -0400, David Zarzycki via llvm-dev
wrote:> I might be misunderstanding the thread here, but are there architectures
> other than Intel that support alternative address spaces?
Yes, efficient support for separate kernel and user VA used to be quite
common and effectively means two separate address spaces.

Joerg

David Chisnall via llvm-dev

2018-May-01 09:14 UTC

head link

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

On 30 Apr 2018, at 21:26, David Zarzycki via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> I might be misunderstanding the thread here, but are there architectures
other than Intel that support alternative address spaces? I’m asking because
x86_64 dropped support for having the code, data, stack, and “extra” segments be
different from each other; and the only two remaining segment registers, “FS”
and “GS”, are only used in practice to alias the current address space. In fact,
*user-space* instructions were later added to read/write the FS/GS segment
bases, thus embracing the fact that these segment registers are used in practice
to alias the current address space.[1]
I’m not 100% sure if you’re asking whether processors support different address
space, or whether you’re asking whether targets use LLVM’s notion of an address
space, so I’ll try to answer both.

To the first interpretation of your question:

Others have pointed out that GPUs have different memory regions (shared mutable,
shared immutable, local, and so on).  Any processor with an MMU supports some
notion of address spaces, the simplest of which involves multiple completely
distinct address spaces.  This is somewhat complicated by shared memory.  In the
C abstract machine, there is no difference between pointers to shared and
unshared memory, which is unfortunate as the safety of storing pointers in such
regions can vary.  In OpenCL, the host can map regions into which it is safe to
store pointers that are valid on both the host and device, which a more sane
language than C would regard as a separate address space.

In terms of out-of-tree architectures, a large number of embedded processors
have different regions for (for example), stack, code ROM, data ROM, and heap. 
Some have different overlapping shared regions.  The architecture that I’ve
worked on for the last 6 years, CHERI, provides a flexible notion of address
spaces allowing a model like segmentation at the coarse granularity for
sandboxing legacy code (with 64-bit integers as pointers), or fine-grained
memory safety by representing every pointer as a 128-bit hardware-enforced type
that encodes bounds and permissions.

To the second interpretation of your question:

GPUs use different address spaces for their different memory types, as do
out-of-tree embedded targets.  Azul and the (apparently now dead) CLR back end
using LLVM used AS1 to indicate that a pointer was to GC’d memory.  We use AS200
to indicate a 128-bit fat pointer and AS0 to indicate a 64-bit pointer (which is
implicitly relative to a default 128-bit pointer in a special register).

It’s worth noting that LLVM’s notion of an address space is a property of the
pointer, whereas embedded C regards it as a property of the underlying memory. 
This means that it is always syntactically valid to cast between address spaces
in LLVM IR, though the result may be a non-dereferencable pointer.  This is
somewhat problematic for optimisers, because this information is not well
expressed (for us, for example, casting from AS0 to AS200 always results in a
pointer that is valid if the original is valid, but casting from AS200 to AS0
may be null.  We’ve had to do a lot of cleanup on optimisers to prevent them
from generating broken code as a result).

The current model of an AS conflates two notions: a different region of memory
(potentially with different properties) and a different kind of pointer
(potentially with different properties).  It would be nice to decouple these and
provide a mechanism similar to function attributes that would allow properties
on pointers to be expressed, in an orthogonal manner to address spaces.  This
would require moving some information (such as pointer size) into the
attributes, but would probably be a long-term cleaner approach.

This would probably be easier after the typeless pointer work is completed, so
that pointers are all of type PTR but with attributes indicating their other
properties.  The AMD GPU, for example, could benefit from having an attribute
indicating that -1, rather than 0, is the ‘invalid pointer’ value for some
kinds.  Other useful information would include aliasing scopes (which could be
updated on inlining), values that are guaranteed not to be dereferenced, whether
out-of-bounds values are representable, and so on.

David

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Apr 2018 - [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang

Reasonably Related Threads