John McCall via llvm-dev
2018-Apr-30 18:14 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 28, 2018, at 4:12 PM, Sanjoy Das via cfe-dev <cfe-dev at lists.llvm.org> wrote: > On Thu, Apr 19, 2018 at 3:56 PM, Manoj Gupta via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> My understanding is we only want to disable the optimizations regarding >> undefined behavior >> related to null pointer deference optimizations. And address space checks >> will end up >> disabling more optimizations than needed. > > [Repeating what others have already mentioned on this thread] > > I this it is better to avoid framing this as "disable optimizations that exploit > UB" and instead frame this as "define some behavior that was undefined before". > >> I did look at some of the optimizations/transforms and there are some that >> we definitely want to keep. >> >> Just a quick example from grepping: >> lib/Transforms/Scalar/LoopIdiomRecognize.cpp >> ........... >> // Don't create memset_pattern16s with address spaces. >> StorePtr->getType()->getPointerAddressSpace() == 0 && >> (PatternValue = getMemSetPatternValue(StoredVal, DL))) { >> // It looks like we can use PatternValue! >> return LegalStoreKind::MemsetPattern; >> } >> >> Even worse, Sanitizers do NOT work with address spaces which is a big deal >> breaker IMO. > > IMO fixing these seems less engineering overhead in the long term than > introducing > yet another knob to the IR. > > More importantly, we should _not_ be doing these optimizations without auditing > them individually. For instance with -fno-delete-null-pointer checks, it isn't > okay to change a memset loop to a call to llvm.memset unless we've ensured llvm > DTRT for llvm.memset and null checks (i.e. the null check in "llvm.memset(ptr, > ...); if (!ptr) {}" does not get optimized away). > >> Since address spaces and null pointers are really orthogonal issues, I would >> prefer to >> not conflate them. > > I'm not sure I agree with this. Address spaces are a mechanism to provide > additional semantics to pointers. In this case the additional property we > want is "address 0 may be dereferenceable", and using address spaces seems > appropriate. > >> In addition, It is already not easy to convince Linux Kernel maintainers to >> accept clang specific patches. > > Maybe I'm missing something, but I thought in this address space scheme we'd > still provide a -fno-delete-null-ptr-checks flag -- it is clang that would mark > pointers with address space n in this mode. > >> From Linux kernel maintainers POV, Clang built kernels are already "getting >> lucky". So I am not too >> worried about missing a few cases. >> I'll be glad to fix the missing cases whenever reported. > > It seems to me that adding back missing (but correct) optimizations when > reported is better than removing existing (but incorrect) optimizations when > reported. If I were a kernel developer (which I am not) I'd rather have a > kernel that boots slower than a security vulnerability. > >> I also have some other concerns with address spaces e.g. how to pick a safe >> address space not used by >> any target e.g. many targets (in and out-of-tree) actively use non-zero >> address spaces. >> User code can also specify any address space via >> __attribute__((address_space(N))) so mixing object >> files will be tricky in such cases. > > I think for that we can start a thread on cfe-dev and llvm-dev about reserving > address spaces. While at it, we should reserve a range of address spaces for > LLVM middle-end use so that we can do more things like > -fno-delete-null-pointer-checks without worrying about address space clashes.The LLVM address space design has pushed well beyond the sensible boundaries of less-is-more and really needs some concerted effort to actually define the expected properties of different address spaces instead of a dozen different engineers applying a "don't do this optimization if the pointer is in a non-zero address space" rule to the optimizer with a shotgun. In fact, if we'd already done that, we wouldn't need any sort of address-space hack to support this request. We'd just need a very simple audit of the places that check the "are dereferences of the zero address undefined behavior" bit to make sure that they honor it even in address space 0. But instead that audit will be confused by a thousand places that just bail out for non-zero address spaces without further explanation. Such a design would hopefully include reserving ranges of address spaces for different purposes. Clang is already perfectly capable of remapping address space numbers from the user-facing address_space attribute when generating IR. John.
Sanjoy Das via llvm-dev
2018-Apr-30 18:58 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com> wrote:> The LLVM address space design has pushed well beyond the sensible boundaries > of less-is-more and really needs some concerted effort to actually define the expected > properties of different address spaces instead of a dozen different engineers applying > a "don't do this optimization if the pointer is in a non-zero address space" rule to the > optimizer with a shotgun. > > In fact, if we'd already done that, we wouldn't need any sort of address-space hack > to support this request. We'd just need a very simple audit of the places that check > the "are dereferences of the zero address undefined behavior" bit to make sure that > they honor it even in address space 0. But instead that audit will be confused by a > thousand places that just bail out for non-zero address spaces without further > explanation.I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. We also need to cap the amount of extra semantics that can be put on address spaces. For instance, we should probably never support trapping semantics on loads/stores, even via address spaces. -- Sanjoy
John McCall via llvm-dev
2018-Apr-30 19:05 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 30, 2018, at 2:58 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > On Mon, Apr 30, 2018 at 11:14 AM, John McCall <rjmccall at apple.com> wrote: >> The LLVM address space design has pushed well beyond the sensible boundaries >> of less-is-more and really needs some concerted effort to actually define the expected >> properties of different address spaces instead of a dozen different engineers applying >> a "don't do this optimization if the pointer is in a non-zero address space" rule to the >> optimizer with a shotgun. >> >> In fact, if we'd already done that, we wouldn't need any sort of address-space hack >> to support this request. We'd just need a very simple audit of the places that check >> the "are dereferences of the zero address undefined behavior" bit to make sure that >> they honor it even in address space 0. But instead that audit will be confused by a >> thousand places that just bail out for non-zero address spaces without further >> explanation. > > I agree. The pattern of bailing out if AddrSpace != 0 is unfortunate. > > We also need to cap the amount of extra semantics that can be put on address > spaces. For instance, we should probably never support trapping semantics on > loads/stores, even via address spaces.I would say instead that address spaces are not the right way to support trapping semantics on loads/stores. John.
Richard Smith via llvm-dev
2018-Apr-30 20:24 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
On 30 April 2018 at 11:14, John McCall via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > On Apr 28, 2018, at 4:12 PM, Sanjoy Das via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > > On Thu, Apr 19, 2018 at 3:56 PM, Manoj Gupta via llvm-dev > > <llvm-dev at lists.llvm.org> wrote: > >> My understanding is we only want to disable the optimizations regarding > >> undefined behavior > >> related to null pointer deference optimizations. And address space > checks > >> will end up > >> disabling more optimizations than needed. > > > > [Repeating what others have already mentioned on this thread] > > > > I this it is better to avoid framing this as "disable optimizations that > exploit > > UB" and instead frame this as "define some behavior that was undefined > before". > > > >> I did look at some of the optimizations/transforms and there are some > that > >> we definitely want to keep. > >> > >> Just a quick example from grepping: > >> lib/Transforms/Scalar/LoopIdiomRecognize.cpp > >> ........... > >> // Don't create memset_pattern16s with address spaces. > >> StorePtr->getType()->getPointerAddressSpace() == 0 && > >> (PatternValue = getMemSetPatternValue(StoredVal, DL))) { > >> // It looks like we can use PatternValue! > >> return LegalStoreKind::MemsetPattern; > >> } > >> > >> Even worse, Sanitizers do NOT work with address spaces which is a big > deal > >> breaker IMO. > > > > IMO fixing these seems less engineering overhead in the long term than > > introducing > > yet another knob to the IR. > > > > More importantly, we should _not_ be doing these optimizations without > auditing > > them individually. For instance with -fno-delete-null-pointer checks, > it isn't > > okay to change a memset loop to a call to llvm.memset unless we've > ensured llvm > > DTRT for llvm.memset and null checks (i.e. the null check in > "llvm.memset(ptr, > > ...); if (!ptr) {}" does not get optimized away). > > > >> Since address spaces and null pointers are really orthogonal issues, I > would > >> prefer to > >> not conflate them. > > > > I'm not sure I agree with this. Address spaces are a mechanism to > provide > > additional semantics to pointers. In this case the additional property > we > > want is "address 0 may be dereferenceable", and using address spaces > seems > > appropriate. > > > >> In addition, It is already not easy to convince Linux Kernel > maintainers to > >> accept clang specific patches. > > > > Maybe I'm missing something, but I thought in this address space scheme > we'd > > still provide a -fno-delete-null-ptr-checks flag -- it is clang that > would mark > > pointers with address space n in this mode. > > > >> From Linux kernel maintainers POV, Clang built kernels are already > "getting > >> lucky". So I am not too > >> worried about missing a few cases. > >> I'll be glad to fix the missing cases whenever reported. > > > > It seems to me that adding back missing (but correct) optimizations when > > reported is better than removing existing (but incorrect) optimizations > when > > reported. If I were a kernel developer (which I am not) I'd rather have > a > > kernel that boots slower than a security vulnerability. > > > >> I also have some other concerns with address spaces e.g. how to pick a > safe > >> address space not used by > >> any target e.g. many targets (in and out-of-tree) actively use non-zero > >> address spaces. > >> User code can also specify any address space via > >> __attribute__((address_space(N))) so mixing object > >> files will be tricky in such cases. > > > > I think for that we can start a thread on cfe-dev and llvm-dev about > reserving > > address spaces. While at it, we should reserve a range of address > spaces for > > LLVM middle-end use so that we can do more things like > > -fno-delete-null-pointer-checks without worrying about address space > clashes. > > The LLVM address space design has pushed well beyond the sensible > boundaries > of less-is-more and really needs some concerted effort to actually define > the expected > properties of different address spaces instead of a dozen different > engineers applying > a "don't do this optimization if the pointer is in a non-zero address > space" rule to the > optimizer with a shotgun. > > In fact, if we'd already done that, we wouldn't need any sort of > address-space hack > to support this request. We'd just need a very simple audit of the places > that check > the "are dereferences of the zero address undefined behavior" bit to make > sure that > they honor it even in address space 0. But instead that audit will be > confused by a > thousand places that just bail out for non-zero address spaces without > further > explanation. > > Such a design would hopefully include reserving ranges of address spaces > for different > purposes. Clang is already perfectly capable of remapping address space > numbers > from the user-facing address_space attribute when generating IR.>From the peanut gallery: has any thought been given to moving away fromhardcoded address space numbers entirely? Abstractly it seems like an extra layer of indirection here could help (eg, we could allow IR to define address spaces that can be part of pointer types, and those could nominate which target-specific pointer representation and interpretation they use -- perhaps by name instead of by number -- as well as other properties such as whether zero is a valid address). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/82d13813/attachment.html>
John McCall via llvm-dev
2018-Apr-30 20:39 UTC
[llvm-dev] [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
> On Apr 30, 2018, at 4:24 PM, Richard Smith <richard at metafoo.co.uk> wrote: > On 30 April 2018 at 11:14, John McCall via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > On Apr 28, 2018, at 4:12 PM, Sanjoy Das via cfe-dev <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote: > > On Thu, Apr 19, 2018 at 3:56 PM, Manoj Gupta via llvm-dev > > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > >> My understanding is we only want to disable the optimizations regarding > >> undefined behavior > >> related to null pointer deference optimizations. And address space checks > >> will end up > >> disabling more optimizations than needed. > > > > [Repeating what others have already mentioned on this thread] > > > > I this it is better to avoid framing this as "disable optimizations that exploit > > UB" and instead frame this as "define some behavior that was undefined before". > > > >> I did look at some of the optimizations/transforms and there are some that > >> we definitely want to keep. > >> > >> Just a quick example from grepping: > >> lib/Transforms/Scalar/LoopIdiomRecognize.cpp > >> ........... > >> // Don't create memset_pattern16s with address spaces. > >> StorePtr->getType()->getPointerAddressSpace() == 0 && > >> (PatternValue = getMemSetPatternValue(StoredVal, DL))) { > >> // It looks like we can use PatternValue! > >> return LegalStoreKind::MemsetPattern; > >> } > >> > >> Even worse, Sanitizers do NOT work with address spaces which is a big deal > >> breaker IMO. > > > > IMO fixing these seems less engineering overhead in the long term than > > introducing > > yet another knob to the IR. > > > > More importantly, we should _not_ be doing these optimizations without auditing > > them individually. For instance with -fno-delete-null-pointer checks, it isn't > > okay to change a memset loop to a call to llvm.memset unless we've ensured llvm > > DTRT for llvm.memset and null checks (i.e. the null check in "llvm.memset(ptr, > > ...); if (!ptr) {}" does not get optimized away). > > > >> Since address spaces and null pointers are really orthogonal issues, I would > >> prefer to > >> not conflate them. > > > > I'm not sure I agree with this. Address spaces are a mechanism to provide > > additional semantics to pointers. In this case the additional property we > > want is "address 0 may be dereferenceable", and using address spaces seems > > appropriate. > > > >> In addition, It is already not easy to convince Linux Kernel maintainers to > >> accept clang specific patches. > > > > Maybe I'm missing something, but I thought in this address space scheme we'd > > still provide a -fno-delete-null-ptr-checks flag -- it is clang that would mark > > pointers with address space n in this mode. > > > >> From Linux kernel maintainers POV, Clang built kernels are already "getting > >> lucky". So I am not too > >> worried about missing a few cases. > >> I'll be glad to fix the missing cases whenever reported. > > > > It seems to me that adding back missing (but correct) optimizations when > > reported is better than removing existing (but incorrect) optimizations when > > reported. If I were a kernel developer (which I am not) I'd rather have a > > kernel that boots slower than a security vulnerability. > > > >> I also have some other concerns with address spaces e.g. how to pick a safe > >> address space not used by > >> any target e.g. many targets (in and out-of-tree) actively use non-zero > >> address spaces. > >> User code can also specify any address space via > >> __attribute__((address_space(N))) so mixing object > >> files will be tricky in such cases. > > > > I think for that we can start a thread on cfe-dev and llvm-dev about reserving > > address spaces. While at it, we should reserve a range of address spaces for > > LLVM middle-end use so that we can do more things like > > -fno-delete-null-pointer-checks without worrying about address space clashes. > > The LLVM address space design has pushed well beyond the sensible boundaries > of less-is-more and really needs some concerted effort to actually define the expected > properties of different address spaces instead of a dozen different engineers applying > a "don't do this optimization if the pointer is in a non-zero address space" rule to the > optimizer with a shotgun. > > In fact, if we'd already done that, we wouldn't need any sort of address-space hack > to support this request. We'd just need a very simple audit of the places that check > the "are dereferences of the zero address undefined behavior" bit to make sure that > they honor it even in address space 0. But instead that audit will be confused by a > thousand places that just bail out for non-zero address spaces without further > explanation. > > Such a design would hopefully include reserving ranges of address spaces for different > purposes. Clang is already perfectly capable of remapping address space numbers > from the user-facing address_space attribute when generating IR. > > From the peanut gallery: has any thought been given to moving away from hardcoded address space numbers entirely? Abstractly it seems like an extra layer of indirection here could help (eg, we could allow IR to define address spaces that can be part of pointer types, and those could nominate which target-specific pointer representation and interpretation they use -- perhaps by name instead of by number -- as well as other properties such as whether zero is a valid address).I think that would be great design, yeah. Even if we stuck with identifying them by numbers, I would hope that we would allow them to be described at the top-level this way, and that backward compatibility would be handled by just filling in maximally conservative information. John. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180430/d2c1f130/attachment.html>
Reasonably Related Threads
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [cfe-dev] RFC: Implementing -fno-delete-null-pointer-checks in clang
- [LLVMdev] n-bit bytes for clang/llvm