thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Chris Lattner via llvm-dev

2021-Jun-06 04:26 UTC

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev <cfe-dev at
lists.llvm.org> wrote:On 4 Jun 2021, at 11:24, George Mitenkov
wrote:> Hi all,
> 
> Together with Nuno Lopes and Juneyoung Lee we propose to add a new byte
> type to LLVM to fix miscompilations due to load type punning. Please see
> the proposal below. It would be great to hear the
> feedback/comments/suggestions!
> 
> 
> Motivation
> =========> 
> char and unsigned char are considered to be universal holders in C. They
> can access raw memory and are used to implement memcpy. i8 is the LLVM’s
> counterpart but it does not have such semantics, which is also not
> desirable as it would disable many optimizations.
> 
> I don’t believe this is correct. LLVM does not have an innate
> concept of typed memory. The type of a global or local allocation
> is just a roundabout way of giving it a size and default alignment,
> and similarly the type of a load or store just determines the width
> and default alignment of the access. There are no restrictions on
> what types can be used to load or store from certain objects.
> 
> C-style type aliasing restrictions are imposed using tbaa
> metadata, which are unrelated to the IR type of the access.
> 
I completely agree with John.  “i8” in LLVM doesn’t carry any implications about
aliasing (in fact, LLVM pointers are going towards being typeless).  Any such
thing occurs at the accesses, and are part of TBAA.

I’m opposed to adding a byte type to LLVM, as such semantic carrying types are
entirely unprecedented, and would add tremendous complexity to the entire
system.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210605/f1d542ff/attachment.html>

James Courtier-Dutton via llvm-dev

2021-Jun-06 07:54 UTC

head link

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

Hi,

I would also oppose adding a byte type, but mainly because the bug
report mentioned (https://bugs.llvm.org/show_bug.cgi?id=37469) is not
a bug at all.
The example in the bug report is just badly written C code.
Specifically:

int main() {
  int A[4], B[4];
  printf("%p %p\n", A, &B[4]);
  if ((uintptr_t)A == (uintptr_t)&B[4]) {
    store_10_to_p(A, &B[4]);
    printf("%d\n", A[0]);
  }
  return 0;
}

"int B[4];" allows values between 0 and 3 only, and referring to 4 in
&B[4] is undef, so in my view, it is correctly optimised out which is
why it disappears in -O3.

Kind Regards

James


On Sun, 6 Jun 2021 at 05:26, Chris Lattner via cfe-dev
<cfe-dev at lists.llvm.org> wrote:>
> On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev <cfe-dev at
lists.llvm.org> wrote:On 4 Jun 2021, at 11:24, George Mitenkov wrote:
>
> Hi all,
>
> Together with Nuno Lopes and Juneyoung Lee we propose to add a new byte
> type to LLVM to fix miscompilations due to load type punning. Please see
> the proposal below. It would be great to hear the
> feedback/comments/suggestions!
>
>
> Motivation
> =========>
> char and unsigned char are considered to be universal holders in C. They
> can access raw memory and are used to implement memcpy. i8 is the LLVM’s
> counterpart but it does not have such semantics, which is also not
> desirable as it would disable many optimizations.
>
> I don’t believe this is correct. LLVM does not have an innate
> concept of typed memory. The type of a global or local allocation
> is just a roundabout way of giving it a size and default alignment,
> and similarly the type of a load or store just determines the width
> and default alignment of the access. There are no restrictions on
> what types can be used to load or store from certain objects.
>
> C-style type aliasing restrictions are imposed using tbaa
> metadata, which are unrelated to the IR type of the access.
>
> I completely agree with John.  “i8” in LLVM doesn’t carry any implications
about aliasing (in fact, LLVM pointers are going towards being typeless).  Any
such thing occurs at the accesses, and are part of TBAA.
>
> I’m opposed to adding a byte type to LLVM, as such semantic carrying types
are entirely unprecedented, and would add tremendous complexity to the entire
system.
>
> -Chris
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Hal Finkel via llvm-dev

2021-Jun-06 15:52 UTC

head link

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

On 6/6/21 00:26, Chris Lattner via cfe-dev wrote:> On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev 
> <cfe-dev at lists.llvm.org> wrote:On 4 Jun 2021, at 11:24, George 
> Mitenkov wrote:
>>
>>     Hi all,
>>
>>     Together with Nuno Lopes and Juneyoung Lee we propose to add a
>>     new byte
>>     type to LLVM to fix miscompilations due to load type punning.
>>     Please see
>>     the proposal below. It would be great to hear the
>>     feedback/comments/suggestions!
>>
>>
>>     Motivation
>>     =========>>
>>     char and unsigned char are considered to be universal holders in
>>     C. They
>>     can access raw memory and are used to implement memcpy. i8 is the
>>     LLVM’s
>>     counterpart but it does not have such semantics, which is also not
>>     desirable as it would disable many optimizations.
>>
>> I don’t believe this is correct. LLVM does not have an innate
>> concept of typed memory. The type of a global or local allocation
>> is just a roundabout way of giving it a size and default alignment,
>> and similarly the type of a load or store just determines the width
>> and default alignment of the access. There are no restrictions on
>> what types can be used to load or store from certain objects.
>>
>> C-style type aliasing restrictions are imposed using |tbaa|
>> metadata, which are unrelated to the IR type of the access.
>>
> I completely agree with John.  “i8” in LLVM doesn’t carry any 
> implications about aliasing (in fact, LLVM pointers are going towards 
> being typeless).  Any such thing occurs at the accesses, and are part 
> of TBAA.
>
> I’m opposed to adding a byte type to LLVM, as such semantic carrying 
> types are entirely unprecedented, and would add tremendous complexity 
> to the entire system.
>
> -Chris

I'll take this opportunity to point out that, at least historically, the 
reason why a desire to optimize around ptrtoint keeps resurfacing is 
because:

  1. Common optimizations introduce them into code that did not 
otherwise have them (SROA, for example, see convertValue in SROA.cpp).

  2. They're generated by some of the ABI code for argument passing (see 
clang/lib/CodeGen/TargetInfo.cpp).

  3. They're present in certain performance-sensitive code idioms (see, 
for example, ADT/PointerIntPair.h).

It seems to me that, if there's design work to do in this area, one 
should consider addressing these now-long-standing issues where we 
introduce ptrtoint by replacing this mechanism with some other one.

  -Hal
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210606/385b7f06/attachment.html>

Philip Reames via llvm-dev

2021-Jun-09 19:06 UTC

head link

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

On 6/5/21 9:26 PM, Chris Lattner via llvm-dev wrote:> On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
wrote:On 4
> Jun 2021, at 11:24, George Mitenkov wrote:
>>
>>     Hi all,
>>
>>     Together with Nuno Lopes and Juneyoung Lee we propose to add a
>>     new byte
>>     type to LLVM to fix miscompilations due to load type punning.
>>     Please see
>>     the proposal below. It would be great to hear the
>>     feedback/comments/suggestions!
>>
>>
>>     Motivation
>>     =========>>
>>     char and unsigned char are considered to be universal holders in
>>     C. They
>>     can access raw memory and are used to implement memcpy. i8 is the
>>     LLVM’s
>>     counterpart but it does not have such semantics, which is also not
>>     desirable as it would disable many optimizations.
>>
>> I don’t believe this is correct. LLVM does not have an innate
>> concept of typed memory. The type of a global or local allocation
>> is just a roundabout way of giving it a size and default alignment,
>> and similarly the type of a load or store just determines the width
>> and default alignment of the access. There are no restrictions on
>> what types can be used to load or store from certain objects.
>>
>> C-style type aliasing restrictions are imposed using |tbaa|
>> metadata, which are unrelated to the IR type of the access.
>>
> I completely agree with John.  “i8” in LLVM doesn’t carry any 
> implications about aliasing (in fact, LLVM pointers are going towards 
> being typeless).  Any such thing occurs at the accesses, and are part 
> of TBAA.
>
> I’m opposed to adding a byte type to LLVM, as such semantic carrying 
> types are entirely unprecedented, and would add tremendous complexity 
> to the entire system.
I agree with both John and Chris here.

I've read through the discussion in this thread, and have yet to be 
convinced there is a problem, much less than this is a good solution.  
I'm open to being convinced of those two things, but the writeup in this 
thread doesn't do it.  There's snippet of examples downthread which 
might be convincing, but there's objections raised around language 
semantics which I find very hard to follow.  The fragmentation of the 
thread really doesn't help.

I would suggest the OP take some of the motivating examples, write up a 
web-page with the examples and their interpretation, then revisit the 
topic.  In particular, I strongly suggest anticipating incorrect 
interpretation/objections and explicitly addressing them.

I'll also note that the use of the term capture w.r.t a *load* 
downthread makes absolutely no sense to me.  Stores capture, not loads.

Philip




-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210609/c7b05d09/attachment.html>

llvm dev - Jun 2021 - [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM