Nuno Lopes via llvm-dev
2021-Jun-06 18:31 UTC
[llvm-dev] [RFC] Introducing a byte type to LLVM
Hi James, Your comment is incorrect because: int x[n]; x[n]; // legal expression in C This is necessary to allow idioms where you iterate over an array using pointer arithmetic, like: for (int *p = x; p != x + n; ++p) { ... } So, AFAICT, the bug report you mention is perfectly valid. Plus LLVM IR's pointer arithmetic may go out-of-bounds (getelementptr without inbounds). It's important to not mix C and LLVM IR semantics. They have different goals. Nuno -----Original Message----- From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of James Courtier-Dutton via llvm-dev Sent: 06 June 2021 10:02 To: Chris Lattner <clattner at nondot.org> Cc: llvm-dev <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org Developers <cfe-dev at lists.llvm.org>; Ralf Jung <jung at mpi-sws.org> Subject: Re: [llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM Also, the comment below is wrong. At this point, arr3 is equivalent to arr2, which is q. // Now arr3 is equivalent to arr1, which is p. int *r; memcpy(&r, (unsigned char *)arr3, sizeof(r)); // Now r is p. *p = 1; *r = 10; On Sun, 6 Jun 2021 at 08:54, James Courtier-Dutton <james.dutton at gmail.com> wrote:> > Hi, > > I would also oppose adding a byte type, but mainly because the bug > report mentioned (https://bugs.llvm.org/show_bug.cgi?id=37469) is not > a bug at all. > The example in the bug report is just badly written C code. > Specifically: > > int main() { > int A[4], B[4]; > printf("%p %p\n", A, &B[4]); > if ((uintptr_t)A == (uintptr_t)&B[4]) { > store_10_to_p(A, &B[4]); > printf("%d\n", A[0]); > } > return 0; > } > > "int B[4];" allows values between 0 and 3 only, and referring to 4 in > &B[4] is undef, so in my view, it is correctly optimised out which is > why it disappears in -O3. > > Kind Regards > > James > > > On Sun, 6 Jun 2021 at 05:26, Chris Lattner via cfe-dev > <cfe-dev at lists.llvm.org> wrote: > > > > On Jun 4, 2021, at 11:25 AM, John McCall via cfe-dev <cfe-dev at lists.llvm.org> wrote:On 4 Jun 2021, at 11:24, George Mitenkov wrote: > > > > Hi all, > > > > Together with Nuno Lopes and Juneyoung Lee we propose to add a new > > byte type to LLVM to fix miscompilations due to load type punning. > > Please see the proposal below. It would be great to hear the > > feedback/comments/suggestions! > > > > > > Motivation > > =========> > > > char and unsigned char are considered to be universal holders in C. > > They can access raw memory and are used to implement memcpy. i8 is > > the LLVM’s counterpart but it does not have such semantics, which is > > also not desirable as it would disable many optimizations. > > > > I don’t believe this is correct. LLVM does not have an innate > > concept of typed memory. The type of a global or local allocation is > > just a roundabout way of giving it a size and default alignment, and > > similarly the type of a load or store just determines the width and > > default alignment of the access. There are no restrictions on what > > types can be used to load or store from certain objects. > > > > C-style type aliasing restrictions are imposed using tbaa metadata, > > which are unrelated to the IR type of the access. > > > > I completely agree with John. “i8” in LLVM doesn’t carry any implications about aliasing (in fact, LLVM pointers are going towards being typeless). Any such thing occurs at the accesses, and are part of TBAA. > > > > I’m opposed to adding a byte type to LLVM, as such semantic carrying types are entirely unprecedented, and would add tremendous complexity to the entire system. > > > > -Chris > > > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev_______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev