thr3ads.net - similar to: "[LLVMdev] Missed optimization on array initialization"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Missed optimization on array initialization"

[LLVMdev] Missed optimization on array initialization

2012 Feb 25

[LLVMdev] Missed optimization on array initialization

On Feb 25, 2012, at 3:17 AM, Carlo Alberto Ferraris wrote: > Prompted by a SO post (http://stackoverflow.com/questions/9441882/compiler-instruction-reordering-optimizations-in-c-and-what-inhibits-them/9442363) I checked and found that LLVM yields the same (seemingly) suboptimal code as MSVC. > Consider the following, simplified, C snippet: > extern void bar(int*); > > void

[LLVMdev] Missed optimization on array initialization

2012 Feb 25

[LLVMdev] Missed optimization on array initialization

On Feb 25, 2012, at 10:32 AM, Chris Lattner <clattner at apple.com> wrote: > > On Feb 25, 2012, at 3:17 AM, Carlo Alberto Ferraris wrote: > >> Prompted by a SO post (http://stackoverflow.com/questions/9441882/compiler-instruction-reordering-optimizations-in-c-and-what-inhibits-them/9442363) I checked and found that LLVM yields the same (seemingly) suboptimal code as MSVC.

[LLVMdev] where is F7 opcode for TEST instruction on X86?

2014 Apr 22

[LLVMdev] where is F7 opcode for TEST instruction on X86?

hi, at the moment, TEST instruction is defined with 0xf7 opcode, as demonstrated below. $ echo "0xf7 0xc0 0x00 0x00 0x00 0x22"|./Release+Asserts/bin/llvm-mc -disassemble -arch=x86 .section __TEXT,__text,regular,pure_instructions testl $570425344, %eax ## imm = 0x22000000 however, i cannot find anywhere this F7 opcode is defined in

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

The vectorizer will now emit = load <8 x i32>, align #TargetAlignmentOfScalari32 where before it would emit = load <8 x i32> (which has the semantics of “= load <8 xi32>, align 0” which means the address is aligned with target abi alignment, see http://llvm.org/docs/LangRef.html#load-instruction). When the backend generates code for the former it will emit an unaligned move:

[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)

2014 Oct 17

[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)

Hi all, Consider the following example: define void @fn(i8* %buf) #0 { entry: %arrayidx = getelementptr i8* %buf, i64 18 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64 18, i32 1, i1 false) %arrayidx1 = getelementptr i8* %buf, i64 18 store i8 1, i8* %arrayidx1, align 1 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64 18, i32 1, i1 false)

CHECK-LABLE or CHECK?

2017 Mar 31

CHECK-LABLE or CHECK?

Hi All, I came across a FileCheck failure I don't understand why. The example code below: void test1() { ... code ... // CHECK-LABEL: @test1 // CHECK: void @llvm.memcpy.p0i8.p0i8.i32 - (1) } void dummy() { // make (1) match ... code ... // CHECK-LABEL: @dummy } void test2() { ... code ... // CHECK-LABEL: @test2 //

[LLVMdev] malloc / free & memcpy optimisations.

2013 May 21

[LLVMdev] malloc / free & memcpy optimisations.

The front end I'm building for an existing interpreted language is unfortunately producing output similar to this far too often; define void @foo(i8* nocapture %dest, i8* nocapture %src, i32 %len) nounwind { %1 = tail call noalias i8* @malloc(i32 %len) nounwind tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %1, i8* %src, i32 %len, i32 1, i1 false) tail call void

lifetime_start/end

2020 Sep 30

lifetime_start/end

Hello, What intrinsics "@llvm.lifetime.start/@llvm.lifetime.end" really do? As per my knowledge, they define the live ranges of variables. In the following code section, they seem redundant. However, when I remove them, the behavior of the code becomes non-deterministic. The live ranges of the variables defined by them are never used in the code. Thanks, --------------- %37 = bitcast

Which pass should be propagating memory copies

2017 May 16

Which pass should be propagating memory copies

Consider the following IR example: define void @simple([4 x double] *%ptr, i64 %idx) { %stack = alloca [4 x double] %ptri8 = bitcast [4 x double] *%ptr to i8* %stacki8 = bitcast [4 x double] *%stack to i8* call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32, i32 0, i1 0) %dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32 0, i64 %idx

[LLVMdev] How to get llvm bitcode executed

2012 May 22

[LLVMdev] How to get llvm bitcode executed

Hi All, I have a program that uses C++ STL a lot. To have the source code for STL functions, I undefined "_GLIBCXX_EXTERN_TEMPLATE" in c++config.h. In spite of this, after compilation (via clang) and linking (via llvm-ld), the resulting bitcode contains a few declared functions (with no definitions). My question is: In the scenario where some function definitions are missing in a llvm

new @llvm.memcpy and @llvm.memset API in trunk - how to use alignment?

2018 Mar 22

new @llvm.memcpy and @llvm.memset API in trunk - how to use alignment?

The new @llvm.memcpy API does not have an alignment parameter. Instead the docs say to use the align <n> attribute. How is this supposed to work with different alignments? For example, I have one memcpy with align 4, align 4, and another with align 1, align 1. ; Function Attrs: argmemonly nounwind declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly align 4, i8* nocapture

is this a bug in an optimization pass?

2018 Nov 23

is this a bug in an optimization pass?

The frontend code is a pretty simple for loop, that counts from i = 0; i != 10; i += 1 It gets optimized into and endless loop. export fn entry() void { var array: [10]Bar = undefined; var x = for (array) |elem, i| { if (i == 1) break elem; } else bar2(); } Here's the generated IR: ; ModuleID = 'test' source_filename = "test" target datalayout =

Change memcpy/memmove/memset to have dest and source alignment attributes

2018 Jan 19

Change memcpy/memmove/memset to have dest and source alignment attributes

> On Jan 18, 2018, at 7:45 AM, Daniel Neilson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > Hi all, > This change has been reviewed, and appears to be ready to land (review available here if anyone still wants to chime in: https://reviews.llvm.org/D41675 <https://reviews.llvm.org/D41675> ). The process that we’re going to use for landing this will take a few

[LLVMdev] llvm.memset.p0i8.* intrinsics

2012 Jul 26

[LLVMdev] llvm.memset.p0i8.* intrinsics

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, are the llvm.memset.p0i8.i32 and llvm.memset.p0i8.i64 intrinsics meant to be used for 32-bit and 64-bit architectures, respectively depending on the module's target datalayout? Or can I use any of them? If yes, why are there two functions? Sorry, if it's obvious, but the documentation wasn't that clear. Thanks and ciao, Mario

[LLVMdev] malloc / free & memcpy optimisations.

2013 May 21

[LLVMdev] malloc / free & memcpy optimisations.

> could you allocate the memory on the stack instead (alloca instruction)? This is mainly for string or binary blob handling, using the stack isn't a great idea for size reasons. While I'm experimenting with simple code examples now, and I picked a simple one for this email. I'm certain things will get much more complicated once I implement more features of the language. On Tue,

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

2012 Aug 22

[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information

Hello, Currently LLVM expects front-ends to lower struct assignments into either individual scalar loads and stores, or calls to @llvm.memcpy. For structs with lots of fields, it can take a lot of scalar loads and stores, so @llvm.memcpy is used instead. Unfortunately, using @llvm.memcpy does not permit full TBAA information to be preserved. Also, it unnecessarily copies any padding bytes between

Change memcpy/memmove/memset to have dest and source alignment attributes

2018 Jan 02

Change memcpy/memmove/memset to have dest and source alignment attributes

Good day all, I’ve spent a few days resurrecting the circa-2015 work on removing the explicit alignment argument (4th arg) from the @llvm.memcpy/memmove/memset intrinsics in favour of using the alignment attribute on the pointer args of calls to the intrinsic. This work was first proposed back in August 2015 by Lang Hames: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html (item

Change memcpy/memmove/memset to have dest and source alignment attributes

2018 Jan 19

Change memcpy/memmove/memset to have dest and source alignment attributes

On Jan 18, 2018, at 10:48 PM, Chris Lattner <clattner at nondot.org<mailto:clattner at nondot.org>> wrote: On Jan 18, 2018, at 7:45 AM, Daniel Neilson via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi all, This change has been reviewed, and appears to be ready to land (review available here if anyone still wants to chime in:

AliasAnalysis does not look though a memcpy

2018 Nov 29

AliasAnalysis does not look though a memcpy

Hi, I'm trying to get AA results for two pointers, but it seems that AA cannot look though a memcpy. For example: define dso_local spir_func void @fun() { entry: ; Store an address of `var' %var = alloca i32, align 4 store i32 42, i32* %var, align 4 %var.addr = alloca i32*, align 8 store i32* %var, i32** %var.addr, align 8 ; Memcpy

Dead store elimination in the backend for -ftrivial-auto-var-init

2019 Aug 07

Dead store elimination in the backend for -ftrivial-auto-var-init

There are two problems: 1. padding after union and call to q(), without LTO we can't remove that store. 2. shortcut which I have which ignores all instructions q() . this assume that memset to acpar.match, acpar.matchinfo also useful which is not true. I should be able to improve this case. On Thu, Aug 1, 2019 at 11:29 PM Vitaly Buka <vitalybuka at google.com> wrote: > On a first

similar to: [LLVMdev] Missed optimization on array initialization