thr3ads.net - similar to: "[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics"

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics"

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 29

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

I can't think of a better way to do this, so I think it's ok. I also submitted a complementary patch on llvm-commits clarifying volatile semantics. -Andy On Jan 28, 2013, at 8:54 AM, Arnaud A. de Grandmaison <arnaud.allarddegrandmaison at parrot.com> wrote: > Hi All, > > In the language reference manual, the access behavior of the memcpy, > memmove and memset

[LLVMdev] Weird volatile propagation ?

2013 Jan 18

[LLVMdev] Weird volatile propagation ?

Hi All, Using clang+llvm at head, I noticed a weird behaviour with the following reduced testcase : $ cat test.c #include <stdint.h> struct R { uint16_t a; uint16_t b; }; volatile struct R * const addr = (volatile struct R *) 416; void test(uint16_t a) { struct R r = { a, 1 }; *addr = r; } $ clang -O2 -o - -emit-llvm -S -c test.c ; ModuleID = 'test.c' target

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

As a results of my investigations, the thread is also added to cfe-dev. The context : while porting my company code from the LLVM/Clang releases 3.1 to 3.2, I stumbled on a code size and performance regression. The testcase is : $ cat test.c #include <stdint.h> struct R { uint16_t a; uint16_t b; }; volatile struct R * const addr = (volatile struct R *) 416; void test(uint16_t a) {

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

I doubt you needed to add cfe-dev here. Sorry I hadn't seen this, this seems like an easy and simple deficiency in the IR intrinsic for memcpy. See below. On Sun, Jan 20, 2013 at 1:42 PM, Arnaud de Grandmaison < arnaud.allarddegrandmaison at parrot.com> wrote: > define void @test(i16 zeroext %a) nounwind uwtable { > %r.sroa.0 = alloca i16, align 2 > %r.sroa.1 = alloca i16,

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Feb 03

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Same patches as before, but 0002-memcpy has been updated to put the (is|set)SrcVolatile methods to where they logically belong : MemTransferInst. This makes (is|set)Volatile methods look a bit ugly to keep compatibility with existing behaviour, but they will hopefully disappear when all users have moved to the new interface --- in the next series of patches. I plan to give a try to phabricator

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 31

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Thanks Andy and Chandler, After specifying the volatile access behaviour, the second step was to autoupgrade the memmove/memcpy intrinsics, and implement (is|set)Volatile in terms of (is|set)(Src|Dest)Volatile, with no functional change. 0001-Specify-the-access-behaviour-of-the-memcpy-memmove-a.patch is the one you already reviewed, unaltered.

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 21

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

On 01/20/2013 10:56 PM, Chandler Carruth wrote: > I doubt you needed to add cfe-dev here. Sorry I hadn't seen this, this > seems like an easy and simple deficiency in the IR intrinsic for > memcpy. See below. > > On Sun, Jan 20, 2013 at 1:42 PM, Arnaud de Grandmaison > <arnaud.allarddegrandmaison at parrot.com > <mailto:arnaud.allarddegrandmaison at

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

2013 Dec 16

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

At 2013-12-15 22:43:34,"Caldarale, Charles R" <Chuck.Caldarale at unisys.com> wrote: >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] >> On Behalf Of Haishan >> Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3 > >> My clang version is 3.3 and debug build. > >> //test.c >> int a[6] = {1, 2, 3, 4, 5,

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

2013 Dec 21

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

The flag -enable-aa-sched-mi should do what you want you want in the MachineScheduler pass. If you want to do it in the selection DAG, there is a subtarget hook that might do it: TargetSubtargetInfo::useAA() LLVM won’t generate the schedule you want anyway for Intel core processors, but the alias analysis can be useful in general. -Andy On Dec 16, 2013, at 6:03 AM, Haishan <hndxvon at

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

2013 Dec 15

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Haishan > Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3 > My clang version is 3.3 and debug build. > //test.c > int a[6] = {1, 2, 3, 4, 5, 6} > int main() { > a[0] = a[5]; > a[1] = a[4]; > a[2] = a[5]; > } > //end test.c > Then test.dump is

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

2013 Dec 15

[LLVMdev] Question about Pre-RA-schedule in LLVM3.3

Hi, I compile a case (test.c) to get object machine file (test.o) using clang as follows: "clang -target arm -integrated-as -c test.c -o test.o" My clang version is 3.3 and debug build. //test.c int a[6] = {1, 2, 3, 4, 5, 6} int main() { a[0] = a[5]; a[1] = a[4]; a[2] = a[5]; } //end test.c Then test.dump is generated by using the objdump tool. //test.dump ldr r1, [r0, #20]

[LLVMdev] Overzealous PromoteCastOfAllocation

2008 Sep 22

[LLVMdev] Overzealous PromoteCastOfAllocation

On Sep 13, 2008, at 1:07 PM, Matthijs Kooijman wrote: > Hi Dan, > >> Changing PromoteCastOfAllocation to not replace aggregate allocas >> with >> non-aggregate allocas if they have GEP users sounds reasonable to me. > This sounds reasonable indeed, but still a bit arbitrary. Haven't > figured out > anything better yet, though. > >> Finding the

[LLVMdev] Overzealous PromoteCastOfAllocation

2008 Sep 13

[LLVMdev] Overzealous PromoteCastOfAllocation

Hi Dan, > Changing PromoteCastOfAllocation to not replace aggregate allocas with > non-aggregate allocas if they have GEP users sounds reasonable to me. This sounds reasonable indeed, but still a bit arbitrary. Haven't figured out anything better yet, though. > Finding the maximum alignment is sometimes still useful though, so > it would be nice to update the alignment field of

Optimization issues (Alias Analysis?)

2016 Jul 04

Optimization issues (Alias Analysis?)

Hey, I am currently working on a VM which is based on LLVM and I would like to use its optimizer, but it somehow it can't detect something very simple (I guess.) This is the LLVM IR: target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" target triple = "i386-unknown-linux-gnu" %struct.regs = type { i32, i32, i32 } define void @Test(%struct.regs* noalias

Redundant ptrtoint/inttoptr instructions

2020 Jul 02

Redundant ptrtoint/inttoptr instructions

Hi all, We noticed a lot of unnecessary ptrtoint instructions that stand in way of some of our optimizations; the code pattern looks like this: bb1: %int1 = ptrtoint %struct.s* %ptr1 to i64 bb2: %int2 = ptrtoint %struct.s* %ptr2 to i64 %bb3: %phi.node = phi i64 [ %int1, %bb1 ], [%int2, %bb2 ] %ptr = inttoptr i64 %phi.node to %struct.s* In short, the pattern above arises due to: 1.

[RFC] bitfield access shrinking

2017 Mar 09

[RFC] bitfield access shrinking

On 03/09/2017 12:28 PM, Krzysztof Parzyszek via llvm-dev wrote: > We could add intrinsics to extract/insert a bitfield, which would > simplify a lot of that bitwise logic. But then you need to teach a bunch of places about how to simply them, fold using bitwise logic and other things that reduce demanded bits into them, etc. This seems like a difficult tradeoff. -Hal > >

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

2017 Jun 20

LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions

On 06/20/2017 03:26 AM, Hal Finkel wrote: > Hi, Adrien, Hello Hal! Thanks for your answer! > Thanks for reporting this. I recommend that you file a bug report at > https://bugs.llvm.org/ Will do! > Whenever I see reports of missed optimization opportunities in the face > of ptrtoint/inttoptr, my first question is: why are these instructions > present in the first place? At

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is

Help with SROA throwing away no-alias information

2020 Jan 17

Help with SROA throwing away no-alias information

I'm having an issue where SROA will throw away no-alias information on some loads after inlining, because the loads are derived from a store to an alloca which can be removed after inlining. The pointers that were originally stored into the alloca do *not *have any aliasing information - the only context that allowed me to assert aliasing was that the inlined-function guaranteed it to be so.

[LLVMdev] Alias analysis issue with structs on PPC

2015 Mar 15

[LLVMdev] Alias analysis issue with structs on PPC

On Sun, Mar 15, 2015 at 4:34 PM Olivier Sallenave <ol.sall at gmail.com> wrote: > Hi Daniel, > > Thanks for your feedback. I would prefer not to write a new AA. Can't we > directly implement that traversal in BasicAA? > Can I ask why? Outside of the "well, it's another pass", i mean? BasicAA is stateless, so you can't cache, and you really don't

similar to: [LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics