thr3ads.net - search: "srcloc"

[LLVMdev] RFC - Adding an optimization report facility?

2014 Mar 07

4

[LLVMdev] RFC - Adding an optimization report facility?

...imization report facility? > > > > > > > > On Thu, Mar 6, 2014 at 7:08 PM, Hal Finkel < hfinkel at anl.gov > wrote: > > > > My suggestion is that we start attaching loop-id metadata to loops in > the frontend, and then also start attaching 'srcloc' metadata, just > like we do for inline asm statements. This way we can pass back the > information we need to the frontend for it to identify the loop > without too much trouble. There may be a better long-term design, > but this seems, at least, like an easy thing to do in the sho...

Inline assembly in intel syntax mishandling i constraint

2020 Jan 07

2

Inline assembly in intel syntax mishandling i constraint

...asm volatile("movl %0, %%eax" : : "i"(&foo)); asm volatile("movl %0, %%ebx" : : "i"(&bar)); } This produces define void @_start() #0 { call void asm sideeffect "movl $0, %eax", "i,~{dirflag},~{fpsr},~{flags}"(i32* @foo) #1, !srcloc !3 call void asm sideeffect "movl $0, %ebx", "i,~{dirflag},~{fpsr},~{flags}"(i32* @bar) #1, !srcloc !4 ret void } When assembled, I get the expected output 80480a3: b8 b0 90 04 08 mov eax,0x80490b0 80480a8: bb b4 90 04 08 mov ebx,0x80490b4 After modi...

Inline assembly in intel syntax mishandling i constraint

2020 Jan 08

2

Inline assembly in intel syntax mishandling i constraint

...%%eax" : : "i"(&foo)); > asm volatile("movl %0, %%ebx" : : "i"(&bar)); > } > > This produces > define void @_start() #0 { > call void asm sideeffect "movl $0, %eax", "i,~{dirflag},~{fpsr},~{flags}"(i32* @foo) #1, !srcloc !3 > call void asm sideeffect "movl $0, %ebx", "i,~{dirflag},~{fpsr},~{flags}"(i32* @bar) #1, !srcloc !4 > ret void > } > > When assembled, I get the expected output > 80480a3: b8 b0 90 04 08 mov eax,0x80490b0 > 80480a8: bb b4...

[LLVMdev] RFC - Adding an optimization report facility?

2014 Mar 07

3

[LLVMdev] RFC - Adding an optimization report facility?

...don't have enough location information to tell the user > *which* loop got unrolled. The key is to give them actionable > information, not just the output of -debug :-) My suggestion is that we start attaching loop-id metadata to loops in the frontend, and then also start attaching 'srcloc' metadata, just like we do for inline asm statements. This way we can pass back the information we need to the frontend for it to identify the loop without too much trouble. There may be a better long-term design, but this seems, at least, like an easy thing to do in the short term. -Hal &g...

[LLVMdev] add Inline assembly in LLVM IR

2013 Jun 07

2

[LLVMdev] add Inline assembly in LLVM IR

...128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" define i32 @main() nounwind uwtable { entry: %retval = alloca i32, align 4 store i32 0, i32* %retval call void asm sideeffect ".long 0x12345678", "~{dirflag},~{fpsr},~{flags}"() nounwind, !srcloc !0 ret i32 0 } !0 = metadata !{i32 20} -- And I want to know which LLVM API should I use to generate " call void asm sideeffect ".long 0x12345678", "~{dirflag},~{fpsr},~{flags}"() nounwind, !srcloc !0" ? Thanks!! -------------- next part -------------- An HTML at...

llvm return value propagation & asm

2017 Jan 27

2

llvm return value propagation & asm

...ue to the caller. ; Function Attrs: naked noinline optnone define i32 @callcatch(i32, i32) #3 !dbg !10103 { BasicBlock8472: call void asm "\0D\0Apushl %ebp\0D\0Amovl 8(%esp),%eax\0D\0Amovl 12(%esp), %ebp\0D\0Acalll *%eax\0D\0Apopl %ebp\0D\0Aretl\0D\0A", ""(), !dbg !10104, !srcloc !10106 // this returns in eax ret i32 0, !dbg !10104 } ; Function Attrs: naked noinline optnone define void @jumptocont(i32, i32, i32) #3 !dbg !10107 { BasicBlock8473: call void asm "\0D\0A movl 12(%esp), %ebp\0D\0A movl 4(%esp), %eax\0D\0A movl 8(%esp), %esp\0D\0A jmpl *%e...

IR inline assembly: the x86 Intel "offset" operator

2019 Dec 09

4

IR inline assembly: the x86 Intel "offset" operator

Hi all, I'm trying to land (a rebased version of) http://llvm.org/D37461 - which should make it possible to handle x86 Intel assembly like mov eax, offset Foo::ptr + 1 (Currently, omitting the +1 works... but offset doesn't work in compound expressions.) I'm having trouble figuring out what inline assembly I can emit into the LLVM IR that will work properly. So far, the closest

[LLVMdev] Load and store debug information

2012 Aug 12

1

[LLVMdev] Load and store debug information

Hi all, I am currently looking for more accurate debug information. In particular I am interested into relating load and store IR instructions to the corresponding array subscript. Current debug information associate each instruction to the source code line they come from. I would like to extend this, for loads and stores only, to make metadata point to the source location where the array

[LLVMdev] add Inline assembly in LLVM IR

2013 Jun 07

0

[LLVMdev] add Inline assembly in LLVM IR

...triple = "x86_64-unknown-linux-gnu" > > define i32 @main() nounwind uwtable { > entry: > %retval = alloca i32, align 4 > store i32 0, i32* %retval > call void asm sideeffect ".long 0x12345678", "~{dirflag},~{fpsr},~{flags}"() > nounwind, !srcloc !0 > ret i32 0 > } > > !0 = metadata !{i32 20} > > -- > And I want to know which LLVM API should I use to generate " call void asm > sideeffect ".long 0x12345678", "~{dirflag},~{fpsr},~{flags}"() nounwind, !srcloc > !0" ? > > Thanks...

[LLVMdev] Inline asm call argument mismatch?

2013 Jun 26

1

[LLVMdev] Inline asm call argument mismatch?

Hello, In the following code snippet: %tmp49 = call i64 asm "movq %gs:${1:P},$0", "=r,im,,~{fpsr},~{flags}"(i64* @kernel_stack) #6, !dbg !6625, !srcloc !5841 I would expect for the inline asm call to receive two arguments because of the ${1:P} corresponding to a %P1 that will append the $1 to %%gs:. Can someone explain why there is only one argument in this call? Moreover, is there any documentation on the constraints that LLVM/clang generates?...

[RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations

2015 Nov 02

8

[RFC] A new intrinsic, `llvm.blackbox`, to explicitly prevent constprop, die, etc optimizations

...2 = bitcast i32* %dummy.i to i8* call void @llvm.lifetime.start(i64 4, i8* %2) #1 ; Here, the value operand was the original argument to `test::black_box::<i32>` store i32 2, i32* %dummy.i, align 4 call void asm "", "r,~{dirflag},~{fpsr},~{flags}"(i32* %dummy.i) #1, !srcloc !0 %3 = load i32, i32* %dummy.i, align 4 call void @llvm.lifetime.end(i64 4, i8* %2) #1 ```` This could be better. # Solution Add a new intrinsic, called `llvm.blackbox`, which accepts a value of any type and returns a value of the same type. As with many other intrinsics, this intrinsic sha...

RFC: Extending optimization reporting

2019 Apr 30

4

RFC: Extending optimization reporting

...llowing not very helpful information. for (...) // Loop was unswitched. // Loop could not be vectorized because... // Loop was vectorized. A if (lic) B C Instead I'd like to have a way to produce something like this: for (...) // Loop was unswitched for condition (srcloc) // Unswitched loop version #1 // Unswitched for IF condition (srcloc) // Loop was not vectorized: // Unswitched loop version #2 // Loop was vectorized The primary thing missing, I think, is a way for the vectorizer to give some indication of which version of the loop it...

[LLVMdev] new set of superoptimizer results

2014 Nov 26

2

[LLVMdev] new set of superoptimizer results

I strongly suspect pointer union and pointer int pair style classes are the source of these... But perhaps I'm wrong On Nov 26, 2014 9:29 AM, "Michael Zolotukhin" <mzolotukhin at apple.com> wrote: > John, > > That’s a great post and really interesting data, thank you! > > On Nov 25, 2014, at 9:58 AM, Reid Kleckner <rnk at google.com> wrote: > >

InlineAsm and allocation to wrong register for indirect access

2016 Jul 21

2

InlineAsm and allocation to wrong register for indirect access

...of an inline asm with indirect memory references being allocated invalid registers (i.e. registers that cannot be used on loads). For example, the inline asm constraint is correct: call void asm sideeffect "MOV $$r0, $0\0AMOV $$r0, $1\0A", "*m,*m,~{r0}"(i16* @a, i16* %b) #1, !srcloc !1 but then $0 and $1 are allocated to registers that cannot be used as a memory base pointer. I am having trouble finding where this decision is made. Is InlineAsm going through the normal register allocation process or does it have its own specialized algorithm? Any pointers to how registers a...

[LLVMdev] optimizer clobber EFLAGS

2015 Jul 29

2

[LLVMdev] optimizer clobber EFLAGS

...irective: --- entry: tail call void @foo() #2 %0 = load i32* @a, align 4, !tbaa !1 %sub = add nsw i32 %0, -1 store i32 %sub, i32* @a, align 4, !tbaa !1 %tobool = icmp eq i32 %sub, 0 tail call void asm sideeffect "", "~{cc},~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !5 tail call void @foo() #2 br i1 %tobool, label %if.end, label %return if.end: ; preds = %entry tail call void @foo() #2 br label %return return: ; preds = %entry, %if.end %retval.0 = phi i32 [ 0,...

IR inline assembly: the x86 Intel "offset" operator

2019 Dec 11

2

IR inline assembly: the x86 Intel "offset" operator

...On Tue, Dec 10, 2019 at 7:23 PM Reid Kleckner <rnk at google.com> wrote: > I think perhaps we want to make this LLVM IR asm string: > call void asm sideeffect inteldialect "mov eax, $0", > "i,~{eax},~{dirflag},~{fpsr},~{flags}"(i32* @"?Bar@@3HA") #1, !srcloc !3 > > The key thing is the 'i' constraint. That ends up producing the wrong > assembly right now, but maybe with your rebased patch, now it will do the > right thing. > > If you change up the dialect and switch to an ELF triple, this inline asm > will produce the expec...

Landing Pad bug?

2016 Oct 11

2

Landing Pad bug?

...; preds = %6 unreachable ; <label>:8: ; preds = %2 %9 = tail call { i64, i8* } asm sideeffect "rep stosb", "={cx},={di},{ax},0,1,~{memory},~{dirflag},~{fpsr},~{flags}"(i32 0, i64 16, i8* nonnull %3) #53, !srcloc !1070 invoke void @_ZN8CryptoPP19UnalignedDeallocateEPv(i8* nonnull %3) to label %13 unwind label %10 ; <label>:10: ; preds = %8, %6 %11 = landingpad { i8*, i32 } catch i8* null %12 = extractvalue { i8*, i32 } %11, 0 tail call v...

[LLVMdev] RFC: PerfGuide for frontend authors

2015 Mar 01

2

[LLVMdev] RFC: PerfGuide for frontend authors

...llvm.lifetime.start(i64 400000, i8* %2) ; this happens too late >>>>> call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %1, i64 400000, i32 4, i1 false) >>>>> call void asm "", "r,~{dirflag},~{fpsr},~{flags}"([100000 x i32]* %arg) #2, !noalias !0, !srcloc !3 >>>>> call void @llvm.lifetime.end(i64 400000, i8* %2) #2, !alias.scope !4, !noalias !0 >>>>> call void @llvm.lifetime.end(i64 400000, i8* %2) >>>>> call void @llvm.lifetime.end(i64 400000, i8* %1) >>>>> ret void >>>>&gt...

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

2012 Jul 10

2

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

...a, i32* %a.addr, align 4 %0 = load i32* %a.addr, align 4 %1 = call i32 asm sideeffect "$( \0A\09.reg .pred \09%p1; \0A\09.reg .pred \09%p2; \0A\09setp.ne.u32 \09%p1, $1, 0; \0A\09vote.any.pred \09%p2, %p1; \0A\09selp.s32 \09$0, 1, 0, %p2; \0A\09$)", "=r,r"(i32 %0) nounwind, !srcloc !0 store i32 %1, i32* %result, align 4 %2 = load i32* %result, align 4 ret i32 %2 } !0 = metadata !{i32 127, i32 132, i32 166, i32 200, i32 242, i32 285, i32 327} > llc -march=nvptx64 test.ll -o test.ptx > cat test.ptx // // Generated by LLVM NVPTX Back-End // .version 3.0 .target sm...

RFC: Extending optimization reporting

2019 May 08

2

RFC: Extending optimization reporting

...llowing not very helpful information. for (...) // Loop was unswitched. // Loop could not be vectorized because... // Loop was vectorized. A if (lic) B C Instead I'd like to have a way to produce something like this: for (...) // Loop was unswitched for condition (srcloc) // Unswitched loop version #1 // Unswitched for IF condition (srcloc) // Loop was not vectorized: // Unswitched loop version #2 // Loop was vectorized The primary thing missing, I think, is a way for the vectorizer to give some indication of which version of the loop it...

search for: srcloc