thr3ads.net - search: "dso

Displaying 20 results from an estimated 153 matches for "dso_local".

Inlining + CSE + restrict pointers == funtimes

2020 Jan 22

Inlining + CSE + restrict pointers == funtimes

...this down to the following failure (see https://godbolt.org/z/-mdjPV): int called(int* __restrict__ a, int* b, int* c) { return *a + *b + *c; } int foo(int * x, int * y) { return *x + *y + called(x, x, y); } int bar(int * x, int * y) { return called(x, x, y) + *x + *y; } Which becomes: define dso_local i32 @called(i32* noalias nocapture readonly %0, i32* nocapture readonly %1, i32* nocapture readonly %2) local_unnamed_addr #0 ! dbg !7 { %4 = load i32, i32* %0, align 4, !dbg !19, !tbaa !20 %5 = load i32, i32* %1, align 4, !dbg !24, !tbaa !20 %6 = add nsw i32 %5, %4, !dbg !25 %7 = load i32, i32* %2...

Question about basic-aa's assumptions

2020 Jul 06

Question about basic-aa's assumptions

...0x565ee028 pc : 0x565ee028 c : 0x565ee024 *pc: 0x565ee024 Basically, I would have liked if basic-aa said ppc and pc are may-aliased to start with for this kind of usage. This is how the globals look like. The second one: "ppc" has "pc" on the right hand side. @pc = common dso_local global [4 x i8*] zeroinitializer, align 4, !dbg !0 @ppc = dso_local global i8** getelementptr inbounds ([4 x i8*], [4 x i8*]* @pc, i32 0, i32 0), align 4, !dbg !6 Best regards, Ram -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/l...

[IndVars] Rewriting exit value of SCEV add expressions

2019 Mar 25

[IndVars] Rewriting exit value of SCEV add expressions

...gressed here, because before rL346397 we would rewrite the exit value of SCEV add expression even if there is hard use inside the loop. Here is the reproducer: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @c = external dso_local local_unnamed_addr global i32, align 4 @a = external dso_local local_unnamed_addr global i32*, align 8 @b = external dso_local local_unnamed_addr global i32, align 4 @d = external dso_local local_unnamed_addr global i32, align 4 define i32 @foo(){ entry: %0 = load i32*, i32** @a, align 8 %.pre...

__restirct ignored when including headers like <cmath>

2020 Jun 28

__restirct ignored when including headers like <cmath>

...es __restirct when I include some standard headers. For example, this code: void vec_add(int* __restrict a, int* __restrict b, int n) { #pragma unroll 4 for(int i=0; i<n; ++i) { a[i] += b[i]; } } results in: ; Function Attrs: nofree norecurse nounwind define dso_local void @_Z7vec_addPiS_i(i32* noalias nocapture %a, i32* noalias nocapture readonly %b, i32 %n) local_unnam ed_addr #0 { entry: . . ... (note the noaliass before function arguments). But this code: #include <cmath> void vec_add(int* __restrict a, int* __restrict b,...

Question about Constant expressions

2020 May 31

Question about Constant expressions

...dds it as a separate instruction, but it is rolled into one call instruction in the .bc output. I am just curious as to why it is done this way. #include <stdint.h> #include <stdio.h> int test21(void); int test21(void) { printf("Hello1\n"); return 0; } define dso_local i32 @test21() #0 { %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i64 0, i64 0)) ret i32 0 }

Usage of the jumptable attribute

2019 Aug 31

Usage of the jumptable attribute

...target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @.str = private unnamed_addr constant [12 x i8] c"Hello World\00", align 1 ; Function Attrs: jumptable noinline nounwind optnone uwtable define dso_local void @foo() unnamed_addr #0 { %1 = call i32 @puts(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i32 0, i32 0)) ret void } declare dso_local i32 @puts(i8*) #1 ; Function Attrs: jumptable noinline nounwind optnone uwtable define dso_local i32 @main() un...

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

...on I say that is that the inputs are vectors and the output is also a vector - we just perform the operation on extracted elements rather than on the input vectors themselves. In the PR you linked, there is an example that shows the difference (simplified to <2 x double> for brevity): define dso_local <2 x double> @test(i64 %a, i64 %b) { entry: %conv = uitofp i64 %a to double %conv1 = uitofp i64 %b to double %vecinit = insertelement <2 x double> undef, double %conv, i32 0 %vecinit2 = insertelement <2 x double> %vecinit, double %conv1, i32 1 ret <2 x double> %vec...

Emitting a local alias

2020 May 07

Emitting a local alias

...want to be able to place _ZTVSt13bad_exception in .rodata instead of .data.rel.ro which requires that the object does not need a relocation. According to `Constant::needsRelocation()`, a reloc isn't needed if the offset I'm taking (.long symbol_A - symbol_B) contains symbols that are both dso_local. I can guarantee that `symbol_A` will be, but `symbol_B` needs to be a global with default visibility. My solution around this is to emit a local alias, but the issue is that with optimizations (-O3), the latter snippet seems to "resolve" the aliases and instead replaces them with my alia...

Question on TBAA and optimization

2019 Nov 28

Question on TBAA and optimization

...;Aptr" and "Bptr" will not alias each other in both the functions. TBAA is able to figure that out in the case of "foo2". Hence it could propagate the value "10" directly for return in "foo2". --IR snippet --- ; Function Attrs: nounwind uwtable define dso_local i32 @foo1(%struct.A* %Aptr, %struct.B* %Bptr) local_unnamed_addr #0 { entry: %b1 = getelementptr inbounds %struct.A, %struct.A* %Aptr, i64 0, i32 1, i32 0 store i32 10, i32* %b1, align 4, !tbaa !2 %b11 = getelementptr inbounds %struct.B, %struct.B* %Bptr, i64 0, i32 0 store i32 11, i32* %b1...

Refining which symbols are preemptable with lto

2017 Jun 14

Refining which symbols are preemptable with lto

As a follow up to https://reviews.llvm.org/D20217 I would like to use lto/thinlto to refine when a symbol is marked as local/preemptable. I'm not very familiar with lto though so would appreciate some guidance about how best to go about this. Regards Sean Fertile

Sancov guard semantics for usage between comdats

2020 May 14

Sancov guard semantics for usage between comdats

...nitize-coverage=trace-pc-guard /tmp/test.cpp -c -O1` generates sancov guards (__sancov_gen_.X) that are used outside of their comdat group due to inlining: ``` @__sancov_gen_.1 = private global [3 x i32] zeroinitializer, section "__sancov_guards", comdat($_ZN3Foo10inline_fooEv) define dso_local i32 @_ZN3Foo10public_fooEv(%struct.Foo* %0) local_unnamed_addr #0 comdat align 2 { call void @__sanitizer_cov_trace_pc_guard(i32* getelementptr inbounds ([3 x i32], [3 x i32]* @__sancov_gen_, i64 0, i64 0)) call void asm sideeffect "", ""() #4 ; This is from inlining Fo...

Why doesn't this `and` get eliminated

2020 Jun 12

Why doesn't this `and` get eliminated

define dso_local i32 @f(i32 %0) { %2 = and i32 %0, 7 %3 = icmp eq i32 %2, 7 %4 = zext i1 %3 to i32 ret i32 %4 } I thought instcombine would remove it. It doesn't and nothing else does either. LLVM Version is 10.0.0. /Riyaz -------------- next part -------------- An HTML attachment was scrubbed... URL:...

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

...output is also a vector - we just >> perform the operation on extracted elements rather than on the input >> vectors themselves. >> >> In the PR you linked, there is an example that shows the difference >> (simplified to <2 x double> for brevity): >> define dso_local <2 x double> @test(i64 %a, i64 %b) { >> entry: >> %conv = uitofp i64 %a to double >> %conv1 = uitofp i64 %b to double >> %vecinit = insertelement <2 x double> undef, double %conv, i32 0 >> %vecinit2 = insertelement <2 x double> %vecinit, doubl...

Need help in understanding llvm optimization

2018 Aug 11

Need help in understanding llvm optimization

Hi, I have below code in C - int main() { double x,y; x = 1e16; y = (x + 1) - x; printf("y:%e\n", y); return 0; } llvm bitcode looks like this for this function - ; Function Attrs: nounwind uwtable define dso_local i32 @main() local_unnamed_addr #0 { entry: %call = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), double 0.000000e+00) ret i32 0 } I am not able to understand how addition and subtraction are performed in this code. There is no fadd o...

Information Loss of Array Type in Function Interface in IR Generated by Clang

2019 Jun 30

Information Loss of Array Type in Function Interface in IR Generated by Clang

..., Thanks for your prompt reply! Sure, I can implement a AST visitor to go through the AST to get the information but I just wonder whether there is any other way to let Clang do so. What I am considering is how to let the generated IR looks like below, which some tools realize: define dso_local i32 @_Z1fPii([51 x i32]* %A, i32 %x) local_unnamed_addr #0 !dbg !7 { entry: ... } Best regards, ------------------------------------------ Tingyuan LIANG MPhil Student Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology _____________________________...

ThinLTO Bug ?

2018 Jul 28

ThinLTO Bug ?

...b.o b.ll ~/llvm/build-debug/bin/llvm-lto2 run -o c a.o b.o -r a.o,c,px -r b.o,a,px -r b.o,b,px -r a.o,gv -r b.o,gv [~/thinltobug]$ cat a.ll target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.TA = type {} @gv = dso_local global %struct.TA * zeroinitializer define i32 @c() !dbg !6 { bitcast %struct.TA ** @gv to i8* unreachable } !llvm.module.flags = !{!0, !1} !llvm.dbg.cu = !{!2} !0 = !{i32 2, !"Debug Info Version", i32 3} !1 = !{i32 1, !"ThinLTO", i32 0} !2 = distinct !DICompile...

ThinLTO Bug ?

2018 Jul 30

ThinLTO Bug ?

...> > > > > [~/thinltobug]$ cat a.ll > > > > target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" > > > > target triple = "x86_64-unknown-linux-gnu" > > > > %struct.TA = type {} > > > > > > > > @gv = dso_local global %struct.TA * zeroinitializer > > > > define i32 @c() !dbg !6 { > > > > bitcast %struct.TA ** @gv to i8* > > > > unreachable > > > > } > > > > > > > > > > > > !llvm.module.flags = !{!0, !1} > > >...

LLVM Alias Analysis (Load and store from same address is not showed up in same set)

2018 Apr 15

LLVM Alias Analysis (Load and store from same address is not showed up in same set)

...quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @fp = internal global i32 ()* null, align 8 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00", align 1 ; Function Attrs: noinline nounwind optnone uwtable define dso_local void @ind_call(i32 ()* %compr) #0 { entry: %compr.addr = alloca i32 ()*, align 8 store i32 ()* %compr, i32 ()** %compr.addr, align 8 %0 = load i32 ()*, i32 ()** %compr.addr, align 8 store i32 ()* %0, i32 ()** @fp, align 8 %1 = load i32 ()*, i32 ()** @fp, align 8 %call = call i32...

Information Loss of Array Type in Function Interface in IR Generated by Clang

2019 Jun 30

Information Loss of Array Type in Function Interface in IR Generated by Clang

...Clang command I used is : clang -O1 -emit-llvm -S -g tmp.cc -o tmp.bc Thanks in advance for your time and suggestion! ^_^ C source code: int f ( int A[51], int x) { return A[x]; } =========================== generated IR: ; Function Attrs: norecurse nounwind readonly uwtable define dso_local i32 @_Z1fPii(i32* nocapture readonly %A, i32 %x) local_unnamed_addr #0 !dbg !7 { entry: call void @llvm.dbg.value(metadata i32* %A, metadata !13, metadata !DIExpression()), !dbg !15 call void @llvm.dbg.value(metadata i32 %x, metadata !14, metadata !DIExpression()), !dbg !16 %idxprom = sext i3...

TypePromoteFloat loses intermediate rounding operations

2019 Dec 10

TypePromoteFloat loses intermediate rounding operations

For the following C code __fp16 x, y, z, w; void foo() { x = y + z; x = x + w; } clang produces IR that extends each operand to float and then truncates to half before assigning to x. Like this define dso_local void @foo() #0 !dbg !18 { %1 = load half, half* @y, align 2, !dbg !21 %2 = fpext half %1 to float, !dbg !21 %3 = load half, half* @z, align 2, !dbg !22 %4 = fpext half %3 to float, !dbg !22 %5 = fadd float %2, %4, !dbg !23 %6 = fptrunc float %5 to half, !dbg !21 store half %6, half* @x, align 2, !d...

search for: dso_local