search for: p270

Displaying 20 results from an estimated 20 matches for "p270".

Did you mean: 270
2020 May 27
4
default behavior or
...side effects by defaults. but when compile a test case i.e. cat a.c float foo(float a, float b) { return a+b; } $clang a.c -O2 -S -emit-llvm emit ir like: $cat a.ll --------------------------------------- ; ModuleID = 'a.c' source_filename = "a.c" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define dso_local float @foo(float %a, float %b) local_unnamed_addr #0 { entry: %add = fadd float %a, %b ret float %add }...
2020 May 31
2
LLC crash while handling DEBUG info
...--- void foo() { } ----------- Let's say, above function is compiled to generate LLVM IR with -g flag using the command line `clang++ -g -O0 -S -emit-llvm foo.cpp`, we get below IR ----------- ; ModuleID = 'foo.cpp' source_filename = "foo.cpp" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: noinline nounwind optnone uwtable define dso_local void @_Z3foov() #0 !dbg !7 { ret void, !dbg !10 } attributes #0 = { noinline nounwind optnone uwtable &quot...
2020 May 27
2
By default clang does not emit trap insn
...>> return a+b; >> } >> >> $clang a.c -O2 -S -emit-llvm >> emit ir like: >> $cat a.ll >> --------------------------------------- >> ; ModuleID = 'a.c' >> source_filename = "a.c" >> target datalayout = >> "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-unknown-linux-gnu" >> >> ; Function Attrs: norecurse nounwind readnone uwtable >> define dso_local float @foo(float %a, float %b) local_unnamed_addr #0 { >> entry: &g...
2020 Jul 22
2
Unlikely branches can have expensive contents hoisted
...again, So I'm looking at llvm.expect specifically for branch hints. In the following example LLVM will hoist the pow/cos calls into the entry block even though I've used the llvm.expect intrinsic to make it clear that one of the calls is unlikely to occur. target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc-coff" define dllexport double @foo(i32 %val) { entry: %0 = icmp slt i32 %val, 42 %1 = call i1 @llvm.expect.i1(i1 %0, i1 false) %2 = sitofp i32 %val to double br i1 %1, label...
2020 May 31
2
LLC crash while handling DEBUG info
...nerate LLVM IR with -g flag > > using the command line `clang++ -g -O0 -S -emit-llvm foo.cpp`, we get > > below IR > > > > ----------- > > ; ModuleID = 'foo.cpp' > > source_filename = "foo.cpp" > > target datalayout = > > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > > target triple = "x86_64-unknown-linux-gnu" > > > > ; Function Attrs: noinline nounwind optnone uwtable > > define dso_local void @_Z3foov() #0 !dbg !7 { > > ret void, !dbg !10 > > }...
2020 May 31
2
LLC crash while handling DEBUG info
...+ -g -O0 -S -emit-llvm foo.cpp`, we get > >> > below IR > >> > > >> > ----------- > >> > ; ModuleID = 'foo.cpp' > >> > source_filename = "foo.cpp" > >> > target datalayout = > >> > > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > >> > target triple = "x86_64-unknown-linux-gnu" > >> > > >> > ; Function Attrs: noinline nounwind optnone uwtable > >> > define dso_local void @_Z3foov() #0 !dbg !7 { > &...
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
...h AVX or AVX2 enabled. The following IR example is vectorized to 4 wide with LLVM 11 and trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8 wide matching the ymm registers. ; ModuleID = '../test.ll' source_filename = "main" target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc-coff" %"Burst.Compiler.IL.Tests.VectorsMaths/FloatPointer.0" = type { float*, i32, [4 x i8] } ; Function Attrs: nofree define dllexport void @func(float* noalias nocapture...
2020 Nov 19
1
JIT compiling CUDA source code
...t compilation to my JIT as usual, but... what to do with the > Module from the device compilation? If I just add it to the JIT, I get an > error message like this: > > Added modules have incompatible data layouts: > e-i64:64-i128:128-v16:16-v32:32-n16:32:64 (module) vs > e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128 (jit) > > Any suggestions as to what to do with the Module containing CUDA kernel > code, so that the host Module can invoke it? > > Geoff > > On Tue, Nov 17, 2020 at 6:39 PM Geoff Levner <glevner at gmail.com> w...
2020 Jun 13
2
target-features attribute prevents inlining?
...stfn', with the latter calling the former. One would expect the optimizer inlining the call to the '_Z2fnP10TestStructi', but it doesn't. (The command line I used is 'opt -O3 test.ll -o test2.bc') source_filename = "a.cpp" > target datalayout = > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-unknown-linux-gnu" > > %struct.TestStruct = type { i8*, i32 } > > define dso_local i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 %1) > #0 { > %3 = getelementptr inboun...
2020 Jun 01
2
LLC crash while handling DEBUG info
...> below IR > >> >> > > >> >> > ----------- > >> >> > ; ModuleID = 'foo.cpp' > >> >> > source_filename = "foo.cpp" > >> >> > target datalayout = > >> >> > > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > >> >> > target triple = "x86_64-unknown-linux-gnu" > >> >> > > >> >> > ; Function Attrs: noinline nounwind optnone uwtable > >> >> > define dso_local...
2020 Nov 17
2
JIT compiling CUDA source code
We have an application that allows the user to compile and execute C++ code on the fly, using Orc JIT v2, via the LLJIT class. And we would like to extend it to allow the user to provide CUDA source code as well, for GPU programming. But I am having a hard time figuring out how to do it. To JIT compile C++ code, we do basically as follows: 1. call Driver::BuildCompilation(), which returns a
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
...to 4 wide with LLVM 11 and >> trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8 >> wide matching the ymm registers. >> >> ; ModuleID = '../test.ll' >> source_filename = "main" >> target datalayout = >> "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-pc-windows-msvc-coff" >> >> %"Burst.Compiler.IL.Tests.VectorsMaths/FloatPointer.0" = type { float*, >> i32, [4 x i8] } >> >> ; Function Attrs: nofree...
2020 Nov 19
0
JIT compiling CUDA source code
...I add the Module from the host compilation to my JIT as usual, but... what to do with the Module from the device compilation? If I just add it to the JIT, I get an error message like this: Added modules have incompatible data layouts: e-i64:64-i128:128-v16:16-v32:32-n16:32:64 (module) vs e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128 (jit) Any suggestions as to what to do with the Module containing CUDA kernel code, so that the host Module can invoke it? Geoff On Tue, Nov 17, 2020 at 6:39 PM Geoff Levner <glevner at gmail.com> wrote: > We have an applicati...
2020 Jun 13
2
target-features attribute prevents inlining?
...mer. One would expect the > optimizer inlining the call to the '_Z2fnP10TestStructi', but it doesn't. > (The command line I used is 'opt -O3 test.ll -o test2.bc') > > > >> source_filename = "a.cpp" > >> target datalayout = > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > >> target triple = "x86_64-unknown-linux-gnu" > >> > >> %struct.TestStruct = type { i8*, i32 } > >> > >> define dso_local i32 @_Z2fnP10TestStructi(%struct.TestStruct* %0, i32 &...
2020 Jul 16
4
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
...s in LLVM 10 it (correctly as per what we want) vectorized it 8 >>>> wide matching the ymm registers. >>>> >>>> ; ModuleID = '../test.ll' >>>> source_filename = "main" >>>> target datalayout = >>>> "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" >>>> target triple = "x86_64-pc-windows-msvc-coff" >>>> >>>> %"Burst.Compiler.IL.Tests.VectorsMaths/FloatPointer.0" = type { float*, >>>> i32, [4 x i8] } >>&gt...
2020 Jun 13
2
target-features attribute prevents inlining?
...gt; optimizer inlining the call to the '_Z2fnP10TestStructi', but it doesn't. > (The command line I used is 'opt -O3 test.ll -o test2.bc') > >> > > >> >> source_filename = "a.cpp" > >> >> target datalayout = > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > >> >> target triple = "x86_64-unknown-linux-gnu" > >> >> > >> >> %struct.TestStruct = type { i8*, i32 } > >> >> > >> >> define dso_local i32 @_Z2f...
2020 May 23
2
Loop Unroll
...clang -O0 -Xclang -disable-O0-optnone -emit-llvm for.c -S -o forO0.ll $ opt -O0 -S --loop-unroll --unroll-count=4 -view-cfg forO0.ll -o for-opt00-unroll4.ll And this is the LLVM IR code that I get: ; ModuleID = 'forO0.ll' source_filename = "for.c" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: noinline nounwind uwtable define dso_local i32 @add(i32 %a, i32 %b) #0 { entry: %a.addr = alloca i32, align 4 %b.addr = alloca i32, align 4 store i32 %a, i...
2020 May 26
3
Loop Unroll
...-llvm for.c -S -o forO0.ll > $ opt -O0 -S --loop-unroll --unroll-count=4 -view-cfg forO0.ll -o > for-opt00-unroll4.ll > > And this is the LLVM IR code that I get: > > ; ModuleID = 'forO0.ll' > source_filename = "for.c" > target datalayout = > "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-unknown-linux-gnu" > > ; Function Attrs: noinline nounwind uwtable > define dso_local i32 @add(i32 %a, i32 %b) #0 { > entry: > %a.addr = alloca i32, align 4 > %b.addr = all...
2020 May 22
4
Loop Unroll
Hi, I'm interesting in find a pass for loop unrolling in LLVM compiler. I tried opt --loop-unroll --unroll-count=4, but it don't work well. What pass I can used and how? I would also like to know if there is any way to mark the loops that I want them to be unroll Thanks you. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2006 Jun 09
3
GXP-2000 MultiPurpose Keys
Is it possible to program the multi-purpose keys on a GXP-2000 remotely via a TFTP configuration file? If so, what are the parameters to put in the configuration file? Thanks, Daniel