thr3ads.net - llvm dev - [llvm-dev] target triple in 3.8 [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Frank Winter via llvm-dev

2016-Feb-19 20:08 UTC

[llvm-dev] target triple in 3.8

I have some trouble making the SIMD vector length visible to the passes. 
My application is basically on the level of 'opt'.
What I did in version 3.6 was

functionPassManager->add(new 
llvm::TargetLibraryInfo(llvm::Triple(Mod->getTargetTriple())));
functionPassManager->add(new llvm::DataLayoutPass());

and then the -basicaa and -loop-vectorizer were able to vectorize the 
input IR for AVX.

Now, with 3.8 that didn't compile. What I do instead is just setting the 
datalayout to the Module (got that from the Kaleido example).

Mod->setDataLayout( targetMachine->createDataLayout() );

I don't add anything to the pass manager anymore, right? Especially I 
don't set the target triple..?!

However, the SIMD size doesn't shine through. The debug output of the 
loop vectorizer says:

LV: Checking a loop in "main" from module
LV: Loop hints: force=? width=0 unroll=0
LV: Found a loop: L3
LV: Found an induction variable.
LV: We can vectorize this loop!
LV: Found trip count: 8
LV: The Smallest and Widest types: 32 / 32 bits.
LV: The Widest register is: 32 bits.
LV: Found an estimated cost of 0 for VF 1 For instruction:   %6 = phi 
i64 [ %14, %L3 ], [ 0, %L5 ]
LV: Found an estimated cost of 1 for VF 1 For instruction:   %7 = add 
nsw i64 %19, %6
LV: Found an estimated cost of 0 for VF 1 For instruction:   %8 = 
getelementptr float, float* %arg1, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction:   %9 = load 
float, float* %8
LV: Found an estimated cost of 0 for VF 1 For instruction:   %10 = 
getelementptr float, float* %arg2, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction:   %11 = load 
float, float* %10
LV: Found an estimated cost of 1 for VF 1 For instruction:   %12 = fadd 
float %11, %9
LV: Found an estimated cost of 0 for VF 1 For instruction:   %13 = 
getelementptr float, float* %arg0, i64 %7
LV: Found an estimated cost of 1 for VF 1 For instruction:   store float 
%12, float* %13
LV: Found an estimated cost of 1 for VF 1 For instruction:   %14 = add 
nsw i64 %6, 1
LV: Found an estimated cost of 1 for VF 1 For instruction:   %15 = icmp 
sge i64 %14, 8
LV: Found an estimated cost of 1 for VF 1 For instruction:   br i1 %15, 
label %L4, label %L3
LV: Scalar loop costs: 8.
LV: Selecting VF: 1.
LV: Vectorization is possible but not beneficial.
LV: Interleaving is not beneficial.


The problematic line is:

LV: The Widest register is: 32 bits.

Before, with 3.6 on the same hardware it showed 256 bits. (which is 
correct).

Something is a miss here. I know, there were some changes to the target 
triple, but I didn't follow it too closely. Anyone knows how this is 
done now?

Thanks,
Frank

Mehdi Amini via llvm-dev

2016-Feb-19 20:14 UTC

head link

[llvm-dev] target triple in 3.8

Do you have the TTI in your pass manager?

Something like:

    // Add the TTI (required to inform the vectorizer about register size for
    // instance)
    PM.add(createTargetTransformInfoWrapperPass(TM->getTargetIRAnalysis()));

Also, are you populating the pass manager using the passmanagerbuilder? You
still need the TLI:

    // Populate the PassManager
    PassManagerBuilder PMB;
    PMB.LibraryInfo = new TargetLibraryInfoImpl(TM->getTargetTriple());
....


Or without the PassManagerBuild, something like:

    PM.add(new
TargetLibraryInfoWrapperPass(TargetLibraryInfoImpl(TM->getTargetTriple())));


-- 
Mehdi

> On Feb 19, 2016, at 12:08 PM, Frank Winter via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> I have some trouble making the SIMD vector length visible to the passes. My
application is basically on the level of 'opt'.
> What I did in version 3.6 was
> 
> functionPassManager->add(new
llvm::TargetLibraryInfo(llvm::Triple(Mod->getTargetTriple())));
> functionPassManager->add(new llvm::DataLayoutPass());
> 
> and then the -basicaa and -loop-vectorizer were able to vectorize the input
IR for AVX.
> 
> Now, with 3.8 that didn't compile. What I do instead is just setting
the datalayout to the Module (got that from the Kaleido example).
> 
> Mod->setDataLayout( targetMachine->createDataLayout() );
> 
> I don't add anything to the pass manager anymore, right? Especially I
don't set the target triple..?!
> 
> However, the SIMD size doesn't shine through. The debug output of the
loop vectorizer says:
> 
> LV: Checking a loop in "main" from module
> LV: Loop hints: force=? width=0 unroll=0
> LV: Found a loop: L3
> LV: Found an induction variable.
> LV: We can vectorize this loop!
> LV: Found trip count: 8
> LV: The Smallest and Widest types: 32 / 32 bits.
> LV: The Widest register is: 32 bits.
> LV: Found an estimated cost of 0 for VF 1 For instruction:   %6 = phi i64 [
%14, %L3 ], [ 0, %L5 ]
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %7 = add nsw
i64 %19, %6
> LV: Found an estimated cost of 0 for VF 1 For instruction:   %8 =
getelementptr float, float* %arg1, i64 %7
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %9 = load
float, float* %8
> LV: Found an estimated cost of 0 for VF 1 For instruction:   %10 =
getelementptr float, float* %arg2, i64 %7
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %11 = load
float, float* %10
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %12 = fadd
float %11, %9
> LV: Found an estimated cost of 0 for VF 1 For instruction:   %13 =
getelementptr float, float* %arg0, i64 %7
> LV: Found an estimated cost of 1 for VF 1 For instruction:   store float
%12, float* %13
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %14 = add nsw
i64 %6, 1
> LV: Found an estimated cost of 1 for VF 1 For instruction:   %15 = icmp sge
i64 %14, 8
> LV: Found an estimated cost of 1 for VF 1 For instruction:   br i1 %15,
label %L4, label %L3
> LV: Scalar loop costs: 8.
> LV: Selecting VF: 1.
> LV: Vectorization is possible but not beneficial.
> LV: Interleaving is not beneficial.
> 
> 
> The problematic line is:
> 
> LV: The Widest register is: 32 bits.
> 
> Before, with 3.6 on the same hardware it showed 256 bits. (which is
correct).
> 
> Something is a miss here. I know, there were some changes to the target
triple, but I didn't follow it too closely. Anyone knows how this is done
now?
> 
> Thanks,
> Frank
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Frank Winter via llvm-dev

2016-Feb-19 20:24 UTC

head link

[llvm-dev] target triple in 3.8

I added your suggestion and am using this now

llvm::legacy::FunctionPassManager *functionPassManager = new 
llvm::legacy::FunctionPassManager(Mod);

llvm::PassRegistry &registry = *llvm::PassRegistry::getPassRegistry();
initializeScalarOpts(registry);

functionPassManager->add( new 
llvm::TargetLibraryInfoWrapperPass(llvm::TargetLibraryInfoImpl(targetMachine->getTargetTriple()))
);


still,

LV: The Widest register is: 32 bits.


so, unfortunately no change.

If I dump the Module, it starts with:

; ModuleID = 'module'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"


Does this datalayout look good for x86-64 AVX ?


Frank



On 02/19/2016 03:14 PM, Mehdi Amini wrote:> Do you have the TTI in your pass manager?
>
> Something like:
>
>      // Add the TTI (required to inform the vectorizer about register size
for
>      // instance)
>     
PM.add(createTargetTransformInfoWrapperPass(TM->getTargetIRAnalysis()));
>
> Also, are you populating the pass manager using the passmanagerbuilder? You
still need the TLI:
>
>      // Populate the PassManager
>      PassManagerBuilder PMB;
>      PMB.LibraryInfo = new TargetLibraryInfoImpl(TM->getTargetTriple());
> ....
>
>
> Or without the PassManagerBuild, something like:
>
>      PM.add(new
TargetLibraryInfoWrapperPass(TargetLibraryInfoImpl(TM->getTargetTriple())));
>
>

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - target triple in 3.8

[llvm-dev] target triple in 3.8

[llvm-dev] target triple in 3.8

[llvm-dev] target triple in 3.8

Reasonably Related Threads