Displaying 20 results from an estimated 21 matches for "looprerolling".
2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...like to hear what others in the community think about this and give other people the opportunity to perform their own performance measurements.
— Performance Gains —
SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
MultiSource/Benchmarks/Olden/power/power -18.55%
MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
SingleSource/Benchmarks/Misc/flops-6 -11.02%
SingleSource/Benchmarks/Misc/flops-5 -10.03%
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
External/Nurbs/nurbs -7.98%
SingleSource/Benchmarks/Misc/pi -7.29%
External/SPEC/CINT2000/252_eon/2...
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ould be straight-forward. It would also be really useful to see what the code size and compile time impact is.
-Chris
>
> — Performance Gains —
> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
> MultiSource/Benchmarks/Olden/power/power -18.55%
> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
> SingleSource/Benchmarks/Misc/flops-6 -11.02%
> SingleSource/Benchmarks/Misc/flops-5 -10.03%
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
> External/Nurbs/nurbs -7.98%
> SingleSource/Benchmarks/Misc/pi -7.29%
> Ex...
2013 Jul 28
2
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...us Current σ
SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.67% 1.4064 0.6516 0.0007
External/Nurbs/nurbs -19.47% 2.5389 2.0445 0.0029
MultiSource/Benchmarks/Olden/power/power -18.49% 1.2572 1.0248 0.0004
SingleSource/Benchmarks/Misc/flops-4 -15.93% 0.7767 0.6530 0.0348
MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.72% 2.3925 2.0404 0.0013
SingleSource/Benchmarks/Misc/flops-6 -11.05% 1.1427 1.0164 0.0009
SingleSource/Benchmarks/Misc/flops-5 -10.43% 1.2771 1.1439 0.0015
MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.10% 2.3468 2.1568 0.0195
SingleSource/Bench...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...also be really useful to see what the code size and compile time impact is.
>
> -Chris
>
>>
>> — Performance Gains —
>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
>> MultiSource/Benchmarks/Olden/power/power -18.55%
>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
>> SingleSource/Benchmarks/Misc/flops-6 -11.02%
>> SingleSource/Benchmarks/Misc/flops-5 -10.03%
>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
>> External/Nurbs/nurbs -7.98%
>> SingleSource/Benchmarks/Mi...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...at the code size and compile time impact is.
>>
>> -Chris
>>
>>>
>>> — Performance Gains —
>>> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
>>> MultiSource/Benchmarks/Olden/power/power -18.55%
>>> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
>>> SingleSource/Benchmarks/Misc/flops-6 -11.02%
>>> SingleSource/Benchmarks/Misc/flops-5 -10.03%
>>> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -8.37%
>>> External/Nurbs/nurbs -7.98%
>>> Single...
2013 Jul 14
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...> community think about this and give other people the opportunity to perform
> their own performance measurements.
>
> — Performance Gains —
> SingleSource/Benchmarks/Misc/matmul_f64_4x4 -53.68%
> MultiSource/Benchmarks/Olden/power/power -18.55%
> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -14.71%
> SingleSource/Benchmarks/Misc/flops-6 -11.02%
> SingleSource/Benchmarks/Misc/flops-5 -10.03%
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
> -8.37%
> External/Nurbs/nurbs -7.98%
> SingleSource/Benchmarks/Misc/pi -7.29%
&g...
2013 Jul 28
0
[LLVMdev] Enabling the SLP-vectorizer by default for -O3
...ntσ
> SingleSource/Benchmarks/Misc/matmul_f64_4x4-53.67%1.40640.65160.0007
> External/Nurbs/nurbs-19.47%2.53892.04450.0029
> MultiSource/Benchmarks/Olden/power/power-18.49%1.25721.02480.0004
> SingleSource/Benchmarks/Misc/flops-4-15.93%0.77670.65300.0348
> MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt-14.72%
> 2.39252.04040.0013SingleSource/Benchmarks/Misc/flops-6-11.05%1.14271.0164
> 0.0009SingleSource/Benchmarks/Misc/flops-5-10.43%1.27711.14390.0015
> MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt
> -8.10%2.34682.15680.0195SingleSource/B...
2013 Sep 25
0
[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation
...g)
MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk -69.11%
MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc -44.39%
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm -12.74%
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm -11.21%
MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -11.14%
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k -11.11%
MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -10.87%
MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -10.87%
SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm -10...
2015 Jan 30
0
[LLVMdev] About user of bitcast/GEP instruction
----- Original Message -----
> From: "guoqing zhang" <gqzhang81 at gmail.com>
> To: llvmdev at cs.uiuc.edu
> Sent: Friday, January 30, 2015 4:29:16 AM
> Subject: [LLVMdev] About user of bitcast/GEP instruction
>
> Hi,
>
>
> In PromoteMemoryToRegister.cpp, it seems to rely on the fact that the
> only users of bitcast/GEP instruction are lifetime
2015 Jan 30
3
[LLVMdev] About user of bitcast/GEP instruction
Hi,
In PromoteMemoryToRegister.cpp, it seems to rely on the fact that the only
users of bitcast/GEP instruction are lifetime intrinsics
(llvm.lifetime.start/end). I did some searching in llvm/test folder, it
seems to be true.
However, by reading LLVM IR manual, I don't see any restriction stated on
the possible user of bitcast/GEP instruction. So my question is who impose
the restriction ?
2015 Feb 26
5
[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Hi all,
I've started looking at the GlobalMerge pass, enabled by default on
ARM and AArch64. I think we should reconsider that, at least for
AArch64.
As is, the pass just merges all globals together, in groups of 4KB
(AArch64, 128B on ARM).
At the time it was enabled, the general thinking was "it's almost
free, it doesn't affect performance much, we might as well use it".
2015 Jan 30
1
[LLVMdev] About user of bitcast/GEP instruction
Hi,
If the special handling in the meg2reg pass is to look for lifetime
intrinsics, shouldn't it cast to <IntrisicInst> and then use
getInstrinsicID to check for lifetime_start and lifetime_end ?
The thing that I don't understand is the following piece of code, which
finds all the users and cast it to <Instruction> then eraseFromParent().
How can this guarantee that it only
2018 Feb 06
2
[RFC] Make LoopVectorize Aware of SLP Operations
Hello,
We would like to propose making LoopVectorize aware of SLP operations,
to improve the generated code for loops operating on struct fields or
doing complex math.
At the moment, LoopVectorize uses interleaving to vectorize loops that
operate on values loaded/stored from consecutive addresses: vector
loads/stores are generated to combine consecutive loads/stores and then
shufflevector
2018 Feb 08
0
[RFC] Make LoopVectorize Aware of SLP Operations
Hi Florian!
This proposal sounds pretty exciting! Integrating SLP-aware loop vectorization (or the other way around) and SLP into the VPlan framework is definitely aligned with the long term vision and we would prefer this approach to the LoopReroll and InstCombine alternatives that you mentioned. We prefer a generic implementation that can handle complicated cases to something ad-hoc for some
2016 Jul 21
2
RFC: Strong GC References in LLVM
On Thu, Jul 21, 2016 at 9:39 AM, Daniel Berlin <dberlin at dberlin.org> wrote:
>
>
> On Thu, Jul 21, 2016 at 9:26 AM, Andrew Trick <atrick at apple.com> wrote:
>
>>
>> On Jul 21, 2016, at 7:45 AM, Philip Reames <listmail at philipreames.com>
>> wrote:
>>
>> Joining in very late, but the tangent here has been interesting (if
>>
2016 Jul 21
3
RFC: Strong GC References in LLVM
> On Jul 21, 2016, at 7:45 AM, Philip Reames <listmail at philipreames.com> wrote:
>
> Joining in very late, but the tangent here has been interesting (if rather OT for the original thread).
>
> I agree with Danny that we might want to take a close look at how we model things like maythrow calls, no return, and other implicit control flow. I'm not convinced that moving
2016 Jul 21
4
RFC: Strong GC References in LLVM
Okay, so it sounds like it might actually be better to be even more low
level, call it "ExtendedBBInfo" or something, and rename what it provides
to be more clearly structural:
A. Inst * FirstIsGuaranteedToTransferExecutionToSuccessor(BB) (naming
bikeshed open on this one :P)
B. Inst * LastIsGuaranteedToTransferExecutionToSuccessor(BB)
C. Inst *FirstMayThrow(BB)
D. Inst
2013 Jul 28
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
...862 -1.5981735159817
Applications/ClamAV/clamscan 0.094 0.0925 -1.5957446808510
Benchmarks/McCat/17-bintr/bintr 0.0666 0.0658 -1.2012012012012
Benchmarks/MiBench/automotive-susan/automot 0.0312 0.0309 -0.9615384615384
Benchmarks/TSVC/LoopRerolling-dbl/LoopRerol 2.7783 2.7524 -0.9322247417485
Benchmarks/SciMark2-C/scimark2 22.2684 22.0824 -0.8352643207414
Benchmarks/mediabench/g721/g721encode/encod 0.0403 0.04 -0.7444168734491
Benchmarks/ASC_Sequoia/AMGmk/AMGmk 5.0381 5.0033 -0...
2013 Jul 18
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Andy and I briefly discussed this the other day, we have not yet got
chance to list a detailed pass order
for the pre- and post- IPO scalar optimizations.
This is wish-list in our mind:
pre-IPO: based on the ordering he propose, get rid of the inlining (or
just inline tiny func), get rid of
all loop xforms...
post-IPO: get rid of inlining, or maybe we still need it, only
2018 Apr 26
0
Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path
...70111|2.093725046| -0.16|17747770107| 0|2.089922083| 0.02|17747770106| 0|
|MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt.test | 40|1.426366572|15128429282|1.421173178| 0.37|15128429276| 0|1.421455642| 0.35|15128429276| 0|
|MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl.test | 40|2.479358186|12728810214|2.476810043| 0.1|12728810217| 0|2.479956135| -0.02|12728810217| 0|
|MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt.test | 40|1.749389689| 9799621839|1.751205696| -0.1| 9799621843| 0|1.746...