similar to: [LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?"

2013 Jan 01
0
[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?
Hi Dmitry, > > In our compiler we use a modified version LLVM Polly, which is very sensitive to > proper code generation. Among the number of limitations, the loop region > (enclosed by phi node on induction variable and branch) is required to be free > of additional memory-dependent branches. In other words, there must be no > conditional "br" instructions below phi
2012 Jul 31
3
[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM
Hi Duncan, A DragonEgg/GCC-related question: do you know where these strange FRAME tokens originate from (e.g. %struct.FRAME.matmul)? Compiling simple Fortran code with DragonEgg: > cat matmul.f90 subroutine matmul(nx, ny, nz) implicit none integer :: nx, ny, nz real, dimension(nx, ny) :: A real, dimension(ny, nz) :: B real, dimension(nx, nz) :: C integer :: i, j, k real,
2012 Jul 31
0
[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM
According to comment in tree-nested.c, these frames should be only introduced in case of debug or OpenMP lowering: /* A subroutine of convert_nonlocal_reference_op. Create a local variable in the nested function with DECL_VALUE_EXPR set to reference the true variable in the parent function. This is used both for debug info and in OpenMP lowering. */ However, in this code example we
2013 Oct 23
0
[LLVMdev] First attempt at recognizing pointer reduction
On Oct 23, 2013, at 3:10 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 23 October 2013 16:05, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > In the examples you gave there are no reduction variables in the loop vectorizer’s sense. But, they all have memory accesses that are strided. > > This is what I don't get. As far as I understood, a
2013 Oct 21
0
[LLVMdev] First attempt at recognizing pointer reduction
Renato, can you post a hand-created vectorized IR of how a reduction would work on your example? I don’t think that recognizing this as a reduction is going to get you far. A reduction is beneficial if the value reduced is only truly needed outside of a loop. This is not the case here (we are storing/loading from the pointer). Your example is something like WRITEPTR = phi i8* [ outsideval,
2012 May 04
3
[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful
Is there any chance of replacing/extending the GEP instruction? As noted in the GEP FAQ, GEPs don't support variable-length arrays; when the front ends have to support VLAs, they linearize the subscript expressions, throwing away information. The FAQ suggests that folks interested in writing an analysis that understands array indices (I'm thinking of dependence analysis) should be
2013 Oct 23
2
[LLVMdev] First attempt at recognizing pointer reduction
On 23 October 2013 16:05, Arnold Schwaighofer <aschwaighofer at apple.com>wrote: > In the examples you gave there are no reduction variables in the loop > vectorizer’s sense. But, they all have memory accesses that are strided. > This is what I don't get. As far as I understood, a reduction variable is the one that aggregates the computation done by the loop, and is used
2012 May 04
0
[LLVMdev] Extending GetElementPointer, or Premature Linearization Considered Harmful
Hi Preston, On Fri, May 4, 2012 at 9:12 AM, Preston Briggs <preston.briggs at gmail.com> wrote: > > which produces > > %arrayidx24 = getelementptr inbounds [100 x [100 x i64]]* %A, i64 > %arrayidx21.sum, i64 %add1411, i64 %add > store i64 0, i64* %arrayidx24, align 8 > {{{(5 + ((3 + %n) * %n)),+,(2 * %n * %n)}<%for.cond1.preheader>,+,(4 *
2013 Oct 21
5
[LLVMdev] First attempt at recognizing pointer reduction
Hi Nadav, Arnold, I managed to find some time to work on the pointer reduction, and I got a patch that can make "canVectorize()" pass. Basically what I do is to teach AddReductionVar() about pointers, saying they don't really have an exit instructions, and that (maybe) the final store is a good candidate (is it?). This makes it recognize the writes and reads, but then
2013 Jan 02
2
[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?
On 01/01/2013 02:45 PM, Duncan Sands wrote: > Hi Dmitry, > >> >> In our compiler we use a modified version LLVM Polly, which is very >> sensitive to >> proper code generation. Among the number of limitations, the loop region >> (enclosed by phi node on induction variable and branch) is required to >> be free >> of additional memory-dependent
2013 Jan 02
0
[LLVMdev] [DragonEgg] [Polly] Should we expect DragonEgg to produce identical LLVM IR for identical GIMPLE?
Hi Duncan & Tobi, Thanks a lot for your interest, and for pointing out differences in GIMPLE I missed. Attached is simplified test case. Is it good? Tobi, regarding runtime alias analysis: in KernelGen we already do it along with runtime values substitution. For example: <------------------ __kernelgen_main_loop_17: compile started ---------------------> Integer args substituted:
2017 Dec 13
2
Reducing code size of Position Independent Executables (PIE) by shrinking the size of dynamic relocations section
On Mon, Dec 11, 2017 at 6:14 PM, Roland McGrath <roland at hack.frob.com> wrote: > > On Mon, Dec 11, 2017 at 3:50 PM Rahul Chaudhry via gnu-gabi <gnu-gabi at sourceware.org> wrote: >> >> A simple combination of delta-encoding and run_length-encoding is one of the >> first schemes we experimented with (32-bit entries with 24-bit 'delta' and an >>
2019 Jun 27
2
[PATCH v2] drm/bochs: fix framebuffer setup.
The driver doesn't consider framebuffer pitch and offset, leading to a wrong display in case offset != 0 or pitch != width * bpp. Fix it. Signed-off-by: Gerd Hoffmann <kraxel at redhat.com> --- drivers/gpu/drm/bochs/bochs.h | 2 +- drivers/gpu/drm/bochs/bochs_hw.c | 14 ++++++++++---- drivers/gpu/drm/bochs/bochs_kms.c | 3 ++- 3 files changed, 13 insertions(+), 6 deletions(-)
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can >do a great job optimizing. Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates the output of vectorizer: If vectorizer is the best place to perform the optimization, it should do so. This includes the cases like
2016 Jun 30
1
[Proposal][RFC] Strided Memory Access Vectorization
As a strong advocate of logical vector representation, I'm counting on community liking Michael's RFC and that'll proceed sooner than later. I plan to pitch in (e.g., perf experiments). >Probably can depend on the support provided by below RFC by Michael: > "Allow loop vectorizer to choose vector widths that generate illegal types" >In that case Loop Vectorizer will
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate bigger types than target supported: Based on VF currently we check the cost and generate the expected set of instruction[s] for bigger type. It has two challenges for bigger types cost is not always correct and code generation may not generate efficient instruction[s]. Probably can depend on the support provided by below RFC by
2016 Jun 15
3
[Proposal][RFC] Strided Memory Access Vectorization
Sorry for the spam. Copy-paste didn't capture the Subject properly. Resending with the correct Subject so that the thread is captured properly. -----Original Message----- From: Saito, Hideki Sent: Wednesday, June 15, 2016 1:39 PM To: 'llvm-dev at lists.llvm.org' <llvm-dev at lists.llvm.org> Subject: RE: [llvm-dev] [Proposal][RFC] Strided Memory Access Ashutosh, First,
2013 Oct 24
1
[LLVMdev] First attempt at recognizing pointer reduction
On 23 October 2013 23:05, Arnold Schwaighofer <aschwaighofer at apple.com>wrote: > A reduction is something like: > > for (i= …) { > r+= a[i]; > } > return r; > Ok, so "reduction" is just a reduction in the map-reduce sense, and nothing else. You don’t need to transform them in the legality phase. Believe me ;). Look > at how we handle stride one
2010 Feb 26
2
[PATCH] renouveau/nv10: remove duplicate vertex buffer registers
NV10TCL defines the vertex buffer registers both as arrays and as individual named registers. This causes duplicate register definitions and the individual registers are not used either by the DDX or by the Mesa driver. Francisco Jerez said to remove them all. Signed-off-by: Luca Barbieri <luca at luca-barbieri.com> --- renouveau.xml | 49
2012 Oct 04
1
[PATCH] gallium/nouveau: use pre-calculated stride for resource_get_handle
Fixes FDO#55294. --- src/gallium/drivers/nv30/nv30_miptree.c | 3 +-- src/gallium/drivers/nv50/nv50_miptree.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nv30/nv30_miptree.c b/src/gallium/drivers/nv30/nv30_miptree.c index 5a9a63b..9700fa8 100644 --- a/src/gallium/drivers/nv30/nv30_miptree.c +++ b/src/gallium/drivers/nv30/nv30_miptree.c @@ -56,8