Displaying 20 results from an estimated 11000 matches similar to: "[LLVMdev] [PATCH] Add a Scalarize pass"
2013 Nov 15
2
[LLVMdev] [PATCH] Add a Scalarize pass
Nadav Rotem <nrotem at apple.com> writes:
> On Nov 14, 2013, at 2:32 PM, Richard Sandiford
> <rsandifo at linux.vnet.ibm.com> wrote:
>> Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
>>> Are you worried that adding it to PMB will increase compile time?
>>> The pass exits very early for any target that doesn't opt-in to doing
2013 Nov 15
0
[LLVMdev] [PATCH] Add a Scalarize pass
Hi Richard,
The discussion on llvmpipe is irrelevant. llvmpipe has its own pass manager and optimization pipe, it is not a C compiler.
Nadav
On Nov 15, 2013, at 3:26 AM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:
> Nadav Rotem <nrotem at apple.com> writes:
>> On Nov 14, 2013, at 2:32 PM, Richard Sandiford
>> <rsandifo at linux.vnet.ibm.com>
2013 Nov 14
0
[LLVMdev] [PATCH] Add a Scalarize pass
On Nov 14, 2013, at 2:32 PM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:
> Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
>> Are you worried that adding it to PMB will increase compile time?
>> The pass exits very early for any target that doesn't opt-in to doing
>> scalarisation at the IR level, without even looking at the function.
2013 Nov 14
2
[LLVMdev] [PATCH] Add a Scalarize pass
Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
> Are you worried that adding it to PMB will increase compile time?
> The pass exits very early for any target that doesn't opt-in to doing
> scalarisation at the IR level, without even looking at the function.
As an alternative, adding Scalarizer and InstCombine passes to
SystemZPassConfig::addIRPasses() would probably
2013 Nov 13
0
[LLVMdev] [PATCH] Add a Scalarize pass
Nadav Rotem <nrotem at apple.com> writes:
> Hi Richard,
>
> Thanks for working on this. We should probably move this discussion to
> llvm-dev because it is not strictly related to the patch review
> anymore.
OK, I removed phabricator and llvm-commits.
> The code below is not representative of general c/c++
> code. Usually only domain specific language (such as OpenCL)
2013 Nov 14
2
[LLVMdev] [PATCH] Add a Scalarize pass
Hi Richard,
Thanks for working on this. Comments below.
> I don't understand the basis for the last statement though. Do you mean
> that you think most cases produce better code if scalarised at the SD stage
> rather than at the IR level? Could you give an example?
You presented an example that shows that scalarizing vectors allow further optimizations. But I don’t think that
2013 Nov 14
0
[LLVMdev] [PATCH] Add a Scalarize pass
Nadav Rotem <nrotem at apple.com> writes:
>> I don't understand the basis for the last statement though. Do you mean
>> that you think most cases produce better code if scalarised at the SD stage
>> rather than at the IR level? Could you give an example?
>
> You presented an example that shows that scalarizing vectors allow
> further optimizations. But I
2013 Oct 25
3
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi, Richard,
I think we are solving a same problem. I am working on shader language
too. I am not satisfied with current binaries because vector operations
are kept in llvm opt.
glsl shader language has an operation called "swizzle". It can select
sub-components of a vector. If a shader only takes components "xy" for a
vec4. it's certainly wasteful to generate 4
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
Liu Xin <navy.xliu at gmail.com> writes:
> I think we are solving a same problem. I am working on shader language
> too. I am not satisfied with current binaries because vector operations
> are kept in llvm opt.
>
> glsl shader language has an operation called "swizzle". It can select
> sub-components of a vector. If a shader only takes components "xy"
2013 Oct 25
3
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi, LLVM community,
I write some code in hand using LLVM IR. for simplicity, I write them in <4
x float>. now I found some stores for elements are useless.
for example, If I store {0.0, 1.0, 2.0, 3.0} to a <4 x float> %a. maybe
only %a.xy is alive in my program. our target doesn't feature SIMD
instruction, which means we have to lower vector to many scalar
instructions. I found
2013 Oct 25
3
[LLVMdev] Is there pass to break down <4 x float> to scalars
On 25 October 2013 11:06, Richard Sandiford <rsandifo at linux.vnet.ibm.com>wrote:
> I wanted the same thing for SystemZ, which doesn't have vectors,
> in order to improve the llvmpipe code.
>
Hi Richard,
This is a nice patch. I was wondering how hard it'd be to do that, and it
seems that you're catching lots of corner cases.
My interest is also due to converting odd
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
Liu Xin <navy.xliu at gmail.com> writes:
> Hi, LLVM community,
>
> I write some code in hand using LLVM IR. for simplicity, I write them in <4
> x float>. now I found some stores for elements are useless.
>
> for example, If I store {0.0, 1.0, 2.0, 3.0} to a <4 x float> %a. maybe
> only %a.xy is alive in my program. our target doesn't feature SIMD
>
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
Renato Golin <renato.golin at linaro.org> writes:
> On 25 October 2013 11:06, Richard Sandiford <rsandifo at linux.vnet.ibm.com>wrote>> It would also need some TargetTransformInfo hooks to decide which
>> vectors should be decomposed.
>
> If I got it right, this may not be necessary, or it may even be harmful.
>
> Say you decide that <4 x i32> vectors
2013 Oct 30
2
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi, Richard,
Your decompose vector patch works perfect on my site. Unfortunately, I
still get stupid code because llvm '-dse' fails followed by
'decompose-vector' .
I read the DSE code and it is definitely capable of eliminating unused
memory stores if its AA works. I don't think basic AA works for me. I
found my program have complex memory accesses, such as bi-dimentional
2016 Feb 09
2
Vectorization with fast-math on irregular ISA sub-sets
----- Original Message -----
> From: "James Molloy" <James.Molloy at arm.com>
> To: "Renato Golin" <renato.golin at linaro.org>
> Cc: "Nadav Rotem" <nrotem at apple.com>, "Arnold Schwaighofer" <aschwaighofer at apple.com>, "Hal Finkel"
> <hfinkel at anl.gov>, "LLVM Dev" <llvm-dev at
2013 Oct 25
2
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi,
Great to see someone working on this. This will benefit the performance
portability goal of the pocl's OpenCL kernel compiler. It has been one of
the low hanging fruits in improving its implicit WG vectorization
applicability.
The use case there is that sometimes it makes sense to devectorize
the explicitly used vector datatype code of OpenCL kernels in order to make
better opportunities
2016 Feb 09
2
Vectorization with fast-math on irregular ISA sub-sets
----- Original Message -----
> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "James Molloy" <James.Molloy at arm.com>, "Nadav Rotem" <nrotem at apple.com>, "Arnold Schwaighofer"
> <aschwaighofer at apple.com>, "LLVM Dev" <llvm-dev at
2015 Aug 19
3
Code owner for the scalarizer
We should find a code owner for the scalarizer
(lib/Transforms/Scalar/Scalarizer.cpp).
I nominate Richard Sandiford, who added it in r195471.
Let me know what you think.
Thanks,
Hans
2015 Aug 19
2
Code owner for the scalarizer
I think the people that have been in there the most are Hal and Chandler of
late with a few others here and there.
-eric
On Wed, Aug 19, 2015 at 10:30 AM Chris Lattner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> > On Aug 19, 2015, at 9:50 AM, Hans Wennborg <hans at chromium.org> wrote:
> >
> > We should find a code owner for the scalarizer
> >
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi> writes:
> E.g., the last time I checked, the inner loop vectorizer (which pocl exploits)
> just refused to vectorize loops with vector instructions. It might not
> be so drastic with the SLP or the BB vectorizer, but in general, it might
> make sense to let the vectorizer to do the decisions on how to map the
> parallel