Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] Register based vector insert/extract"
2007 Apr 23
2
[LLVMdev] Register based vector insert/extract
On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote:
> On Mon, 23 Apr 2007, Christopher Lamb wrote:
>> How can one let the back end know how to insert and extract
>> elements of
>> a vector through sub-register copies? I'm at a loss how to do this...
>
> You probably want to custom lower the insertelement/extractelement
> operations for the cases you support.
2007 Apr 23
0
[LLVMdev] Register based vector insert/extract
On Mon, 23 Apr 2007, Christopher Lamb wrote:
> How can one let the back end know how to insert and extract elements of
> a vector through sub-register copies? I'm at a loss how to do this...
You probably want to custom lower the insertelement/extractelement
operations for the cases you support. Take a look at
X86TargetLowering::LowerEXTRACT_VECTOR_ELT for some examples of how to do
2007 Apr 23
0
[LLVMdev] Register based vector insert/extract
On Apr 23, 2007, at 1:17 PM, Christopher Lamb wrote:
>
> On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote:
>
>> On Mon, 23 Apr 2007, Christopher Lamb wrote:
>>> How can one let the back end know how to insert and extract
>>> elements of
>>> a vector through sub-register copies? I'm at a loss how to do
>>> this...
>>
>> You
2007 Apr 23
2
[LLVMdev] Register based vector insert/extract
On Apr 23, 2007, at 1:43 PM, Christopher Lamb wrote:
> On Apr 23, 2007, at 1:17 PM, Christopher Lamb wrote:
>
>> On Apr 23, 2007, at 12:31 PM, Chris Lattner wrote:
>>
>>> On Mon, 23 Apr 2007, Christopher Lamb wrote:
>>>> How can one let the back end know how to insert and extract
>>>> elements of
>>>> a vector through sub-register
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The following IR implements the following nested loop:
for (int i = start ; i < end ; ++i )
for (int p = 0 ; p < 4 ; ++p )
a[i*4+p] = b[i*4+p] + c[i*4+p];
define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float*
noalias %arg4, float* noalias %arg5, float* noalias %arg6) {
entrypoint:
br i1 %arg2, label %L0, label %L1
L0:
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The loop vectorizer relies on cleanup passes to be run after it:
from Transforms/IPO/PassManagerBuilder.cpp:
// Add the various vectorization passes and relevant cleanup passes for
// them since we are no longer in the middle of the main scalar pipeline.
MPM.add(createLoopVectorizePass(DisableUnrollLoops));
MPM.add(createInstructionCombiningPass());
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The instcombine pass cleans up a lot.
Any idea why there are still shufflevector, insertelement, *and* bitcast
(!!) etc. instructions left? The original loop is so clean, a textbook
example I'd say. There is no need to shuffle anything.At least I don't
see it.
Frank
vector.ph: ; preds = %L5
%broadcast.splatinsert1 = insertelement <4 x
2007 Apr 24
2
[LLVMdev] Register based vector insert/extract
On Apr 23, 2007, at 8:22 PM, Evan Cheng wrote:
>
> On Apr 23, 2007, at 4:07 PM, Christopher Lamb wrote:
>
>> Thanks for the detailed response.
>>
>> On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote:
>>
>>> Right. Evan is currently focusing on getting the late stages of
>>> the code
>>> generator (e.g. livevars) to be able to understand
2007 Apr 24
0
[LLVMdev] Register based vector insert/extract
On Apr 23, 2007, at 4:07 PM, Christopher Lamb wrote:
> Thanks for the detailed response.
>
> On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote:
>
>> Right. Evan is currently focusing on getting the late stages of
>> the code
>> generator (e.g. livevars) to be able to understand arbitrary machine
>> instrs in the face of physreg subregs. This lays the
2007 Apr 23
2
[LLVMdev] Register based vector insert/extract
Thanks for the detailed response.
On Apr 23, 2007, at 4:22 PM, Chris Lattner wrote:
> Right. Evan is currently focusing on getting the late stages of
> the code
> generator (e.g. livevars) to be able to understand arbitrary machine
> instrs in the face of physreg subregs. This lays the groundwork for
> handling vreg subregs, but won't solve it directly.
Is the work Evan
2008 Apr 27
3
[LLVMdev] Two new 'llvmnotes'
On Apr 27, 2008, at 12:49 PM, Nick Lewycky wrote:
> Chris Lattner wrote:
>> On Apr 27, 2008, at 10:58 AM, Talin wrote:
>>
>>> I would certainly make use of this in my frontend.
>>>
>>> I suggest the names "getfield" and "setfield" for the two
>>> operations,
>>>
>>
>> I agree that
2014 Mar 17
2
[LLVMdev] Improving SLPVectorizer for Julia
I'm working on some small improvements to SLPVectorizer.cpp so that it can deal with some tuple operations arising from Julia code. Being fairly new to LLVM, I could use some advice, particular from those familiar with the internals of SLPVectorizer.
The motivation can be found in the Julia discussion https://github.com/JuliaLang/julia/issues/5857 . Here is an example of the kind of LLVM
2013 Feb 04
0
[LLVMdev] Vectorizer using Instruction, not opcodes
Hi all,
My take on this is that, as you state below, at the IR level we are only roughly estimating cost, at best (or we would have to lower the code and then estimate cost - something we don't want to do).
I would propose for estimating the "worst case costs" and see how far we get with this. My rational here is that we don't want vectorization to decrease performance relative
2013 Feb 04
6
[LLVMdev] Vectorizer using Instruction, not opcodes
On 4 February 2013 18:25, Arnold Schwaighofer <aschwaighofer at apple.com>wrote:
> For cases where this approach breaks really badly we could consider adding
> a specialized api or parameters (like the type of a user/use). But we
> should do so only as a last resort and backed by actual code that would
> benefit from doing so.
>
Very sensible, more or less what I had in
2013 Feb 04
2
[LLVMdev] Vectorizer using Instruction, not opcodes
Hi folks,
I've been thinking on how to implement some of the costs and there is a lot
of instructions which cost depend on other instructions around. Casts are
one obvious case, since arithmetic and memory instructions can, sometimes,
cast values for free.
The cost model receives Opcodes, which lose the info on the history of the
values being vectorized, and I thought we could pass the whole
2008 Apr 27
0
[LLVMdev] Two new 'llvmnotes'
On 2008-04-27, at 15:56, Chris Lattner wrote:
> On Apr 27, 2008, at 12:49 PM, Nick Lewycky wrote:
>
>> Chris Lattner wrote:
>>
>>> On Apr 27, 2008, at 10:58 AM, Talin wrote:
>>>
>>>> I would certainly make use of this in my frontend.
>>>>
>>>> I suggest the names "getfield" and "setfield" for the two
2012 Aug 10
2
[LLVMdev] GVN miscompile debugging help
I found a case where GVN seems to miscompile an OpenCL program. What I am trying to figure out is given a bitcode file, how can I reduce it to a simpler case with bugpoint when I don't have a valid reference compiler available.
Thanks for any tips,
Micah
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2014 Jul 22
2
[LLVMdev] InsertElementInst and ExtractElementInst
Hello,
I am create a <3 x i32> vector in LLVM IR. Then I insert 3 instructions
and later on I try to load one instruction from the vector. The
insertion seems to work, however, when I try to load a specific
instruction from a vector I seems that it does not work.
This is the part of my IR:
%"ins or1" = insertelement <3 x i32> undef, i32 %38, i32 0
%"ins and2"
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
Hey,
as some of you may know we're in process of experimenting with LLVM in
Gallium3D (Mesa's new driver model), where LLVM would be used both in the
software only (by just JIT executing shaders) and hardware (drivers will
implement LLVM code-generators) cases.
While the software only case is pretty straight forward I just realized I
missed something in my initial evaluation.
That
2012 Feb 28
1
[LLVMdev] How to vectorize a vector type cast?
Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as
typedef float float4 __attribute__((ext_vector_type(4)));
typedef unsigned char uchar4 __attribute__((ext_vector_type(4)));
float4 to_float4(uchar4 in)
{
float4 out = {in.x, in.y, in.z, in.w};
return out;
}
Running