thr3ads.net - llvm dev - [llvm-dev] Prologue and epilogue for vectorized code [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Alex Susu via llvm-dev

2016-Apr-27 22:46 UTC

[llvm-dev] Prologue and epilogue for vectorized code

Hello.
     I'd like to generate a sort of prologue+epilogue for a code block
running on a SIMD
architecture obtained from the LLVM loop vectorizer. My SIMD processor receives
data from
the CPU via DMA transfer and sends it via DMA transfer or a FIFO.
    It is exactly for these transfers that I need to write the prologue+epilogue
-
relatively simple, e.g. a call to a function like TransferViaDMA().
    Although it doesn't seem to be very difficult, I'm curious what is
the best way to do it.

    I haven't found anybody to write prologue+epilogue for vector code
(obtained from the
loop vectorizer), and although it shouldn't be very different from the
prologue+epilogue
for function call, I'm still curious what's the best way to do it.

    Please let me know what do you recommend.

  Thank you,
    Alex

Alex Susu via llvm-dev

2016-May-11 20:44 UTC

head link

[llvm-dev] Inserting function call in LLVM code before vector instruction - do it in LoopVectorize.cpp? [was Re: Prologue and epilogue for vectorized code]

Hello.
     I come back with this question, rephrased a bit. Note that I guess this
question
should be useful also for the NVPTX LLVM back end, when it will generate
automatically
code for both CPU and NVIDIA device and generate automatically memory transfers,
with
cudaMemcpy().

     Given LLVM scalar and vector code I want to generate code for both the
scalar CPU and
for my research Connex SIMD unit. The CPU and SIMD unit have different memory
spaces and
we require to perform memory transfer from CPU to my Connex SIMD unit, via DMA,
to
"synchronize" the 2 memories.

     Therefore, in the LLVM code with vector instructions I need to add (on the
way to
code generation) a call to a function performing the memory transfer from CPU to
my Connex
SIMD unit. More exactly, for the LLVM code below (obtained from LLVM's opt
tool):
       ...
       %8 = getelementptr inbounds [10000 x float], [10000 x float]* @A, i64 0,
i64 %7
       %9 = bitcast float* %8 to <32 x float>*
       %wide.load = load <32 x float>, <32 x float>* %9, align 4
       [more...]
     I want on the CPU to add a call to an external function writeDataToArray()
like this:
         ...
         %8 = getelementptr inbounds [10000 x float], [10000 x float]* @A, i64
0, i64 %7
         %9 = bitcast float* %8 to <32 x float>*
         call writeDataToArray(%9, 128, 0) ; 2nd parameter is the transfer size
in bytes,
3rd param is the offset to write in the local memory of the SIMD unit
       and, then, run only the following code on the SIMD unit:
         %newVar = getelementptr inbounds i32, i32* inttoptr (i64 0 to i32*),
i64 0
         %dst = load <32 x float>, <32 x float>* %newVar, align 4
         [more...]

     Should I perform the insertion of this function call in LLVM's 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp in method:
        /// Vectorize Load and Store instructions,
        virtual void vectorizeMemoryInstruction(Instruction *Instr) ?
     Or should I do it as a separate LLVM pass or maybe in the back end?


   Thank you,
     Alex


On 4/28/2016 1:46 AM, Alex Susu wrote:>    Hello.
>      I'd like to generate a sort of prologue+epilogue for a code block
running on a SIMD
> architecture obtained from the LLVM loop vectorizer. My SIMD processor
receives data from
> the CPU via DMA transfer and sends it via DMA transfer or a FIFO.
>     It is exactly for these transfers that I need to write the
prologue+epilogue -
> relatively simple, e.g. a call to a function like TransferViaDMA().
>     Although it doesn't seem to be very difficult, I'm curious what
is the best way to do it.
>
>     I haven't found anybody to write prologue+epilogue for vector code
(obtained from the
> loop vectorizer), and although it shouldn't be very different from the
prologue+epilogue
> for function call, I'm still curious what's the best way to do it.
>
>     Please let me know what do you recommend.
>
>   Thank you,
>     Alex

llvm dev - Apr 2016 - Prologue and epilogue for vectorized code

[llvm-dev] Prologue and epilogue for vectorized code

[llvm-dev] Inserting function call in LLVM code before vector instruction - do it in LoopVectorize.cpp? [was Re: Prologue and epilogue for vectorized code]