thr3ads.net - search: "susu"

Displaying 20 results from an estimated 76 matches for "susu".

Did you mean: suse

2016 Jun 07

[LLVMdev] LLVM loop vectorizer

Hi Alex, This has been very recently fixed by Hal. See http://reviews.llvm.org/rL270771 Adam > On Jun 4, 2016, at 3:13 AM, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello. > Mikhail, I come back to this older thread. > I need to do a few changes to LoopVectorize.cpp. > > One of them is related to figuring out the exact C source line and column number of the loops being vec...

[LLVMdev] LLVM loop vectorizer

2016 Feb 18

[LLVMdev] LLVM loop vectorizer

...like this. Also, one related thought: it might be worth making it a separate pass, not a part of loop vectorizer. LLVM already has several 'utility' passes (e.g. loop rotation), which primarily aims at enabling other passes. Thanks, Michael > On Feb 15, 2016, at 6:44 AM, RCU <alex.e.susu at gmail.com> wrote: > > Hello, Michael. > I come back to this older email. Sorry if you receive it again. > > I am trying to implement coalescing/collapsing of nested loops. This would be clearly beneficial for the loop vectorizer, also. > I'm normally planning...

[LLVMdev] LLVM loop vectorizer

2016 Jun 04

[LLVMdev] LLVM loop vectorizer

...ually not vectorizer related. Vectorizer just uses data provided by other passes. What you probably might want is to look into routine Loop::getStartLoc() (see lib/Analysis/LoopInfo.cpp). If you find a way to improve it, patches are welcome:) Thanks, Michael > On Jun 3, 2016, at 6:13 PM, Alex Susu <alex.e.susu at gmail.com> wrote: > > Hello. > Mikhail, I come back to this older thread. > I need to do a few changes to LoopVectorize.cpp. > > One of them is related to figuring out the exact C source line and column number of the loops being vectorized. I'...

[LLVMdev] LLVM loop vectorizer

2015 Jul 08

[LLVMdev] LLVM loop vectorizer

Hello. I am trying to vectorize a CSR SpMV (sparse matrix vector multiplication) procedure but the LLVM loop vectorizer is not able to handle such code. I am using cland and llvm version 3.4 (on Ubuntu 12.10). I use the -fvectorize option with clang and -loop-vectorize with opt-3.4 . The CSR SpMV function is inspired from

Immediate operand for vector instructions

2016 Dec 06

Immediate operand for vector instructions

Hi Alex, On 5 December 2016 at 18:00, Alex Susu <alex.e.susu at gmail.com> wrote: > We can compile it. Note that this is the only compilable code w.r.t. > using i64 or i64imm (in the 2 lines above: "dag InOperandList", "list<dag> > Pattern"). Yeah, you actually want to use "imm": list&l...

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

2017 Jul 28

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

...using old LLVM sources---changing this many files for supporting a different width LaneBitmask is no longer necessary. Also, boost is not a current requirement for building LLVM and it's unlikely that requiring it for that purpose alone is justified. -Krzysztof On 7/28/2017 6:30 AM, Alex Susu via llvm-dev wrote: > Hello. > I come back to this older thread. > > As I've said before, I managed to patch the various files from the > back end related to lanemask in order to support at most 1024 vector > lanes. For this I am using a 1024-bit long lanemask...

LLVM Loop vectorizer - 2 vector.body blocks appear

2016 Aug 01

LLVM Loop vectorizer - 2 vector.body blocks appear

Hello. Mikhail, with the more recent version of the LoopVectorize.cpp code (retrieved at the beginning of July 2016) I ran the following piece of C code: void foo(long *A, long *B, long *C, long N) { for (long i = 0; i < N; ++i) { C[i] = A[i] + B[i]; } } The vectorized LLVM program I obtain contains 2 vector.body blocks - one named

Specify special cases of delay slots in the back end

2017 Feb 10

Specify special cases of delay slots in the back end

...ing well with the post-RA scheduler? Otherwise, if the post RA scheduler only inserts NOPs, since I have issues using it, I could as well insert NOPs in the [Target]AsmPrinter.cpp module . Thank you, Alex On 2/10/2017 1:42 AM, Hal Finkel wrote: > > On 02/09/2017 04:46 PM, Alex Susu via llvm-dev wrote: >> Hello. >> Hal, thank you for the information. >> I managed to get inspired from PPCHazardRecognizers.cpp. So I created my very simple >> [Target]HazardRecognizers.cpp pass that is also derived from ScoreboardHazardRecognizer. >> My clas...

Accessing the associated LLVM IR Instruction for an SDNode used in instruction selection (back end)

2016 Oct 24

Accessing the associated LLVM IR Instruction for an SDNode used in instruction selection (back end)

...ur use case for that? > > Generally speaking I would recommend against doing that. When the SDBuilder is done, I > would expect the SDNodes to not query anything outside of the SD layer. We are not here > now, though. > > Cheers, -Quentin >> On Oct 21, 2016, at 4:57 AM, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello. I would like to access the LLVM IR Instruction from which an SDNode (from >> SelectionDAG) originates. For this I have modified: - >> llvm/lib/CodeGen/SelectionDAGISel.cpp, SelectionDAGISel::SelectBasicBlock...

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

2016 Sep 18

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

...t;unsigned char> &Sig). Is there any reason these enum IIT_Info ( IIT_V128, IIT_V256) are not added in file /IntrinsicEmitter.cpp? Thank you, Alex On Tue, Sep 13, 2016 at 1:47 AM, Matthias Braun <mbraun at apple.com> wrote: > > > On Sep 8, 2016, at 6:37 AM, Alex Susu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hello. > > In my TableGen back end description I need to use more than 32 (e.g., > 128, 1024, etc) subregisters per register for my research SIMD processor. I > have used so far with success 32 subregis...

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

2017 Jul 28

Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

...eGen/MachineValueType.h [repository]/llvm/utils/TableGen/CodeGenTarget.cpp Please let me know if you want to commit these changes also - they are rather complex in the sense there are a lot of small dependencies for these types. Best regards, Alex On 9/20/2016 12:48 PM, Alex Susu wrote: > Hello. > I managed to use SIMD units with more than 32 lanes (32 subregisters per vector > register) in TableGen, llc and opt. For example, I use SIMD instructions with types > v128i16 and v512i16. > > An important questions I have is if it is OK to add the type...

Prioritizing an SDNode for scheduling

2016 Oct 21

Prioritizing an SDNode for scheduling

...Candidate method which compares two instructions that can be legally > scheduled and decide which of the two should be scheduled. Currently these > method are target independent. > > The correctness question still remains open for me. > > > On Thu, Oct 20, 2016 at 8:08 PM, Alex Susu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello. >> Is there a way to specify in the back end an (ISD::INLINEASM) SDNode >> to be scheduled first under all circumstances? I need to specify something >> like node priority to schedule the node...

Specify special cases of delay slots in the back end

2017 Feb 11

Specify special cases of delay slots in the back end

...Arch64 is using it. Thank you, Alex On 2/10/2017 11:33 PM, Hal Finkel wrote: > Hi Alex, > > All of this makes sense, but are you correctly handling the Stalls argument to > getHazardType? What are you doing with it? > > -Hal > > > On 02/10/2017 02:42 PM, Alex Susu via llvm-dev wrote: >> Hello. >> I am progressing a bit with difficulty with the post RA scheduler >> (PostRASchedulerList.cpp with ScoreboardHazardRecognizer) - the problem I have is that >> it doesn't advance at the next available instruction when the overridden &...

LoopVectorize module - some possible enhancements

2016 Aug 21

LoopVectorize module - some possible enhancements

Hello, Michael, I'd like to ask if we can enhance the LoopVectorize LLVM module (I am currently using a version from Jul 2016). More exactly: - do you envision to support in the near future LLVM IR gather and scatter intrinsics (as described at http://llvm.org/docs/LangRef.html#llvm-masked-gather-intrinsics and scatter)? I see you have defined some methods that should

LoopVectorize fails to vectorize more complex loops

2018 Jul 07

LoopVectorize fails to vectorize more complex loops

Hello. Could you please tell me why the first loop of the following program (also maybe the commented loop) doesn't get vectorized with LoopVectorize (from a recent LLVM build from the SVN repository from Jun 2018)? typedef short TYPE; TYPE data[1400][1200]; void kernel_covariance(int m, int n, TYPE mean[1200]) { int i, j, k; for (j = 0; j < m; j++) { mean[j] =

Specify special cases of delay slots in the back end

2017 Feb 09

Specify special cases of delay slots in the back end

...rote: > Hi Alex, > > You can program a post-RA scheduler which will return NoopHazard in the appropriate > circumstances. You can look at the PowerPC target (e.g. > lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example. > > -Hal > > > On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote: >> Hello. >> I see there is little information on specifying instructions with delay slots. >> So could you please tell me how can I insert NOPs (BEFORE or after an instruction) >> or how to make an aware instruction scheduler in order to avoid...

TableGen - Help to implement a form of gather/scatter operations for Mips MSA

2016 Dec 09

TableGen - Help to implement a form of gather/scatter operations for Mips MSA

Hello. I read on page 4 of http://www.cs.fsu.edu/~whalley/cda5155/chap4.pdf that gather and scatter operations exist for Mips, named LVI and SVI, respectively. Did anyone think of implementing in the LLVM Mips back end (part of the MSA vector instructions) gather and scatter operations? If so, can you share with me the TableGen spec? (I tried to start from LD_DESC_BASE, but it

Pre-RA scheduler does not generate NOPs when getHazardType() returns NoopHazard

2017 Feb 12

Pre-RA scheduler does not generate NOPs when getHazardType() returns NoopHazard

Hello. I am new to the schedulers implemented in the back end of LLVM. I am trying to handle data hazards in my simple processor, with instructions that execute in 1 cycle. I have tried the standard post-RA scheduler, implemented in lib/CodeGen/PostRASchedulerList.cpp, (with a ScoreboardHazardRecognizer), but I have some issues with some consecutive instructions that are

Computing loop trip counts with Scalar evolution

2017 May 18

Computing loop trip counts with Scalar evolution

Hello. I tried to get the trip count of a loop with Scalar evolution. I got inspired from http://stackoverflow.com/questions/13834364/how-to-get-loop-bounds-in-llvm . However the analysis described there doesn't work well for the second inner loop of thes function below (although if we declare Bcols a short it works well): void MatMul(int Arows, int Acols, int Brows, int

Handling native i16 types in clang and opt

2017 May 21

Handling native i16 types in clang and opt

Hello. My target architecture supports natively 16 bit integers (i16). Whenever I write in C programs using only short types, clang compiles the program to LLVM and converts the i16 data to i32 to perform arithmetic operations and then truncates the results to i16. Then, the InstructionCombining (INSTCOMBINE or IC) pass removes these conversions back and forth from i16, except for

search for: susu