thr3ads.net - search: "madhur13490"

Displaying 18 results from an estimated 18 matches for "madhur13490".

2020 Oct 13

Manipulating DAGs in TableGen

On Tue, Oct 13, 2020 at 10:47 AM Madhur Amilkanthwar <madhur13490 at gmail.com> wrote: > What do you guys think about the below enhancements? > > 5. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index'th result. > > 6. !setdagrestype(dag target_dag, type T [, index]...

Automatic GPU Code Generation

2018 Dec 11

Automatic GPU Code Generation

...tically or by directives without involvement of CUDA. This way, I am talking about avoiding source to source compiler approach where c code is converted automatically into CUDA, instead I am saying directly to convert C code to PTX assembly. On Tue, Dec 11, 2018 at 12:19 PM Madhur Amilkanthwar <madhur13490 at gmail.com> wrote: > You can skip CUDA code generation and target PTX assembly. PTX is a common > assembly language for NVIDIA's GPU. You may want to look at PPCG, Pluto > projects to get a hint of how automatic CUDA code can be generated by > compilers. They are based on poly...

LLVM's loop strength reduction module

2017 Jul 31

LLVM's loop strength reduction module

...you would not get a solution from it, right? Thanks. Regards, Venugopal Raghavan From: qcolombet at apple.com [mailto:qcolombet at apple.com] Sent: Friday, July 07, 2017 2:16 AM To: Raghavan, Venugopal <Venugopal.Raghavan at amd.com> Cc: llvm-dev at lists.llvm.org; Madhur Amilkanthwar <madhur13490 at gmail.com> Subject: Re: [llvm-dev] LLVM's loop strength reduction module Hi Raghavan, I concur no specific docs. What do you want to know specifically? Cheers, -Quentin On Jul 5, 2017, at 11:16 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at li...

LLVM's loop strength reduction module

2017 Jul 06

LLVM's loop strength reduction module

Hi Raghavan, I concur no specific docs. What do you want to know specifically? Cheers, -Quentin > On Jul 5, 2017, at 11:16 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > AFAIK, no official doc. > You can probably get better help if you ask specific questions (which part of the code you don't understand). > > On Thu, Jul 6, 2017 at 9:53

Debug info for Cuda

2017 Nov 08

Debug info for Cuda

...troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at gmail.com<mailto:madhur13490 at gmail.com>> написал(а): I don't understand the use case and reasons to blame PTXAS compiler here. >>a) Supports DWARF-2 only. What would you like to achieve with DWARF-3+ that you cannot do with DWARF2? >> b) Labels are allowed only in...

Automatic GPU Code Generation

2018 Dec 11

Automatic GPU Code Generation

Hello, I need to ask, like automatic compiler vectorization, can GPU ISA be generated automatically, by skipping the CUDA programming? For instance if i just write C code there can be 2 possibilities, semi and full automatic. In case of semi, we can write #pragma directives to say this should be run on gpu. hence can the compiler generates directly gpu ISA, skipping CUDA code? In case of fully

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

2018 Apr 29

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

Is there any written description of what "non trivialness" is there? On Sun, Apr 29, 2018, 2:49 PM Chandler Carruth via llvm-dev < llvm-dev at lists.llvm.org> wrote: > One of the last big missing pieces for the new PM is enabling non-trivial > loop unswitch at O3. > > The pass is now working well and passing all the testing I have done as > well as some others'

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

2018 Apr 29

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

One of the last big missing pieces for the new PM is enabling non-trivial loop unswitch at O3. The pass is now working well and passing all the testing I have done as well as some others' testing (thanks Fedor!) so it should be ready to be enabled. I've done preliminary benchmarking on the test suite and SPEC and haven't seen any interesting regressions and quite a few improvements.

Running verify between every opt pass?

2016 Jul 08

Running verify between every opt pass?

Hi, Is there any easy way to run the verifier between each pass in opt if I do e.g. opt -O3 foo.ll -o foo.opt.ll ? If I add -verify after -O3 I get one invocation of the verifier first in the FunctionPass manager and then get two (!) runs of the verifier after all other passes are run. Then I saw the flag -verify-each which sounds promising, the help text says - Verify after each

how to prevent LLVM back-end from reordering instructions at instruction scheduling?

2016 Nov 15

how to prevent LLVM back-end from reordering instructions at instruction scheduling?

Setting the MI as isTerminator should have the same impact, yes? I'm not sure of the other consequences of this though, if any, have to look into it. Thanks. -Ryan On Tue, Nov 15, 2016 at 5:18 PM, Krzysztof Parzyszek < kparzysz at codeaurora.org> wrote: > You can override TargetInstrInfo::isSchedulingBoundary for that. > > -Krzysztof > > On 11/15/2016 4:13 PM, Ryan

Debug info for Cuda

2017 Nov 06

Debug info for Cuda

06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to

Manipulating DAGs in TableGen

2020 Oct 12

Manipulating DAGs in TableGen

I understood that the name is a matching tag for the operand and not its name (as in named macro or function arguments). However, I was assuming that the names in any one DAG node had to be unique and so could serve as selectors for operands. But a quick investigation shows that I was wrong: names can be duplicated in the same node. So DAG indexes are integers only. At 10/12/2020 01:46 PM,

[CUDA] Lost debug information when compiling CUDA code

2017 Jun 14

[CUDA] Lost debug information when compiling CUDA code

Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated

Looking for suggestions: Inferring GPU memory accesses

2020 Aug 22

Looking for suggestions: Inferring GPU memory accesses

Hi all, As part of my research I want to investigate the relation between the grid's geometry and the memory accesses of a kernel in common gpu benchmarks (e.g Rodinia, Polybench etc). As a first step i want to answer the following question: - Given a kernel function with M possible memory accesses. For how many of those M accesses we can statically infer its location given concrete values

[RFC] Pass return status

2020 Jul 16

[RFC] Pass return status

> Out of curiosity, does change here include changes to names, and other semantically-irrelevant changes (e.g., changing the order of operands in a PHI)? The hashing function used to detect changes is currently very simple: it only accounts for instruction opcode and order. So some semantically-irrelevant changes are ignored (as well as some relevant changes), and some are not. Permuting two

Dynamically determine the CostPerUse value in the register allocator.

2020 May 30

Dynamically determine the CostPerUse value in the register allocator.

I dont know the history behind CostPerUse word so I may be missing the background associated with it. It seems that it's misnomer for what it is intended. At first sight, the word indicates that the cost is a function of uses of the register - more the uses more the cost. How do we want to define the value of CostPerUse. Should it be a function of uses? or just the target? On Sat, May 30,

LLVM is getting faster, April edition

2017 Apr 12

LLVM is getting faster, April edition

Hi, It's been a while since I sent the last compile time report [1], where it was shown that LLVM was getting slower over time. But now I'm happy to bring some good news: finally, LLVM is getting faster, not slower :) *** Current status *** Many areas of LLVM have been examined and improved since then: InstCombine, SCEV, APInt implementation, and that resulted in almost 10% improvement

LLVM is getting faster, April edition

2017 Apr 18

LLVM is getting faster, April edition

> On Apr 11, 2017, at 10:25 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I am interested in knowing more. > 1. What benchmarks does LLVM community use for compile-time study? I see CTMark, but is that the only one being analyzed? CTMark is not cast in stone. Its purpose is for the community to have a trackable proxy for the overall llvm test

search for: madhur13490