search for: madhur13490

Displaying 18 results from an estimated 18 matches for "madhur13490".

2020 Oct 13
5
Manipulating DAGs in TableGen
On Tue, Oct 13, 2020 at 10:47 AM Madhur Amilkanthwar <madhur13490 at gmail.com> wrote: > What do you guys think about the below enhancements? > > 5. !getdagrestype(dag [, index]) - Returns type of result value. If the DAG computes multiple values then return type of 'index'th result. > > 6. !setdagrestype(dag target_dag, type T [, index]...
2018 Dec 11
2
Automatic GPU Code Generation
...tically or by directives without involvement of CUDA. This way, I am talking about avoiding source to source compiler approach where c code is converted automatically into CUDA, instead I am saying directly to convert C code to PTX assembly. On Tue, Dec 11, 2018 at 12:19 PM Madhur Amilkanthwar <madhur13490 at gmail.com> wrote: > You can skip CUDA code generation and target PTX assembly. PTX is a common > assembly language for NVIDIA's GPU. You may want to look at PPCG, Pluto > projects to get a hint of how automatic CUDA code can be generated by > compilers. They are based on poly...
2017 Jul 31
1
LLVM's loop strength reduction module
...you would not get a solution from it, right? Thanks. Regards, Venugopal Raghavan From: qcolombet at apple.com [mailto:qcolombet at apple.com] Sent: Friday, July 07, 2017 2:16 AM To: Raghavan, Venugopal <Venugopal.Raghavan at amd.com> Cc: llvm-dev at lists.llvm.org; Madhur Amilkanthwar <madhur13490 at gmail.com> Subject: Re: [llvm-dev] LLVM's loop strength reduction module Hi Raghavan, I concur no specific docs. What do you want to know specifically? Cheers, -Quentin On Jul 5, 2017, at 11:16 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at li...
2017 Jul 06
3
LLVM's loop strength reduction module
Hi Raghavan, I concur no specific docs. What do you want to know specifically? Cheers, -Quentin > On Jul 5, 2017, at 11:16 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > AFAIK, no official doc. > You can probably get better help if you ask specific questions (which part of the code you don't understand). > > On Thu, Jul 6, 2017 at 9:53
2017 Nov 08
2
Debug info for Cuda
...troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at gmail.com<mailto:madhur13490 at gmail.com>> написал(а): I don't understand the use case and reasons to blame PTXAS compiler here. >>a) Supports DWARF-2 only. What would you like to achieve with DWARF-3+ that you cannot do with DWARF2? >> b) Labels are allowed only in...
2018 Dec 11
2
Automatic GPU Code Generation
Hello, I need to ask, like automatic compiler vectorization, can GPU ISA be generated automatically, by skipping the CUDA programming? For instance if i just write C code there can be 2 possibilities, semi and full automatic. In case of semi, we can write #pragma directives to say this should be run on gpu. hence can the compiler generates directly gpu ISA, skipping CUDA code? In case of fully
2018 Apr 29
0
FYI, planning to enable nontrivial loop unswitch in the new PM at O3
Is there any written description of what "non trivialness" is there? On Sun, Apr 29, 2018, 2:49 PM Chandler Carruth via llvm-dev < llvm-dev at lists.llvm.org> wrote: > One of the last big missing pieces for the new PM is enabling non-trivial > loop unswitch at O3. > > The pass is now working well and passing all the testing I have done as > well as some others'
2018 Apr 29
2
FYI, planning to enable nontrivial loop unswitch in the new PM at O3
One of the last big missing pieces for the new PM is enabling non-trivial loop unswitch at O3. The pass is now working well and passing all the testing I have done as well as some others' testing (thanks Fedor!) so it should be ready to be enabled. I've done preliminary benchmarking on the test suite and SPEC and haven't seen any interesting regressions and quite a few improvements.
2016 Jul 08
3
Running verify between every opt pass?
Hi, Is there any easy way to run the verifier between each pass in opt if I do e.g. opt -O3 foo.ll -o foo.opt.ll ? If I add -verify after -O3 I get one invocation of the verifier first in the FunctionPass manager and then get two (!) runs of the verifier after all other passes are run. Then I saw the flag -verify-each which sounds promising, the help text says - Verify after each
2016 Nov 15
2
how to prevent LLVM back-end from reordering instructions at instruction scheduling?
Setting the MI as isTerminator should have the same impact, yes? I'm not sure of the other consequences of this though, if any, have to look into it. Thanks. -Ryan On Tue, Nov 15, 2016 at 5:18 PM, Krzysztof Parzyszek < kparzysz at codeaurora.org> wrote: > You can override TargetInstrInfo::isSchedulingBoundary for that. > > -Krzysztof > > On 11/15/2016 4:13 PM, Ryan
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to
2020 Oct 12
3
Manipulating DAGs in TableGen
I understood that the name is a matching tag for the operand and not its name (as in named macro or function arguments). However, I was assuming that the names in any one DAG node had to be unique and so could serve as selectors for operands. But a quick investigation shows that I was wrong: names can be duplicated in the same node. So DAG indexes are integers only. At 10/12/2020 01:46 PM,
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2020 Aug 22
5
Looking for suggestions: Inferring GPU memory accesses
Hi all, As part of my research I want to investigate the relation between the grid's geometry and the memory accesses of a kernel in common gpu benchmarks (e.g Rodinia, Polybench etc). As a first step i want to answer the following question: - Given a kernel function with M possible memory accesses. For how many of those M accesses we can statically infer its location given concrete values
2020 Jul 16
2
[RFC] Pass return status
> Out of curiosity, does change here include changes to names, and other semantically-irrelevant changes (e.g., changing the order of operands in a PHI)? The hashing function used to detect changes is currently very simple: it only accounts for instruction opcode and order. So some semantically-irrelevant changes are ignored (as well as some relevant changes), and some are not. Permuting two
2020 May 30
2
Dynamically determine the CostPerUse value in the register allocator.
I dont know the history behind CostPerUse word so I may be missing the background associated with it. It seems that it's misnomer for what it is intended. At first sight, the word indicates that the cost is a function of uses of the register - more the uses more the cost. How do we want to define the value of CostPerUse. Should it be a function of uses? or just the target? On Sat, May 30,
2017 Apr 12
6
LLVM is getting faster, April edition
Hi, It's been a while since I sent the last compile time report [1], where it was shown that LLVM was getting slower over time. But now I'm happy to bring some good news: finally, LLVM is getting faster, not slower :) *** Current status *** Many areas of LLVM have been examined and improved since then: InstCombine, SCEV, APInt implementation, and that resulted in almost 10% improvement
2017 Apr 18
3
LLVM is getting faster, April edition
> On Apr 11, 2017, at 10:25 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I am interested in knowing more. > 1. What benchmarks does LLVM community use for compile-time study? I see CTMark, but is that the only one being analyzed? CTMark is not cast in stone. Its purpose is for the community to have a trackable proxy for the overall llvm test