similar to: [RFC] Target-specific parametrization of function inliner

Displaying 20 results from an estimated 12000 matches similar to: "[RFC] Target-specific parametrization of function inliner"

2016 Mar 10
3
[RFC] Target-specific parametrization of function inliner
IMO, the appropriate thing for TTI to inform the inliner about is how costly the actual act of a "call" is likely to be. I would hope that this would only be used on targets where there is some really dramatic overhead of actually doing a function call such that the code size cost incurred by inlining is completely dwarfed by the improvements. GPUs are one of the few platforms that
2016 Apr 01
2
[RFC] Target-specific parametrization of function inliner
> On Mar 10, 2016, at 10:34 AM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Thu, Mar 10, 2016 at 6:49 AM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: > IMO, the appropriate thing for TTI to inform the inliner about is how costly the actual act of a "call" is likely to be. I
2016 Mar 10
2
[RFC] Target-specific parametrization of function inliner
IMO, a good inliner with a precise cost/benefit model will eventually need what Art is proposing here. Giving the function call overhead as an example. It depends on a couple of factors: 1) call/return instruction latency; 2) function epilogue/prologue; 3) calling convention (argument parsing, using registers or not, what register classes etc). All these factors depend on target information. If
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2016 Jun 02
5
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:
2020 Jul 30
2
Status of CUDA 11 support
Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2017 Aug 16
3
CUDA separate compilation
Clang currently doesn't support CUDA separate compilation and thus extern __device__ functions and variables cannot be used. Could someone give me any pointers where to look or what has to be done to support this? If at all possible, I'd like to see what's missing and possibly try to tackle it. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Dec 09
1
10.0-BETA4 (upgraded from 9.2-RELEASE) zpool upgrade -> boot failure
Hi, Is there anything known about ZFS under 10.0-BETA4 when FreeBSD was upgraded from 9.2-RELEASE? I have two servers, with very different hardware (on is with soft raid and the other have not) and after a zpool upgrade, no way to get the server booting. Do I miss something when upgrading? I cannot get the error message for the moment. I reinstalled the raid server under Linux and the other
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function
2018 Mar 14
0
TableGen: spring cleaning, new features for "functional programming"
Nicolai, I want to say huge thank you for your improvements to tablegen. While it's still far from perfect, I now have a hope that one day I'll be able to *just write* something in tablegen, as opposed to constantly struggling to trick tablegen into doing what I need it to do. Thank you. --Artem On Wed, Feb 21, 2018 at 2:48 AM Nicolai Hähnle <nhaehnle at gmail.com> wrote: >
2015 Sep 29
2
Fwd: buildbot failure in LLVM on clang-ppc64-elf-linux2
This buildbot appears to have been failing for several weeks now ( http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/19490 ). Does anyone know/own/care about it? ---------- Forwarded message ---------- From: <llvm.buildmaster at lab.llvm.org> Date: Mon, Sep 28, 2015 at 11:17 PM Subject: buildbot failure in LLVM on clang-ppc64-elf-linux2 To: Aaron Ballman <aaron at
2018 Feb 21
4
TableGen: spring cleaning, new features for "functional programming"
Hi Artem, Thank you for your encouraging reply :) I have now cleaned up the first batch of changes and put them on Phabricator here: https://reviews.llvm.org/D43552 I've tried to keep individual changes small, and I've verified with `git rebase -x` that the build is good after each change. This first batch does not cause any changes in backend's generated files. [snip]>
2015 Sep 29
3
Fwd: buildbot failure in LLVM on clang-ppc64-elf-linux2
On Tue, 2015-09-29 at 14:29 -0500, Hal Finkel wrote: > [+Bill and Bill] > > ----- Original Message ----- > > From: "David Blaikie via llvm-dev" <llvm-dev at lists.llvm.org> > > To: "llvm-dev" <llvm-dev at lists.llvm.org> > > Sent: Tuesday, September 29, 2015 12:39:02 PM > > Subject: [llvm-dev] Fwd: buildbot failure in LLVM on
2015 Jan 27
7
[LLVMdev] Embedding cpu and feature strings into IR and enabling switching subtarget on a per function basis
I've been investigating what is needed to ensure command line options are passed to the backend codegen passes during LTO and enable compiling different functions in a module with different command line options (see the links below for previous discussions). http://thread.gmane.org/gmane.comp.compilers.llvm.devel/78855 http://thread.gmane.org/gmane.comp.compilers.llvm.devel/80456 The command
2016 Jan 05
3
TargetTransformInfo getOperationCost uses
Hi, I'm trying to implement the TTI hooks for AMDGPU to avoid unrolling loops for operations with huge expansions (i.e. integer division). The values that are ultimately reported by opt -cost-model -analyze (the actual cost model tests) seem to not matter for this. The huge cost I've assigned division doesn't prevent the loop from being unrolled, because it isn't actually
2020 Aug 17
4
Inlining with different target features
Hi llvm-dev, I recently updated the WebAssembly TargetTransformInfo to allow functions with different target feature sets to be inlined into each other, but I ran into an issue I want to get the community's opinion on. Since WebAssembly modules have to be validated before they are run, it only makes sense to talk about WebAssembly features at module granularity rather than function
2020 Apr 08
6
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
TL;DR; We can improve compiler optimizations driven by heuristics by replacing those heuristics with machine-learned policies (ML models). Policies are trained offline and ship as part of the compiler. Determinism is maintained because models are fixed when the compiler is operating in production. Fine-tuning or regressions may be handled by incorporating the interesting cases in the ML training
2020 Apr 09
3
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
+Yundi Qian <yundi at google.com> +Eugene Brevdo <ebrevdo at google.com> , our team members from the ML side. To avoid formatting issues, here is a link to the RFC <https://docs.google.com/document/d/1BoSGQlmgAh-yUZMn4sCDoWuY6KWed2tV58P4_472mDE/edit?usp=sharing>, open to comments. Thanks! On Wed, Apr 8, 2020 at 2:34 PM Mircea Trofin <mtrofin at google.com> wrote: >
2016 Aug 02
2
RFC: We should stop merging allocas in the inliner
Sorry I missed these comments in my first read through David. On Mon, Aug 1, 2016 at 1:06 AM Xinliang David Li via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Sun, Jul 31, 2016 at 9:47 PM, Chandler Carruth via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Thoughts? The code changes are easy and mechanical. My plan would be: >> > > There is one