Displaying 20 results from an estimated 12000 matches similar to: "[RFC] Target-specific parametrization of function inliner"
2016 Mar 10
3
[RFC] Target-specific parametrization of function inliner
IMO, the appropriate thing for TTI to inform the inliner about is how
costly the actual act of a "call" is likely to be. I would hope that this
would only be used on targets where there is some really dramatic overhead
of actually doing a function call such that the code size cost incurred by
inlining is completely dwarfed by the improvements. GPUs are one of the few
platforms that
2016 Apr 01
2
[RFC] Target-specific parametrization of function inliner
> On Mar 10, 2016, at 10:34 AM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
>
> On Thu, Mar 10, 2016 at 6:49 AM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
> IMO, the appropriate thing for TTI to inform the inliner about is how costly the actual act of a "call" is likely to be. I
2016 Mar 10
2
[RFC] Target-specific parametrization of function inliner
IMO, a good inliner with a precise cost/benefit model will eventually need
what Art is proposing here.
Giving the function call overhead as an example. It depends on a couple of
factors: 1) call/return instruction latency; 2) function epilogue/prologue;
3) calling convention (argument parsing, using registers or not, what
register classes etc). All these factors depend on target information. If
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots
in names generated by llvm still needs to be fixed.
On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote:
> Artem,
>
> With David's http://reviews.llvm.org/rL265060, do you think
> __nvvm_reflect_anchor is still necessary?
>
> On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2016 Jun 02
5
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello,
When generating the PTX output from CUDA file(.cu file), the minimum target
that is accepted by LLVM is sm_20. But I have a specific requirement to
generate PTX output for compute capability 1.0 (sm_10). Is there any
previous version of LLVM supporting this?
Thank you,
Ginu
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2020 Jul 30
2
Status of CUDA 11 support
Hi,
I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM.
>From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2017 Aug 16
3
CUDA separate compilation
Clang currently doesn't support CUDA separate compilation and thus extern
__device__ functions and variables cannot be used.
Could someone give me any pointers where to look or what has to be done to
support this? If at all possible, I'd like to see what's missing and
possibly try to tackle it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Dec 09
1
10.0-BETA4 (upgraded from 9.2-RELEASE) zpool upgrade -> boot failure
Hi,
Is there anything known about ZFS under 10.0-BETA4 when FreeBSD was
upgraded from 9.2-RELEASE?
I have two servers, with very different hardware (on is with soft raid
and the other have not) and after a zpool upgrade, no way to get the
server booting.
Do I miss something when upgrading?
I cannot get the error message for the moment. I reinstalled the raid
server under Linux and the other
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com>
On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I was wondering if anyone has encountered this issue when cross compiling
> cuda on Nvidia TX2 running android.
>
> The error is
> In file included from <built-in>:1:
> In file included from
>
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling
cuda on Nvidia TX2 running android.
The error is
In file included from <built-in>:1:
In file included from
prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219:
../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19:
error: no matching function
2018 Mar 14
0
TableGen: spring cleaning, new features for "functional programming"
Nicolai,
I want to say huge thank you for your improvements to tablegen.
While it's still far from perfect, I now have a hope that one day
I'll be able to *just write* something in tablegen, as opposed to
constantly struggling to trick tablegen into doing what I need it to do.
Thank you.
--Artem
On Wed, Feb 21, 2018 at 2:48 AM Nicolai Hähnle <nhaehnle at gmail.com> wrote:
>
2015 Sep 29
2
Fwd: buildbot failure in LLVM on clang-ppc64-elf-linux2
This buildbot appears to have been failing for several weeks now (
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/19490 ).
Does anyone know/own/care about it?
---------- Forwarded message ----------
From: <llvm.buildmaster at lab.llvm.org>
Date: Mon, Sep 28, 2015 at 11:17 PM
Subject: buildbot failure in LLVM on clang-ppc64-elf-linux2
To: Aaron Ballman <aaron at
2018 Feb 21
4
TableGen: spring cleaning, new features for "functional programming"
Hi Artem,
Thank you for your encouraging reply :)
I have now cleaned up the first batch of changes and put them on
Phabricator here: https://reviews.llvm.org/D43552
I've tried to keep individual changes small, and I've verified with `git
rebase -x` that the build is good after each change. This first batch
does not cause any changes in backend's generated files.
[snip]>
2015 Sep 29
3
Fwd: buildbot failure in LLVM on clang-ppc64-elf-linux2
On Tue, 2015-09-29 at 14:29 -0500, Hal Finkel wrote:
> [+Bill and Bill]
>
> ----- Original Message -----
> > From: "David Blaikie via llvm-dev" <llvm-dev at lists.llvm.org>
> > To: "llvm-dev" <llvm-dev at lists.llvm.org>
> > Sent: Tuesday, September 29, 2015 12:39:02 PM
> > Subject: [llvm-dev] Fwd: buildbot failure in LLVM on
2015 Jan 27
7
[LLVMdev] Embedding cpu and feature strings into IR and enabling switching subtarget on a per function basis
I've been investigating what is needed to ensure command line options are
passed to the backend codegen passes during LTO and enable compiling
different functions in a module with different command line options (see
the links below for previous discussions).
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/78855
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/80456
The command
2016 Jan 05
3
TargetTransformInfo getOperationCost uses
Hi,
I'm trying to implement the TTI hooks for AMDGPU to avoid unrolling loops for operations with huge expansions (i.e. integer division).
The values that are ultimately reported by opt -cost-model -analyze (the actual cost model tests) seem to not matter for this. The huge cost I've assigned division doesn't prevent the loop from being unrolled, because it isn't actually
2020 Aug 17
4
Inlining with different target features
Hi llvm-dev,
I recently updated the WebAssembly TargetTransformInfo to allow functions
with different target feature sets to be inlined into each other, but I ran
into an issue I want to get the community's opinion on.
Since WebAssembly modules have to be validated before they are run, it only
makes sense to talk about WebAssembly features at module granularity rather
than function
2020 Apr 08
6
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
TL;DR; We can improve compiler optimizations driven by heuristics by
replacing those heuristics with machine-learned policies (ML models).
Policies are trained offline and ship as part of the compiler. Determinism
is maintained because models are fixed when the compiler is operating in
production. Fine-tuning or regressions may be handled by incorporating the
interesting cases in the ML training
2020 Apr 09
3
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
+Yundi Qian <yundi at google.com> +Eugene Brevdo <ebrevdo at google.com> , our
team members from the ML side.
To avoid formatting issues, here is a link to the RFC
<https://docs.google.com/document/d/1BoSGQlmgAh-yUZMn4sCDoWuY6KWed2tV58P4_472mDE/edit?usp=sharing>,
open to comments.
Thanks!
On Wed, Apr 8, 2020 at 2:34 PM Mircea Trofin <mtrofin at google.com> wrote:
>
2016 Aug 02
2
RFC: We should stop merging allocas in the inliner
Sorry I missed these comments in my first read through David.
On Mon, Aug 1, 2016 at 1:06 AM Xinliang David Li via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Sun, Jul 31, 2016 at 9:47 PM, Chandler Carruth via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Thoughts? The code changes are easy and mechanical. My plan would be:
>>
>
> There is one