thr3ads.net - similar to: "[LLVMdev] prevents instruction-scheduler from interfereing instruction pair"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] prevents instruction-scheduler from interfereing instruction pair"

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

2013 Nov 23

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

What I meant was to write your own expansion pass and run it after the scheduler passes, e.g. in the pre-emit stage. > if (addPreEmitPass()) printAndVerify("After PreEmit passes") Though if it's too hacky for you then fair enough. Amara On 23 November 2013 03:17, Liu Xin <navy.xliu at gmail.com> wrote: > Amara, > > first, thank you for answering. but I found

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

2013 Nov 23

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

I think this after a second. I got your point. I can define a pseudo instruction for an instr-pair and expand it after post-RA-sched. as you said, in preEmitPass. The original intrinsic can also be kept. I just convert the intrinsic to pseudo instruction in TargetLower. Thank you for your enlightening suggestion! thanks, --lx On Sat, Nov 23, 2013 at 8:37 PM, Amara Emerson <amara.emerson at

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

2013 Nov 23

[LLVMdev] prevents instruction-scheduler from interfereing instruction pair

Amara, first, thank you for answering. but I found expandPsuedo instructions actually happens before post-RA, like the following code showing: your approach is a little hacky, right? : ) // Expand pseudo instructions before second scheduling pass. addPass(&ExpandPostRAPseudosID); printAndVerify("After ExpandPostRAPseudos"); // Run pre-sched2 passes. if (addPreSched2())

RFC: Generic IR reductions

2017 Feb 10

RFC: Generic IR reductions

On 9 February 2017 at 17:31, Amara Emerson <amara.emerson at gmail.com> wrote: > Ping. Does anyone else have thoughts on this? Hi Amara, It seems the people who replied in this thread are mostly in sync with the proposal, why don't you push a review in phab, and let's take this to the next level? cheers, --renato

[LLVMdev] Phabricator loves Amara

2014 Feb 11

[LLVMdev] Phabricator loves Amara

Folks, For some reason, all new phabricator diffs are automatically including Amara, which is probably a bit annoying for him, but pointless. I believe it happens because his name is the first in the alphabetical order. Can someone have a look at what's going on? cheers, --renato

RFC: Generic IR reductions

2017 Feb 03

RFC: Generic IR reductions

Yes, SVE can vectorize early exit loops by using speculative (first-faulting) loads, which essentially give a predicate of the lanes loaded successfully. For uncounted loops with these special loads, the loop predicate tests can be done using a 'ptest' instruction, checking if the last element is active. Amara On 3 February 2017 at 10:15, Simon Pilgrim <llvm-dev at redking.me.uk>

[LLVMdev] [cfe-dev] AArch64 Clang CLI interface proposal

2014 Jan 08

[LLVMdev] [cfe-dev] AArch64 Clang CLI interface proposal

I knew I'd regret leaving that option in for the MIPS port back in 99. Basically this is the only acceptable way for mcpu to exist, but should never have been added to the GCC aarch64 port at all since there's no compatibility with existing build systems to worry about. I would still like you to show this mythical piece of software that needs this compatibility. -eric On Jan 8, 2014 3:06

Codegen pass configs dependent on function attributes?

2020 May 12

Codegen pass configs dependent on function attributes?

I’ve put up a patch here: https://reviews.llvm.org/D79769 <https://reviews.llvm.org/D79769> that adds a unified pipeline that targets can opt-into. It has some similarities with forcing fallbacks, but uses a different mechanism to do so to preserve the abort behavior. It therefore requires that every GISel pass needs to explicitly check whether the GISel selector is being requested rather

[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

2017 Dec 18

[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

FYI all: the patch to enable it is available to review here: https://reviews.llvm.org/D41362 Thanks, Amara > On Dec 18, 2017, at 5:44 PM, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Ok. We’ll look at what we can do to further stress test it in the next two months, additional suggestions from the community is welcome. Patch should be incoming to enable it

RFC: Memcpy inlining in IR

2019 Jun 20

RFC: Memcpy inlining in IR

Hi all, For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level. The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR

[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

2017 Dec 15

[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

I don’t know of any further issues preventing us flipping the switch. At this point, I’d aim to flip the switch shortly after the creation of the 6.0.0 release branch, so that GlobalISel can harden a bit more enabled-by-default on trunk before it goes into an LLVM release (presumably 7.0.0 then). Thanks, Kristof > On 11 Dec 2017, at 17:08, Amara Emerson <aemerson at apple.com> wrote:

[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler

2013 Nov 12

[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler

Hi David, Thanks for your efforts here. I have a few comments on your patch, although I realise it's still a work in progress. +class ConstantPool { + MCSymbol *Label; + typedef std::vector<const MCExpr*> EntryVecTy; Use a SmallVector here? + MCSymbol *getLabel() {return Label;} + size_t getNumEntries() {return Entries.size();} + const MCExpr *getEntry(size_t Num) {return

[LLVMdev] AArch64 Clang CLI interface proposal

2014 Jan 07

[LLVMdev] AArch64 Clang CLI interface proposal

Parsing the arch string is a bit icky, but I don't really have too much of a problem with it - and it's better than -mcpu so... -eric On Tue Jan 07 2014 at 9:23:43 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 7 January 2014 17:05, Amara Emerson <amara.emerson at arm.com> wrote: > > We plan on implementing this interface for AArch64 Clang in future, and

RFC: Promoting experimental reduction intrinsics to first class intrinsics

2020 Apr 08

RFC: Promoting experimental reduction intrinsics to first class intrinsics

Hi, It’s been a few years now since I added some intrinsics for doing vector reductions. We’ve been using them exclusively on AArch64, and I’ve seen some traffic a while ago on list for other targets too. Sander did some work last year to refine the semantics after some discussion. Are we at the point where we can drop the “experimental” from the name? IMO all target should begin to transition

Codegen pass configs dependent on function attributes?

2020 May 05

Codegen pass configs dependent on function attributes?

Hi all. I’m trying to get GlobalISel to work better with LTO. At the moment if you enable it via -fglobal-isel, it only adds the -mllvm -global-isel and related options to the cc1 invocation. With LTO, that doesn’t work as we need to encode codegen options into the bitcode, usually via function attributes. Does anyone have any ideas on how to achieve this? The only way I can see it working is if

Suggestions on code generation for SIMD

2018 Jan 08

Suggestions on code generation for SIMD

Thanks Amara so much for the info! One more question: what do people usually do if they want to generate vectorized code for some existing c/c++ code? Do they usually do C/C++ source level transformation, or do at LLVM's IR level? I know clang supports auto vectorizations, such as loop vectorization and SLP, but they are not flexible enough if we want to do more custom vectorizations or

[LLVMdev] VFP3

2014 Jun 23

[LLVMdev] VFP3

I am not using llvm tools, but sources and directly calling into relevant LLVM classes and methods. Thanks, Daman On 23/06/14 4:11 pm, "Amara Emerson" <amara.emerson at gmail.com> wrote: >Hi Damanjit, > >I assume you're trying to use the tools like llvm-mc, in which case >you can use the -mattr=+vfpv3 flag to enable it. This applies to other >subtarget

RFC: [GlobalISel] propagating int/float type information

2020 May 05

RFC: [GlobalISel] propagating int/float type information

I don’t think bfloat should be handled this way. What Amara is suggesting is an optimization, i.e., if we drop the information we are still correct. With bfloat, if we do an operation on float16 instead of bfloat16 this is a correctness problem. So that means that either we need to have new opcodes for bfloat or we need to carry around the floating point type in MIR. I think it would be more

[LLVMdev] AArch64 Clang CLI interface proposal

2014 Jan 07

[LLVMdev] AArch64 Clang CLI interface proposal

Hi, Clang for AArch64 currently accepts an -mfpu option to specify the FPU/NEON unit. This behaviour, while consistent with the ARM target and ARM gcc, will no longer be supported in AArch64 gcc. The preferred CLI for specifying FPU/NEON units will be the use of the -march option with feature modifiers (+[no]feature). The feature modifiers according to the GCC manual are: * crypto * fp * simd

status on NewGVN?

2018 Jan 08

status on NewGVN?

> On 6 Jan 2018, at 04:53, Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Fri, Jan 5, 2018 at 8:39 PM, Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Greetings, > > I just found a bug in NewGVN: https://bugs.llvm.org/show_bug.cgi?id=35839

similar to: [LLVMdev] prevents instruction-scheduler from interfereing instruction pair