Displaying 20 results from an estimated 200 matches similar to: "getCacheSize() / subtarget machine id"
2016 Dec 16
1
help/hints/suggestions/tips please: how to give _generic_ compilation for a particular ISA a non-zero LoopMicroOpBufferSize?
Dear all,
Some benchmarking experimentation I`ve done recently -- all on AArch64 -- has shown that it
might be beneficial for all AArch64 targets to have a positive LoopMicroOpBufferSize, whereas
the default that applies to all ISAs seems to be zero.
Although I`ve tried going as far down the rabbit hole as I can, I haven`t found a way to set
DefaultLoopMicroOpBufferSize on a per-ISA basis or
2017 Aug 22
2
Subtarget Initialization in <ARCH>TargetMachine constructor
Hi,
I found some different discrepancy on how Subtarget is created
between some arch specific TargetMachine constructor.
For example, for BPF/Lanai:
BPFTargetMachine::BPFTargetMachine(const Target &T, const Triple &TT,
StringRef CPU, StringRef FS,
const TargetOptions &Options,
2017 Aug 23
2
Subtarget Initialization in <ARCH>TargetMachine constructor
Thanks, Alex. See my comments below.
On Wed, Aug 23, 2017 at 12:59 AM, Alex Bradbury <asb at asbradbury.org> wrote:
> On 22 August 2017 at 23:39, Y Song via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> Hi,
>
> Hi Yonghong.
>
>> I found some different discrepancy on how Subtarget is created
>> between some arch specific TargetMachine constructor.
2018 Mar 06
2
[RFC] llvm-mca: a static performance analysis tool
On Tue, Mar 6, 2018 at 5:55 AM, Andrew Trick via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Mar 5, 2018, at 6:28 PM, Matthias Braun <mbraun at apple.com> wrote:
>
>
>
> On Mar 5, 2018, at 6:14 PM, Andrew Trick via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> On Mar 5, 2018, at 3:38 PM, Quentin Colombet <qcolombet at
2018 Mar 06
3
[RFC] llvm-mca: a static performance analysis tool
> On Mar 5, 2018, at 6:14 PM, Andrew Trick via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
>
>> On Mar 5, 2018, at 3:38 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote:
>>
>> When Ahmed and I worked on the decompiler, we first targeted MC. Going to MI was more difficult and really wouldn’t have gotten us a
2018 Mar 06
0
[RFC] llvm-mca: a static performance analysis tool
> On Mar 6, 2018, at 4:20 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
>
> To be clear then, resolveSchedClass should be moved from TargetSchedModel into MCSchedModel (which is where I originally wanted it). Any TargetInstrInfo APIs called from SchedPredicate should be moved to MCInstrInfo, which should be straightforward but annoying.
>
> Personally, I
2018 Mar 06
0
[RFC] llvm-mca: a static performance analysis tool
> On Mar 5, 2018, at 6:28 PM, Matthias Braun <mbraun at apple.com> wrote:
>
>
>
>> On Mar 5, 2018, at 6:14 PM, Andrew Trick via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>>
>>
>>> On Mar 5, 2018, at 3:38 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at
2015 Nov 16
3
DFAPacketizer, Scheduling and LoadLatency
I'm unclear how does DFAPacketizer and the scheduler know a given
instruction is a load.
Here is what I'm talking about
Let's assume my VLIW target is described as follows:
def MyTargetItineraries :
ProcessorItineraries<[Slot0, Slot1], [], [
..............................
InstrItinData<RI, [InstrStage<1, [Slot0, Slot1]>]>,
2013 Jan 17
1
[PATCH] x86, Allow x2apic without IR on VMware platform.
Please consider this patch to allow x2apic without IR support when
running on VMware platform. Tested on top of 3.8-rc3.
Thanks,
Alok
--
Allow x2apic without IR on VMware platform.
From: Alok N Kataria <akataria at vmware.com>
This patch updates x2apic initializaition code to allow x2apic on VMware
platform even without interrupt remapping support.
The hypervisor_x2apic_available hook
2013 Jan 17
1
[PATCH] x86, Allow x2apic without IR on VMware platform.
Please consider this patch to allow x2apic without IR support when
running on VMware platform. Tested on top of 3.8-rc3.
Thanks,
Alok
--
Allow x2apic without IR on VMware platform.
From: Alok N Kataria <akataria at vmware.com>
This patch updates x2apic initializaition code to allow x2apic on VMware
platform even without interrupt remapping support.
The hypervisor_x2apic_available hook
2018 Mar 06
0
[RFC] llvm-mca: a static performance analysis tool
> On Mar 5, 2018, at 3:38 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
> When Ahmed and I worked on the decompiler, we first targeted MC. Going to MI was more difficult and really wouldn’t have gotten us a lot of benefits. Instead, Ahmed pushed for directly decompiling to IR (look for dagger).
Thanks for the pointer Quentin.
> I would actually be in favor for more
2013 Apr 30
1
[LLVMdev] Instruction Scheduling - migration from v3.1 to v3.2
On Apr 26, 2013, at 3:53 AM, Martin J. O'Riordan <Martin.ORiordan at movidius.com> wrote:
> I am migrating the llvm/clang derived compiler for our processor from the
> v3.1 to v3.2 codebase. This has mostly gone well except that instruction
> latency scheduling is no longer happening.
>
> The people who implemented this previously sub-classed 'ScheduleDAGInstrs'
2014 Jan 18
3
[LLVMdev] Artificial deps and stores
On Jan 17, 2014, at 4:03 PM, Andrew Trick <atrick at apple.com> wrote:
>
> On Jan 17, 2014, at 3:54 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>> Andy, et al.,
>>
>> In ScheduleDAGInstrs::buildSchedGraph, the code for handling stores has this:
>>
>> if (!ExitSU.isPred(SU))
>> // Push store's up a bit to avoid them
2018 Mar 05
2
[RFC] llvm-mca: a static performance analysis tool
Thanks Andrea for working on this!
I’ve been willing to do this for quite some time now. Looks like procrastination was the right approach here ;).
> On Mar 2, 2018, at 9:33 AM, Andrew Trick via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> +Ahmed
>
>> On Mar 2, 2018, at 6:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com <mailto:andrea.dibiagio at
2020 Sep 23
2
Incorrect Cortex-R4/R4F/R5 ProcessorModel in ARM.td
In ARM.td, I see that the ProcessorModel for cortex-r4, cortex-r4f, and cortex-r5 (as well as r7 and r8) is based on "CortexA8Model", which seems incorrect. When this was added in 2015, there were also comments associated with this configuration, such as "// FIXME: R5 has currently the same ProcessorModel as A8" (later removed). The processor model for Cortex-r52 appears to
2018 Nov 01
3
RFC: System (cache, etc.) model for LLVM
Am Do., 1. Nov. 2018 um 15:21 Uhr schrieb David Greene <dag at cray.com>>
> > thank you for sharing the system hierarchy model. IMHO it makes a lot
> > of sense, although I don't know which of today's passes would make use
> > of it. Here are my remarks.
>
> LoopDataPrefetch would use it via the existing TTI interfaces, but I
> think that's about it
2015 Oct 15
3
what can cause a "CPU table is not sorted" assertion
I'm trying to create a simplified 2 slot VLIW from an OR1K. The codebase
I'm working with is here <https://github.com/openrisc/llvm-or1k>. I've
created an initial MyTargetSchedule.td
def MyTargetModel : SchedMachineModel {
// HW can decode 2 instructions per cycle.
let IssueWidth = 2;
let LoadLatency = 4;
let MispredictPenalty = 16;
// This flag is set to allow the
2016 Feb 24
2
Performance degradation on ARMv7 (cortex-a9)
Hi Bradley,
I was doing some performance analysis for ARMv7 (cortex-a9) and I
noticed that one of my benchmarks degraded by 93%. I have tracked the
regression down to the following commit by you:
/
//commit 7c1b77248baaeafec5d6433c3d1da9a2e2b69595//
//Author: Bradley Smith <bradley.smith at arm.com>//
//Date: Mon Nov 16 11:10:19 2015 +0000//
// [ARM] Introduce subtarget features per
2016 Feb 24
1
Performance degradation on ARMv7 (cortex-a9)
Thanks Bradley.
I see that the features set in /ARM.td/ get written to the generated
file /<build>/llvm/lib/Target/ARM/ARMGenSubtargetInfo.inc./ Here the
ProcA9 features appear in /ARMFeatureKV/ table:
/{ "a9", "Cortex-A9 ARM processors", { ARM::ProcA9 }, {
*ARM::FeatureFP16* } },
/With your change, the features for ProcA9 in the above entry are
empty.//This
2018 Nov 01
2
RFC: System (cache, etc.) model for LLVM
Hi,
thank you for sharing the system hierarchy model. IMHO it makes a lot
of sense, although I don't know which of today's passes would make use
of it. Here are my remarks.
I am wondering how one could model the following features using this
model, or whether they should be part of a performance model at all:
* ARM's big.LITTLE
* NUMA hierarchies (are the NUMA domains