thr3ads.net - similar to: "A question about AArch64 Cortex-A57 subtarget definition"

Displaying 20 results from an estimated 90 matches similar to: "A question about AArch64 Cortex-A57 subtarget definition"

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > > Also is there plans to make the NEON optimisations on ARMv7 run time > > detectable like they have in cairo/pixman? For generic distributions > > it would nice to be able to be able to enable them as they offer > > decent performance improvements but have the code

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 10:11, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com> wrote: > > > > On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > >> > >> > Also is there plans to make the NEON optimisations

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 24

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

>> >> a. Simplest use case to validate this optimization for correctness. >> >> b. Simplest use case to validate this optimization for performance. >> >> >> >> Would prefer something like opusdec that can be executed on command >> >> line. >> > >> > >> > The easiest thing to use is probably opus_demo (opusdec

[LLVMdev] New machine model questions

2014 Jan 28

[LLVMdev] New machine model questions

From: Andrew Trick [mailto:atrick at apple.com] Sent: 24 January 2014 21:52 To: Daniel Sanders Cc: LLVM Developers Mailing List (llvmdev at cs.uiuc.edu) Subject: Re: New machine model questions On Jan 24, 2014, at 2:21 AM, Daniel Sanders <Daniel.Sanders at imgtec.com<mailto:Daniel.Sanders at imgtec.com>> wrote: Hi Andrew, I seem to be making good progress on the P5600 scheduler

[VLIW Scheduler] Itineraries vs. per operand scheduling

2018 Feb 08

[VLIW Scheduler] Itineraries vs. per operand scheduling

We have a two different dimensions for each instruction: slot assignments, and operand timings. These two are unrelated to each other, and also each (or both) can change for any given instruction from one architecture version to the next. The main concern for us was which of these mechanisms contains all the information that we need. We cannot express all the scheduling details by hand, and

[VLIW Scheduler] Itineraries vs. per operand scheduling

2018 Feb 08

[VLIW Scheduler] Itineraries vs. per operand scheduling

Hi Krzysztof, 2018-02-08 13:32 GMT+08:00 Andrew Trick via llvm-dev < llvm-dev at lists.llvm.org>: > > > On Feb 4, 2018, at 9:15 AM, Yatsina, Marina via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi, > > What is the best way to model a scheduler for a VLIW in-order architecture? > I’ve looked at the Hexagon and R600 architectures and they are using

[LLVMdev] New machine model questions

2014 Jan 24

[LLVMdev] New machine model questions

Hi Andrew, I seem to be making good progress on the P5600 scheduler using the new machine model but I've got a few questions about it. How would you represent an instruction that splits into two micro-ops and is dispatched to two different reservation stations? For example, I have two reservation stations (AGQ and FPQ). An FPU load instruction is split into a load micro-op which is

what can cause a "CPU table is not sorted" assertion

2015 Oct 15

what can cause a "CPU table is not sorted" assertion

I'm trying to create a simplified 2 slot VLIW from an OR1K. The codebase I'm working with is here <https://github.com/openrisc/llvm-or1k>. I've created an initial MyTargetSchedule.td def MyTargetModel : SchedMachineModel { // HW can decode 2 instructions per cycle. let IssueWidth = 2; let LoadLatency = 4; let MispredictPenalty = 16; // This flag is set to allow the

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

>> > Also is there plans to make the NEON optimisations on ARMv7 run time >> > detectable like they have in cairo/pixman? For generic distributions >> > it would nice to be able to be able to enable them as they offer >> > decent performance improvements but have the code fall back on devices >> > that don't support NEON. >> Yep, adding

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta < viswanath.puttagunta at linaro.org> wrote: >> >> > Also is there plans to make the NEON optimisations on ARMv7 run time >> > detectable like they have in cairo/pixman? For generic distributions >> > it would nice to

No subject

2013 Apr 11

No subject

optimizations done for this CPU architecture. Whether it is in-line assembly or assembly optimization of function (Eg: celt_pitch_xcorr_arm.s), or using intrinsics, it is still some optimizations. So, I don't understand your perspective. I really thought about this for the most amount of time... could you please suggest an alternative here?.. Because I'm really out of ideas in this area

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 25 November 2014 at 10:11, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 24 November 2014 at 17:48, Peter Robinson <pbrobinson at gmail.com> wrote: >>> >> a. Simplest use case to validate this optimization for correctness. >>> >> b. Simplest use case to validate this optimization for performance. >>> >> >>> >> Would prefer something like opusdec that can be executed on command >>> >>

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 10:18, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 10:11, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at

[LLVMdev] Question about load clustering in the machine scheduler

2015 Mar 27

[LLVMdev] Question about load clustering in the machine scheduler

Hi, I have a program with over 100 loads (each with a 10 cycle latency) at the beginning of the program, and I can't figure out how to get the machine scheduler to intermix ALU instructions with the loads to effectively hide the latency. It seems the issue is with load clustering. I restrict load clustering to 4 at a time, but when I look at the debug output, the loads are always being

[LLVMdev] Question about load clustering in the machine scheduler

2015 Mar 27

[LLVMdev] Question about load clustering in the machine scheduler

On Thu, Mar 26, 2015 at 11:50:20PM -0700, Andrew Trick wrote: > > > On Mar 26, 2015, at 7:36 PM, Tom Stellard <tom at stellard.net> wrote: > > > > Hi, > > > > I have a program with over 100 loads (each with a 10 cycle latency) > > at the beginning of the program, and I can't figure out how to get > > the machine scheduler to intermix ALU

[LLVMdev] Questions about MachineScheduler

2013 Jul 22

[LLVMdev] Questions about MachineScheduler

Hi, I'm working on defining a SchedMachineModel for the Southern Islands family of GPUs, and I have two questions related to the MachineScheduler. 1. I have a resource that can process 15 instructions at the same time. In the TableGen definitions, should I do: def HWVMEM : ProcResource<15>; or let BufferSize = 15 in { def HWVMEM : ProcResource<1>; } 2. Southern Islands has

[LLVMdev] Questions about MachineScheduler

2013 Jul 23

[LLVMdev] Questions about MachineScheduler

On Jul 22, 2013, at 11:50 AM, Tom Stellard <tom at stellard.net> wrote: > Hi, > > I'm working on defining a SchedMachineModel for the Southern Islands > family of GPUs, and I have two questions related to the > MachineScheduler. > > 1. I have a resource that can process 15 instructions at the same time. > In the TableGen definitions, should I do: > > def

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 21

[LLVMdev] MI scheduler produce badly code with inline function

Hi Andy, I'm working on defining new machine model for my target, But I don't understand how to define the in-order machine (reservation tables) in new model. For example, if target has IF ID EX WB stages should I do: let BufferSize=0 in { def IF: ProcResource<1>; def ID: ProcResource<1>; def EX: ProcResource<1>; def WB: ProcResource<1>; } def :

[LLVMdev] Question about per-operand machine model

2014 Feb 18

[LLVMdev] Question about per-operand machine model

Hi Andy and all, I have a question about per-operand machine model. I am finding some relations between 'MCWriteLatencyEntry' and 'MCWriteProcResEntry'. For example, class InstTEST<..., InstrItinClass itin> : Instruction { let Itinerary = Itin; } // I assume this MI writes 2 registers. def TESTINST : InstTEST<..., II_TEST> // schedule info II_TEST:

similar to: A question about AArch64 Cortex-A57 subtarget definition