search for: othercounter

Displaying 4 results from an estimated 4 matches for "othercounter".

2016 Mar 12
2
RFC: Pass to prune redundant profiling instrumentation
...with their > multiplicities > create new counter > emit side-table data that relates the new counter to an array of > (other counter, multiplicity of update) > > The runtime just emits the side-table and then llvm-profdata does: > > for each counter C: > for (otherCounter, multiplicity) in side-table[C]: > counters[otherCounter] += multiplicity * counters[C] > > There are other issues that can complicate the matter. 1) The assumption in the algorithm is that the source counter has only one update site -- but instead it may have more than one sites...
2019 Sep 10
2
MachineScheduler not scheduling for latency
Hi Andy, Thanks for the explanations. Yes AMDGPU is in-order and has MicroOpBufferSize = 1. Re "issue limited" and instruction groups: could it make sense to disable the generic scheduler's detection of issue limitation on in-order CPUs, or on CPUs that don't define instruction groups, or some similar condition? Something like: --- a/lib/CodeGen/MachineScheduler.cpp +++
2016 Mar 12
2
RFC: Pass to prune redundant profiling instrumentation
> On Mar 11, 2016, at 5:28 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Fri, Mar 11, 2016 at 12:47 PM, Vedant Kumar <vsk at apple.com> wrote: > There have been a lot of responses. I'll try to summarize the thread and respond > to some of the questions/feedback. > > > Summary > ======= > > 1. We should teach GlobalDCE to
2019 Sep 09
2
Fwd: MachineScheduler not scheduling for latency
Hi, I'm trying to understand why MachineScheduler does a poor job in straight line code in cases like the one in the attached debug dump. This is on AMDGPU, an in-order target, and the problem is that the IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in the resulting schedule they are often placed right next to their uses like this: 1784B %140:vgpr_32 =