thr3ads.net - llvm dev - [llvm-dev] RFC: System (cache, etc.) model for LLVM [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2018-Nov-05 17:08 UTC

[llvm-dev] RFC: System (cache, etc.) model for LLVM

On Mon, 5 Nov 2018 at 15:56, David Greene <dag at cray.com>
wrote:> The cache interfaces are flexible enough to allow passes to answer
> questions like, "how much effective cache is available for this core
> (thread, etc.)?"  That's a critical question to reason about the
> thrashing behavior you mentioned above.
>
> Knowing the cache line size is important for prefetching and various
> other memory operations such as streaming.
>
> Knowing the number of ways can allow one to guesstimate which memory
> accesses are likely to collide in the cache.
>
> It also happens that all of these parameters are useful for simulation
> purposes, which may help projects like llvm-mca.
I see.

So, IIGIR, initially, this would consolidate the prefetching
infrastructure, which is a worthy goal in itself and would require a
minimalist implementation for now.

But later, vectorisers could use that info, for example, to understand
how much would be beneficial to unroll vectorised loops (where total
access size should be a multiple of the cache line), etc.

Ultimately, simulations would be an interesting use of it, but
shouldn't be a driving force for additional features bundled into the
initial design.

> I'm not quite grasping this.  Are you saying that a partcular subtarget
> may have multiple "clusters" of big.LITTLE cores and that each
cluster
> may look different from the others?
Yeah, "big.LITTLE" [1] is a marketing name and can mean a bunch of
different scenarios.

For example:
 - List of big+little cores seen by the kernel as a single core but
actually being two separate cores, and scheduled by the kernel via
frequency scaling.
 - Two entirely separate clusters flipped between all big or all little
 - Heterogeneous mix, which could have different number of big and
little cores with no cache need of coherence between them. Junos have
two little and four big, Tegras have one little and four big. There
are also other designs with dozens of huge cores plus a tiny core for
management purposes.

But it's worse, because different releases of the same family can have
different core counts, change model (clustered/bundled/heterogeneous)
and there's no way to currently represent that in table-gen.

Given that the kernel has such a high influence how those cores get
scheduled and preempted, I don't think there's any hope that the
compiler can do a good job at predicting usage or having any real
impact amidst higher level latency, such as context switches and
systemcalls.

-- 
cheers,
--renato

[1] https://en.wikipedia.org/wiki/ARM_big.LITTLE

David Greene via llvm-dev

2018-Nov-05 19:03 UTC

head link

[llvm-dev] RFC: System (cache, etc.) model for LLVM

Renato Golin via llvm-dev <llvm-dev at lists.llvm.org> writes:
> So, IIGIR, initially, this would consolidate the prefetching
> infrastructure, which is a worthy goal in itself and would require a
> minimalist implementation for now.
Yes.
> But later, vectorisers could use that info, for example, to understand
> how much would be beneficial to unroll vectorised loops (where total
> access size should be a multiple of the cache line), etc.
Exactly!
> Ultimately, simulations would be an interesting use of it, but
> shouldn't be a driving force for additional features bundled into the
> initial design.
I agree simulation isn't the primary motivation, but it's a nice
side-effect.

We use all of these parameters today, so they are useful.
> But it's worse, because different releases of the same family can have
> different core counts, change model (clustered/bundled/heterogeneous)
> and there's no way to currently represent that in table-gen.
Yes, this is exactly the SKU problem I mentioned.  I don't have a good
solution for that other than to say that we've found that a generic
model per major subtarget can work well enough across different SKUs.
As currently constructed, the model is intended to be a resource for
heuristics, so getting things wrong is "just" a performance hit.

I guess it would be up to the people interested in a particular target
to figure out a reasonable, maintainable way to manage models for
possibly many subtargets.  This proposal is about providing
infrastructure to allow models to be created with not too much effort.
It doesn't say anything about what models for a particular
target/subtarget should look like.  :)
> Given that the kernel has such a high influence how those cores get
> scheduled and preempted, I don't think there's any hope that the
> compiler can do a good job at predicting usage or having any real
> impact amidst higher level latency, such as context switches and
> systemcalls.
Sure.  In those cases a model isn't that useful.  Not every subtarget
needs to have a model.  Alternatively, a simple "dumb" model could be
used for such targets, setting prefetch parameters, etc. to something
not totally outrageous.  The prefetcher, for example, would have to
check if a model exists.  If not, it wouldn't prefetch.

                               -David

Renato Golin via llvm-dev

2018-Nov-05 22:24 UTC

head link

[llvm-dev] RFC: System (cache, etc.) model for LLVM

On Mon, 5 Nov 2018 at 19:04, David Greene <dag at cray.com>
wrote:> I guess it would be up to the people interested in a particular target
> to figure out a reasonable, maintainable way to manage models for
> possibly many subtargets.  This proposal is about providing
> infrastructure to allow models to be created with not too much effort.
> It doesn't say anything about what models for a particular
> target/subtarget should look like.  :)
Exactly! :)

-- 
cheers,
--renato

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Nov 2018 - RFC: System (cache, etc.) model for LLVM

[llvm-dev] RFC: System (cache, etc.) model for LLVM

[llvm-dev] RFC: System (cache, etc.) model for LLVM

[llvm-dev] RFC: System (cache, etc.) model for LLVM

Apparently Analagous Threads