Displaying 8 results from an estimated 8 matches for "cgo2017".
2017 Jan 03
3
LLVM Performance Workshop at CGO 2017 (early registration ends January 6th)
FYI,
The LLVM Performance Workshop will be held at CGO 2017. The workshop
is co-located with CC, HPCA, and PPoPP.
If you are interested in attending the workshop, please register at
the CGO website: http://cgo.org/cgo2017/workshops.html
When: Saturday February 4th, 2017
Where: Austin, Texas, USA
----
Hi, CGO workshop and tutorial organizers,
This is a friendly reminder that early registration rates for CGO (and its workshops/tutorials) ends on January 6:
http://cgo.org/cgo2017/registration.html
And the early ho...
2016 Oct 18
2
LLVM Performance Workshop at CGO 2017
An LLVM Performance Workshop will be held at CGO 2017. The workshop
is co-located with CC, HPCA, and PPoPP.
If you are interested in attending the workshop, please register at
the CGO website: http://cgo.org/cgo2017/workshops.html
Call for Speakers
We invite speakers from academia and industry to present their work on the
following list of topics (including and not limited to:)
- improving performance and size of code generated by LLVM,
- improving performance of LLVM's runtime libraries,
- tools deve...
2017 Jul 31
1
[RFC] Profile guided section layout
Michael Spencer via llvm-dev <llvm-dev at lists.llvm.org> writes:
> I've recently implemented profile guided section layout in llvm + lld using
> the Call-Chain Clustering (C³) heuristic from
> https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
> . In the programs I've tested it on I've gotten from 0% to 5% performance
> improvement over standard PGO with zero cases of slowdowns and up to 15%
> reduction in ITLB misses.
>
>
> There are three parts to this implementation.
>
> The first is a...
2017 Jun 15
7
[RFC] Profile guided section layout
I've recently implemented profile guided section layout in llvm + lld using
the Call-Chain Clustering (C³) heuristic from
https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
. In the programs I've tested it on I've gotten from 0% to 5% performance
improvement over standard PGO with zero cases of slowdowns and up to 15%
reduction in ITLB misses.
There are three parts to this implementation.
The first is a new llvm pass which uses branch freq...
2018 Aug 07
3
Regarding basic block layout/code placement optimizations of profile guided optimization (PGO)
Hi,
I would like to learn the details regarding what exactly PGO does for basic
block layout/code placement optimizations in llvm. Could you please point
me to some descriptions? Is it close to this paper (Karl Pettis and Robert
C. Hansen. 1990. Profile guided code positioning. PLDI'90)
http://perso.ensta-paristech.fr/~bmonsuez/Cours/B6-4/Articles/papers15.pdf?
Whether it is purely
2017 Jul 31
3
[RFC] Profile guided section layout
Hi Rafael,
On 07/31/2017 04:20 PM, Rafael Avila de Espindola via llvm-dev wrote:
> However, do we need to start with instrumentation? The original paper
> uses sampling with good results and current intel cpus can record every
> branch in a program.
>
> I would propose starting with just an lld patch that reads the call
> graph from a file. The format would be very similar to
2017 Jul 31
2
[RFC] Profile guided section layout
...okup(S); });
}
+// Sort sections by the profile data provided in the .note.llvm.callgraph
+// sections.
+//
+// This algorithm is based on Call-Chain Clustering from:
+// Optimizing Function Placement for Large-Scale Data-Center Applications
+// https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
+//
+// This first builds a call graph based on the profile data then iteratively
+// merges the hottest call edges as long as it would not create a cluster larger
+// than the page size. All clusters are then sorted by a density metric to
+// further improve locality.
+template &...
2017 Aug 01
2
[RFC] Profile guided section layout
...okup(S); });
}
+// Sort sections by the profile data provided in the .note.llvm.callgraph
+// sections.
+//
+// This algorithm is based on Call-Chain Clustering from:
+// Optimizing Function Placement for Large-Scale Data-Center Applications
+// https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf
+//
+// This first builds a call graph based on the profile data then iteratively
+// merges the hottest call edges as long as it would not create a cluster larger
+// than the page size. All clusters are then sorted by a density metric to
+// further improve locality.
+template &...