similar to: New and more general Function Merging optimization for code size

Displaying 20 results from an estimated 8000 matches similar to: "New and more general Function Merging optimization for code size"

2018 Aug 02
2
New and more general Function Merging optimization for code size
Hi Hal, Because my function merging strategy is able to merge any two function, allowing for different CFGs, different parameters, etc. I am unable to use just a simple hash value to compare whether or not two functions are similar. Therefore, the idea is to have an infrastructure which allows me to compare whether or not two functions are similar without having traverse the two function
2017 Jul 13
2
LLVM (Cool/Warm) DOT Printers for Profiling
Hi everyone, I have been working with profiling in LLVM and I was wondering if it would be interesting to upstream the following DOT Printers for Profiling Visualization: https://github.com/rcorcs/llvm-heat-printer All suggestions are welcomed. Thanks, Rodrigo Rocha -------------- next part -------------- An HTML attachment was scrubbed... URL:
2012 Sep 29
7
[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler
Hi, We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark tests that compare our algorithm with LLVM's pre-allocation scheduling algorithm. First, here is a
2017 May 18
6
Enable vectorizer-maximize-bandwidth by default?
Hi, I'm proposing to make vectorizer-maximize-bandwidth on by default for loop vectorizer because it should generally help performance. I've tested the performance impact on Intel sandybridge machine with speccpu benchmarks: Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 26.84
2016 Aug 30
2
Fwd: cfl-aa
dear LLVMers, I am trying to use some of the LLVM alias analyses, and I would like to check two things with you: is scev-aa being maintained in LLVM 3.7? Second question: I run cfl-aa, and I got a very small number of pointer disambiguation (no alias) with it. My results for SPEC CINT 2006 follow below. Is this low number of no alias responses something to be excepted? Below the results that I
2016 Mar 29
2
[CodeGen] CodeSize - TailMerging and BlockPlacement
Hi everyone, The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. I did an experiment of adding additional BranchFolding and BlockPlacement after the existing BlockPlacement (i.e., -block-placement -branch-folder -block-placement) targeting
2010 Feb 15
0
[LLVMdev] Measurements of the new inlinehint attribute
Friday I enabled the inlinehint function attribute in the inliner. It mostly affects the performance of -Os compiled code. I have made some measurements on the SPEC test suite to show what it means. I made three runs of then nightly tests. The baseline represents -Os with no inlinehint: make TEST=nightly OPTFLAGS=-Os EXTRA_LOPT_OPTIONS=-inlinehint-threshold=0
2012 Sep 29
0
[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler
On Sep 29, 2012, at 2:43 AM, Ghassan Shobaki <ghassan_shobaki at yahoo.com> wrote: > Hi, > > We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark
2017 Jan 30
4
(RFC) Adjusting default loop fully unroll threshold
Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code
2012 Jun 05
2
[LLVMdev] [PATCH] add x32 psABI support
If you are interesting to play around X32, you may refer to http://sourceware.org/glibc/wiki/x32 to bootstrap a local environment on Linux. Yours - Michael -----Original Message----- From: cfe-commits-bounces at cs.uiuc.edu [mailto:cfe-commits-bounces at cs.uiuc.edu] On Behalf Of Liao, Michael Sent: Monday, June 04, 2012 5:09 PM To: llvm-commits at cs.uiuc.edu; cfe-commits at cs.uiuc.edu
2010 May 19
3
5.5 ISO size vs RHEL
Hi We use CentOS and RHEL, the 5.5 RHEL ISO for x86_64 is 3.7GB (**), the CentOS one is 4602MB (***) split over two DVDs. Is this reasonable and correct? Any ideas why would there be such a discrepancy if they are built from the same (or very similar) source? Regards Anthony Caetano ** the md5sum checks out, and RHN lists the size as 3,532 MB *** CentOS-5.5-x86_64-bin-DVD-1of2.iso +
2016 Oct 27
2
(RFC) Encoding code duplication factor in discriminator
The impact to debug_line is actually not small. I only implemented the part 1 (encoding duplication factor) for loop unrolling and loop vectorization. The debug_line size overhead for "-O2 -g1" binary of speccpu C/C++ benchmarks: 433.milc 23.59% 444.namd 6.25% 447.dealII 8.43% 450.soplex 2.41% 453.povray 5.40% 470.lbm 0.00% 482.sphinx3 7.10% 400.perlbench 2.77% 401.bzip2 9.62% 403.gcc
2017 Jan 30
2
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop > dynamic unroller and partial unroller. This seems conservative because > unlike dynamic/partial
2017 Jan 30
0
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Currently, loop fully unroller shares the same default
2000 Nov 28
1
dubious behavior during login
Hi, I'm running openssh-2.3.0p1 under Tru64 4.0. I've got the sources and built it whithout additional options. The `problem' happens when a login from a non-existing user is attempted: $ ssh bogus at foo.com Connection closed by foo.com It doesn't even ask the password. So anyone can test whether this host has a user called bogus. I'm not sure whether this is a bug, but I
2016 Oct 27
0
(RFC) Encoding code duplication factor in discriminator
The large percentages are from those tiny benchmarks. If you look at omnetpp (0.52%), and xalanc (1.46%), the increase is small. To get a better average increase, you can sum up total debug_line size before and after and compute percentage accordingly. David On Thu, Oct 27, 2016 at 1:11 PM, Dehao Chen <dehao at google.com> wrote: > The impact to debug_line is actually not small. I only
2017 Jan 31
3
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote: > > > > On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> On Jan 30,
2010 Jul 22
0
[LLVMdev] fp Question
On Jul 22, 2010, at 4:18 PMPDT, Reza Yazdani wrote: > Hi, > > I ran Spec2006 with -O4. All integer benchmarks passed, but only 8 > out 17 of floating point benchmarks passed. Is this normal or I > made a mistake in my build? Hi Reza. Somebody on Linux should answer, but I don't think it's normal. You may have checked out the source at a moment when it had a bug
2020 Jun 02
2
Improve hot cold splitting to aggressively outline small blocks
Hello Tobias, Thank you for the suggestion! Aditya also mentioned this. I will look into it. Best regards, Ruijie Ruijie Fang Email: ruijief at princeton.edu On Tue, Jun 2, 2020 at 12:48 PM Tobias Hieta <tobias at plexapp.com> wrote: > Hello Ruijie, > > One other workload that would be interesting to test might be clang > itself. Building clang with PGO information is a