thr3ads.net - similar to: "New and more general Function Merging optimization for code size"

Displaying 20 results from an estimated 8000 matches similar to: "New and more general Function Merging optimization for code size"

New and more general Function Merging optimization for code size

2018 Aug 02

New and more general Function Merging optimization for code size

Hi Hal, Because my function merging strategy is able to merge any two function, allowing for different CFGs, different parameters, etc. I am unable to use just a simple hash value to compare whether or not two functions are similar. Therefore, the idea is to have an infrastructure which allows me to compare whether or not two functions are similar without having traverse the two function

LLVM (Cool/Warm) DOT Printers for Profiling

2017 Jul 13

LLVM (Cool/Warm) DOT Printers for Profiling

Hi everyone, I have been working with profiling in LLVM and I was wondering if it would be interesting to upstream the following DOT Printers for Profiling Visualization: https://github.com/rcorcs/llvm-heat-printer All suggestions are welcomed. Thanks, Rodrigo Rocha -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

2012 Sep 29

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

Hi, We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark tests that compare our algorithm with LLVM's pre-allocation scheduling algorithm. First, here is a

Enable vectorizer-maximize-bandwidth by default?

2017 May 18

Enable vectorizer-maximize-bandwidth by default?

Hi, I'm proposing to make vectorizer-maximize-bandwidth on by default for loop vectorizer because it should generally help performance. I've tested the performance impact on Intel sandybridge machine with speccpu benchmarks: Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 26.84

Fwd: cfl-aa

2016 Aug 30

Fwd: cfl-aa

dear LLVMers, I am trying to use some of the LLVM alias analyses, and I would like to check two things with you: is scev-aa being maintained in LLVM 3.7? Second question: I run cfl-aa, and I got a very small number of pointer disambiguation (no alias) with it. My results for SPEC CINT 2006 follow below. Is this low number of no alias responses something to be excepted? Below the results that I

[CodeGen] CodeSize - TailMerging and BlockPlacement

2016 Mar 29

[CodeGen] CodeSize - TailMerging and BlockPlacement

Hi everyone, The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. I did an experiment of adding additional BranchFolding and BlockPlacement after the existing BlockPlacement (i.e., -block-placement -branch-folder -block-placement) targeting

[LLVMdev] Measurements of the new inlinehint attribute

2010 Feb 15

[LLVMdev] Measurements of the new inlinehint attribute

Friday I enabled the inlinehint function attribute in the inliner. It mostly affects the performance of -Os compiled code. I have made some measurements on the SPEC test suite to show what it means. I made three runs of then nightly tests. The baseline represents -Os with no inlinehint: make TEST=nightly OPTFLAGS=-Os EXTRA_LOPT_OPTIONS=-inlinehint-threshold=0

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

2012 Sep 29

[LLVMdev] LLVM's Pre-allocation Scheduler Tested against a Branch-and-Bound Scheduler

On Sep 29, 2012, at 2:43 AM, Ghassan Shobaki <ghassan_shobaki at yahoo.com> wrote: > Hi, > > We are currently working on revising a journal article that describes our work on pre-allocation scheduling using LLVM and have some questions about LLVM's pre-allocation scheduler. The answers to these question will help us better document and analyze the results of our benchmark

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code

[LLVMdev] [PATCH] add x32 psABI support

2012 Jun 05

[LLVMdev] [PATCH] add x32 psABI support

If you are interesting to play around X32, you may refer to http://sourceware.org/glibc/wiki/x32 to bootstrap a local environment on Linux. Yours - Michael -----Original Message----- From: cfe-commits-bounces at cs.uiuc.edu [mailto:cfe-commits-bounces at cs.uiuc.edu] On Behalf Of Liao, Michael Sent: Monday, June 04, 2012 5:09 PM To: llvm-commits at cs.uiuc.edu; cfe-commits at cs.uiuc.edu

5.5 ISO size vs RHEL

2010 May 19

5.5 ISO size vs RHEL

Hi We use CentOS and RHEL, the 5.5 RHEL ISO for x86_64 is 3.7GB (**), the CentOS one is 4602MB (***) split over two DVDs. Is this reasonable and correct? Any ideas why would there be such a discrepancy if they are built from the same (or very similar) source? Regards Anthony Caetano ** the md5sum checks out, and RHN lists the size as 3,532 MB *** CentOS-5.5-x86_64-bin-DVD-1of2.iso +

(RFC) Encoding code duplication factor in discriminator

2016 Oct 27

(RFC) Encoding code duplication factor in discriminator

The impact to debug_line is actually not small. I only implemented the part 1 (encoding duplication factor) for loop unrolling and loop vectorization. The debug_line size overhead for "-O2 -g1" binary of speccpu C/C++ benchmarks: 433.milc 23.59% 444.namd 6.25% 447.dealII 8.43% 450.soplex 2.41% 453.povray 5.40% 470.lbm 0.00% 482.sphinx3 7.10% 400.perlbench 2.77% 401.bzip2 9.62% 403.gcc

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop > dynamic unroller and partial unroller. This seems conservative because > unlike dynamic/partial

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 30

(RFC) Adjusting default loop fully unroll threshold

> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Currently, loop fully unroller shares the same default

dubious behavior during login

2000 Nov 28

dubious behavior during login

Hi, I'm running openssh-2.3.0p1 under Tru64 4.0. I've got the sources and built it whithout additional options. The `problem' happens when a login from a non-existing user is attempted: $ ssh bogus at foo.com Connection closed by foo.com It doesn't even ask the password. So anyone can test whether this host has a user called bogus. I'm not sure whether this is a bug, but I

(RFC) Encoding code duplication factor in discriminator

2016 Oct 27

(RFC) Encoding code duplication factor in discriminator

The large percentages are from those tiny benchmarks. If you look at omnetpp (0.52%), and xalanc (1.46%), the increase is small. To get a better average increase, you can sum up total debug_line size before and after and compute percentage accordingly. David On Thu, Oct 27, 2016 at 1:11 PM, Dehao Chen <dehao at google.com> wrote: > The impact to debug_line is actually not small. I only

(RFC) Adjusting default loop fully unroll threshold

2017 Jan 31

(RFC) Adjusting default loop fully unroll threshold

> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote: > > > > On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> On Jan 30,

[LLVMdev] fp Question

2010 Jul 22

[LLVMdev] fp Question

On Jul 22, 2010, at 4:18 PMPDT, Reza Yazdani wrote: > Hi, > > I ran Spec2006 with -O4. All integer benchmarks passed, but only 8 > out 17 of floating point benchmarks passed. Is this normal or I > made a mistake in my build? Hi Reza. Somebody on Linux should answer, but I don't think it's normal. You may have checked out the source at a moment when it had a bug

Improve hot cold splitting to aggressively outline small blocks

2020 Jun 02

Improve hot cold splitting to aggressively outline small blocks

Hello Tobias, Thank you for the suggestion! Aditya also mentioned this. I will look into it. Best regards, Ruijie Ruijie Fang Email: ruijief at princeton.edu On Tue, Jun 2, 2020 at 12:48 PM Tobias Hieta <tobias at plexapp.com> wrote: > Hello Ruijie, > > One other workload that would be interesting to test might be clang > itself. Building clang with PGO information is a

similar to: New and more general Function Merging optimization for code size