similar to: Debug info interacting with optimization and code generation

Displaying 20 results from an estimated 8000 matches similar to: "Debug info interacting with optimization and code generation"

2017 May 18
6
Enable vectorizer-maximize-bandwidth by default?
Hi, I'm proposing to make vectorizer-maximize-bandwidth on by default for loop vectorizer because it should generally help performance. I've tested the performance impact on Intel sandybridge machine with speccpu benchmarks: Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 26.84
2017 Jan 30
4
(RFC) Adjusting default loop fully unroll threshold
Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to double the threshold for loop fully unroller. This will change the codegen of several SPECCPU benchmarks: Code
2016 Oct 27
2
(RFC) Encoding code duplication factor in discriminator
The impact to debug_line is actually not small. I only implemented the part 1 (encoding duplication factor) for loop unrolling and loop vectorization. The debug_line size overhead for "-O2 -g1" binary of speccpu C/C++ benchmarks: 433.milc 23.59% 444.namd 6.25% 447.dealII 8.43% 450.soplex 2.41% 453.povray 5.40% 470.lbm 0.00% 482.sphinx3 7.10% 400.perlbench 2.77% 401.bzip2 9.62% 403.gcc
2017 Jan 30
2
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop > dynamic unroller and partial unroller. This seems conservative because > unlike dynamic/partial
2016 Oct 27
0
(RFC) Encoding code duplication factor in discriminator
The large percentages are from those tiny benchmarks. If you look at omnetpp (0.52%), and xalanc (1.46%), the increase is small. To get a better average increase, you can sum up total debug_line size before and after and compute percentage accordingly. David On Thu, Oct 27, 2016 at 1:11 PM, Dehao Chen <dehao at google.com> wrote: > The impact to debug_line is actually not small. I only
2017 Jan 31
3
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote: > > > > On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> On Jan 30,
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com> wrote: > On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Currently, loop fully unroller shares the same default
2017 Jan 30
0
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368
2016 Oct 27
0
(RFC) Encoding code duplication factor in discriminator
Do you have an estimate of the debug_line size increase? I guess it will be small. David On Thu, Oct 27, 2016 at 11:39 AM, Dehao Chen <dehao at google.com> wrote: > Motivation: > Many optimizations duplicate code. E.g. loop unroller duplicates the loop > body, GVN duplicates computation, etc. The duplicated code will share the > same debug info with the original code. For
2009 May 28
2
Bug in base function sample ( ) (PR#13727)
Full_Name: Michael Chajewski Version: 2.9.0 OS: Windows XP Submission from: (NULL) (150.108.71.185) I was programming a routine which kept reducing the array from which a random sample was taken, resulting in a single number. I discovered that when R attempts to sample from an object with only one number it does not reproduce/report the number but instead chooses a random number between 1 and
2007 Jul 24
2
Dial out through multiple Zap groups
Hi, I'm trying to set a rule to dial out through multiple Zap groups so that, say, g0 is the cheaper POTS lines group and must be used first. However, if g0 is busy or disconnected then try dialing out g1. My g0 group is made up of 4 analog lines connected to a 4-FXO card. I disconnected the RJ-11 wires from the FXO card to simulate a line disconnection. So theoretically all calls should
2012 Jan 12
1
[LLVMdev] A question of Sparc assembly generated by llc
Hi, There are some generated Sparc assembly code like this: main: ! @main ! BB#0: save %sp, -112, %sp sethi 0, %l0 or %g0, 5, %l1 st %l0, [%fp+-4] st %l1, [%fp+-8] st %l1, [%fp+-12] sethi %hi(.L.str), %l1 ld [%fp+-8], %o1 add %l1, %lo(.L.str), %l1 or %g0, %l1, %o0 call printf nop ld [%fp+-12], %o2 ld [%fp+-8], %l2 sethi %hi(.L.strQ521), %l3 add
2023 Dec 02
1
Try reproduce glmm by hand
Dear all, In order to be sure I understand glmm correctly, I try to reproduce by hand a simple result. Here is a reproducible code. The questions are in _________________ Of course I have tried to find the solution using internet but I was not able to find a solution. I have also tried to follow glmer but it is very complicated code! Thanks for any help. Marc # Generate set of df with nb
2015 Apr 11
2
[LLVMdev] __eh_frame info changes in Clang?
Nick, Do you happen to know why the version reported in 'dwarfdump --eh-frame' for object files now differs when compiled with and without -g? The test used in FSF gcc's configure produces a diff of.. % diff -u conftest.o.g.stripped.dwarfdump conftest.o.g0.stripped.dwarfdump --- conftest.o.g.stripped.dwarfdump 2015-04-10 21:43:15.000000000 -0400 +++
2016 Oct 27
8
(RFC) Encoding code duplication factor in discriminator
Motivation: Many optimizations duplicate code. E.g. loop unroller duplicates the loop body, GVN duplicates computation, etc. The duplicated code will share the same debug info with the original code. For SamplePGO, the debug info is used to present the profile. Code duplication will affect profile accuracy. Taking loop unrolling for example: #1 foo(); #2 for (i = 0; i < N; i++) { #3 bar();
2013 Apr 29
3
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi, We've recently encountered a problem in our compiler where the line number in debug info jumps back and force even at O0. This is caused by DAG node ordering not being properly kept during legalization and instruction selection. There are still uncaught cases after applying the patch mentioned here. So I have decided to implement the approach suggested by Andy as below. i.e. maintain the
2013 Apr 30
2
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi Eric, Sorry I wasn't clear. The problem happened in the "source" pre-RA scheduler, which relies on DAG node ordering to schedule the nodes. Xiaoyi From: Eric Christopher [mailto:echristo at gmail.com] Sent: Tuesday, April 30, 2013 12:54 AM To: Guo, Xiaoyi Cc: LLVM Developers Mailing List Subject: Re: [LLVMdev] [PATCH] Propagate DAG node ordering during legalization and
2009 Aug 19
2
[LLVMdev] Solaris (sparc) llc bugs
Hello. I have been trying to check, how llvm works on Solaris recently. First I have tested lli, whitch seems to execute the bytecode generated on Linux without any problems. However, llc has failed to generate valid SPARC assembler code even on the helloworld example. Here is the generated code: sakharov at trillian:~$ cat ./test.s .text .align 16 .globl main
2008 Apr 23
2
[LLVMdev] how to dump DSA graph in gdb?
Hi, all: Recently I am debugging the DSA and want to learn how it work, and now I am checking the local datastructure analysis. I use the following command to print the graph: (gdb) p g.dump() digraph DataStructures { label="Function addG"; Node0xe1f3a0 [shape=record,shape=Mrecord,label="{ i32: MRE\n|{<g0>}}"]; Node0xe1f4d0
2012 Apr 19
1
Question about glusterfs quotas on debian wheezy?
Hello list, I'm experimenting with a little GlusterFS cluster on debian wheezy: === snip === muzzy:~# cat /etc/debian_version wheezy/sid muzzy:~# dpkg -l | grep gluster ii glusterfs-client 3.2.6-1 clustered file-system (client package) ii glusterfs-common 3.2.6-1 GlusterFS common libraries and translator modules ii glusterfs-server 3.2.6-1 clustered file-system (server package) === snip