Displaying 20 results from an estimated 8000 matches similar to: "Debug info interacting with optimization and code generation"
2017 May 18
6
Enable vectorizer-maximize-bandwidth by default?
Hi,
I'm proposing to make vectorizer-maximize-bandwidth on by default for loop
vectorizer because it should generally help performance.
I've tested the performance impact on Intel sandybridge machine with
speccpu benchmarks:
Benchmark Base:Reference (1)
-------------------------------------------------------
spec/2006/fp/C++/444.namd 26.84
2017 Jan 30
4
(RFC) Adjusting default loop fully unroll threshold
Currently, loop fully unroller shares the same default threshold as loop
dynamic unroller and partial unroller. This seems conservative because
unlike dynamic/partial unrolling, fully unrolling will not affect
LSD/ICache performance. In https://reviews.llvm.org/D28368, I proposed to
double the threshold for loop fully unroller. This will change the codegen
of several SPECCPU benchmarks:
Code
2016 Oct 27
2
(RFC) Encoding code duplication factor in discriminator
The impact to debug_line is actually not small. I only implemented the part
1 (encoding duplication factor) for loop unrolling and loop vectorization.
The debug_line size overhead for "-O2 -g1" binary of speccpu C/C++
benchmarks:
433.milc 23.59%
444.namd 6.25%
447.dealII 8.43%
450.soplex 2.41%
453.povray 5.40%
470.lbm 0.00%
482.sphinx3 7.10%
400.perlbench 2.77%
401.bzip2 9.62%
403.gcc
2017 Jan 30
2
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Currently, loop fully unroller shares the same default threshold as loop
> dynamic unroller and partial unroller. This seems conservative because
> unlike dynamic/partial
2016 Oct 27
0
(RFC) Encoding code duplication factor in discriminator
The large percentages are from those tiny benchmarks. If you look at
omnetpp (0.52%), and xalanc (1.46%), the increase is small. To get a better
average increase, you can sum up total debug_line size before and after and
compute percentage accordingly.
David
On Thu, Oct 27, 2016 at 1:11 PM, Dehao Chen <dehao at google.com> wrote:
> The impact to debug_line is actually not small. I only
2017 Jan 31
3
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 4:56 PM, Dehao Chen <dehao at google.com> wrote:
>
>
>
> On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
> On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> On Jan 30,
2017 Jan 31
0
(RFC) Adjusting default loop fully unroll threshold
On Mon, Jan 30, 2017 at 3:56 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> On Mon, Jan 30, 2017 at 3:51 PM Mehdi Amini via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Currently, loop fully unroller shares the same default
2017 Jan 30
0
(RFC) Adjusting default loop fully unroll threshold
> On Jan 30, 2017, at 10:49 AM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Currently, loop fully unroller shares the same default threshold as loop dynamic unroller and partial unroller. This seems conservative because unlike dynamic/partial unrolling, fully unrolling will not affect LSD/ICache performance. In https://reviews.llvm.org/D28368
2016 Oct 27
0
(RFC) Encoding code duplication factor in discriminator
Do you have an estimate of the debug_line size increase? I guess it will be
small.
David
On Thu, Oct 27, 2016 at 11:39 AM, Dehao Chen <dehao at google.com> wrote:
> Motivation:
> Many optimizations duplicate code. E.g. loop unroller duplicates the loop
> body, GVN duplicates computation, etc. The duplicated code will share the
> same debug info with the original code. For
2009 May 28
2
Bug in base function sample ( ) (PR#13727)
Full_Name: Michael Chajewski
Version: 2.9.0
OS: Windows XP
Submission from: (NULL) (150.108.71.185)
I was programming a routine which kept reducing the array from which a random
sample was taken, resulting in a single number. I discovered that when R
attempts to sample from an object with only one number it does not
reproduce/report the number but instead chooses a random number between 1 and
2007 Jul 24
2
Dial out through multiple Zap groups
Hi,
I'm trying to set a rule to dial out through multiple
Zap groups so that, say, g0 is the cheaper POTS lines
group
and must be used first. However, if g0 is busy or
disconnected then try dialing out g1.
My g0 group is made up of 4 analog lines connected to
a 4-FXO card. I disconnected the RJ-11 wires from the
FXO card
to simulate a line disconnection. So theoretically all
calls should
2012 Jan 12
1
[LLVMdev] A question of Sparc assembly generated by llc
Hi,
There are some generated Sparc assembly code like this:
main: ! @main
! BB#0:
save %sp, -112, %sp
sethi 0, %l0
or %g0, 5, %l1
st %l0, [%fp+-4]
st %l1, [%fp+-8]
st %l1, [%fp+-12]
sethi %hi(.L.str), %l1
ld [%fp+-8], %o1
add %l1, %lo(.L.str), %l1
or %g0, %l1, %o0
call printf
nop
ld [%fp+-12], %o2
ld [%fp+-8], %l2
sethi %hi(.L.strQ521), %l3
add
2023 Dec 02
1
Try reproduce glmm by hand
Dear all,
In order to be sure I understand glmm correctly, I try to reproduce by
hand a simple result. Here is a reproducible code. The questions are in
_________________
Of course I have tried to find the solution using internet but I was not
able to find a solution. I have also tried to follow glmer but it is
very complicated code!
Thanks for any help.
Marc
# Generate set of df with nb
2015 Apr 11
2
[LLVMdev] __eh_frame info changes in Clang?
Nick,
Do you happen to know why the version reported in 'dwarfdump
--eh-frame' for object files now differs when compiled with and
without -g? The test used in FSF gcc's configure produces a diff of..
% diff -u conftest.o.g.stripped.dwarfdump conftest.o.g0.stripped.dwarfdump
--- conftest.o.g.stripped.dwarfdump 2015-04-10 21:43:15.000000000 -0400
+++
2016 Oct 27
8
(RFC) Encoding code duplication factor in discriminator
Motivation:
Many optimizations duplicate code. E.g. loop unroller duplicates the loop
body, GVN duplicates computation, etc. The duplicated code will share the
same debug info with the original code. For SamplePGO, the debug info is
used to present the profile. Code duplication will affect profile accuracy.
Taking loop unrolling for example:
#1 foo();
#2 for (i = 0; i < N; i++) {
#3 bar();
2013 Apr 29
3
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi,
We've recently encountered a problem in our compiler where the line number in debug info jumps back and force even at O0. This is caused by DAG node ordering not being properly kept during legalization and instruction selection. There are still uncaught cases after applying the patch mentioned here.
So I have decided to implement the approach suggested by Andy as below. i.e. maintain the
2013 Apr 30
2
[LLVMdev] [PATCH] Propagate DAG node ordering during legalization and instruction selection
Hi Eric,
Sorry I wasn't clear. The problem happened in the "source" pre-RA scheduler, which relies on DAG node ordering to schedule the nodes.
Xiaoyi
From: Eric Christopher [mailto:echristo at gmail.com]
Sent: Tuesday, April 30, 2013 12:54 AM
To: Guo, Xiaoyi
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] [PATCH] Propagate DAG node ordering during legalization and
2009 Aug 19
2
[LLVMdev] Solaris (sparc) llc bugs
Hello.
I have been trying to check, how llvm works on Solaris recently.
First I have tested lli, whitch seems to execute the bytecode generated
on Linux without any problems. However, llc has failed to generate valid
SPARC assembler code even on the helloworld example. Here is the generated
code:
sakharov at trillian:~$ cat ./test.s
.text
.align 16
.globl main
2008 Apr 23
2
[LLVMdev] how to dump DSA graph in gdb?
Hi, all:
Recently I am debugging the DSA and want to learn how it work, and now I
am checking the local datastructure analysis.
I use the following command to print the graph:
(gdb) p g.dump()
digraph DataStructures {
label="Function addG";
Node0xe1f3a0 [shape=record,shape=Mrecord,label="{ i32: MRE\n|{<g0>}}"];
Node0xe1f4d0
2012 Apr 19
1
Question about glusterfs quotas on debian wheezy?
Hello list,
I'm experimenting with a little GlusterFS cluster on debian wheezy:
=== snip ===
muzzy:~# cat /etc/debian_version
wheezy/sid
muzzy:~# dpkg -l | grep gluster
ii glusterfs-client 3.2.6-1 clustered file-system (client package)
ii glusterfs-common 3.2.6-1 GlusterFS common libraries and translator
modules
ii glusterfs-server 3.2.6-1 clustered file-system (server package)
=== snip