Displaying 20 results from an estimated 5000 matches similar to: "7-8% compile time slowdowns in LLVM 10"
2020 Jul 02
6
RFC: Replacing the default CRT allocator on Windows
Hello,
I was wondering how folks were feeling about replacing the default Windows CRT allocator in Clang, LLD and other LLVM tools possibly.
The CRT heap allocator on Windows doesn't scale well on large core count machines. Any multi-threaded workload in LLVM that allocates often is impacted by this. As a result, link times with ThinLTO are extremely slow on Windows. We're observing
2015 Mar 13
6
[LLVMdev] On LLD performance
> I will do a run with --merge-strings. This should probably the the
> default to match other ELF linkers.
Trying --merge-strings with today's trunk I got
* comment got 77 797 bytes smaller.
* rodata got 9 394 257 bytes smaller.
Comparing with gold, comment now has the same size and rodata is 55
021 bytes bigger.
Amusingly, merging strings seems to make lld a bit faster. With
2015 Mar 11
9
[LLVMdev] On LLD performance
I spent a week to optimize LLD performance and just wanted to share things
what I found. Also if there's anyone who have a good idea on how to make it
faster, I'd like to hear.
My focus is mainly on Windows, but my optimizations are generally platform
neutral. I aim both single-thread and multi-thread performance.
r231434 <http://reviews.llvm.org/rL231454> is a change that has the
2015 Mar 13
3
[LLVMdev] On LLD performance
Rafael,
This is very good information and extremely useful.
On 3/12/2015 11:49 AM, Rafael Espíndola wrote:
> I tried benchmarking it on linux by linking clang Release+asserts (but
> lld itself with no asserts). The first things I noticed were:
>
> missing options:
>
> warning: ignoring unknown argument: --no-add-needed
> warning: ignoring unknown argument: -O3
> warning:
2016 Nov 17
2
LLD: time to enable --threads by default
Did you see this
http://llvm.org/viewvc/llvm-project?view=revision&revision=287140 ?
Interpreting these numbers may be tricky because of hyper threading, though.
On Wed, Nov 16, 2016 at 5:15 PM, Joerg Sonnenberger via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Wed, Nov 16, 2016 at 12:44:46PM -0800, Rui Ueyama via llvm-dev wrote:
> > I'm thinking to enable --threads
2016 Nov 17
4
LLD: time to enable --threads by default
Here is the result of running 20 threads on 20 physical cores (40 virtual
cores).
19002.081139 task-clock (msec) # 2.147 CPUs utilized
( +- 2.88% )
23,006 context-switches # 0.001 M/sec
( +- 2.24% )
1,491 cpu-migrations # 0.078 K/sec
( +- 22.50% )
2,607,076 page-faults # 0.137 M/sec
2020 Jul 03
4
[cfe-dev] RFC: Replacing the default CRT allocator on Windows
Thanks for the suggestion James, it reduces the commit by about ~900 MB (14,9 GB -> 14 GB).
Unfortunately it does not solve the performance problem. The heap is global to the application and thread-safe, so every malloc/free locks it, which evidently doesn’t scale. We could manually create thread-local heaps, but I didn’t want to go there. Ultimately allocated blocks need to share ownership
2020 Jul 07
2
[cfe-dev] RFC: Replacing the default CRT allocator on Windows
For release builds, I think this is fine. However for debug builds, the Windows allocator provides a lot of built-in functionality for debugging memory issues that I would be very sad to lose. Therefore, I would request that:
1. This be added as a configuration option to either select the new allocator or the windows allocator
2. The Windows allocator be used by default in debug builds
2020 Jul 07
3
[cfe-dev] RFC: Replacing the default CRT allocator on Windows
Asan and the Debug CRT take different approaches, but the problems they
cover largely overlap.
Both help with detection of errors like buffer overrun, double free, use
after free, etc. Asan generally gives you more immediate feedback on
those, but you pay a higher price in performance. Debug CRT lets you do
some trade off between the performance hit and how soon it detects problems.
Asan
2020 Jul 07
2
[cfe-dev] RFC: Replacing the default CRT allocator on Windows
I hadn't heard this before. If I use clang with -fsanitize=address to
build my program, and then run my program, what difference does it make for
the execution of my program whether the compiler itself was instrumented or
not? Do you mean that ASAN runtime itself should be instrumented, since
your program loads that at runtime?
On Tue, Jul 7, 2020 at 2:04 PM Mitch Phillips <mitchp at
2020 Jun 22
2
Restrict qualifier on class members
Hi Jeroen,
That's great! I was trying to use the patch, what's the latest version of
the project we could apply it on?
Hi Neil,
That seems like what I can do as well! Do you happen to have some examples
lying around? Maybe a pointer to the planned presentation, if that's okay?
Thank you,
Bandhav
On Mon, Jun 22, 2020 at 1:55 AM Neil Henning <neil.henning at unity3d.com>
2020 Jun 22
2
Restrict qualifier on class members
Unfortunately https://llvm.org/docs/LangRef.html#llvm-loop-parallel-accesses-metadata
is not a solution here. A loop-parallel access does not imply
non-aliasing. The obvious case is when only reading from a location,
but even when a location is written to I'd be careful to deduce that
they do not alias since it might be a "benign data race" or the value
never used. Additionally,
2010 Aug 26
2
[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?
On Aug 26, 2010, at 12:59 PMPDT, Eric Christopher wrote:
> On Aug 26, 2010, at 12:25 PM, Yuri wrote:
>> On 08/26/2010 11:53, Eric Christopher wrote:
>>> Could you get it to print out the instruction when it happens?
>>> (just change the line above the error message to print it out to
>>> errs()).
>>>
>>> It basically means that a pseudo
2015 Feb 20
10
[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
.../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++-
1 file changed, 63 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index dfb093c..e38a3b8 100644
---
2010 Aug 27
2
[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?
On Aug 26, 2010, at 11:34 PMPDT, Yuri wrote:
> On 08/26/2010 13:17, Dale Johannesen wrote:
>>>> Insn before the error: TCRETURNri64 %RAX<kill>, 0, %RDI<kill>,
>>>> %RAX<imp-def,dead>, %RDI<imp-def,dead>, %RSP<imp-use>, ...
>>>
>>> Odd. I thought TCReturn was being lowered. At any rate can you
>>> file a bug
2013 Sep 12
1
[LLVMdev] [patch] remove redundant code in X86DisassemblerDecoder.c
there is an if-else code in X86DisassemblerDecoder.c that does exactly the
same thing on both paths. so this patch removes the redundant path.
thanks,
Jun
diff --git a/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
b/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
index 20e61da..3932ea1 100644
--- a/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
+++
2015 Aug 19
5
[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF
Some shaders appear to extract bits using shift/and combos. Detect
(some) of those and convert to EXTBF instead.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
.../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++-------
1 file changed, 46 insertions(+), 20 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
2013 Jun 25
2
[LLVMdev] Auxiliary operand types for disassembler.
I'm working on a disassembler for hexagon (vliw) architecture and I
would like to add an additional operand type, "kAux" to the MCOperand class.
The reason for this is that each insn has parse bits which are not
explicit operands and have differing meanings based on the insn's
location within the packet and the number of insns inside the packet.
In order for the disassembler
2010 Dec 16
1
[LLVMdev] x86 disassembler: if-statement with redundant branch
Hi there!
In the x86 disassembler I noticed an if-statement with a
duplicated branch. Are these intended to be identical?
Best regards,
Nicolas Kaiser
--
diff -ur llvm-2.8.orig/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c llvm-2.8/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c
--- llvm-2.8.orig/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c 2010-05-06 22:59:00.000000000 +0200
2012 Mar 02
3
[LLVMdev] how to annotate assembler
Hi,
In GCC there is one useful option -dp (or -dP for more verbose output)
to annotate assembler with instruction patterns, that was used when
assembler was generated. For example:
double
test(long long s)
{
return s;
}
gcc -S -dp -O0 test.c
test:
.LFB0:
.cfi_startproc
pushq %rbp # 18 *pushdi2_rex64/1 [length = 1]
.cfi_def_cfa_offset 16
movq %rsp, %rbp # 19 *movdi_1_rex64/2