thr3ads.net - similar to: "7-8% compile time slowdowns in LLVM 10"

Displaying 20 results from an estimated 5000 matches similar to: "7-8% compile time slowdowns in LLVM 10"

RFC: Replacing the default CRT allocator on Windows

2020 Jul 02

RFC: Replacing the default CRT allocator on Windows

Hello, I was wondering how folks were feeling about replacing the default Windows CRT allocator in Clang, LLD and other LLVM tools possibly. The CRT heap allocator on Windows doesn't scale well on large core count machines. Any multi-threaded workload in LLVM that allocates often is impacted by this. As a result, link times with ThinLTO are extremely slow on Windows. We're observing

[LLVMdev] On LLD performance

2015 Mar 13

[LLVMdev] On LLD performance

> I will do a run with --merge-strings. This should probably the the > default to match other ELF linkers. Trying --merge-strings with today's trunk I got * comment got 77 797 bytes smaller. * rodata got 9 394 257 bytes smaller. Comparing with gold, comment now has the same size and rodata is 55 021 bytes bigger. Amusingly, merging strings seems to make lld a bit faster. With

[LLVMdev] On LLD performance

2015 Mar 11

[LLVMdev] On LLD performance

I spent a week to optimize LLD performance and just wanted to share things what I found. Also if there's anyone who have a good idea on how to make it faster, I'd like to hear. My focus is mainly on Windows, but my optimizations are generally platform neutral. I aim both single-thread and multi-thread performance. r231434 <http://reviews.llvm.org/rL231454> is a change that has the

[LLVMdev] On LLD performance

2015 Mar 13

[LLVMdev] On LLD performance

Rafael, This is very good information and extremely useful. On 3/12/2015 11:49 AM, Rafael Espíndola wrote: > I tried benchmarking it on linux by linking clang Release+asserts (but > lld itself with no asserts). The first things I noticed were: > > missing options: > > warning: ignoring unknown argument: --no-add-needed > warning: ignoring unknown argument: -O3 > warning:

LLD: time to enable --threads by default

2016 Nov 17

LLD: time to enable --threads by default

Did you see this http://llvm.org/viewvc/llvm-project?view=revision&revision=287140 ? Interpreting these numbers may be tricky because of hyper threading, though. On Wed, Nov 16, 2016 at 5:15 PM, Joerg Sonnenberger via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Wed, Nov 16, 2016 at 12:44:46PM -0800, Rui Ueyama via llvm-dev wrote: > > I'm thinking to enable --threads

LLD: time to enable --threads by default

2016 Nov 17

LLD: time to enable --threads by default

Here is the result of running 20 threads on 20 physical cores (40 virtual cores). 19002.081139 task-clock (msec) # 2.147 CPUs utilized ( +- 2.88% ) 23,006 context-switches # 0.001 M/sec ( +- 2.24% ) 1,491 cpu-migrations # 0.078 K/sec ( +- 22.50% ) 2,607,076 page-faults # 0.137 M/sec

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

2020 Jul 03

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

Thanks for the suggestion James, it reduces the commit by about ~900 MB (14,9 GB -> 14 GB). Unfortunately it does not solve the performance problem. The heap is global to the application and thread-safe, so every malloc/free locks it, which evidently doesn’t scale. We could manually create thread-local heaps, but I didn’t want to go there. Ultimately allocated blocks need to share ownership

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

2020 Jul 07

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

For release builds, I think this is fine. However for debug builds, the Windows allocator provides a lot of built-in functionality for debugging memory issues that I would be very sad to lose. Therefore, I would request that: 1. This be added as a configuration option to either select the new allocator or the windows allocator 2. The Windows allocator be used by default in debug builds

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

2020 Jul 07

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

Asan and the Debug CRT take different approaches, but the problems they cover largely overlap. Both help with detection of errors like buffer overrun, double free, use after free, etc. Asan generally gives you more immediate feedback on those, but you pay a higher price in performance. Debug CRT lets you do some trade off between the performance hit and how soon it detects problems. Asan

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

2020 Jul 07

[cfe-dev] RFC: Replacing the default CRT allocator on Windows

I hadn't heard this before. If I use clang with -fsanitize=address to build my program, and then run my program, what difference does it make for the execution of my program whether the compiler itself was instrumented or not? Do you mean that ASAN runtime itself should be instrumented, since your program loads that at runtime? On Tue, Jul 7, 2020 at 2:04 PM Mitch Phillips <mitchp at

Restrict qualifier on class members

2020 Jun 22

Restrict qualifier on class members

Hi Jeroen, That's great! I was trying to use the patch, what's the latest version of the project we could apply it on? Hi Neil, That seems like what I can do as well! Do you happen to have some examples lying around? Maybe a pointer to the planned presentation, if that's okay? Thank you, Bandhav On Mon, Jun 22, 2020 at 1:55 AM Neil Henning <neil.henning at unity3d.com>

Restrict qualifier on class members

2020 Jun 22

Restrict qualifier on class members

Unfortunately https://llvm.org/docs/LangRef.html#llvm-loop-parallel-accesses-metadata is not a solution here. A loop-parallel access does not imply non-aliasing. The obvious case is when only reading from a location, but even when a location is written to I'd be careful to deduce that they do not alias since it might be a "benign data race" or the value never used. Additionally,

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

2010 Aug 26

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

On Aug 26, 2010, at 12:59 PMPDT, Eric Christopher wrote: > On Aug 26, 2010, at 12:25 PM, Yuri wrote: >> On 08/26/2010 11:53, Eric Christopher wrote: >>> Could you get it to print out the instruction when it happens? >>> (just change the line above the error message to print it out to >>> errs()). >>> >>> It basically means that a pseudo

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

2015 Feb 20

[PATCH 01/11] nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 66 +++++++++++++++++++++- 1 file changed, 63 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp index dfb093c..e38a3b8 100644 ---

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

2010 Aug 27

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

On Aug 26, 2010, at 11:34 PMPDT, Yuri wrote: > On 08/26/2010 13:17, Dale Johannesen wrote: >>>> Insn before the error: TCRETURNri64 %RAX<kill>, 0, %RDI<kill>, >>>> %RAX<imp-def,dead>, %RDI<imp-def,dead>, %RSP<imp-use>, ... >>> >>> Odd. I thought TCReturn was being lowered. At any rate can you >>> file a bug

[LLVMdev] [patch] remove redundant code in X86DisassemblerDecoder.c

2013 Sep 12

[LLVMdev] [patch] remove redundant code in X86DisassemblerDecoder.c

there is an if-else code in X86DisassemblerDecoder.c that does exactly the same thing on both paths. so this patch removes the redundant path. thanks, Jun diff --git a/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c b/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c index 20e61da..3932ea1 100644 --- a/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c +++

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

2015 Aug 19

[PATCH 1/2] nvc0/ir: detect AND/SHR pairs and convert into EXTBF

Some shaders appear to extract bits using shift/and combos. Detect (some) of those and convert to EXTBF instead. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 66 +++++++++++++++------- 1 file changed, 46 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[LLVMdev] Auxiliary operand types for disassembler.

2013 Jun 25

[LLVMdev] Auxiliary operand types for disassembler.

I'm working on a disassembler for hexagon (vliw) architecture and I would like to add an additional operand type, "kAux" to the MCOperand class. The reason for this is that each insn has parse bits which are not explicit operands and have differing meanings based on the insn's location within the packet and the number of insns inside the packet. In order for the disassembler

[LLVMdev] x86 disassembler: if-statement with redundant branch

2010 Dec 16

[LLVMdev] x86 disassembler: if-statement with redundant branch

Hi there! In the x86 disassembler I noticed an if-statement with a duplicated branch. Are these intended to be identical? Best regards, Nicolas Kaiser -- diff -ur llvm-2.8.orig/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c llvm-2.8/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c --- llvm-2.8.orig/lib/Target/X86/Disassembler/X86DisassemblerDecoder.c 2010-05-06 22:59:00.000000000 +0200

[LLVMdev] how to annotate assembler

2012 Mar 02

[LLVMdev] how to annotate assembler

Hi, In GCC there is one useful option -dp (or -dP for more verbose output) to annotate assembler with instruction patterns, that was used when assembler was generated. For example: double test(long long s) { return s; } gcc -S -dp -O0 test.c test: .LFB0: .cfi_startproc pushq %rbp # 18 *pushdi2_rex64/1 [length = 1] .cfi_def_cfa_offset 16 movq %rsp, %rbp # 19 *movdi_1_rex64/2

similar to: 7-8% compile time slowdowns in LLVM 10