similar to: [LLVMdev] Possible missed optimization on function calling?

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] Possible missed optimization on function calling?"

2010 Dec 05
1
[LLVMdev] Register Pairing
Hello Lang, thanks for the suggestion :) it's very interesting. I'll take a read to the email you've pointed out there to understand how it works. Btw, does this mean that only your allocator is able to handle or support this type of constraint? As a follow up to my previous email, the following code is a real example of what i was explaining, Lang this example is exactly why i need
2010 Dec 02
0
[LLVMdev] Register Pairing
Hi Borja, > Without doing what i mentioned and letting LLVM expand all operations wider > than 8 bits as you asked, the code produced is excellent supposing that many > of the moves there should be 16 bit moves reducing code size and right > register allocation, also something important for me is that the code is > better than gcc's. When i say right reg allocation it doesnt
2010 Dec 01
2
[LLVMdev] Register Pairing
Jeff thanks for those suggestions, that's exactly what i would like to do, however i dont know how to do it with my current knowledge :\ As far as i understand patterns only take one instruction as an input (while the pattern you wrote before takes two) and also, i dont know how to handle register copying (COPY) in the .td file because they're handled in a different way to the rest of
2014 Feb 08
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
On Fri, 7 Feb 2014, Timothy B. Terriberry wrote: > Martin Storsjo wrote: >> This is required in order to build using the built-in assembler >> in clang. > > These patches break the gcc build (with "Error: bad instruction"). Ah, right, sorry about that. > Documentation I've seen is contradictory on which order ({cond}{size} or > {size}{cond}) is correct.
2012 Oct 02
18
[PATCH 0/3] x86: adjust entry frame generation
This set of patches converts the way frames gets created from using PUSHes/POPs to using MOVes, thus allowing (in certain cases) to avoid saving/restoring part of the register set. While the place where the (small) win from this comes from varies between CPUs, the net effect is a 1 to 2% reduction on a combined interruption entry and exit when the full state save can be avoided. 1: use MOV
2014 Sep 29
18
Implement reclocking for DDR2, DDR3, GDDR3
Following a series of patches that implement memory reclocking for NVA3/5/8 with DDR2, DDR3 and GDDR3 on board. I tested these patches on 6 different graphics cards, but I expect reclocking now to work on many more. Testers can pick up these patches and test it by enabling pstate (nouveau.pstate=1). They should then be able to change clocks by writing to /sys/class/drm/card0/device/pstate. Correct
2016 Nov 09
10
Is the correct behavior of getelementptr i192* for opt + llc -march=aarch64?
Hi all, opt and opt + llc generate the difference aarch64 asm code for the following LLVM code. Is it intended behavior? I expected (A) because I cast %p from i192* to i64*. The information is dropped by opt and 8-byte padding is inserted or I write a bad code? % cat a.ll define void @store0_to_p4(i192* %p) { %p1 = bitcast i192* %p to i64* %p2 = getelementptr i64, i64* %p1, i64 3 %p3 =
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)
2010 Jun 15
2
[LLVMdev] Question on X86 backend
Hi Micah, > In X86InstrInfo.td for Call Instructions, it mentions that Uses for > argument registers are added manually. Can someone point me to the > location where they are added as the comment doesn't reference a > where or how? the register uses are added by the function X86TargetLowering::LowerCall() during the DAG Lowering phase. This is the relevant code segment: // Add
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in
2004 Mar 02
1
Immediate crash on Mac OS X 10.2.8
Hello, I don't know very much about Macs, so may have missed some obvious steps, but I downloaded version 1.8.1 of RAqua.dmg and installed from the R package file. The Start R icon appeared in the applications directory, but crashed immediately on opening. Here is the log file: Date/Time: 2004-03-03 08:32:27 +1100 OS Version: 10.2.8 (Build 6R73) Host: Kylie-Kings-Computer.local.
2020 Apr 07
2
[ARM] Register pressure with -mthumb forces register reload before each call
If I'm understanding what's going on in this test correctly, what's happening is: * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6) * The function has three arguments, so those three plus the register we need to hold the
2016 May 04
2
OrcLazyJIT for windows
Hi There, I am currently exploring C++ JIT-compilation for a project where this would be very useful. I started with the code from the lli tool which uses OrcLazyJIT and changed it, such that the module is being compiled from c++ source in memory and OrcLazyJIT is used exclusively. Now since I am on windows, I found that my application is crashing when trying to run the main function from the
2017 Oct 20
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
On 20 October 2017 at 09:24, Ingo Molnar <mingo at kernel.org> wrote: > > * Thomas Garnier <thgarnie at google.com> wrote: > >> Change the assembly code to use only relative references of symbols for the >> kernel to be PIE compatible. >> >> Position Independent Executable (PIE) support will allow to extended the >> KASLR randomization range below
2017 Oct 20
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
On 20 October 2017 at 09:24, Ingo Molnar <mingo at kernel.org> wrote: > > * Thomas Garnier <thgarnie at google.com> wrote: > >> Change the assembly code to use only relative references of symbols for the >> kernel to be PIE compatible. >> >> Position Independent Executable (PIE) support will allow to extended the >> KASLR randomization range below
2016 May 04
2
OrcLazyJIT for windows
Hi David, This is really cool. I'd love to get this in-tree. There are two ways we could go about this: (1) Make the OrcArchitecture interface ABI-aware so that it can choose the right resolver code, or (2) Replace the OrcArchitecture classes with OrcABI classes. I.e. We'd just a rename OrcX86_64 -> Orc_X86_64_SysV (and rename I386 & AArch64 similarly) , then we add your code as
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
I submitted the problem report to clang's bugzilla but no one seems to care so I have to send it to the mailing list. clang 3.7 svn (trunk 229055 as the time I was to report this problem) generates slower code than 3.5 (Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)) for the following code. It is a "8 queens puzzle" solver written as an educational example. As
2017 Oct 16
4
[Xen-devel] [PATCH 11/13] x86/paravirt: Add paravirt alternatives infrastructure
On 10/12/2017 03:53 PM, Boris Ostrovsky wrote: > On 10/12/2017 03:27 PM, Andrew Cooper wrote: >> On 12/10/17 20:11, Boris Ostrovsky wrote: >>> There is also another problem: >>> >>> [ 1.312425] general protection fault: 0000 [#1] SMP >>> [ 1.312901] Modules linked in: >>> [ 1.313389] CPU: 0 PID: 1 Comm: init Not tainted 4.14.0-rc4+ #6
2017 Oct 16
4
[Xen-devel] [PATCH 11/13] x86/paravirt: Add paravirt alternatives infrastructure
On 10/12/2017 03:53 PM, Boris Ostrovsky wrote: > On 10/12/2017 03:27 PM, Andrew Cooper wrote: >> On 12/10/17 20:11, Boris Ostrovsky wrote: >>> There is also another problem: >>> >>> [ 1.312425] general protection fault: 0000 [#1] SMP >>> [ 1.312901] Modules linked in: >>> [ 1.313389] CPU: 0 PID: 1 Comm: init Not tainted 4.14.0-rc4+ #6
2018 Dec 14
2
efi config hang
> ah ha... now that I have all the pices in place, this is what I was missing: > > # EFI/BOOT/SYSLX64.CFG > # D-I config version 2.0 > # search path for the c32 support libraries (libcom32, libutil etc.) > PATH EFI/BOOT/SYSLINUX/EFI64/ > DEFAULT common > PROMPT 0 > LABEL common > CONFIG ../../syslinux.cfg ../../ > > The syslinux.cfg and all the other .cfg