search for: quadwords

Displaying 20 results from an estimated 48 matches for "quadwords".

2008 Dec 23
3
[LLVMdev] Register Dependencies and Register Allocation
I'm writing a back-end for an architecture that supports multi-word loads. As a concrete example, "ldqw r0, [addr]" would load a quadword (4 words) into 4 registers starting with r0 (implicit writes to r1, r2, and r3). First, is there any currently supported architecture that has anything like this? I suspect not. If not, I hope someone might help me figure out how to make this
2007 Aug 29
3
[LLVMdev] Custom GEP lowering
On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote: > On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: >> It looks like I need to be able to intercept GEP lowering (in >> SelectionDAGLowering::visitGetElementPtr) and insert something else >> other than the shifts and adds. The basic problem is that CellSPU >> loads and stores on 16-byte boundaries. Consequently,
2012 Jun 22
1
[LLVMdev] Inconsistent naming of SSE intrinsics?
Hey guys, Is there a reason for the following naming quirk in the x86 SSE intrinsics: int_x86_sse2_pcmpeq_b int_x86_sse2_pcmpeq_w int_x86_sse2_pcmpeq_d int_x86_sse41_pcmpeqq I anticipated a "_q" suffix for the quadword variant, but was surprised to see the intrinsic named above. Just FYI..., Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL:
2009 Jul 30
2
SYSLINUX 3.83-pre3
I *think* I have found and fixed the Thinkpad MEMDISK problem. The problem with MS-DOS I understand... not so when it comes to an apparently unrelated FreeDOS problem, and as such I really don't know *why* the hack I did works, nor if it will *stay* fixed, but at least it seems to boot on my T61 (at least until it crashes due to another error...) -hpa -- H. Peter Anvin, Intel Open Source
2008 Dec 23
0
[LLVMdev] Register Dependencies and Register Allocation
On Dec 23, 2008, at 11:03 AMPST, Marc de Kruijf wrote: > > I'm writing a back-end for an architecture that supports multi-word > loads. As a concrete example, "ldqw r0, [addr]" would load a > quadword (4 words) into 4 registers starting with r0 (implicit > writes to r1, r2, and r3). ARM has this. It currently works by creating such instructions in a
2009 Jul 31
1
[PATCH] [memdisk] Additional EDD Device Parameter Table fields
Some additional fields from the EDD-4 spec. draft for the Device Parameter Table have been added into the structure in setup.c and memdisk.inc. These were added in the hopes of resolving a FreeDOS MEMDISK bug on IBM ThinkPads. --- memdisk/memdisk.inc | 11 +++++++++++ memdisk/setup.c | 10 ++++++++++ 2 files changed, 21 insertions(+), 0 deletions(-) diff --git a/memdisk/memdisk.inc
2007 Aug 29
0
[LLVMdev] Custom GEP lowering
On Aug 28, 2007, at 6:15 PM, Scott Michel wrote: > On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote: > >> On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: >>> It looks like I need to be able to intercept GEP lowering (in >>> SelectionDAGLowering::visitGetElementPtr) and insert something else >>> other than the shifts and adds. The basic problem is
2007 Apr 18
1
[PATCH 0/7] Using %gs for per-cpu areas on x86
OK, here it is. Benchmarks still coming. This is against Andi's 2.6.18-rc7-git3 tree, and replaces the patches between (and not including) i386-pda-asm-offsets and i386-early-fault. One patch is identical, one is mildly modified, the rest are re-implemented but inspired by Jeremy's PDA work. Thanks, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
2007 Apr 18
1
[PATCH 0/7] Using %gs for per-cpu areas on x86
OK, here it is. Benchmarks still coming. This is against Andi's 2.6.18-rc7-git3 tree, and replaces the patches between (and not including) i386-pda-asm-offsets and i386-early-fault. One patch is identical, one is mildly modified, the rest are re-implemented but inspired by Jeremy's PDA work. Thanks, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)
2016 Jan 31
2
question about src/test_seeking.c - seek_barrage()
seek_barrage() has variable n of type long int (which is 32bit usually). Then we see something like n = (long int)total_samples; So, why n has type long int, and not FLAC__int64 or some other 64-bit type?
2007 Aug 28
0
[LLVMdev] Custom GEP lowering
On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: > It looks like I need to be able to intercept GEP lowering (in > SelectionDAGLowering::visitGetElementPtr) and insert something else > other than the shifts and adds. The basic problem is that CellSPU > loads and stores on 16-byte boundaries. Consequently, the SPU backend > has to do the load or store differently
2004 Sep 10
1
altivec lpc_restore_signal
I've had this a long time but haven't submitted it yet. I've tried to mirror the ia32 setup, so there should be a new subdirectory src/libFLAC/ppc . The first two attachments go there. The third is a context diff for src/libFLAC/Makefile.am . I have some more modified files, which I figured I'd submit after the above are checked in and working for somebody other than me. If you
2018 Jan 26
4
[RFC] Improving compact x86-64 compact unwind descriptors
Here is our proposal to extend/enhance the x86-64 compact unwind descriptors to fully describe the prologue/epilogue for asynchronous unwinding.  I believe there are missing/lacking CFI directives as well, but I'll save that for another thread. Asynchronous Compact Unwind Descriptors Ron Brender, VMS Software, Inc. Revised January 25, 2018 1  Introduction This document proposes means to
2007 Aug 28
2
[LLVMdev] Custom GEP lowering
It looks like I need to be able to intercept GEP lowering (in SelectionDAGLowering::visitGetElementPtr) and insert something else other than the shifts and adds. The basic problem is that CellSPU loads and stores on 16-byte boundaries. Consequently, the SPU backend has to do the load or store differently than most normal architectures that have byte-addressable operations.
2018 Jan 27
0
[RFC] Improving compact x86-64 compact unwind descriptors
John and Ron, I developed the original compact unwind implementation for macOS 10.6 back in 2009. I tried to leave space in the design to support finer grain exception handling such as for asynchronous or for the shrink wrap optimization. The idea I had at the time was instead of having just one 32-bit compact unwind info per function, there could be an array of them each covering a different
2018 Jan 29
2
[RFC] Improving compact x86-64 compact unwind descriptors
Hi Nick, It is a pleasure to be in contact with the creator of the compact unwind approach! I can see how an array of 32-bit unwind blocks could be used to describe each distinct point within a function (within a prolog in particular). But then you end up with six or seven or more such blocks for a large percentage of functions, don't you? Seems like a lot of additional space for something
2014 Sep 19
4
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Hi Chandler, I have tested the new shuffle lowering on a AMD Jaguar cpu (which is AVX but not AVX2). On this particular target, there is a delay when output data from an execution unit is used as input to another execution unit of a different cluster. For example, There are 6 executions units which are divided into 3 execution clusters of Float(FPM,FPA), Vector Integer (MMXA,MMXB,IMM), and Store
2010 Jan 08
0
[LLVMdev] First-class aggregate semantics
On Thursday 07 January 2010 21:56:11 Dustin Laurence wrote: > On 01/07/2010 01:38 PM, David Greene wrote: > > The way this works on many targets is that the caller allocates stack > > space in its frame for the returned struct and passes a pointer to it > > as a first "hidden" argument to the callee. The callee then copies > > that data into the space pointed
2018 Jan 27
0
[RFC] Improving compact x86-64 compact unwind descriptors
Hi John & Ron, I read through the proposal and had a couple of quick observations. 1. The proposed encoding assumes that the epilogue instructions always come at the end of the function -- or rather, just before the next function. If there is a stack protector __stack_chk_fail sequence, or there is NOP padding between functions, then the epilogue cannot be expressed. The proposed encoding