thr3ads.net - search: "quadwords"

Displaying 20 results from an estimated 48 matches for "quadwords".

[LLVMdev] Register Dependencies and Register Allocation

2008 Dec 23

[LLVMdev] Register Dependencies and Register Allocation

I'm writing a back-end for an architecture that supports multi-word loads. As a concrete example, "ldqw r0, [addr]" would load a quadword (4 words) into 4 registers starting with r0 (implicit writes to r1, r2, and r3). First, is there any currently supported architecture that has anything like this? I suspect not. If not, I hope someone might help me figure out how to make this

[LLVMdev] Custom GEP lowering

2007 Aug 29

[LLVMdev] Custom GEP lowering

On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote: > On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: >> It looks like I need to be able to intercept GEP lowering (in >> SelectionDAGLowering::visitGetElementPtr) and insert something else >> other than the shifts and adds. The basic problem is that CellSPU >> loads and stores on 16-byte boundaries. Consequently,

[LLVMdev] Inconsistent naming of SSE intrinsics?

2012 Jun 22

[LLVMdev] Inconsistent naming of SSE intrinsics?

Hey guys, Is there a reason for the following naming quirk in the x86 SSE intrinsics: int_x86_sse2_pcmpeq_b int_x86_sse2_pcmpeq_w int_x86_sse2_pcmpeq_d int_x86_sse41_pcmpeqq I anticipated a "_q" suffix for the quadword variant, but was surprised to see the intrinsic named above. Just FYI..., Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL:

SYSLINUX 3.83-pre3

2009 Jul 30

SYSLINUX 3.83-pre3

I *think* I have found and fixed the Thinkpad MEMDISK problem. The problem with MS-DOS I understand... not so when it comes to an apparently unrelated FreeDOS problem, and as such I really don't know *why* the hack I did works, nor if it will *stay* fixed, but at least it seems to boot on my T61 (at least until it crashes due to another error...) -hpa -- H. Peter Anvin, Intel Open Source

[LLVMdev] Register Dependencies and Register Allocation

2008 Dec 23

[LLVMdev] Register Dependencies and Register Allocation

On Dec 23, 2008, at 11:03 AMPST, Marc de Kruijf wrote: > > I'm writing a back-end for an architecture that supports multi-word > loads. As a concrete example, "ldqw r0, [addr]" would load a > quadword (4 words) into 4 registers starting with r0 (implicit > writes to r1, r2, and r3). ARM has this. It currently works by creating such instructions in a

[PATCH] [memdisk] Additional EDD Device Parameter Table fields

2009 Jul 31

[PATCH] [memdisk] Additional EDD Device Parameter Table fields

Some additional fields from the EDD-4 spec. draft for the Device Parameter Table have been added into the structure in setup.c and memdisk.inc. These were added in the hopes of resolving a FreeDOS MEMDISK bug on IBM ThinkPads. --- memdisk/memdisk.inc | 11 +++++++++++ memdisk/setup.c | 10 ++++++++++ 2 files changed, 21 insertions(+), 0 deletions(-) diff --git a/memdisk/memdisk.inc

[LLVMdev] Custom GEP lowering

2007 Aug 29

[LLVMdev] Custom GEP lowering

On Aug 28, 2007, at 6:15 PM, Scott Michel wrote: > On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote: > >> On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: >>> It looks like I need to be able to intercept GEP lowering (in >>> SelectionDAGLowering::visitGetElementPtr) and insert something else >>> other than the shifts and adds. The basic problem is

[PATCH 0/7] Using %gs for per-cpu areas on x86

2007 Apr 18

[PATCH 0/7] Using %gs for per-cpu areas on x86

OK, here it is. Benchmarks still coming. This is against Andi's 2.6.18-rc7-git3 tree, and replaces the patches between (and not including) i386-pda-asm-offsets and i386-early-fault. One patch is identical, one is mildly modified, the rest are re-implemented but inspired by Jeremy's PDA work. Thanks, Rusty. -- Help! Save Australia from the worst of the DMCA: http://linux.org.au/law

[PATCH 0/7] Using %gs for per-cpu areas on x86

2007 Apr 18

[PATCH 0/7] Using %gs for per-cpu areas on x86

flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)

2004 Oct 06

flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)

Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)

question about src/test_seeking.c - seek_barrage()

2016 Jan 31

question about src/test_seeking.c - seek_barrage()

seek_barrage() has variable n of type long int (which is 32bit usually). Then we see something like n = (long int)total_samples; So, why n has type long int, and not FLAC__int64 or some other 64-bit type?

[LLVMdev] Custom GEP lowering

2007 Aug 28

[LLVMdev] Custom GEP lowering

On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: > It looks like I need to be able to intercept GEP lowering (in > SelectionDAGLowering::visitGetElementPtr) and insert something else > other than the shifts and adds. The basic problem is that CellSPU > loads and stores on 16-byte boundaries. Consequently, the SPU backend > has to do the load or store differently

altivec lpc_restore_signal

2004 Sep 10

altivec lpc_restore_signal

I've had this a long time but haven't submitted it yet. I've tried to mirror the ia32 setup, so there should be a new subdirectory src/libFLAC/ppc . The first two attachments go there. The third is a context diff for src/libFLAC/Makefile.am . I have some more modified files, which I figured I'd submit after the above are checked in and working for somebody other than me. If you

[RFC] Improving compact x86-64 compact unwind descriptors

2018 Jan 26

[RFC] Improving compact x86-64 compact unwind descriptors

Here is our proposal to extend/enhance the x86-64 compact unwind descriptors to fully describe the prologue/epilogue for asynchronous unwinding. I believe there are missing/lacking CFI directives as well, but I'll save that for another thread. Asynchronous Compact Unwind Descriptors Ron Brender, VMS Software, Inc. Revised January 25, 2018 1 Introduction This document proposes means to

[LLVMdev] Custom GEP lowering

2007 Aug 28

[LLVMdev] Custom GEP lowering

It looks like I need to be able to intercept GEP lowering (in SelectionDAGLowering::visitGetElementPtr) and insert something else other than the shifts and adds. The basic problem is that CellSPU loads and stores on 16-byte boundaries. Consequently, the SPU backend has to do the load or store differently than most normal architectures that have byte-addressable operations.

[RFC] Improving compact x86-64 compact unwind descriptors

2018 Jan 27

[RFC] Improving compact x86-64 compact unwind descriptors

John and Ron, I developed the original compact unwind implementation for macOS 10.6 back in 2009. I tried to leave space in the design to support finer grain exception handling such as for asynchronous or for the shrink wrap optimization. The idea I had at the time was instead of having just one 32-bit compact unwind info per function, there could be an array of them each covering a different

[RFC] Improving compact x86-64 compact unwind descriptors

2018 Jan 29

[RFC] Improving compact x86-64 compact unwind descriptors

Hi Nick, It is a pleasure to be in contact with the creator of the compact unwind approach! I can see how an array of 32-bit unwind blocks could be used to describe each distinct point within a function (within a prolog in particular). But then you end up with six or seven or more such blocks for a large percentage of functions, don't you? Seems like a lot of additional space for something

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 19

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, I have tested the new shuffle lowering on a AMD Jaguar cpu (which is AVX but not AVX2). On this particular target, there is a delay when output data from an execution unit is used as input to another execution unit of a different cluster. For example, There are 6 executions units which are divided into 3 execution clusters of Float(FPM,FPA), Vector Integer (MMXA,MMXB,IMM), and Store

[LLVMdev] First-class aggregate semantics

2010 Jan 08

[LLVMdev] First-class aggregate semantics

On Thursday 07 January 2010 21:56:11 Dustin Laurence wrote: > On 01/07/2010 01:38 PM, David Greene wrote: > > The way this works on many targets is that the caller allocates stack > > space in its frame for the returned struct and passes a pointer to it > > as a first "hidden" argument to the callee. The callee then copies > > that data into the space pointed

[RFC] Improving compact x86-64 compact unwind descriptors

2018 Jan 27

[RFC] Improving compact x86-64 compact unwind descriptors

Hi John & Ron, I read through the proposal and had a couple of quick observations. 1. The proposed encoding assumes that the epilogue instructions always come at the end of the function -- or rather, just before the next function. If there is a stack protector __stack_chk_fail sequence, or there is NOP padding between functions, then the epilogue cannot be expressed. The proposed encoding

search for: quadwords