Displaying 20 results from an estimated 48 matches for "quadwords".
2008 Dec 23
3
[LLVMdev] Register Dependencies and Register Allocation
I'm writing a back-end for an architecture that supports multi-word loads.
As a concrete example, "ldqw r0, [addr]" would load a quadword (4 words)
into 4 registers starting with r0 (implicit writes to r1, r2, and r3).
First, is there any currently supported architecture that has anything like
this? I suspect not. If not, I hope someone might help me figure out how
to make this
2007 Aug 29
3
[LLVMdev] Custom GEP lowering
On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote:
> On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote:
>> It looks like I need to be able to intercept GEP lowering (in
>> SelectionDAGLowering::visitGetElementPtr) and insert something else
>> other than the shifts and adds. The basic problem is that CellSPU
>> loads and stores on 16-byte boundaries. Consequently,
2012 Jun 22
1
[LLVMdev] Inconsistent naming of SSE intrinsics?
Hey guys,
Is there a reason for the following naming quirk in the x86 SSE intrinsics:
int_x86_sse2_pcmpeq_b
int_x86_sse2_pcmpeq_w
int_x86_sse2_pcmpeq_d
int_x86_sse41_pcmpeqq
I anticipated a "_q" suffix for the quadword variant, but was surprised to
see the intrinsic named above.
Just FYI...,
Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2009 Jul 30
2
SYSLINUX 3.83-pre3
I *think* I have found and fixed the Thinkpad MEMDISK problem. The
problem with MS-DOS I understand... not so when it comes to an
apparently unrelated FreeDOS problem, and as such I really don't know
*why* the hack I did works, nor if it will *stay* fixed, but at least it
seems to boot on my T61 (at least until it crashes due to another error...)
-hpa
--
H. Peter Anvin, Intel Open Source
2008 Dec 23
0
[LLVMdev] Register Dependencies and Register Allocation
On Dec 23, 2008, at 11:03 AMPST, Marc de Kruijf wrote:
>
> I'm writing a back-end for an architecture that supports multi-word
> loads. As a concrete example, "ldqw r0, [addr]" would load a
> quadword (4 words) into 4 registers starting with r0 (implicit
> writes to r1, r2, and r3).
ARM has this. It currently works by creating such instructions in a
2009 Jul 31
1
[PATCH] [memdisk] Additional EDD Device Parameter Table fields
Some additional fields from the EDD-4 spec. draft for the Device
Parameter
Table have been added into the structure in setup.c and memdisk.inc.
These
were added in the hopes of resolving a FreeDOS MEMDISK bug on IBM
ThinkPads.
---
memdisk/memdisk.inc | 11 +++++++++++
memdisk/setup.c | 10 ++++++++++
2 files changed, 21 insertions(+), 0 deletions(-)
diff --git a/memdisk/memdisk.inc
2007 Aug 29
0
[LLVMdev] Custom GEP lowering
On Aug 28, 2007, at 6:15 PM, Scott Michel wrote:
> On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote:
>
>> On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote:
>>> It looks like I need to be able to intercept GEP lowering (in
>>> SelectionDAGLowering::visitGetElementPtr) and insert something else
>>> other than the shifts and adds. The basic problem is
2007 Apr 18
1
[PATCH 0/7] Using %gs for per-cpu areas on x86
OK, here it is. Benchmarks still coming. This is against Andi's
2.6.18-rc7-git3 tree, and replaces the patches between (and not
including) i386-pda-asm-offsets and i386-early-fault.
One patch is identical, one is mildly modified, the rest are
re-implemented but inspired by Jeremy's PDA work.
Thanks,
Rusty.
--
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
2007 Apr 18
1
[PATCH 0/7] Using %gs for per-cpu areas on x86
OK, here it is. Benchmarks still coming. This is against Andi's
2.6.18-rc7-git3 tree, and replaces the patches between (and not
including) i386-pda-asm-offsets and i386-early-fault.
One patch is identical, one is mildly modified, the rest are
re-implemented but inspired by Jeremy's PDA work.
Thanks,
Rusty.
--
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything.
The asm code isn't gas compliant. the libFLAC linker script has a typo,
disabling the asm optimization and/or altivec won't let a correct build
anyway.
Instant fixes for the asm stuff:
sed -i -e"s:;:\#:" on the lpc_asm.s
to load address instead of addis+ori you could use
lis and la and PLEASE use the @l(register)
2016 Jan 31
2
question about src/test_seeking.c - seek_barrage()
seek_barrage() has variable n of type long int (which is 32bit usually).
Then we see something like
n = (long int)total_samples;
So, why n has type long int, and not FLAC__int64 or some other 64-bit type?
2007 Aug 28
0
[LLVMdev] Custom GEP lowering
On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote:
> It looks like I need to be able to intercept GEP lowering (in
> SelectionDAGLowering::visitGetElementPtr) and insert something else
> other than the shifts and adds. The basic problem is that CellSPU
> loads and stores on 16-byte boundaries. Consequently, the SPU backend
> has to do the load or store differently
2004 Sep 10
1
altivec lpc_restore_signal
I've had this a long time but haven't submitted it yet.
I've tried to mirror the ia32 setup, so there should be a new subdirectory
src/libFLAC/ppc . The first two attachments go there. The third is a context
diff for src/libFLAC/Makefile.am .
I have some more modified files, which I figured I'd submit after the above
are checked in and working for somebody other than me. If you
2018 Jan 26
4
[RFC] Improving compact x86-64 compact unwind descriptors
Here is our proposal to extend/enhance the x86-64 compact unwind
descriptors to fully describe the prologue/epilogue for asynchronous
unwinding. I believe there are missing/lacking CFI directives as well,
but I'll save that for another thread.
Asynchronous Compact Unwind Descriptors
Ron Brender, VMS Software, Inc.
Revised January 25, 2018
1 Introduction
This document proposes means to
2007 Aug 28
2
[LLVMdev] Custom GEP lowering
It looks like I need to be able to intercept GEP lowering (in
SelectionDAGLowering::visitGetElementPtr) and insert something else
other than the shifts and adds. The basic problem is that CellSPU
loads and stores on 16-byte boundaries. Consequently, the SPU backend
has to do the load or store differently than most normal
architectures that have byte-addressable operations.
2018 Jan 27
0
[RFC] Improving compact x86-64 compact unwind descriptors
John and Ron,
I developed the original compact unwind implementation for macOS 10.6 back in 2009. I tried to leave space in the design to support finer grain exception handling such as for asynchronous or for the shrink wrap optimization. The idea I had at the time was instead of having just one 32-bit compact unwind info per function, there could be an array of them each covering a different
2018 Jan 29
2
[RFC] Improving compact x86-64 compact unwind descriptors
Hi Nick,
It is a pleasure to be in contact with the creator of the compact unwind
approach!
I can see how an array of 32-bit unwind blocks could be used to describe
each distinct point within a function (within a prolog in particular). But
then you end up with six or seven or more such blocks for a large
percentage of functions, don't you? Seems like a lot of additional space
for something
2014 Sep 19
4
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Hi Chandler,
I have tested the new shuffle lowering on a AMD Jaguar cpu (which is
AVX but not AVX2).
On this particular target, there is a delay when output data from an
execution unit is used as input to another execution unit of a
different cluster. For example, There are 6 executions units which are
divided into 3 execution clusters of Float(FPM,FPA), Vector Integer
(MMXA,MMXB,IMM), and Store
2010 Jan 08
0
[LLVMdev] First-class aggregate semantics
On Thursday 07 January 2010 21:56:11 Dustin Laurence wrote:
> On 01/07/2010 01:38 PM, David Greene wrote:
> > The way this works on many targets is that the caller allocates stack
> > space in its frame for the returned struct and passes a pointer to it
> > as a first "hidden" argument to the callee. The callee then copies
> > that data into the space pointed
2018 Jan 27
0
[RFC] Improving compact x86-64 compact unwind descriptors
Hi John & Ron, I read through the proposal and had a couple of quick observations.
1. The proposed encoding assumes that the epilogue instructions always come at the end of the function -- or rather, just before the next function. If there is a stack protector __stack_chk_fail sequence, or there is NOP padding between functions, then the epilogue cannot be expressed. The proposed encoding