search for: 16byte

Displaying 20 results from an estimated 23 matches for "16byte".

2009 Mar 11
4
[LLVMdev] Bug in X86CompilationCallback_SSE
I don't know how to file a PR, but I have a patch (see below), that should work regardless of abi differences, since it relies on the compiler to do the though job. void X86CompilationCallback_SSE(void) { char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned asm volatile ( "movl %%eax,(%0)\n" "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX "movl %%ecx,8(%0)\n" :: "r"(SAVEBUF+64): "memory" ); asm volatile ( // Save all XMM arg registers "movaps %%xmm0, (%...
2018 Jan 14
1
PCIe ordering and new VIRTIO packed ring format.
...ing and Granularity Observed by a Read Transaction" " if a host CPU writes a QWORD to host memory, a Requester reading that QWORD from host memory may observe a portion of the QWORD updated and another portion of it containing the old value." This means that after the device reads a 16byte descriptor, it cannot know that all the values In the descriptor are up to date even if the VIRTQ_DESC_F_AVAIL bit is set. This is true even if the driver uses the appropriate memory barriers. We encountered this behavior in practice on x86 servers. Our solution was to add an index to the latest v...
2009 Mar 12
0
[LLVMdev] Bug in X86CompilationCallback_SSE
...ote: > I don't know how to file a PR, but I have a patch (see below), that > should work regardless of abi differences, since it relies on the > compiler to do the though job. > > void X86CompilationCallback_SSE(void) { > char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned How do you ensure it's 16-byte aligned? Can you declare a local array and specify alignment using attribute __aligned? Evan > > asm volatile ( > "movl %%eax,(%0)\n" > "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX > "movl %%ecx...
2006 Feb 10
4
Dtracing scsi
A small script to see what SCSI commands are being issued by a system: http://blogs.sun.com/roller/page/chrisg?entry=scsi_d_script Still work in progress as I needs to handle larger CDBs but it is a start, since I don''t have a disk that big it is not a problem for me yet. Also getting the return scsi packet is a hack but so far I can see no alternative short of knowing about all the
2013 Feb 15
0
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
..., there is a bug in the code that computes the required space (which needs to be 16 byte aligned): see X86ISelLowering::GetAlignedArgumentStackSize In general, the tail caller knows that there is space because it was called and its parameters were put there (plus some empty space to keep the stack 16byte aligned). To keep the stack aligned the parameter area changes in increments of 16. There was a bug (apparently, i did not upstream the fix for it :( ) in the may we calculate this "adjustment" that would cause a 0 stack space to pump up the alignment by 12 (16 - return addr). It is safe...
2013 Feb 15
1
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
...s a bug in the code that computes the required space (which needs to be 16 byte aligned): see X86ISelLowering::GetAlignedArgumentStackSize > > In general, the tail caller knows that there is space because it was called and its parameters were put there (plus some empty space to keep the stack 16byte aligned). To keep the stack aligned the parameter area changes in increments of 16. There was a bug (apparently, i did not upstream the fix for it :( ) in the may we calculate this "adjustment" that would cause a 0 stack space to pump up the alignment by 12 (16 - return addr). It is safe...
2009 Mar 11
0
[LLVMdev] Bug in X86CompilationCallback_SSE
Hello, Corrado > Before you can correctly invoke a function via the Procedure Linkage > Table (plt), the ABI mandates that ebx is pointing to the GOT (Global > Offset Table) (see http://www.greyhat.ch/lab/downloads/pic.html) This is known issue, just nobody realized, that we have bunch of non- PIC-aware assembler code. :) Fixing would be not so trivial though, mostly due to ABI
2013 Feb 15
2
[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates
>> While investigating one of the existing tests >> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some >> interesting code. The IR is very straightforward: >> >> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 >> %a4) { >> entry: >> ret i32 %a3 >> } >> >> define fastcc i32 @tailcaller(i32
2009 Mar 12
0
[LLVMdev] Bug in X86CompilationCallback_SSE
...ote: > I don't know how to file a PR, but I have a patch (see below), that > should work regardless of abi differences, since it relies on the > compiler to do the though job. > > void X86CompilationCallback_SSE(void) { > char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned > > asm volatile ( > "movl %%eax,(%0)\n" > "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX > "movl %%ecx,8(%0)\n" > :: "r"(SAVEBUF+64): "memory" ); > > asm volatile ( > // Save all XMM arg re...
2007 Oct 04
0
[LLVMdev] RFC: Tail call optimization X86
...argument size (and when tail-call-opt-align-stack is enabled) i make the resulting argument size a multiple of the target alignment (minus the return address slot). So in the example above the argument stack slot size would be 12bytes for both functions.(on darwin-x86 which requires the stack to be 16byte aligned - which i found out by mistake - because dynamically linked function calls would not work ;) this results in a stack adjustment that is a multiple of the target stack alignment. Maybe the default should be to perform this stack alignment and turn it off with a switch (tail-call-opt-disable...
2009 Mar 10
2
[LLVMdev] Bug in X86CompilationCallback_SSE
Hello. I found that the X86CompilationCallback_SSE wrapper for X86CompilationCallback2 is not setting up properly for the PIC invocation. Before you can correctly invoke a function via the Procedure Linkage Table (plt), the ABI mandates that ebx is pointing to the GOT (Global Offset Table) (see http://www.greyhat.ch/lab/downloads/pic.html) Dump of assembler code for function
2010 May 28
3
[LLVMdev] Vectorized LLVM IR
Hi, We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code to SSE on a 64 bits machine. Right now the equivalent code in scalar mode sill outperform the SSE one. What is the quality of the SSE support in X86 LLVL backend? Are they any specific things to be aware of to improve the speed? Thanks Stéphane Letz
2006 Jun 26
0
[klibc 25/43] ia64 support for klibc
...It returns zero. Subsequent calls to "LongJump" will +// restore the registers and return non-zero to the same location. +// +// On entry, r32 contains the pointer to the jmp_buffer +// + .align 32 + .global setjmp + .proc setjmp +setjmp: + // + // Make sure buffer is aligned at 16byte boundary + // + add r10 = -0x10,r0 ;; // mask the lower 4 bits + and r32 = r32, r10;; + add r32 = 0x10, r32;; // move to next 16 byte boundary + + add r10 = J_PREDS, r32 // skip Unats & pfs save area + add r11 = J_BSP, r32 + // + // save immedia...
2015 Oct 08
5
RFC: Reducing Instr PGO size overhead
...d to replace the raw name string > Pros: > 1. Very simple to implement > 2. have good reduction of all sizes for typical large C++ > applications > 3. No profile data format change is required. > > Cons: > 1. Still requires 16byte overhead per-function -- can actually > hurt C programs > 2. -fcoverage-mapping use case is still not handled > 3. The problem with llvm-profdata still exists (no symbolic info, > partial filtering support) > > > Solution-3: > > This is the new solutio...
2015 Oct 09
2
RFC: Reducing Instr PGO size overhead
...1. Very simple to implement >> > 2. have good reduction of all sizes for typical large C++ >> > applications >> > 3. No profile data format change is required. >> > >> > Cons: >> > 1. Still requires 16byte overhead per-function -- can actually >> > hurt C programs >> > 2. -fcoverage-mapping use case is still not handled >> > 3. The problem with llvm-profdata still exists (no symbolic >> > info, >> > partial filtering support) >> &...
2015 Dec 09
2
RFC: Reducing Instr PGO size overhead
...gt;> > 2. have good reduction of all sizes for typical large C++ >>>> > applications >>>> > 3. No profile data format change is required. >>>> > >>>> > Cons: >>>> > 1. Still requires 16byte overhead per-function -- can actually >>>> > hurt C programs >>>> > 2. -fcoverage-mapping use case is still not handled >>>> > 3. The problem with llvm-profdata still exists (no symbolic >>>> > info, >>>> &g...
2007 Oct 03
4
[LLVMdev] RFC: Tail call optimization X86
On Oct 2, 2007, at 2:27 AM, Arnold Schwaighofer wrote: > Hi all, > > I changed the code that checks whether a tail call is really > eligible for optimization so that it performs the check/fix in > SelectionDAGISel.cpp:BuildSelectionDAG() as suggest by Evan. Also > eliminated an error that caused the remaining failing test cases in > the test-suite. > > The
2015 Sep 08
2
RFC: Reducing Instr PGO size overhead
>> >> >> >> yes -- it is fixed length (8byte) blob which may include null byte in >> >> the middle. >> > >> > >> > For reference, MD5 sum is 16 bytes (128-bit): >> > https://en.wikipedia.org/wiki/MD5 >> >> yes, LLVM's MD5 hash only takes the lower 64bit. >> >> >> > >> >>
2007 Dec 06
51
[PATCH 0/19] desc_struct integration
Hi, this is a series of patches that unify the struct desc_struct and friends across x86_64 and i386. As usual, it provides paravirt capabilities as a side-effect for x86_64. I consider the main goal, namely, of unifying the desc_struct, an ongoing effort, being this the beginning. A lot of old code has to be touched to accomplish that. I don't consider this patch ready for inclusion.
2007 Dec 06
51
[PATCH 0/19] desc_struct integration
Hi, this is a series of patches that unify the struct desc_struct and friends across x86_64 and i386. As usual, it provides paravirt capabilities as a side-effect for x86_64. I consider the main goal, namely, of unifying the desc_struct, an ongoing effort, being this the beginning. A lot of old code has to be touched to accomplish that. I don't consider this patch ready for inclusion.