Displaying 20 results from an estimated 2506 matches for "movs".
Did you mean:
movl
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...shl rax, 20h
mov rsi, offset __mh_execute_header
add rsi, rax
sar rsi, 20h ; size_t
mov edi, 4 ; size_t
call _calloc
lea edx, [r15-1]
movsxd r8, edx
mov ecx, r15d
add ecx, 0FFFFFFFEh
js loc_100000DFA
test r15d, r15d
mov r11d, [rax+r8*4]
jle loc_100000EAE
mov ecx, r15d
add ecx,...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...i, offset __mh_execute_header
>> add rsi, rax
>> sar rsi, 20h ; size_t
>> mov edi, 4 ; size_t
>> call _calloc
>> lea edx, [r15-1]
>> movsxd r8, edx
>> mov ecx, r15d
>> add ecx, 0FFFFFFFEh
>> js loc_100000DFA
>> test r15d, r15d
>> mov r11d, [rax+r8*4]
>> jle loc_100000EAE
>>...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...add rsi, rax
>>>> sar rsi, 20h ; size_t
>>>> mov edi, 4 ; size_t
>>>> call _calloc
>>>> lea edx, [r15-1]
>>>> movsxd r8, edx
>>>> mov ecx, r15d
>>>> add ecx, 0FFFFFFFEh
>>>> js loc_100000DFA
>>>> test r15d, r15d
>>>> mov r11d, [rax+r8*4]
>>>>...
2008 Nov 22
5
[RFC][PATCH] Gfxboot COMBOOT module
...ngth]
+; read file
+; si - file handle
+; es:bx - buffer
+; cx - number of blocks to read
+
+read:
+ push eax
+ mov ax,7
+ mov bx,trackbuf
+ mov cx,[BufSafe]
+ int 22h
+
+ push edi
+ push ecx
+ push si
+ push es
+
+ mov si,trackbuf
+ push edi
+ call gfx_l2so
+ pop di
+ pop es
+
+ rep movsb ; move ds:si -> es:di, length ecx
+ pop es
+ pop si
+ pop ecx
+ pop edi
+
+ pop eax
+ add edi, ecx
+ sub eax, ecx
+ jnz read
+
+bootlogo_read_done:
+ call find_file
+ or eax,eax
+ jnz found_bootlogo
+ stc
+ ret
+
+found_bootlogo:
+ push edi
+ push eax
+ add eax,edi
+ push dword...
2009 Apr 05
3
[PATCH] Gfxboot COMBOOT module
...em]
+
+; read file
+; si - file handle
+; es:bx - buffer
+; cx - number of blocks to read
+
+read:
+ push eax
+ mov ax,7
+ mov bx,trackbuf
+ mov cx,[BufSafe]
+ int 22h
+
+ push edi
+ push ecx
+ push si
+ push es
+
+ mov si,trackbuf
+ push edi
+ call gfx_l2so
+ pop di
+ pop es
+
+ rep movsb ; move ds:si -> es:di, length ecx
+ pop es
+ pop si
+ pop ecx
+ pop edi
+
+ pop eax
+
+ ; si == 0: EOF
+ or si,si
+ jz gfx_read_done
+ add edi,ecx
+ sub eax,ecx
+ ja read
+ jmp gfx_file_too_big
+gfx_read_done:
+ sub eax,ecx
+ mov edx,[file_length]
+ sub edx,eax
+ ; edx = real fi...
2005 Mar 11
0
[LLVMdev] FP Intrinsics
Update: I have been working on this all day, and I finally got it
working more or less with the pattern instruction selector... However,
the generated code is not very good, and I haven't implemented the
expand to calls if the target does not support these FP instructions.
As an example, in the following function the sub abs and compare
compiles to 13 instructions! Also it has changed the
2007 Jan 29
8
x86_64 build break in rombios
I am getting the following build break on changeset 13662. I am
compiling on x86_64 SLES10 with gcc 4.1.0. Is there a fix for this?
Thanks,
Aravindh Puthiyaparambil
Xen Development Team
Unisys, Tredyffrin PA
make[1]: Entering directory `/root/xen/xen-unstable.hg/tools/firmware''
make[2]: Entering directory
`/root/xen/xen-unstable.hg/tools/firmware/rombios''
gcc -o biossums
2020 May 22
2
[PATCH] Optimized assembler version of md5_process() for x86-64
This patch introduces an optimized assembler version of md5_process(),
the inner loop of MD5 checksumming. It affects the performance of all
MD5 operations in rsync - including block matching and whole-file
checksums.
Performance gain is 5-10% depending on the specific CPU.
Originally created by Marc Bevand and placed in the public domain,
later integrated into OpenSSL. This is the original
2010 Sep 21
1
[LLVMdev] Possible missed optimization on function calling?
Hello, I noticed that the following code could be improved a little bit
further. If the optimization is too tricky for the compiler or something and
it's done this way by design forgive me, but in any case i just wanted to
point it out.
Consider the following C code:
extern int mcos(int a);
extern int msin(int a);
extern int mdiv(int a, int b);
int foo(int a, int b)
{
int a4 =
2005 Feb 22
0
[LLVMdev] Area for improvement
On Mon, 21 Feb 2005, Jeff Cohen wrote:
> I noticed that fourinarow is one of the programs in which LLVM is much slower
> than GCC, so I decided to take a look and see why that is so. The program
> has many loops that look like this:
>
> #define ROWS 6
> #define COLS 7
>
> void init_board(char b[COLS][ROWS+1])
> {
> int i,j;
>
> for
2013 Aug 19
3
[LLVMdev] Issue with X86FrameLowering __chkstk on Windows 8 64-bit / Visual Studio 2012
Hi,
I'm using LLVM to convert expressions to native assembly, the problem
is when LLVM compiles this code:
define void @fn_0000000000000000(i8*, i8*, i8*) {
bb:
%res = alloca i32
%3 = load i32* %res
%4 = bitcast i8* %0 to i32*
%5 = load i32* %4
%6 = bitcast i8* %0 to i32*
%7 = load i32* %6
%8 = xor i32 %5, %7
store volatile i32 %8, i32* %res
%9 = load i32* %res
%10 = icmp
2009 Feb 08
1
[PATCH 1/1] COMBOOT API: Add calls for directory functions; Implement for FAT; Try 2
From: Gene Cumm <gene.cumm at gmail.com>
COMBOOT API: Add calls for directory functions; Implement most only
for FAT (SYSLINUX).
Uses INT 22h AX= 001Fh, 0020h, 0021h and 0022h to prepare for the
COM32 C functions getcwd(), opendir(), readdir(), and closedir(),
respectively. INT22h, AX=001Fh will return a valid value for all
variants. INT22h, AX= 0020h, 0021h, and 0022h are only
2018 Nov 30
2
(Question regarding the) incomplete "builtins library" of "Compiler-RT"
"Friedman, Eli" <efriedma at codeaurora.org> wrote:
> On 11/30/2018 8:31 AM, Stefan Kanthak via llvm-dev wrote:
>> Hi @ll,
>>
>> compiler-rt implements (for example) the MSVC (really Windows)
>> specific routines compiler-rt/lib/builtins/i386/chkstk.S and
>> compiler-rt/lib/builtins/x86_64/chkstk.S as __chkstk_ms()
>> See
2004 Sep 13
2
[LLVMdev] How could I get memory address for each assemble instruction?
Hi all,
I am trying to disassemble *.bc to assemble code by using llvm-dis command, but what I got is like the following. So how could I get the assemble code like objdump? I mean the memory address for each instruction.
Thanks
Qiuyu
llvm-dis:
.text
.align 16
.globl adpcm_coder
.type adpcm_coder, @function
adpcm_coder:
.LBBadpcm_coder_0: # entry
sub %ESP, 116
mov DWORD PTR [%ESP + 12],
2017 Mar 15
2
[LLD] Linking static library does not resolve symbols as gold/ld
Compilers don't know about functions that are not defined in the same
compilation unit, so they leave call instruction operands as zero (because
they can't compute any absolute nor relative address of the destinations),
and let linkers fix the address by binary patching.
So, what you are seeing is likely a bug of LLD that it fails to fix the
address for some reason.
Can you dump that
2005 Feb 22
2
[LLVMdev] Area for improvement
Sorry, I thought I was running selection dag isel but I screwed up when
trying out the really big array. You're right, it does clean it up
except for the multiplication.
So LoopStrengthReduce is not ready for prime time and doesn't actually
get used?
I might consider whipping it into shape. Does it still have to handle
getelementptr in its full generality?
Chris Lattner wrote:
2011 Nov 02
5
[LLVMdev] About JIT by LLVM 2.9 or later
...[ebp+8]
}
002C13F0 pop edi
002C13F1 pop esi
002C13F2 pop ebx
002C13F3 mov esp,ebp
002C13F5 pop ebp
002C13F6 ret
*Callee( 'fetch' LLVM ):*
010B0010 mov eax,dword ptr [esp+4]
010B0014 mov ecx,dword ptr [esp+8]
010B0018 movss xmm0,dword ptr [ecx+1Ch]
010B001D movss dword ptr [eax+0Ch],xmm0
010B0022 movss xmm0,dword ptr [ecx+18h]
010B0027 movss dword ptr [eax+8],xmm0
010B002C movss xmm0,dword ptr [ecx+10h]
010B0031 movss xmm1,dword ptr [ecx+14h]
010B0036 movss dword ptr [e...
2008 Feb 25
6
[PATCH 0/4] ia64/xen: paravirtualization of hand written assembly code
Hi. The patch I send before was too large so that it was dropped from
the maling list. I'm sending again with smaller size.
This patch set is the xen paravirtualization of hand written assenbly
code. And I expect that much clean up is necessary before merge.
We really need the feed back before starting actual clean up as Eddie
already said before.
Eddie discussed how to clean up and suggested
2008 Feb 25
6
[PATCH 0/4] ia64/xen: paravirtualization of hand written assembly code
Hi. The patch I send before was too large so that it was dropped from
the maling list. I'm sending again with smaller size.
This patch set is the xen paravirtualization of hand written assenbly
code. And I expect that much clean up is necessary before merge.
We really need the feed back before starting actual clean up as Eddie
already said before.
Eddie discussed how to clean up and suggested
2008 Feb 26
8
[PATCH 0/8] RFC: ia64/xen TAKE 2: paravirtualization of hand written assembly code
Hi. I rewrote the patch according to the comments. I adopted generating
in-place code because it looks the quickest way.
The point Eddie wanted to discuss is how to generate code and its ABI.
i.e. in-place generating v.s. direct jump v.s. indirect function call
Indirect function call doesn't make sense because ivt.S is compiled
multi times. And it is up to pv instances to choose in-place