thr3ads.net - search: "shll"

kernel: locore.s doesn't assemble (fillkpt, $PAGE_SHIFT, $PTESHIFT)

2003 Aug 22

2

kernel: locore.s doesn't assemble (fillkpt, $PAGE_SHIFT, $PTESHIFT)

...5: Error: suffix or operands invalid for `shr' shrl $PAGE_SHIFT,%ecx /tmp/ccOO8Chb.s:2496: Error: suffix or operands invalid for `shr' /tmp/ccOO8Chb.s:2496: Error: suffix or operands invalid for `shl' movl %eax, %ebx ; shrl $PAGE_SHIFT, %ebx ; shll $PTESHIFT,%ebx ; addl (( KPTphys )-KERNBASE) ,%ebx ; orl $0x001 ,%eax ; orl %edx ,%eax ; 1: movl %eax,(%ebx) ; addl $PAGE_SIZE,%eax ; addl $PTESIZE,%ebx ; loop 1b /tmp/ccOO8Chb.s:2512: Error: suffix or operands in...

Liveness of AL, AH and AX in x86 backend

2016 May 24

5

Liveness of AL, AH and AX in x86 backend

...But the output at -O2 is foo: # @foo .cfi_startproc # BB#0: # %entry movb (%rdi), %al movzbl 1(%rdi), %ecx movb %al, z(%rip) movb %cl, z+1(%rip) incb %al shll $8, %ecx movzbl %al, %eax orl %ecx, %eax retq I was hoping it would do something along the lines of movb (%rdi), %al movb 1(%rdi), %ah movh %ax, z(%rip) incb %al retq Why is the x86 backend not getting this code? Does it know that AH:AL = AX?...

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

2019 Aug 15

2

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

...%eax ^ xchgl %edx,%eax xchgl %edx,%eax ret ret At least and especially on Intel processors XCHG was and still is a rather slow instruction and should be avoided. Use the following better code sequences instead: 1: 1: shll %cl,%eax shrl %cl,%edx movl %eax,%edx movl %edx,%eax xorl %eax,%eax xorl %edx,%edx ret ret regards Stefan Kanthak PS: I doubt that a current GCC emits calls of the routines in the /usr/klibc/arch/i386 subdirectory a...

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

...# @foo > .cfi_startproc ># BB#0: # %entry > movb (%rdi), %al > movzbl 1(%rdi), %ecx > movb %al, z(%rip) > movb %cl, z+1(%rip) > incb %al > shll $8, %ecx > movzbl %al, %eax > orl %ecx, %eax > retq > > >I was hoping it would do something along the lines of > > movb (%rdi), %al > movb 1(%rdi), %ah > movh %ax, z(%rip) > incb %al > retq > > >Why is the x86 ba...

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

...; foo: # @foo > .cfi_startproc > # BB#0: # %entry > movb (%rdi), %al > movzbl 1(%rdi), %ecx > movb %al, z(%rip) > movb %cl, z+1(%rip) > incb %al > shll $8, %ecx > movzbl %al, %eax > orl %ecx, %eax > retq > > > I was hoping it would do something along the lines of > > movb (%rdi), %al > movb 1(%rdi), %ah > movh %ax, z(%rip) > incb %al > retq > > > Why is the x86 bac...

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

...ated code from the current community head for bzip2.c from spec 256.bzip2, with these options: clang -m32 -S -O2 bzip2.c .LBB14_4: # %bsW.exit24 subl %eax, %ebx addl $8, %eax movl %ebx, %ecx movl %eax, bsLive shll %cl, %edi movl %ebp, %ecx orl %esi, %edi movzbl %ch, %esi cmpl $8, %eax movl %edi, bsBuff jl .LBB14_6 As you can see, it is using both cl and ch for different values in this basic block. This occurs in the generated code for th...

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

2019 Aug 20

1

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

...ret ret >> >> At least and especially on Intel processors XCHG was and >> still is a rather slow instruction and should be avoided. >> Use the following better code sequences instead: >> >> 1: 1: >> shll %cl,%eax shrl %cl,%edx >> movl %eax,%edx movl %edx,%eax >> xorl %eax,%eax xorl %edx,%edx >> ret ret >> >> regards >> Stefan Kanthak >> > > XCHG is slow for register-memor...

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

...gt;# BB#0: # %entry >> > movb (%rdi), %al >> > movzbl 1(%rdi), %ecx >> > movb %al, z(%rip) >> > movb %cl, z+1(%rip) >> > incb %al >> > shll $8, %ecx >> > movzbl %al, %eax >> > orl %ecx, %eax >> > retq >> > >> > >> >I was hoping it would do something along the lines of >> > >> > movb (%rdi), %al >> &...

Liveness of AL, AH and AX in x86 backend

2016 May 24

1

Liveness of AL, AH and AX in x86 backend

...or bzip2.c from spec 256.bzip2, with these options: > > clang -m32 -S -O2 bzip2.c > > .LBB14_4: # %bsW.exit24 > subl %eax, %ebx > addl $8, %eax > movl %ebx, %ecx > movl %eax, bsLive > shll %cl, %edi > movl %ebp, %ecx > orl %esi, %edi > movzbl %ch, %esi > cmpl $8, %eax > movl %edi, bsBuff > jl .LBB14_6 > > As you can see, it is using both cl and ch for different values in this basic block. T...

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

...# @foo >> .cfi_startproc >> # BB#0: # %entry >> movb (%rdi), %al >> movzbl 1(%rdi), %ecx >> movb %al, z(%rip) >> movb %cl, z+1(%rip) >> incb %al >> shll $8, %ecx >> movzbl %al, %eax >> orl %ecx, %eax >> retq >> >> >> I was hoping it would do something along the lines of >> >> movb (%rdi), %al >> movb 1(%rdi), %ah >> movh %ax, z(%rip) >> incb %al >&...

[LLVMdev] [llvm-commits] rotate

2012 Jul 31

0

[LLVMdev] [llvm-commits] rotate

On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > Andy, > > Here is the left circular shift operator patch. I apologize to the reviewer > in advance. The patch has a good bit of fine detail. Any > comments/criticisms? > > Some caveats... > > 1) This is just the bare minimum needed to make the left circular shift > operator

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

...oo > > .cfi_startproc > ># BB#0: # %entry > > movb (%rdi), %al > > movzbl 1(%rdi), %ecx > > movb %al, z(%rip) > > movb %cl, z+1(%rip) > > incb %al > > shll $8, %ecx > > movzbl %al, %eax > > orl %ecx, %eax > > retq > > > > > >I was hoping it would do something along the lines of > > > > movb (%rdi), %al > > movb 1(%rdi), %ah > > movh %ax, z(%rip) > >...

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

...artproc > ># BB#0: # %entry > > movb (%rdi), %al > > movzbl 1(%rdi), %ecx > > movb %al, z(%rip) > > movb %cl, z+1(%rip) > > incb %al > > shll $8, %ecx > > movzbl %al, %eax > > orl %ecx, %eax > > retq > > > > > >I was hoping it would do something along the lines of > > > > movb (%rdi), %al > > movb 1(%rdi), %ah &gt...

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

2019 Aug 19

0

Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

...xchgl %edx,%eax > ret ret > > At least and especially on Intel processors XCHG was and > still is a rather slow instruction and should be avoided. > Use the following better code sequences instead: > > 1: 1: > shll %cl,%eax shrl %cl,%edx > movl %eax,%edx movl %edx,%eax > xorl %eax,%eax xorl %edx,%edx > ret ret > > regards > Stefan Kanthak > XCHG is slow for register-memory operations due to implicit locking,...

[PATCH 9/21] i386 Deprecate obsolete ldt accessors

2007 Apr 18

0

[PATCH 9/21] i386 Deprecate obsolete ldt accessors

...e set_base(ldt,base) _set_base( ((char *)&(ldt)) , (base) ) #define set_limit(ldt,limit) _set_limit( ((char *)&(ldt)) , (limit) ) -static inline unsigned long _get_base(char * addr) -{ - unsigned long __base; - __asm__("movb %3,%%dh\n\t" - "movb %2,%%dl\n\t" - "shll $16,%%edx\n\t" - "movw %1,%%dx" - :"=&d" (__base) - :"m" (*((addr)+2)), - "m" (*((addr)+4)), - "m" (*((addr)+7))); - return __base; -} - -#define get_base(ldt) _get_base( ((char *)&(ldt)) ) - /* * Load a segment. Fall back on...

[PATCH 9/21] i386 Deprecate obsolete ldt accessors

2007 Apr 18

0

[PATCH 9/21] i386 Deprecate obsolete ldt accessors

...e set_base(ldt,base) _set_base( ((char *)&(ldt)) , (base) ) #define set_limit(ldt,limit) _set_limit( ((char *)&(ldt)) , (limit) ) -static inline unsigned long _get_base(char * addr) -{ - unsigned long __base; - __asm__("movb %3,%%dh\n\t" - "movb %2,%%dl\n\t" - "shll $16,%%edx\n\t" - "movw %1,%%dx" - :"=&d" (__base) - :"m" (*((addr)+2)), - "m" (*((addr)+4)), - "m" (*((addr)+7))); - return __base; -} - -#define get_base(ldt) _get_base( ((char *)&(ldt)) ) - /* * Load a segment. Fall back on...

[LLVMdev] X86 gcc and clang have incompatible calling conventions for returning some small structs?

2010 Jul 08

1

[LLVMdev] X86 gcc and clang have incompatible calling conventions for returning some small structs?

Hello, I think I have come across an inconsistency between gcc and clang/llvm with respect to returning small structs. Given the following code: > struct s { > int a; > int b; > }; > > struct s3 { > int a; > int b; > int c; > }; > > struct s new_s(int v){ > struct s res; > res.a = v; > res.b = -v; > return res;

[LLVMdev] rotate

2012 Jul 29

0

[LLVMdev] rotate

...%esi, %eax movl %eax, %ecx ## kill: CL<def> ECX<kill> shrl %cl, %edi movl -4(%rbp), %eax movabsq $32, %rsi subq -16(%rbp), %rsi movl %esi, %edx movl %edx, %ecx ## kill: CL<def> ECX<kill> shll %cl, %eax orl %eax, %edi movl %edi, %eax popq %rbp ret .cfi_endproc .subsections_via_symbols ===== Michael On Jul 28, 2012, at 9:04 PM, reed kotler <rkotler at mips.com> wrote: > Nice! > > Clever compiler.. > > > On 07/28/2012 08:55 PM, Michael Gottesman wrote:...

[LLVMdev] InstCombine adds bit masks, confuses self, others

2012 Apr 16

0

[LLVMdev] InstCombine adds bit masks, confuses self, others

On Tue, Apr 17, 2012 at 12:23 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote: > I am not sure how best to fix this. If possible, InstCombine's > canonicalization shouldn't hide arithmetic progressions behind bit masks. The entire concept of cleverly converting arithmetic to bit masks seems like the perfect domain for DAGCombine instead of InstCombine: 1) We know the

Liveness of AL, AH and AX in x86 backend

2016 May 25

0

Liveness of AL, AH and AX in x86 backend

...> .cfi_startproc >>> # BB#0: # %entry >>> movb (%rdi), %al >>> movzbl 1(%rdi), %ecx >>> movb %al, z(%rip) >>> movb %cl, z+1(%rip) >>> incb %al >>> shll $8, %ecx >>> movzbl %al, %eax >>> orl %ecx, %eax >>> retq >>> >>> >>> I was hoping it would do something along the lines of >>> >>> movb (%rdi), %al >>> movb 1(%rdi), %ah >>> movh...

search for: shll