thr3ads.net - llvm dev - [LLVMdev] clang .code16 with -Os producing larger code that it needs to [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Vladimir 'φ-coder/phcoder' Serbinenko

2015-Feb-20 14:58 UTC

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

When experimenting with compiling GRUB2 with clang using integrated as,
I found out that it generates a 16-bit code bigger than gas counterpart
and result gets too big for size constraints of bootsector. This was
traced mainly to 2 problems.
32-bit access to 16-bit addresses.
source:
	movl	LOCAL(kernel_sector), %ebx
	movl	%ebx, 8(%si)
clang:
    7cbc:	67 66 8b 1d 5c 7c 00 	addr32 mov 0x7c5c,%ebx
    7cc3:	00
    7cc4:	66 89 5c 08          	mov    %ebx,0x8(%si)

gas:
    7cbc:	66 8b 1e 5c 7c       	mov    0x7c5c,%ebx
    7cc1:	66 89 5c 08          	mov    %ebx,0x8(%si)
32-bit jump.
source:
	jnb	LOCAL(floppy_probe)
clang:
+    7cb5:	66 0f 83 07 01 00 00 	jae    7dc3 <L_floppy_probe>
gas:
-    7cb5:	0f 83 0a 01          	jae    7dc3 <L_floppy_probe>
The last one is particularly problematic as it never makes sense to
issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per
jump. Is it possible to force clang to generate 16-bit jumps?
On bright side if I remove error strings the code is functional.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/49230e8f/attachment.sig>

Vladimir 'φ-coder/phcoder' Serbinenko

2015-Feb-20 15:26 UTC

head link

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

On 20.02.2015 15:58, Vladimir 'φ-coder/phcoder' Serbinenko
wrote:> When experimenting with compiling GRUB2 with clang using integrated as,
> I found out that it generates a 16-bit code bigger than gas counterpart
> and result gets too big for size constraints of bootsector. This was
> traced mainly to 2 problems.
> 32-bit access to 16-bit addresses.
> source:
> 	movl	LOCAL(kernel_sector), %ebx
> 	movl	%ebx, 8(%si)
> clang:
>     7cbc:	67 66 8b 1d 5c 7c 00 	addr32 mov 0x7c5c,%ebx
>     7cc3:	00
>     7cc4:	66 89 5c 08          	mov    %ebx,0x8(%si)
> 
> gas:
>     7cbc:	66 8b 1e 5c 7c       	mov    0x7c5c,%ebx
>     7cc1:	66 89 5c 08          	mov    %ebx,0x8(%si)
> 32-bit jump.
> source:
> 	jnb	LOCAL(floppy_probe)
> clang:
> +    7cb5:	66 0f 83 07 01 00 00 	jae    7dc3 <L_floppy_probe>
> gas:
> -    7cb5:	0f 83 0a 01          	jae    7dc3 <L_floppy_probe>Minimal example would be:
	.code16
	jmp 1f
	.space 256
1:	nop
clang:
   0:	66 e9 00 01 00 00    	jmpl   0x106
	...
 106:	90                   	nop
gcc:
   0:	e9 00 01             	jmp    0x103
	...
 103:	90                   	nop
> The last one is particularly problematic as it never makes sense to
> issue 32-bit jump if %ip is only 16 bits and it eats 3 extra bytes per
> jump. Is it possible to force clang to generate 16-bit jumps?
> On bright side if I remove error strings the code is functional.
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/65c8af1f/attachment.sig>

David Woodhouse

2015-Feb-20 15:38 UTC

head link

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

On Fri, 2015-02-20 at 15:58 +0100, Vladimir 'φ-coder/phcoder' Serbinenko
wrote:> When experimenting with compiling GRUB2 with clang using integrated as,
> I found out that it generates a 16-bit code bigger than gas counterpart
> and result gets too big for size constraints of bootsector. This was
> traced mainly to 2 problems.
...
> 32-bit access to 16-bit addresses.
> clang:
>     7cbc:	67 66 8b 1d 5c 7c 00 00	addr32 mov 0x7c5c,%ebx
> gas:
>     7cbc:	66 8b 1e 5c 7c       	mov    0x7c5c,%ebx
> 32-bit jump.
> clang:
> +    7cb5:	66 0f 83 07 01 00 00 	jae    7dc3 <L_floppy_probe>
> gas:
> -    7cb5:	0f 83 0a 01          	jae    7dc3 <L_floppy_probe>
To a large extent, those are the *same* problem. We don't know that it's
eventually going to fit into a 16-bit offset, so we emit it with a fixup
record which can cope with 32 bits.

Arguably, the jump is *particularly* gratuitous in many cases... but in
'big real' mode is the IP *really* limited to 16 bits?

We could make it default to 16-bit, as gas does. But then we'd be
screwed in the cases where we really *do* need 32-bit.

What we actually need to do is implement handling for the explicit
addr32 prefix. Then we can do what gas does and default to 16-bit but
*also* have a way to do 32-bit when it's needed.

-- 
dwmw2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5745 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/79c3eb88/attachment.bin>

Vladimir 'φ-coder/phcoder' Serbinenko

2015-Feb-20 15:46 UTC

head link

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

On 20.02.2015 16:38, David Woodhouse wrote:> On Fri, 2015-02-20 at 15:58 +0100, Vladimir 'φ-coder/phcoder'
Serbinenko
> wrote:
>> When experimenting with compiling GRUB2 with clang using integrated as,
>> I found out that it generates a 16-bit code bigger than gas counterpart
>> and result gets too big for size constraints of bootsector. This was
>> traced mainly to 2 problems.
> 
> ...
> 
>> 32-bit access to 16-bit addresses.
>> clang:
>>     7cbc:	67 66 8b 1d 5c 7c 00 00	addr32 mov 0x7c5c,%ebx
>> gas:
>>     7cbc:	66 8b 1e 5c 7c       	mov    0x7c5c,%ebx
> 
>> 32-bit jump.
>> clang:
>> +    7cb5:	66 0f 83 07 01 00 00 	jae    7dc3 <L_floppy_probe>
>> gas:
>> -    7cb5:	0f 83 0a 01          	jae    7dc3 <L_floppy_probe>
> 
> To a large extent, those are the *same* problem. We don't know that
it's
> eventually going to fit into a 16-bit offset, so we emit it with a fixup
> record which can cope with 32 bits.
> All labels are local to the source file. If I use %eax instead of %ebx
in first example I get the short code. For the second example how does
clang detect that offset fits into one byte for issuing EB XX sequence
which is issued in resulting file in several places. Can we use the same
mechanism to detect when issuing 16-bit reference and keep 32-bit one
for external references?> Arguably, the jump is *particularly* gratuitous in many cases... but in
> 'big real' mode is the IP *really* limited to 16 bits?
> 
> We could make it default to 16-bit, as gas does. But then we'd be
> screwed in the cases where we really *do* need 32-bit.
> 
> What we actually need to do is implement handling for the explicit
> addr32 prefix. Then we can do what gas does and default to 16-bit but
> *also* have a way to do 32-bit when it's needed.
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 213 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/2428e442/attachment.sig>

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2015 - [LLVMdev] clang .code16 with -Os producing larger code that it needs to

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

[LLVMdev] clang .code16 with -Os producing larger code that it needs to

Apparently Analagous Threads