Vladimir 'φ-coder/phcoder' Serbinenko
2015-Feb-20 15:46 UTC
[LLVMdev] clang .code16 with -Os producing larger code that it needs to
On 20.02.2015 16:38, David Woodhouse wrote:> On Fri, 2015-02-20 at 15:58 +0100, Vladimir 'φ-coder/phcoder' Serbinenko > wrote: >> When experimenting with compiling GRUB2 with clang using integrated as, >> I found out that it generates a 16-bit code bigger than gas counterpart >> and result gets too big for size constraints of bootsector. This was >> traced mainly to 2 problems. > > ... > >> 32-bit access to 16-bit addresses. >> clang: >> 7cbc: 67 66 8b 1d 5c 7c 00 00 addr32 mov 0x7c5c,%ebx >> gas: >> 7cbc: 66 8b 1e 5c 7c mov 0x7c5c,%ebx > >> 32-bit jump. >> clang: >> + 7cb5: 66 0f 83 07 01 00 00 jae 7dc3 <L_floppy_probe> >> gas: >> - 7cb5: 0f 83 0a 01 jae 7dc3 <L_floppy_probe> > > To a large extent, those are the *same* problem. We don't know that it's > eventually going to fit into a 16-bit offset, so we emit it with a fixup > record which can cope with 32 bits. >All labels are local to the source file. If I use %eax instead of %ebx in first example I get the short code. For the second example how does clang detect that offset fits into one byte for issuing EB XX sequence which is issued in resulting file in several places. Can we use the same mechanism to detect when issuing 16-bit reference and keep 32-bit one for external references?> Arguably, the jump is *particularly* gratuitous in many cases... but in > 'big real' mode is the IP *really* limited to 16 bits? > > We could make it default to 16-bit, as gas does. But then we'd be > screwed in the cases where we really *do* need 32-bit. > > What we actually need to do is implement handling for the explicit > addr32 prefix. Then we can do what gas does and default to 16-bit but > *also* have a way to do 32-bit when it's needed. >-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/2428e442/attachment.sig>
David Woodhouse
2015-Feb-20 16:05 UTC
[LLVMdev] clang .code16 with -Os producing larger code that it needs to
On Fri, 2015-02-20 at 16:46 +0100, Vladimir 'φ-coder/phcoder' Serbinenko wrote:> > All labels are local to the source file. If I use %eax instead of %ebx > in first example I get the short code. For the second example how does > clang detect that offset fits into one byte for issuing EB XX sequence > which is issued in resulting file in several places. Can we use the > same mechanism to detect when issuing 16-bit reference and keep 32-bit > one for external references?It's been a while since I looked at this... but I think for the short jumps we just emit the 8-bit version and there's a fixup which can go back and re-emit the instruction in 32-bit mode if it finds it doesn't fit? Do we just need to support a similar fixup for promoting 16-bit to 32-bit relocations? -- dwmw2 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5745 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/9b949c78/attachment.bin>
David Woodhouse
2015-Feb-20 16:18 UTC
[LLVMdev] clang .code16 with -Os producing larger code that it needs to
On Fri, 2015-02-20 at 16:05 +0000, David Woodhouse wrote:> > It's been a while since I looked at this... but I think for the short > jumps we just emit the 8-bit version and there's a fixup which can go > back and re-emit the instruction in 32-bit mode if it finds it doesn't > fit? > > Do we just need to support a similar fixup for promoting 16-bit to > 32-bit relocations?OK, the term I was looking for was 'relaxation'. Look in lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp for X86AsmBackend::relaxInstruction() and related methods. Observe that it will cope with 'relaxing' 8-bit PC-relative relocations to 32-bit PC-relative, but it doesn't cope with anything else. Your task, should you choose to accept it, is to make it cope with other forms of relaxation where necessary. Note that the existing cases end up emitting a new instruction with a *new* opcode. In your case it won't be doing that. It's the *same* opcode, but you'll have to set a flag to tell the emitter to use the 32-bit addressing mode (for data and/or addr as appropriate) this time. And while you're doing that, you should note that that's the *same* flag that'll be needed to support explicit addr32/data32 prefixes in the asm source. So you might as well support those too. I might suggest doing them *first*, in fact. -- David Woodhouse Open Source Technology Centre David.Woodhouse at intel.com Intel Corporation -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5745 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150220/d97aa989/attachment.bin>