Stefan Kanthak via llvm-dev
2018-Nov-26 17:59 UTC
[llvm-dev] BUGS in code generated for target i386-win32
"Tim Northover" <t.p.northover at gmail.com> wrote:> Hi Stefan, > > On Mon, 26 Nov 2018 at 12:37, Stefan Kanthak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> LLVM/clang generates wrong code for the following program >> (see <https://godbolt.org/z/UZrrkG>): > > It looks like all of these issues come down to mismatched expectations > on what goes into and out of an __asm block. If you look at the MSVC > documentation (which I think Clang is trying to be compatible with), > it warns against assuming known values are in particular registers at > any point, and even using __fastcall at all for a function with an > __asm block for that reason: > https://docs.microsoft.com/en-gb/cpp/assembler/inline/using-and-preserving-registers-in-inline-assembly?view=vs-2017Trust me: I KNOW THIS DOCUMENTATION!> I'll try to explain a little below how that one mismatch causes the > issues you're seeing. > >> BUG #1: the compiler fails to allocate (EAX for) the variable "lfsr"! >> BUG #2: the variable "lfsr" is NOT initialized! > > Since the __asm isn't linked (as far as Clang is concerned) to > either input for lfsr32, they're both unused.REALLY? Or better: OUCH! Is Clang NOT aware of the __fastcall calling convention and its register usage? Clang does NOT create prolog/epilog here, so it CLEARLY knows that "argument" is held in ECX and "polynomial" in EDX ... as documented for __fastcall JFTR: Clang (like MSVC too) knows VERY well which registers are used (clobbered) in an __asm block. See <https://godbolt.org/z/blDIzK>, which proves your assumption WRONG!> The compiler has actually inlined that function as you said, then > noticed this lack of use and decided to completely eliminate the > lfsr variable.1. see above! 2. add "__declspec(noinline)" to the function definition ... and notice that the compiler STILL does NOT setup ANY of EAX, ECX and EDX before the function call! See <https://godbolt.org/z/X9R9w7>, which again proves your assumption WRONG.> You can see this in LLVM IR if you add "-emit-llvm -g0" to the godbolt > command-line. The loop *only* contains a counter and the lonely asm > block > >> BUG #3: the compiler allocates EDX for the variable "period"! >> ~~~~~~~ >> EDX is a volatile register, it is not preserved through function calls; > > There is no function call after inlining, of course, so it doesn't > need to be preserved in this case. It looks like there might be an > issue with an __asm block that actually *makes* a call though: > > __asm { > call Wibble > } > > doesn't seem to mark registers used by the call as clobbered from my > testing. It's slightly unclear what MSVC's behaviour is in this > situation from the documentation, but some vague poking suggests it > does know that a call instruction might clobber edx.The x86 calling conventions used on Windows (and some more OS) are VERY WELL KNOWN: EAX, ECX and EDX are NOT preserved through function calls. [EDX:]EAX is used for the [long] long return value, and ECX can be clobbered. Again the addition of "__declspec(noinline)" proves your assumptions wrong: the function call is NOT inlined any more, but its arguments are STILL not set up at all ... and the variable lfsr STILL not initialized. If the function is declared only, but not defined, then the compiler sets the arguments up before the call: see <https://godbolt.org/z/4pmOlC>>> additionally it holds the second argument with the __fastcall calling >> convention, so the constant 0x04C11DB7 MUST be loaded into EDX before >> label LBB1_1: > > This is the same unused-as-far-as-the-compiler-knows thing as the first two.OUCH! <https://godbolt.org/z/X9R9w7> proves this WRONG. LLVM is apparently REALLY confused by __asm, at least when used inside a __fastcall function. My expectation is that AT LEAST a warning message should be printed by the compiler. regards Stefan Kanthak PS: I can help myself when a compiler emits wrong code. Other people might just rely on the compiler...
Tim Northover via llvm-dev
2018-Nov-26 18:24 UTC
[llvm-dev] BUGS in code generated for target i386-win32
Hi Stefan, On Mon, 26 Nov 2018 at 18:01, Stefan Kanthak <stefan.kanthak at nexgo.de> wrote:> > I'll try to explain a little below how that one mismatch causes the > > issues you're seeing. > > > >> BUG #1: the compiler fails to allocate (EAX for) the variable "lfsr"! > >> BUG #2: the variable "lfsr" is NOT initialized! > > > > Since the __asm isn't linked (as far as Clang is concerned) to > > either input for lfsr32, they're both unused. > > REALLY? Or better: OUCH! > Is Clang NOT aware of the __fastcall calling convention and its > register usage?It is, but it intentionally does not tie that knowledge together with any __asm blocks, even if they are the only statement in a function body.> Clang does NOT create prolog/epilog here, so it CLEARLY knows that > "argument" is held in ECX and "polynomial" in EDX ... as documented > for __fastcallThat doesn't follow from the lack of a prologue, though it happens to be true. There is still no link with the register usage in the __asm however.> JFTR: Clang (like MSVC too) knows VERY well which registers are used > (clobbered) in an __asm block. See <https://godbolt.org/z/blDIzK>, > which proves your assumption WRONG!I don't believe so. Clobbering a register is very different from guessing some input value to put into that register prior to the block. For MSVC you're apparently supposed to write inline asm that materializes values into registers from elsewhere (variables of one kind or another) before using them. I could see the behaviour you're expecting being a reasonable default for a __declspec(naked) function restricted to a single __asm block, but it interacts very poorly with C language semantics for functions that can contain both C and assembly.> 2. add "__declspec(noinline)" to the function definition ... and notice > that the compiler STILL does NOT setup ANY of EAX, ECX and EDX before > the function call! > See <https://godbolt.org/z/X9R9w7>, which again proves your assumption > WRONG.The noinline prevents actual inlining, but the same inferences can be (and are) made through inter-procedural analysis. Add "-emit-llvm -g0" and you'll see your function being called with "undef" arguments and the complete elimination of lfsr.> If the function is declared only, but not defined, then the compiler > sets the arguments up before the call: see <https://godbolt.org/z/4pmOlC>Indeed.> >> additionally it holds the second argument with the __fastcall calling > >> convention, so the constant 0x04C11DB7 MUST be loaded into EDX before > >> label LBB1_1: > > > > This is the same unused-as-far-as-the-compiler-knows thing as the first two. > > OUCH! > <https://godbolt.org/z/X9R9w7> proves this WRONG.I'm afraid I don't see how. It's an unused variable (as far as Clang is concerned) so it doesn't need to be materialized anywhere.> My expectation is that AT LEAST a warning message should be printed by > the compiler.For using a register before defining it, perhaps? That's within the implementable realm I think (at least to cover some cases), though whether anyone is interested enough to actually do it is another question. Cheers. Tim.
Reid Kleckner via llvm-dev
2018-Nov-26 20:23 UTC
[llvm-dev] BUGS in code generated for target i386-win32
It's worth pointing out that we *do* add EAX as an implicit output of every asm block, since some MSVC inline asm does stuff like this in SDK headers and we have to support it: int foo(int x) { __asm mov eax, x } There's a bug for this somewhere in bugzilla with more details, but I'm late to work and wasn't able to find it in my first search. We could similarly detect __asm blocks at the start of a function, and add implicit inputs for all parameters, and that would get the desired behavior, with inlining to boot, which MSVC doesn't get. It's worth filing a bug about it in any case. As Tim says, if you just want clang to leave this code alone, add an explicit return and put __declspec(naked) on the function. That'll turn off all the parameter IPO as well. Adding warnings for this is out of scope. If we took the time to analyze the user's usage of registers and correlate it with the target calling convention, we might as well take the time to implement this wacky behavior. On Mon, Nov 26, 2018 at 10:24 AM Tim Northover via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Stefan, > > On Mon, 26 Nov 2018 at 18:01, Stefan Kanthak <stefan.kanthak at nexgo.de> > wrote: > > > I'll try to explain a little below how that one mismatch causes the > > > issues you're seeing. > > > > > >> BUG #1: the compiler fails to allocate (EAX for) the variable "lfsr"! > > >> BUG #2: the variable "lfsr" is NOT initialized! > > > > > > Since the __asm isn't linked (as far as Clang is concerned) to > > > either input for lfsr32, they're both unused. > > > > REALLY? Or better: OUCH! > > Is Clang NOT aware of the __fastcall calling convention and its > > register usage? > > It is, but it intentionally does not tie that knowledge together with > any __asm blocks, even if they are the only statement in a function > body. > > > Clang does NOT create prolog/epilog here, so it CLEARLY knows that > > "argument" is held in ECX and "polynomial" in EDX ... as documented > > for __fastcall > > That doesn't follow from the lack of a prologue, though it happens to > be true. There is still no link with the register usage in the __asm > however. > > > JFTR: Clang (like MSVC too) knows VERY well which registers are used > > (clobbered) in an __asm block. See <https://godbolt.org/z/blDIzK>, > > which proves your assumption WRONG! > > I don't believe so. Clobbering a register is very different from > guessing some input value to put into that register prior to the > block. For MSVC you're apparently supposed to write inline asm that > materializes values into registers from elsewhere (variables of one > kind or another) before using them. > > I could see the behaviour you're expecting being a reasonable default > for a __declspec(naked) function restricted to a single __asm block, > but it interacts very poorly with C language semantics for functions > that can contain both C and assembly. > > > 2. add "__declspec(noinline)" to the function definition ... and notice > > that the compiler STILL does NOT setup ANY of EAX, ECX and EDX before > > the function call! > > See <https://godbolt.org/z/X9R9w7>, which again proves your > assumption > > WRONG. > > The noinline prevents actual inlining, but the same inferences can be > (and are) made through inter-procedural analysis. Add "-emit-llvm -g0" > and you'll see your function being called with "undef" arguments and > the complete elimination of lfsr. > > > If the function is declared only, but not defined, then the compiler > > sets the arguments up before the call: see <https://godbolt.org/z/4pmOlC > > > > Indeed. > > > >> additionally it holds the second argument with the __fastcall calling > > >> convention, so the constant 0x04C11DB7 MUST be loaded into EDX before > > >> label LBB1_1: > > > > > > This is the same unused-as-far-as-the-compiler-knows thing as the > first two. > > > > OUCH! > > <https://godbolt.org/z/X9R9w7> proves this WRONG. > > I'm afraid I don't see how. It's an unused variable (as far as Clang > is concerned) so it doesn't need to be materialized anywhere. > > > My expectation is that AT LEAST a warning message should be printed by > > the compiler. > > For using a register before defining it, perhaps? That's within the > implementable realm I think (at least to cover some cases), though > whether anyone is interested enough to actually do it is another > question. > > Cheers. > > Tim. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181126/4065aca5/attachment.html>