Török Edwin
2009-Jan-24 16:23 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
On 2009-01-23 20:27, Török Edwin wrote:>>> >>> >> i'd not mind it at all if the kernel could be built with other open-source >> compilers too. >> >> Now in this case the patch you suggest might end up hurting the end result >> so it's not an unconditional 'yes'. But ... how much it actually matters >> depends on the circumstances. >> >> So could you please send a sample patch for some of most common inline >> assembly statements that are affected by this, so that we can see: >> >> 1) how ugly the LLVM workarounds are >> >> > > Ok, I will prepare a patch for both cases. > > >> 2) how they affect the generated kernel image in practice >> >> My gut feeling is that it's going to be acceptable with a bit of thinking >> (we might even do some wrappers to do this cleanly) - but i'd really like >> to see it before giving you that judgement. >>The below patch is to build the kernel for x86_64, with the attached .config, using llvm-gcc (trunk, with patch from http://llvm.org/bugs/show_bug.cgi?id=2989#c2). The .config has KVM turned off, because I didn't know how to change x86_emulate.c so that LLVM builds it (http://llvm.org/bugs/show_bug.cgi?id=3373#c10) For 32-bit some more changes are required. The resulting kernel image are of the same size $ ls -l vmlinux.patched -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 17:58 vmlinux.patched $ ls -l vmlinux -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 18:01 vmlinux They aren't identical though, a disassembly shows that the address of most of functions changed, also some register assignments changed (r14 instead of r15, and so on). Are these changes correct, and are they acceptable? Best regards, --Edwin --- arch/x86/include/asm/uaccess.h | 10 ++++++---- arch/x86/lib/delay.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 69d2757..28280de 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -154,7 +154,7 @@ extern int __get_user_bad(void); #define get_user(x, ptr) \ ({ \ - int __ret_gu; \ + unsigned long __ret_gu; \ unsigned long __val_gu; \ __chk_user_ptr(ptr); \ might_fault(); \ @@ -176,7 +176,7 @@ extern int __get_user_bad(void); break; \ } \ (x) = (__typeof__(*(ptr)))__val_gu; \ - __ret_gu; \ + (int)__ret_gu; \ }) #define __put_user_x(size, x, ptr, __ret_pu) \ @@ -239,11 +239,13 @@ extern void __put_user_8(void); */ #define put_user(x, ptr) \ ({ \ - int __ret_pu; \ + __typeof__(*(ptr)) __ret_pu; \ __typeof__(*(ptr)) __pu_val; \ __chk_user_ptr(ptr); \ might_fault(); \ __pu_val = x; \ + /* return value is 0 or -EFAULT, both fit in 1 byte, and \ + * are sign-extendable to int */ \ switch (sizeof(*(ptr))) { \ case 1: \ __put_user_x(1, __pu_val, ptr, __ret_pu); \ @@ -261,7 +263,7 @@ extern void __put_user_8(void); __put_user_x(X, __pu_val, ptr, __ret_pu); \ break; \ } \ - __ret_pu; \ + (int)__ret_pu; \ }) #define __put_user_size(x, ptr, size, retval, errret) \ diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index f456860..12d27f8 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay); inline void __const_udelay(unsigned long xloops) { - int d0; + unsigned long d0; xloops *= 4; asm("mull %%edx" -- 1.5.6.5 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: .config-llvm-64 URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090124/e5b987c5/attachment.ksh>
Ingo Molnar
2009-Jan-24 17:27 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
* Török Edwin <edwintorok at gmail.com> wrote:> On 2009-01-23 20:27, Török Edwin wrote: > >>> > >>> > >> i'd not mind it at all if the kernel could be built with other open-source > >> compilers too. > >> > >> Now in this case the patch you suggest might end up hurting the end result > >> so it's not an unconditional 'yes'. But ... how much it actually matters > >> depends on the circumstances. > >> > >> So could you please send a sample patch for some of most common inline > >> assembly statements that are affected by this, so that we can see: > >> > >> 1) how ugly the LLVM workarounds are > >> > >> > > > > Ok, I will prepare a patch for both cases. > > > > > >> 2) how they affect the generated kernel image in practice > >> > >> My gut feeling is that it's going to be acceptable with a bit of thinking > >> (we might even do some wrappers to do this cleanly) - but i'd really like > >> to see it before giving you that judgement. > >> > > The below patch is to build the kernel for x86_64, with the attached > .config, > using llvm-gcc (trunk, with patch from > http://llvm.org/bugs/show_bug.cgi?id=2989#c2). > > The .config has KVM turned off, because I didn't know how to change > x86_emulate.c so that LLVM builds it > (http://llvm.org/bugs/show_bug.cgi?id=3373#c10) > For 32-bit some more changes are required. > > The resulting kernel image are of the same size > $ ls -l vmlinux.patched > -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 17:58 vmlinux.patched > $ ls -l vmlinux > -rwxr-xr-x 1 edwin edwin 11277206 2009-01-24 18:01 vmlinux > > They aren't identical though, a disassembly shows that the address of > most of functions changed, > also some register assignments changed (r14 instead of r15, and so on). > > Are these changes correct, and are they acceptable? > > Best regards, > --Edwin > > --- > arch/x86/include/asm/uaccess.h | 10 ++++++---- > arch/x86/lib/delay.c | 2 +- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h > index 69d2757..28280de 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -154,7 +154,7 @@ extern int __get_user_bad(void); > > #define get_user(x, ptr) \ > ({ \ > - int __ret_gu; \ > + unsigned long __ret_gu; \ > unsigned long __val_gu; \ > __chk_user_ptr(ptr); \ > might_fault(); \ > @@ -176,7 +176,7 @@ extern int __get_user_bad(void); > break; \ > } \ > (x) = (__typeof__(*(ptr)))__val_gu; \ > - __ret_gu; \ > + (int)__ret_gu; \ > }) > > #define __put_user_x(size, x, ptr, __ret_pu) \ > @@ -239,11 +239,13 @@ extern void __put_user_8(void); > */ > #define put_user(x, ptr) \ > ({ \ > - int __ret_pu; \ > + __typeof__(*(ptr)) __ret_pu; \This does not look right. We can sometimes have put_user() of non-integer types (say structures). How does the (int)__ret_pu cast work in that case? We'll fall into this branch in that case: default: \ __put_user_x(X, __pu_val, ptr, __ret_pu); \ break; \ and __ret_pu has a nonsensical type in that case.> __typeof__(*(ptr)) __pu_val; \ > __chk_user_ptr(ptr); \ > might_fault(); \ > __pu_val = x; \ > + /* return value is 0 or -EFAULT, both fit in 1 byte, and \ > + * are sign-extendable to int */ \ > switch (sizeof(*(ptr))) { \ > case 1: \ > __put_user_x(1, __pu_val, ptr, __ret_pu); \ > @@ -261,7 +263,7 @@ extern void __put_user_8(void); > __put_user_x(X, __pu_val, ptr, __ret_pu); \ > break; \ > } \ > - __ret_pu; \ > + (int)__ret_pu; \ > }) > > #define __put_user_size(x, ptr, size, retval, errret) \ > diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c > index f456860..12d27f8 100644 > --- a/arch/x86/lib/delay.c > +++ b/arch/x86/lib/delay.c > @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay); > > inline void __const_udelay(unsigned long xloops) > { > - int d0; > + unsigned long d0; > > xloops *= 4; > asm("mull %%edx"Is this all that you need (plus the 16-bit setup code tweaks) to get LLVM to successfully build a 64-bit kernel image? If yes then this doesnt look all that bad or invasive at first sight (if the put_user() workaround can be expressed in a cleaner way), but in any case it would be nice to hear an LLVM person's opinion about roughly when this is going to be solved in LLVM itself. Ingo
Török Edwin
2009-Jan-24 18:57 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
On 2009-01-24 19:27, Ingo Molnar wrote:> * Török Edwin <edwintorok at gmail.com> wrote: > >> #define put_user(x, ptr) \ >> ({ \ >> - int __ret_pu; \ >> + __typeof__(*(ptr)) __ret_pu; \ >> > > This does not look right. We can sometimes have put_user() of non-integer > types (say structures).I didn't encounter it with my .config, but it is certainly possible. I think using __builtin_choose_expr would be better than the switch See a new patch at the end of this mail, using __builtin_choose_expr. [vmlinux size still same] It also includes some 32-bit stuff, but that is not complete yet.> How does the (int)__ret_pu cast work in that case? >It would fail at compile time, with an error message that you can't cast aggregates to ints, so my patch is not good.> We'll fall into this branch in that case: > > default: \ > __put_user_x(X, __pu_val, ptr, __ret_pu); \ > break; \ > > and __ret_pu has a nonsensical type in that case. >That branch is a call to a non-existent function __put_user_X, and should give error at link time, right? In the new patch below I used (void)-EFAULT so that you would get an error at compile time (as suggested in __builtin_choose_expr in gcc's manual), if that branch would ever get expanded. Does that sound right?> >> __typeof__(*(ptr)) __pu_val; \ >> __chk_user_ptr(ptr); \ >> might_fault(); \ >> __pu_val = x; \ >> + /* return value is 0 or -EFAULT, both fit in 1 byte, and \ >> + * are sign-extendable to int */ \ >> switch (sizeof(*(ptr))) { \ >> case 1: \ >> __put_user_x(1, __pu_val, ptr, __ret_pu); \ >> @@ -261,7 +263,7 @@ extern void __put_user_8(void); >> __put_user_x(X, __pu_val, ptr, __ret_pu); \ >> break; \ >> } \ >> - __ret_pu; \ >> + (int)__ret_pu; \ >> }) >> >> #define __put_user_size(x, ptr, size, retval, errret) \ >> diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c >> index f456860..12d27f8 100644 >> --- a/arch/x86/lib/delay.c >> +++ b/arch/x86/lib/delay.c >> @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay); >> >> inline void __const_udelay(unsigned long xloops) >> { >> - int d0; >> + unsigned long d0; >> >> xloops *= 4; >> asm("mull %%edx" >> > > Is this all that you need (plus the 16-bit setup code tweaks)The 16-bit setup code is compiled, but obviously doesn't work. I think the best approach would be for LLVM to give a warning/error, when -fno-unit-at-a-time is used, since it doesn't support that.> to get LLVM > to successfully build a 64-bit kernel image? >With the .config I sent previously, yes. With some other .config most likely more changes are needed, for example the SMP code, KVM code, but as I said in my previous email I don't know how to "fix" the inline asm in that case. There's something wrong with building some modules also, I keep getting an "idr_init" undefined error, but the symbol is present in my vmlinux. If I turn make those modules built into the kernel it works then (sctp, w1, thermalsysfs). It looks like I'll also have to submit some patches for ARCH=um, because I get undefined references to __bad_size, __guard, and __stack_smash_handler. __bad_size is probably because LLVM didn't inline expand + DCE something GCC did. With an unpatched LLVM I would also need weak attributes to be on the function type, instead of the return type, but I think thats an LLVM bug, and a one-line patch corrects it.> If yes then this doesnt look all that bad or invasive at first sight (if > the put_user() workaround can be expressed in a cleaner way), but in any > case it would be nice to hear an LLVM person's opinion about roughly when > this is going to be solved in LLVM itself. >Yes, that is why LLVMDev is on Cc:, somebody will eventually reply ;) Not-Signed-off-by: Török Edwin <edwintorok at gmail.com> --- arch/x86/include/asm/uaccess.h | 41 +++++++++++++++++++-------------------- arch/x86/lib/delay.c | 2 +- arch/x86/pci/pcbios.c | 4 ++- 3 files changed, 24 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 69d2757..46d00a8 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -154,7 +154,7 @@ extern int __get_user_bad(void); #define get_user(x, ptr) \ ({ \ - int __ret_gu; \ + unsigned long __ret_gu; \ unsigned long __val_gu; \ __chk_user_ptr(ptr); \ might_fault(); \ @@ -176,7 +176,7 @@ extern int __get_user_bad(void); break; \ } \ (x) = (__typeof__(*(ptr)))__val_gu; \ - __ret_gu; \ + (int)__ret_gu; \ }) #define __put_user_x(size, x, ptr, __ret_pu) \ @@ -200,12 +200,15 @@ extern int __get_user_bad(void); : "A" (x), "r" (addr), "i" (-EFAULT), "0" (err)) #define __put_user_x8(x, ptr, __ret_pu) \ - asm volatile("call __put_user_8" : "=a" (__ret_pu) \ - : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx") + ({ u32 __ret_pu;\ + asm volatile("call __put_user_8" : "=a" (__ret_pu) \ + : "A" ((typeof(*(ptr)))(x)), "c" (ptr) : "ebx");\ + (int)__ret_pu;}) #else #define __put_user_asm_u64(x, ptr, retval) \ __put_user_asm(x, ptr, retval, "q", "", "Zr", -EFAULT) -#define __put_user_x8(x, ptr, __ret_pu) __put_user_x(8, x, ptr, __ret_pu) +#define __put_user_x8(x, ptr, __ret_pu) \ + ({ u64 __ret_pu; __put_user_x(8, x, ptr, __ret_pu); (int)__ret_pu; }) #endif extern void __put_user_bad(void); @@ -239,29 +242,25 @@ extern void __put_user_8(void); */ #define put_user(x, ptr) \ ({ \ - int __ret_pu; \ __typeof__(*(ptr)) __pu_val; \ __chk_user_ptr(ptr); \ might_fault(); \ __pu_val = x; \ - switch (sizeof(*(ptr))) { \ - case 1: \ + __builtin_choose_expr(sizeof(*(ptr)) == 1, \ + ({ u8 __ret_pu; \ __put_user_x(1, __pu_val, ptr, __ret_pu); \ - break; \ - case 2: \ + (int)__ret_pu;}), \ + __builtin_choose_expr(sizeof(*(ptr)) == 2, \ + ({ u16 __ret_pu; \ __put_user_x(2, __pu_val, ptr, __ret_pu); \ - break; \ - case 4: \ + (int)__ret_pu;}), \ + __builtin_choose_expr(sizeof(*(ptr)) == 4, \ + ({ u32 __ret_pu; \ __put_user_x(4, __pu_val, ptr, __ret_pu); \ - break; \ - case 8: \ - __put_user_x8(__pu_val, ptr, __ret_pu); \ - break; \ - default: \ - __put_user_x(X, __pu_val, ptr, __ret_pu); \ - break; \ - } \ - __ret_pu; \ + (int)__ret_pu;}), \ + __builtin_choose_expr(sizeof(*(ptr)) == 8, \ + __put_user_x8(__pu_val, ptr, __ret_pu), \ + (void)-EFAULT)))); \ }) #define __put_user_size(x, ptr, size, retval, errret) \ diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c index f456860..12d27f8 100644 --- a/arch/x86/lib/delay.c +++ b/arch/x86/lib/delay.c @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay); inline void __const_udelay(unsigned long xloops) { - int d0; + unsigned long d0; xloops *= 4; asm("mull %%edx" diff --git a/arch/x86/pci/pcbios.c b/arch/x86/pci/pcbios.c index b82cae9..dfff175 100644 --- a/arch/x86/pci/pcbios.c +++ b/arch/x86/pci/pcbios.c @@ -65,6 +65,7 @@ static struct { static unsigned long bios32_service(unsigned long service) { unsigned char return_code; /* %al */ + unsigned long return_code_compat; /* %eax */ unsigned long address; /* %ebx */ unsigned long length; /* %ecx */ unsigned long entry; /* %edx */ @@ -72,13 +73,14 @@ static unsigned long bios32_service(unsigned long service) local_irq_save(flags); __asm__("lcall *(%%edi); cld" - : "=a" (return_code), + : "=a" (return_code_compat), "=b" (address), "=c" (length), "=d" (entry) : "0" (service), "1" (0), "D" (&bios32_indirect)); + return_code = return_code_compat; local_irq_restore(flags); switch (return_code) { -- 1.5.6.5
Chris Lattner
2009-Jan-24 19:23 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
On Jan 24, 2009, at 9:27 AM, Ingo Molnar wrote:>> #define __put_user_size(x, ptr, size, retval, errret) \ >> diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c >> index f456860..12d27f8 100644 >> --- a/arch/x86/lib/delay.c >> +++ b/arch/x86/lib/delay.c >> @@ -112,7 +112,7 @@ EXPORT_SYMBOL(__delay); >> >> inline void __const_udelay(unsigned long xloops) >> { >> - int d0; >> + unsigned long d0; >> >> xloops *= 4; >> asm("mull %%edx" > > Is this all that you need (plus the 16-bit setup code tweaks) to get > LLVM > to successfully build a 64-bit kernel image? > > If yes then this doesnt look all that bad or invasive at first sight > (if > the put_user() workaround can be expressed in a cleaner way), but in > any > case it would be nice to hear an LLVM person's opinion about roughly > when > this is going to be solved in LLVM itself.Hi Ingo, We would like to support some more specific cases (e.g. when tying a pointer/int to a different size pointer/int) but we currently don't intend to support all cases (e.g. tying a FP value to int). We are in this position because the semantics are very vague and hard to reason about (and change based on target endianness) and we had many subtle bugs in the corner cases. Instead of having silent miscompiles, we decided to just reject all the "hard" cases and add them back one by one as there is demand. That way users could choose to modify their asms instead of having them be potentially silently miscompiled. LLVM 2.5 is in its release process right now, so it will not have improvements in this area, but LLVM 2.6 certainly could. If there is interest in building the kernel with 2.5, I think taking the patches would be worthwhile. If that is hopeless anyway, waiting for the LLVM- side fixes should be fine. -Chris
Andreas Schwab
2009-Jan-24 20:07 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
Török Edwin <edwintorok at gmail.com> writes:> @@ -239,11 +239,13 @@ extern void __put_user_8(void); > */ > #define put_user(x, ptr) \ > ({ \ > - int __ret_pu; \ > + __typeof__(*(ptr)) __ret_pu; \ > __typeof__(*(ptr)) __pu_val; \ > __chk_user_ptr(ptr); \ > might_fault(); \ > __pu_val = x; \ > + /* return value is 0 or -EFAULT, both fit in 1 byte, and \ > + * are sign-extendable to int */ \That does not work when *ptr is unsigned (char or short). Andreas. -- Andreas Schwab, SuSE Labs, schwab at suse.de SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Duncan Sands
2009-Jan-27 19:42 UTC
[LLVMdev] inline asm semantics: output constraint width smaller than input
Hi,> If yes then this doesnt look all that bad or invasive at first sight (if > the put_user() workaround can be expressed in a cleaner way), but in any > case it would be nice to hear an LLVM person's opinion about roughly when > this is going to be solved in LLVM itself.one thing that seems to be clear to everyone except me is... what are the semantics supposed to be? [My understanding is that what is being discussed is when you have an asm with a register as input and output, but with integer types of different width for the input and output, but I saw some mention of struct types in this thread...]. Presumably this is something obvious, but it would be good to have someone spell it out in small words that even someone like me can understand :) Thanks, Duncan.
Apparently Analagous Threads
- [LLVMdev] inline asm semantics: output constraint width smaller than input
- [LLVMdev] inline asm semantics: output constraint width smaller than input
- [LLVMdev] inline asm semantics: output constraint width smaller than input
- [LLVMdev] inline asm semantics: output constraint width smaller than input
- [LLVMdev] inline asm semantics: output constraint width smaller than input