similar to: BUGS in code generated for target i386-win32

Displaying 20 results from an estimated 2000 matches similar to: "BUGS in code generated for target i386-win32"

2018 Nov 26
2
BUGS in code generated for target i386-win32
"Tim Northover" <t.p.northover at gmail.com> wrote: > Hi Stefan, > > On Mon, 26 Nov 2018 at 12:37, Stefan Kanthak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> LLVM/clang generates wrong code for the following program >> (see <https://godbolt.org/z/UZrrkG>): > > It looks like all of these issues come down to mismatched
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote: > IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like > this: > unsigned int foo(unsigned int crc) { > if (crc & 0x80000000) > crc <<= 1, crc ^= 0xEDB88320; > else > crc <<= 1; > return crc; > } To document this for x86 too: rewrite the function
2018 Dec 01
2
Where's the optimiser gone? (part 5.c): missed tail calls, and more...
Compile the following functions with "-O3 -target i386-win32" (see <https://godbolt.org/z/exmjWY>): __int64 __fastcall div(__int64 foo, __int64 bar) { return foo / bar; } On the left the generated code; on the right the expected, properly optimised code: push dword ptr [esp + 16] | push dword ptr [esp + 16] | push dword ptr [esp + 16] |
2015 Apr 09
6
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: > +++ b/kernel/locking/qspinlock_paravirt.h > @@ -0,0 +1,321 @@ > +#ifndef _GEN_PV_LOCK_SLOWPATH > +#error "do not include this file" > +#endif > + > +/* > + * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead > + * of spinning them. > + * > + * This relies on the
2015 Apr 09
6
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: > +++ b/kernel/locking/qspinlock_paravirt.h > @@ -0,0 +1,321 @@ > +#ifndef _GEN_PV_LOCK_SLOWPATH > +#error "do not include this file" > +#endif > + > +/* > + * Implement paravirt qspinlocks; the general idea is to halt the vcpus instead > + * of spinning them. > + * > + * This relies on the
2015 Mar 19
4
[PATCH 8/9] qspinlock: Generic paravirt support
On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: > So I was now thinking of hashing the lock pointer; let me go and quickly > put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda bloated, but it shows the idea I think. And while this has loops in (the rehashing thing) their fwd progress does not depend on other
2015 Mar 19
4
[PATCH 8/9] qspinlock: Generic paravirt support
On Thu, Mar 19, 2015 at 11:12:42AM +0100, Peter Zijlstra wrote: > So I was now thinking of hashing the lock pointer; let me go and quickly > put something together. A little something like so; ideally we'd allocate the hashtable since NR_CPUS is kinda bloated, but it shows the idea I think. And while this has loops in (the rehashing thing) their fwd progress does not depend on other
2015 Apr 13
1
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: > >>+void __init __pv_init_lock_hash(void) > >>+{ > >>+ int pv_hash_size = 4 * num_possible_cpus(); > >>+ > >>+ if (pv_hash_size< (1U<< LFSR_MIN_BITS)) > >>+ pv_hash_size = (1U<< LFSR_MIN_BITS); > >>+ /* > >>+ * Allocate space from bootmem which
2015 Apr 13
1
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Thu, Apr 09, 2015 at 05:41:44PM -0400, Waiman Long wrote: > >>+void __init __pv_init_lock_hash(void) > >>+{ > >>+ int pv_hash_size = 4 * num_possible_cpus(); > >>+ > >>+ if (pv_hash_size< (1U<< LFSR_MIN_BITS)) > >>+ pv_hash_size = (1U<< LFSR_MIN_BITS); > >>+ /* > >>+ * Allocate space from bootmem which
2018 Nov 28
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
On Wed, Nov 28, 2018 at 7:11 AM Sanjay Patel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Thanks for reporting this and other perf opportunities. As I mentioned > before, if you could file bug reports for these, that's probably the only > way they're ever going to get fixed (unless you're planning to fix them > yourself). It's not an ideal situation, but
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
Hi @ll, targetting i386, LLVM/clang generates wrong code for the following functions: unsigned long __bswapsi2 (unsigned long ul) { return (((ul) & 0xff000000ul) >> 3 * 8) | (((ul) & 0x00ff0000ul) >> 8) | (((ul) & 0x0000ff00ul) << 8) | (((ul) & 0x000000fful) << 3 * 8); } unsigned long long __bswapdi2(unsigned long
2018 Nov 30
2
(Question regarding the) incomplete "builtins library" of "Compiler-RT"
"Friedman, Eli" <efriedma at codeaurora.org> wrote: > On 11/30/2018 8:31 AM, Stefan Kanthak via llvm-dev wrote: >> Hi @ll, >> >> compiler-rt implements (for example) the MSVC (really Windows) >> specific routines compiler-rt/lib/builtins/i386/chkstk.S and >> compiler-rt/lib/builtins/x86_64/chkstk.S as __chkstk_ms() >> See
2018 Nov 25
2
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
I just compiled the two attached files in 32-bit mode and ran it. It printed efcdab8967452301. I verified via objdump that the my_bswap function contains the follow assembly which I believe matches the assembly you linked to on godbolt. _my_bswap: 1f70: 55 pushl %ebp 1f71: 89 e5 movl %esp, %ebp 1f73: 8b 55 08 movl 8(%ebp), %edx 1f76: 8b 45 0c movl 12(%ebp), %eax 1f79: 0f c8
2018 Nov 25
3
BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()
bswapdi2 for i386 is correct Bits 31:0 of the source are loaded into edx. Bits 63:32 are loaded into eax. Those are each bswapped. The ABI for the return is edx contains bits [63:32] and eax contains [31:0]. This is opposite of how the register were loaded. ~Craig On Sun, Nov 25, 2018 at 10:36 AM Craig Topper <craig.topper at gmail.com> wrote: > bswapsi2 on the x86-64 isn't using
2015 Apr 09
0
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On 04/09/2015 02:13 PM, Peter Zijlstra wrote: > On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: >> +++ b/kernel/locking/qspinlock_paravirt.h >> @@ -0,0 +1,321 @@ >> +#ifndef _GEN_PV_LOCK_SLOWPATH >> +#error "do not include this file" >> +#endif >> + >> +/* >> + * Implement paravirt qspinlocks; the general idea is to halt the
2015 Apr 09
0
[PATCH v15 09/15] pvqspinlock: Implement simple paravirt support for the qspinlock
On Thu, Apr 09, 2015 at 08:13:27PM +0200, Peter Zijlstra wrote: > On Mon, Apr 06, 2015 at 10:55:44PM -0400, Waiman Long wrote: > > +#define PV_HB_PER_LINE (SMP_CACHE_BYTES / sizeof(struct pv_hash_bucket)) > > +static struct qspinlock **pv_hash(struct qspinlock *lock, struct pv_node *node) > > +{ > > + unsigned long init_hash, hash = hash_ptr(lock, pv_lock_hash_bits);
2015 Apr 07
18
[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support
v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve upon the PV qspinlock code by dynamically allocating the hash table as well as some other performance optimization. - Simplified the Xen PV qspinlock code as suggested by David Vrabel <david.vrabel at citrix.com>. - Add benchmarking data for 3.19 kernel to compare the performance of a spinlock heavy test
2015 Apr 07
18
[PATCH v15 00/15] qspinlock: a 4-byte queue spinlock with PV support
v14->v15: - Incorporate PeterZ's v15 qspinlock patch and improve upon the PV qspinlock code by dynamically allocating the hash table as well as some other performance optimization. - Simplified the Xen PV qspinlock code as suggested by David Vrabel <david.vrabel at citrix.com>. - Add benchmarking data for 3.19 kernel to compare the performance of a spinlock heavy test
2018 Nov 30
3
(Question regarding the) incomplete "builtins library" of "Compiler-RT"
Hi @ll, compiler-rt implements (for example) the MSVC (really Windows) specific routines compiler-rt/lib/builtins/i386/chkstk.S and compiler-rt/lib/builtins/x86_64/chkstk.S as __chkstk_ms() See <http://msdn.microsoft.com/en-us/library/ms648426.aspx> Is there any special reason why compiler-rt doesn't implement other MSVC specific functions (alias builtins or "compiler