thr3ads.net - similar to: "[LLVMdev] Bug in X86CompilationCallback

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Bug in X86CompilationCallback_SSE"

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 11

[LLVMdev] Bug in X86CompilationCallback_SSE

I don't know how to file a PR, but I have a patch (see below), that should work regardless of abi differences, since it relies on the compiler to do the though job. void X86CompilationCallback_SSE(void) { char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned asm volatile ( "movl %%eax,(%0)\n" "movl %%edx,4(%0)\n" // Save EAX/EDX/ECX

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 11

[LLVMdev] Bug in X86CompilationCallback_SSE

Hello, Corrado > Before you can correctly invoke a function via the Procedure Linkage > Table (plt), the ABI mandates that ebx is pointing to the GOT (Global > Offset Table) (see http://www.greyhat.ch/lab/downloads/pic.html) This is known issue, just nobody realized, that we have bunch of non- PIC-aware assembler code. :) Fixing would be not so trivial though, mostly due to ABI

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 12

[LLVMdev] Bug in X86CompilationCallback_SSE

On Mar 11, 2009, at 2:39 PM, Corrado Zoccolo wrote: > I don't know how to file a PR, but I have a patch (see below), that > should work regardless of abi differences, since it relies on the > compiler to do the though job. > > void X86CompilationCallback_SSE(void) { > char * SAVEBUF= (char*) alloca(64+12); // alloca is 16byte aligned How do you ensure it's 16-byte

[LLVMdev] Bug in X86CompilationCallback_SSE

2009 Mar 12

[LLVMdev] Bug in X86CompilationCallback_SSE

This looks like an interesting idea. As written, the inline asms aren't safe though; they reference %eax, %edx, etc. without declaring such things in constraints, so the compiler wouldn't know that it can't clobber those registers. Dan On Mar 11, 2009, at 2:39 PM, Corrado Zoccolo wrote: > I don't know how to file a PR, but I have a patch (see below), that > should work

[LLVMdev] Build issues on Solaris

2009 Aug 18

[LLVMdev] Build issues on Solaris

Hello, Nathan > or if it should be a configure test, which might be safer. Are there > any x86 platforms (other than apple) that don't need PLT-indirect calls? Yes, mingw. However just tweaking the define is not enough - we're not loading address of GOT into ebx before the call (on 32 bit ABIs) thus the call will be to nowhere. -- With best regards, Anton Korobeynikov Faculty of

[LLVMdev] Build issues on Solaris

2009 Aug 11

[LLVMdev] Build issues on Solaris

Hi all, I've encountered a couple of minor build issues on Solaris that have crept in since 2.5, fixes below: 1. In lib/Target/X86/X86JITInfo.cpp, there is: // Check if building with -fPIC #if defined(__PIC__) && __PIC__ && defined(__linux__) #define ASMCALLSUFFIX "@PLT" #else #define ASMCALLSUFFIX #endif Which causes a link failure due to the non-PLT

[LLVMdev] Build issues on Solaris

2009 Aug 25

[LLVMdev] Build issues on Solaris

On 19/08/2009, at 4:00 AM, Anton Korobeynikov wrote: > Hello, Nathan > >> or if it should be a configure test, which might be safer. Are there >> any x86 platforms (other than apple) that don't need PLT-indirect >> calls? > Yes, mingw. However just tweaking the define is not enough - we're not Ok, so configure might be the way to go then, maybe something

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Sat, Jul 7, 2012 at 12:25 AM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >> On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: >>> [...] >>> movaps 32(%rdi), %xmm3 >>> movaps 48(%rdi), %xmm2 >>>

[LLVMdev] Fix for non-standard variable length array + Visual C X86 specific code

2004 Oct 18

[LLVMdev] Fix for non-standard variable length array + Visual C X86 specific code

Paolo Invernizzi wrote: > There was a similar problem some time ago, and was resolved with alloca. > I think it's a better solution to use the stack instead of the heap... I tend to agree, but the constructors won't get called if it's an object array -- anyway, this particular case there was no objects, just pointers and bools so alloca should be fine. I'll leave it to

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > >> I've noticed that LLVM tends to generate suboptimal code and spill an >> excessive amount of registers in large functions, such as in those >> that are automatically generated by FFTW. >

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > I've noticed that LLVM tends to generate suboptimal code and spill an > excessive amount of registers in large functions, such as in those > that are automatically generated by FFTW. One problem might be that we're forcing the 16 stores to the out array to happen in source order, which

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

Hi, I've noticed that LLVM tends to generate suboptimal code and spill an excessive amount of registers in large functions, such as in those that are automatically generated by FFTW. LLVM generates good code for a function that computes an 8-point complex FFT, but from 16-point upwards, icc or gcc generates much better code. Here is an example of a sequence of instructions from a 32-point

[LLVMdev] crash in JIT when running the inliner

2008 Aug 06

[LLVMdev] crash in JIT when running the inliner

Hi, Today I've been trying to debug a weird bug that makes JIT crash with certain code and when using the inliner. This may sound weird, but if I disable the inliner, it doesn't crash. I include an example gdb dump below. Does something looks wrong? Do you think it's a bug in JIT or it's just some other piece of code that is writing on the JIT memory?.. I don't really know

[LLVMdev] How does SSEDomainFix work?

2010 May 11

[LLVMdev] How does SSEDomainFix work?

Hello. This is my 1st post. I have tried SSE execution domain fixup pass. But I am not able to see any improvements. I expect for the example below to use MOVDQA, PAND &c. (On nehalem, ANDPS is extremely slower than PAND) Please tell me if something would be wrong for me. Thank you. Takumi Host: i386-mingw32 Build: trunk at 103373 foo.ll: define <4 x i32> @foo(<4 x i32> %x,

[LLVMdev] How does SSEDomainFix work?

2010 May 11

[LLVMdev] How does SSEDomainFix work?

On May 10, 2010, at 9:07 PM, NAKAMURA Takumi wrote: > Hello. This is my 1st post. ようこそ！ > I have tried SSE execution domain fixup pass. > But I am not able to see any improvements. Did you actually measure runtime, or did you look at assembly? > I expect for the example below to use MOVDQA, PAND &c. > (On nehalem, ANDPS is extremely slower than PAND) Are you sure? The

[LLVMdev] Exception handling question

2010 Jan 22

[LLVMdev] Exception handling question

Interesting. Was this the reason you were getting the recursive compilation error in JIT::runJITOnFunctionUnlocked(...) (isAlreadyCodeGenerating)? Do you have the time to try your test with 2.7? Garrison On Jan 22, 2010, at 17:37, James Williams wrote: > I've worked around this issue in my test case by simply calling my personality function on program to ensure it's JIT'ed

[LLVMdev] X86 FMA4

2012 Jul 27

[LLVMdev] X86 FMA4

Hey Michael, Thanks for the legwork! It appears that the stats you listed are for movaps [SSE], not vmovaps [AVX]. I would *assume* that vmovaps(m128) is closer to vmovaps(m256), since they are both AVX instructions. Although, yes, I agree that this is not clear from Agner's report. Please correct me if I am misunderstanding. As I am sure you are aware, we cannot use SSE (movaps)

[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX

2013 Jul 19

[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX

(Changing subject line as diagnosis has changed) I'm attaching the compiled code that I've been getting, both with CodeGenOpt::Default and CodeGenOpt::None . The crash isn't occurring with CodeGenOpt::None, but that seems to be because ECX isn't being used - it still gets set to 0x7fffffff by one of the calls to 76719BA1 I notice that X86::SQRTPD[m|r] appear in

[LLVMdev] SIMD instructions and memory alignment on X86

2013 Jul 19

[LLVMdev] SIMD instructions and memory alignment on X86

Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think

New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16

2013 Aug 22

New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16

libFLAC have three SSE-accelerated functions FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_N (N = 4, 8, 12). They require lpc_order less than N. The best compression preset (flac -8) uses lpc_order up to 12; it means that during encoding FLAC also uses unaccelerated C function. I'm not very familiar with asm so I took FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_12, changed it and

similar to: [LLVMdev] Bug in X86CompilationCallback_SSE