thr3ads.net - similar to: "[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type"

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

2009 Jun 10

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

On Tue, Jun 9, 2009 at 2:58 PM, Nicolas Capens<nicolas at capens.net> wrote: > Please consider committing the attached patch. I believe the SSE2 packsswb, > packssdw and packuswb intrinsics have an incorrect return type. If we really wanted to do this, an AutoUpgrade patch would be necessary for backwards-compatibility. I'm not sure it's worth bothering. -Eli

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

2009 Jun 10

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

On Jun 9, 2009, at 5:56 PM, Eli Friedman wrote: > On Tue, Jun 9, 2009 at 2:58 PM, Nicolas Capens<nicolas at capens.net> > wrote: >> Please consider committing the attached patch. I believe the SSE2 >> packsswb, >> packssdw and packuswb intrinsics have an incorrect return type. > > If we really wanted to do this, an AutoUpgrade patch would be > necessary

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

2009 Jun 10

[LLVMdev] [Patch] Fix SSE2 packing intrinsics return type

Hi Eli, What exactly do mean by an AutoUpgrade patch? I don't see how this could cause any issues with backward compatibility. People currently using these intrinsics need a bitcast of the result to avoid an assert, and with the patch applied the bitcast is no longer necessary. Cheers, Nicolas -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at

[LLVMdev] 128-bit PXOR requires SSE2

2012 Jan 20

[LLVMdev] 128-bit PXOR requires SSE2

Hi all, I think I found a bug in LLVM 3.0: When compiling for a target without SSE2 support, there were some 128-bit PXOR instructions in the generated code. I traced it down to the following definition in X86InstrSSE.td: def FsFLD0SS : I<0xEF, MRMInitReg, (outs FR32:$dst), (ins), "", [(set FR32:$dst, fp32imm0)]>,

[LLVMdev] 128-bit PXOR requires SSE2

2012 Jan 20

[LLVMdev] 128-bit PXOR requires SSE2

On Fri, Jan 20, 2012 at 2:47 PM, Nicolas Capens <nicolas.capens at gmail.com> wrote: > Hi all, > > I think I found a bug in LLVM 3.0: When compiling for a target without > SSE2 support, there were some 128-bit PXOR instructions in the generated > code. > > I traced it down to the following definition in X86InstrSSE.td: > > def FsFLD0SS : I<0xEF, MRMInitReg,

[LLVMdev] [rfc] "alias weak" X "weak alias"

2014 Jun 07

[LLVMdev] [rfc] "alias weak" X "weak alias"

>> The days of the old .ll parser are long gone, but is it too late to >> change? In case it is not, the attached patches implement just that >> :-) > I'm afraid you need to provide syntax autoupgrade until 4.0 Why, we moved to doing autoupgrade via bitcode quiet some time ago. There were quiet a few format changes to the .ll in the process. Cheers, Rafael

MMX loop filter for theora-exp

2005 Aug 17

MMX loop filter for theora-exp

Hello, I would like to announce the semi-optimized oc_state_loop_filter_frag_rows It gains like 7% speedup. Unfortunately it has some issues: 1) wont compile on 64bit (I will fix it later hopefully) 2) is not yet fully optimized (instruction stalls) Here are the results. CPU: Athlon, speed 1466.91 MHz (estimated) Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask

[LLVMdev] getIntrinsicID() optimization

2009 Oct 17

[LLVMdev] getIntrinsicID() optimization

Hi Chris, Function is currently 108 bytes large. Could 4 more bytes really be an issue? Actually 2 should suffice. While I understand that some applications value storage more than anything, many applications value compilation time very highly. getIntrinsicID is called all over the place (isIntrinsic uses it as well), and every single time it checks the function name. To me that sounds a lot

[LLVMdev] LLVM 2.8 and MMX

2010 Sep 08

[LLVMdev] LLVM 2.8 and MMX

On Wed, Sep 8, 2010 at 12:35 AM, Nicolas Capens <nicolas.capens at gmail.com> wrote: > Hi Chris, > > It's not broken, but the performance is crippled. > > I noticed that the code still contains some MMX instructions, but several > operations get expanded (apparently swizzling and such get expanded to a > large number of byte moves). I think some changes related to

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

2017 Sep 21

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

I agree with Paul that we need to formalise the compatibility policy before we start removing support for old intrinsics. Do we want a deprecation warning of some kind for the use of any intrinsic used in auto-upgrade, would that actually be useful or just a nuisance? In the meantime I’m happy to help fix any missing test coverage. > On 20 Sep 2017, at 22:16, Robinson, Paul via llvm-dev

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

2017 Sep 20

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

Many of the older autoupgrades have no test cases because I think when we upgraded them we just replace all the code in the tests with native IR. So for some of the code we don't even know if it works. I don't really want to watch the amount of code here continue to grow indefinitely. It's pretty poorly structured and has been up against the MSVC cascaded if/else limit a couple times.

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

2017 Sep 20

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

We have quite a lot of code in AutoUpgrade.cpp to upgrade X86 intrinsics that have been replaced with native IR over the years. Has enough time and/or versions passed that we can begin phasing out some of this code? As I'm writing these we don't seem to have tests for a lot of the older upgrades. We've done better at this in the last few years. 3.1 added upgrade for:

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

2017 Sep 22

RFC: [X86] Can we begin removing AutoUpgrade support for x86 instrinsics added in early 3.X versions

Hi, I believe we have a formal policy: we support every bitcode produced since LLVM 3.0: https://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility (until we decide to uprev the version we support). Unfortunately, the testing was only added around 3.6 or 3.7? And support is only as good as the testing we have... -- Mehdi 2017-09-21 0:23 GMT-07:00 Simon Pilgrim via llvm-dev <

[LLVMdev] bit code file incompatibility due to debug info changes

2013 Nov 18

[LLVMdev] bit code file incompatibility due to debug info changes

>> >> At a minimum, it seems like we need a version number in the debug info >> metadata so we can detect this situation and avoid crashing. > > > Or to put it in the terms of the IR: we need to autoupgrade the debug info > metadata just like we do intrinsics. With debug info this might (at the > worst) involve dropping old metadata. > The verifier is probably

[LLVMdev] bit code file incompatibility due to debug info changes

2013 Nov 22

[LLVMdev] bit code file incompatibility due to debug info changes

On Thu, Nov 21, 2013 at 4:17 PM, Manman Ren <manman.ren at gmail.com> wrote: > > > > On Thu, Nov 21, 2013 at 12:01 PM, David Blaikie <dblaikie at gmail.com>wrote: > >> >> >> >> On Thu, Nov 21, 2013 at 11:45 AM, Manman Ren <manman.ren at gmail.com>wrote: >> >>> >>> >>> >>> On Thu, Nov 21, 2013 at

[LLVMdev] Output to a DLL

2009 Jun 11

[LLVMdev] Output to a DLL

Hi all, I'd like to be able to write JIT-compiled code to a Windows DLL. I have no idea where to start though. Does LLVM already offer some support for this? Or would it be straightforward to write my own DLL writer (no advanced features needed)? Or maybe I could use an external linker? All help highly appreciated! Cheers, Nicolas -------------- next part -------------- An HTML

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

[LLVMdev] Spilled variables using unaligned moves

Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr

[LLVMdev] LLVM 2.8 and MMX

2010 Sep 22

[LLVMdev] LLVM 2.8 and MMX

Assign the bug to me and I'll fix it in TOT next week! Thanks for narrowing it down! On Wednesday, September 22, 2010, Nicolas Capens <nicolas.capens at gmail.com> wrote: > Hi all, > > I think I figured it out: > 112804 causes 64-bit UNPCKLBW to no longer be selected for certain cases. > 112805 is benign. > 112806 causes 64-bit UNPCKHBW to no longer be selected for

[LLVMdev] LLVM 2.8 and MMX

2010 Sep 21

[LLVMdev] LLVM 2.8 and MMX

This thread confuses me. I thought Chris said that LLVM 2.8 will not lower generic vectors to MMX because it breaks x87 code, and I didn't see an answer to your question about a switch to tell the code generator otherwise. However, you're complaining that MMX performance is subpar, even though LLVM 2.8 isn't supposed to generate MMX instructions. Can someone clarify the situation

[PATCH] promised MMX patches rc1

2005 Mar 23

[PATCH] promised MMX patches rc1

Hello, Here is my first speedup patch. Like 10-11%. No IDCT yet. Please feel free to comment my code or even better think about improvements. :) I belive my routines are not so bad, maybe one day they will be even more faster. What needs to be optimized is the loop filter fuction. I have no ideas now how to do it. It does not leave much space for parallel stuff, copying memory from lot of

similar to: [LLVMdev] [Patch] Fix SSE2 packing intrinsics return type