thr3ads.net - similar to: "[Bug 90887] New: PhiMovesPass in register allocator broken"

Displaying 20 results from an estimated 30000 matches similar to: "[Bug 90887] New: PhiMovesPass in register allocator broken"

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

On 02.10.2016 20:03, Ilia Mirkin wrote: > On Sun, Oct 2, 2016 at 1:58 PM, Tobias Klausmann > <tobias.johannes.klausmann at mni.thm.de> wrote: >> Previously we'd end up with an unnecessary mov for the thirs immediate value. >> >> total instructions in shared programs : 851881 -> 851864 (-0.00%) >> total gprs used in shared programs : 110295 -> 110295

[Bug 78161] New: [NV96] Artifacts in output of fragment program containing not unrolled loops with conditional break

2014 May 01

[Bug 78161] New: [NV96] Artifacts in output of fragment program containing not unrolled loops with conditional break

https://bugs.freedesktop.org/show_bug.cgi?id=78161 Priority: medium Bug ID: 78161 Assignee: nouveau at lists.freedesktop.org Summary: [NV96] Artifacts in output of fragment program containing not unrolled loops with conditional break Severity: normal Classification: Unclassified OS: Linux (All)

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

2016 Oct 02

[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD

Previously we'd end up with an unnecessary mov for the thirs immediate value. total instructions in shared programs : 851881 -> 851864 (-0.00%) total gprs used in shared programs : 110295 -> 110295 (0.00%) total local used in shared programs : 1020 -> 1020 (0.00%) local gpr inst bytes helped 0 0 17 17

[Bug 96565] New: Clive Barker's Jericho displays strange, vivid colors when motion blur enabled

2016 Jun 17

[Bug 96565] New: Clive Barker's Jericho displays strange, vivid colors when motion blur enabled

https://bugs.freedesktop.org/show_bug.cgi?id=96565 Bug ID: 96565 Summary: Clive Barker's Jericho displays strange,vivid colors when motion blur enabled Product: Mesa Version: 11.2 Hardware: Other OS: All Status: NEW Keywords: bisected, regression Severity: normal

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

I'm building this with llvm-c, and accessing these intrinsics via calling the intrinsic as if it were a function. class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> : NVPTXInst<(outs regclassOut:$dst), (ins), OpStr, [(set regclassOut:$dst, (IntOp))]>; def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",

[PATCH 0/5] nvc0: fp64 preparation

2014 Jul 18

[PATCH 0/5] nvc0: fp64 preparation

Most of codegen is already FP64-ready. There are a few edge-cases that I ran into, many of which can apply even to non-fp64-enabled programs (although the double-wide registers are not very common without fp64). I've yet to give this a full piglit run, but wanted to send these out in case someone wanted to comment. They do not depend on the preliminary core fp64 work. Ilia Mirkin (5):

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests. ----- Original message ----- > I'm building this with llvm-c, and accessing these intrinsics via calling > the intrinsic as if it were a function. > > class F_SREG<string

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Hi Timothy, I'm not sure what you mean by this working for other intrinsics, but in this case, I think you want the intrinsic name llvm.nvvm.read.ptx.sreg.tid.x. For me, this looks like: %x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() Pete On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote: > I'm building this with llvm-c, and accessing these

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

The identifier INT_PTX_SREG_TID_X is the name of an instruction as the back-end sees it, and has very little to do with the name you should use in your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td file and see the definitions for each intrinsic. Then, the name mapping is just: int_foo_bar -> llvm.foo.bar() int_ prefix becomes llvm., and all underscores turn into

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

I've written a compiler that outputs PTX code, the result seems fairly reasonable, but I'm not sure the intrinsics are getting compiled correctly. In addition, when I try load the module using CUDA, I get an error: CUDA_ERROR_NO_BINARY_FOR_GPU. I'm running this on a 2012 MBP with a 640M GPU. PTX Code (for a mandelbrot calculation): // // Generated by LLVM NVPTX Back-End //

Fermi+ shader header docs

2015 May 21

Fermi+ shader header docs

On Thu, May 21, 2015 at 10:05 AM, Robert Morell <rmorell at nvidia.com> wrote: > Hi Ilia, > > On Sat, May 02, 2015 at 12:34:21PM -0400, Ilia Mirkin wrote: >> Hi, >> >> As I'm looking to add some support to nouveau for features like atomic >> counters and images, I'm running into some confusion about what the >> first word of the shader header

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

2015 Dec 16

Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

I believe that your problem is this: /*01a0*/ LD R8, [R8]; /* 0x8000000000821c85 */ That needs to be LD.E (and your ST's need to be ST.E). You're using a 32-bit gmem address, but you need to be using a 64-bit one. I believe the 32-bit ones work on fermi, but afaik not on Kepler. Cheers, -ilia On Wed, Dec 16, 2015 at 12:06 PM, Hans de Goede

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Timothy, Those calls to compute grid intrinsics are definitely wrong. In ptx code they should end up into reading special registers, rather than function calls. Try to take some working example and figure out the LLVM IR differences between it and the result of your compiler. - D. ----- Original message ----- > I've written a compiler that outputs PTX code, the result seems fairly >

[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs

2014 Apr 19

[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs

Hi, We wrote an optimization that eliminates common sub-expressions in a group of similar GEPs for the NVPTX backend. It speeds up some of our benchmarks by up to 20%, which convinces us to try to upstream it. Here's a brief description of why we wrote this optimization, what we did, and how we did it. Loops in CUDA programs are often extensively unrolled by programmers and compilers,

Proper gl_SampleMask output

2014 Apr 23

Proper gl_SampleMask output

Hello, I've been trying to add ARB_sample_shading support to nouveau, and am being defeated by the gl_SampleMask tests. Everything else works fine. (And naturally the tests pass with the proprietary driver.) I'm trying to do this for both GT21x, as well as GF100+. In the GT21x case, it seems like the low bit of method 0x1928 needs to be set (as well as the second-to-lowest bit), for

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Jul 11

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Hello, FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324 When compiling the following code for sm_20, func params are by some reason given with .align 0, which is invalid. Problem does not occur if compiled for sm_10. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple =

Proper gl_SampleMask output

2014 Apr 30

Proper gl_SampleMask output

Hi Ilia. I'll take a look and see what I can find out. Thanks, - Andy On Wed, Apr 23, 2014 at 05:03:17PM -0700, Ilia Mirkin wrote: > On Wed, Apr 23, 2014 at 6:22 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote: > > Hello, > > > > I've been trying to add ARB_sample_shading support to nouveau, and am > > being defeated by the gl_SampleMask tests.

[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs

2014 Apr 21

[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs

Hi Hal, Thanks for your comments! I'm inlining my responses below. Jingyue On Sat, Apr 19, 2014 at 6:38 AM, Hal Finkel <hfinkel at anl.gov> wrote: > Jingyue, > > I can't speak for the NVPTX backend, but I think this looks useful as an > (optional) target-independent pass. A few thoughts: > > - Running GVN tends to be pretty expensive; have you tried EarlyCSE

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

2011 May 26

[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team

Hi all, LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community. If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com Thanks, Evan Job description The Apple compiler team is seeking an engineer who is strongly

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

2016 Apr 08

[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers

Hi, On 23-03-16 23:10, Samuel Pitoiset wrote: > Are you sure this won't break compute shaders on fermi? > Could you please double-check that? I just checked: lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1) Before this patch-set: [hans at plank piglit]$ ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader

similar to: [Bug 90887] New: PhiMovesPass in register allocator broken