Displaying 20 results from an estimated 30000 matches similar to: "[Bug 90887] New: PhiMovesPass in register allocator broken"
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
On 02.10.2016 20:03, Ilia Mirkin wrote:
> On Sun, Oct 2, 2016 at 1:58 PM, Tobias Klausmann
> <tobias.johannes.klausmann at mni.thm.de> wrote:
>> Previously we'd end up with an unnecessary mov for the thirs immediate value.
>>
>> total instructions in shared programs : 851881 -> 851864 (-0.00%)
>> total gprs used in shared programs : 110295 -> 110295
2014 May 01
13
[Bug 78161] New: [NV96] Artifacts in output of fragment program containing not unrolled loops with conditional break
https://bugs.freedesktop.org/show_bug.cgi?id=78161
Priority: medium
Bug ID: 78161
Assignee: nouveau at lists.freedesktop.org
Summary: [NV96] Artifacts in output of fragment program
containing not unrolled loops with conditional break
Severity: normal
Classification: Unclassified
OS: Linux (All)
2016 Oct 02
2
[PATCH] nv50/ir: Propagate third immediate src when folding OP_MAD
Previously we'd end up with an unnecessary mov for the thirs immediate value.
total instructions in shared programs : 851881 -> 851864 (-0.00%)
total gprs used in shared programs : 110295 -> 110295 (0.00%)
total local used in shared programs : 1020 -> 1020 (0.00%)
local gpr inst bytes
helped 0 0 17 17
2016 Jun 17
6
[Bug 96565] New: Clive Barker's Jericho displays strange, vivid colors when motion blur enabled
https://bugs.freedesktop.org/show_bug.cgi?id=96565
Bug ID: 96565
Summary: Clive Barker's Jericho displays strange,vivid colors
when motion blur enabled
Product: Mesa
Version: 11.2
Hardware: Other
OS: All
Status: NEW
Keywords: bisected, regression
Severity: normal
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling
the intrinsic as if it were a function.
class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> :
NVPTXInst<(outs regclassOut:$dst), (ins),
OpStr,
[(set regclassOut:$dst, (IntOp))]>;
def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",
2014 Jul 18
5
[PATCH 0/5] nvc0: fp64 preparation
Most of codegen is already FP64-ready. There are a few edge-cases that I ran
into, many of which can apply even to non-fp64-enabled programs (although the
double-wide registers are not very common without fp64).
I've yet to give this a full piglit run, but wanted to send these out in case
someone wanted to comment. They do not depend on the preliminary core fp64
work.
Ilia Mirkin (5):
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests.
----- Original message -----
> I'm building this with llvm-c, and accessing these intrinsics via calling
> the intrinsic as if it were a function.
>
> class F_SREG<string
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Hi Timothy,
I'm not sure what you mean by this working for other intrinsics, but
in this case, I think you want the intrinsic name
llvm.nvvm.read.ptx.sreg.tid.x.
For me, this looks like:
%x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
Pete
On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote:
> I'm building this with llvm-c, and accessing these
2013 Mar 01
1
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
The identifier INT_PTX_SREG_TID_X is the name of an instruction as the
back-end sees it, and has very little to do with the name you should use in
your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td
file and see the definitions for each intrinsic. Then, the name mapping is
just:
int_foo_bar -> llvm.foo.bar()
int_ prefix becomes llvm., and all underscores turn into
2013 Mar 01
2
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I've written a compiler that outputs PTX code, the result seems fairly
reasonable, but I'm not sure the intrinsics are getting compiled correctly.
In addition, when I try load the module using CUDA, I get an
error: CUDA_ERROR_NO_BINARY_FOR_GPU. I'm running this on a 2012 MBP with
a 640M GPU.
PTX Code (for a mandelbrot calculation):
//
// Generated by LLVM NVPTX Back-End
//
2015 May 21
2
Fermi+ shader header docs
On Thu, May 21, 2015 at 10:05 AM, Robert Morell <rmorell at nvidia.com> wrote:
> Hi Ilia,
>
> On Sat, May 02, 2015 at 12:34:21PM -0400, Ilia Mirkin wrote:
>> Hi,
>>
>> As I'm looking to add some support to nouveau for features like atomic
>> counters and images, I'm running into some confusion about what the
>> first word of the shader header
2015 Dec 16
4
Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?
I believe that your problem is this:
/*01a0*/ LD R8, [R8];
/* 0x8000000000821c85 */
That needs to be LD.E (and your ST's need to be ST.E). You're using a
32-bit gmem address, but you need to be using a 64-bit one. I believe
the 32-bit ones work on fermi, but afaik not on Kepler.
Cheers,
-ilia
On Wed, Dec 16, 2015 at 12:06 PM, Hans de Goede
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Timothy,
Those calls to compute grid intrinsics are definitely wrong. In ptx code they should end up into reading special registers, rather than function calls. Try to take some working example and figure out the LLVM IR differences between it and the result of your compiler.
- D.
----- Original message -----
> I've written a compiler that outputs PTX code, the result seems fairly
>
2014 Apr 19
4
[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs
Hi,
We wrote an optimization that eliminates common sub-expressions in a group
of similar GEPs for the NVPTX backend. It speeds up some of our benchmarks
by up to 20%, which convinces us to try to upstream it. Here's a brief
description of why we wrote this optimization, what we did, and how we did
it.
Loops in CUDA programs are often extensively unrolled by programmers and
compilers,
2014 Apr 23
2
Proper gl_SampleMask output
Hello,
I've been trying to add ARB_sample_shading support to nouveau, and am
being defeated by the gl_SampleMask tests. Everything else works fine.
(And naturally the tests pass with the proprietary driver.) I'm trying
to do this for both GT21x, as well as GF100+.
In the GT21x case, it seems like the low bit of method 0x1928 needs to
be set (as well as the second-to-lowest bit), for
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hello,
FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324
When compiling the following code for sm_20, func params are by some reason
given with .align 0, which is invalid. Problem does not occur if compiled
for sm_10.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple =
2014 Apr 30
2
Proper gl_SampleMask output
Hi Ilia. I'll take a look and see what I can find out.
Thanks,
- Andy
On Wed, Apr 23, 2014 at 05:03:17PM -0700, Ilia Mirkin wrote:
> On Wed, Apr 23, 2014 at 6:22 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> > Hello,
> >
> > I've been trying to add ARB_sample_shading support to nouveau, and am
> > being defeated by the gl_SampleMask tests.
2014 Apr 21
2
[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs
Hi Hal,
Thanks for your comments! I'm inlining my responses below.
Jingyue
On Sat, Apr 19, 2014 at 6:38 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Jingyue,
>
> I can't speak for the NVPTX backend, but I think this looks useful as an
> (optional) target-independent pass. A few thoughts:
>
> - Running GVN tends to be pretty expensive; have you tried EarlyCSE
2011 May 26
2
[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team
Hi all,
LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community.
If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com
Thanks,
Evan
Job description
The Apple compiler team is seeking an engineer who is strongly
2016 Apr 08
2
[PATCH mesa v2 1/2] nouveau: codegen: Use FILE_MEMORY_BUFFER for buffers
Hi,
On 23-03-16 23:10, Samuel Pitoiset wrote:
> Are you sure this won't break compute shaders on fermi?
> Could you please double-check that?
I just checked:
lspci:
01:00.0 VGA compatible controller: NVIDIA Corporation GF119 [GeForce GT 610] (rev a1)
Before this patch-set:
[hans at plank piglit]$ ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader