Displaying 20 results from an estimated 400 matches similar to: "Disk/Boot problems when upgrading BIOS"
2010 Jan 07
1
faster GLS code
Dear helpers,
I wrote a code which estimates a multi-equation model with generalized
least squares (GLS). I can use GLS because I know the covariance matrix of
the residuals a priori. However, it is a bit slow and I wonder if anybody
would be able to point out a way to make it faster (it is part of a bigger
code and needs to run several times).
Any suggestion would be greatly appreciated.
Carlo
2015 Nov 16
3
[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.
---
celt/arch.h | 5 +++++
silk/macros.h | 4 +---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/celt/arch.h b/celt/arch.h
index 9f74ddd..670527b 100644
--- a/celt/arch.h
+++ b/celt/arch.h
@@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line)
#define UADD32(a,b) ((a)+(b))
#define USUB32(a,b) ((a)-(b))
+/* Set this if opus_int64
2015 Nov 21
8
[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.
---
configure.ac | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/configure.ac b/configure.ac
index f52d2c2..e1a6e9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd],
[enable_rtcd=yes])
AC_ARG_ENABLE([intrinsics],
- [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],,
+
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling
the intrinsic as if it were a function.
class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> :
NVPTXInst<(outs regclassOut:$dst), (ins),
OpStr,
[(set regclassOut:$dst, (IntOp))]>;
def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",
2014 May 01
13
[Bug 78161] New: [NV96] Artifacts in output of fragment program containing not unrolled loops with conditional break
https://bugs.freedesktop.org/show_bug.cgi?id=78161
Priority: medium
Bug ID: 78161
Assignee: nouveau at lists.freedesktop.org
Summary: [NV96] Artifacts in output of fragment program
containing not unrolled loops with conditional break
Severity: normal
Classification: Unclassified
OS: Linux (All)
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests.
----- Original message -----
> I'm building this with llvm-c, and accessing these intrinsics via calling
> the intrinsic as if it were a function.
>
> class F_SREG<string
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Hi Timothy,
I'm not sure what you mean by this working for other intrinsics, but
in this case, I think you want the intrinsic name
llvm.nvvm.read.ptx.sreg.tid.x.
For me, this looks like:
%x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
Pete
On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote:
> I'm building this with llvm-c, and accessing these
2013 Mar 01
2
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I've written a compiler that outputs PTX code, the result seems fairly
reasonable, but I'm not sure the intrinsics are getting compiled correctly.
In addition, when I try load the module using CUDA, I get an
error: CUDA_ERROR_NO_BINARY_FOR_GPU. I'm running this on a 2012 MBP with
a 640M GPU.
PTX Code (for a mandelbrot calculation):
//
// Generated by LLVM NVPTX Back-End
//
2013 Mar 01
1
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
The identifier INT_PTX_SREG_TID_X is the name of an instruction as the
back-end sees it, and has very little to do with the name you should use in
your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td
file and see the definitions for each intrinsic. Then, the name mapping is
just:
int_foo_bar -> llvm.foo.bar()
int_ prefix becomes llvm., and all underscores turn into
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Timothy,
Those calls to compute grid intrinsics are definitely wrong. In ptx code they should end up into reading special registers, rather than function calls. Try to take some working example and figure out the LLVM IR differences between it and the result of your compiler.
- D.
----- Original message -----
> I've written a compiler that outputs PTX code, the result seems fairly
>
2015 May 21
2
Fermi+ shader header docs
On Thu, May 21, 2015 at 10:05 AM, Robert Morell <rmorell at nvidia.com> wrote:
> Hi Ilia,
>
> On Sat, May 02, 2015 at 12:34:21PM -0400, Ilia Mirkin wrote:
>> Hi,
>>
>> As I'm looking to add some support to nouveau for features like atomic
>> counters and images, I'm running into some confusion about what the
>> first word of the shader header
2015 Feb 03
2
[LLVMdev] Example for usage of LLVM/Clang/libclc
Hi,
My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and
eventually generate a PTX code. I already did this but I am not sure if the
PTX code I am generating is correct (is the one that is supposed to be
generated).
For example, currently,
In OpenCL : get_global_id(0) translates to
In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates
to
In PTX:
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Does this give correct results for special floats (0, infs)?
We tried to improve (for single floats) x86 rcp in llvmpipe with
newton-raphson, but unfortunately not being able to give correct results
for these two cases (without even more additional code) meant it got all
disabled in the end (you can still see that code in the driver) since
the problems are at least as bad as those due to bad
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi,
Looks like "{" and "}" are lost when trying to use the combination of Clang
and NVPTX, which may result into clash of definitions of the function-scope
and asm-scope. Here is an example:
> cat test.cu
__attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int
__any(int a) {
int result;
asm __volatile__ ("{ \n\t"
".reg .pred
2014 Sep 04
10
MEMX improvements + DDR 2/3 MR generation
Patch 1 and 2 implement wait-for-vblank, required to remove flicker when reclocking memory
Patch 3 and 4 allow me to do things between waiting for VBLANK and disabling FB, like pause PFIFO and wait for the engines to idle. This minimises the time PFIFO is paused, thus maximises performance.
The rest of the patches speak for themselves. As the actual memory reclocking script is still somewhat prone
2012 May 16
2
[LLVMdev] NVPTX: __iAtomicCAS support ?
Dear colleagues,
I'm looking if we can replace nvopencc with LLVM NVPTX in our project.
It turns NVPTX won't work with the code nvopencc can handle (please
see the log below). So are atomic intrinsics not supported or am I
doing call in a wrong way?
Thanks,
- Dima.
SOURCE
========
dmikushin at hp2:~> cat kernelgen_monitor.ll
; ModuleID =
2010 Feb 04
1
Bug in as.character? (PR#14206)
A long formula which is converted using as.character, looses its last
part: ``diagonal = 1e-12)''
Shorter formula is ok though.
Best,
H??vard
************
Browse[2]> formula.str
y ~ -1 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8 + b9 + b10 + b11 +
b12 + b13 + b14 + b15 + b16 + b17 + b18 + b19 + b20 + b21 +
b22 + b23 + b24 + b25 + b26 + b27 + b28 + b29 + b30 + b31 +
b32 +
2015 Oct 26
9
[PATCH 0/4] Add pdaemon load counters
this series makes use of the load counters we can use to get information about
the current load of the gpu.
This series includes the needed pmu bits and a debugfs interface to read them
out. Currently the values are between 0 and 255, because it is much easier to
implement it this way on the pmu.
Karol Herbst (4):
subdev/pmu/fuc: add gk104
pmu/fuc: add macros for pdaemon pwr counters
2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody,
As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info.
This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry,
You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below).
Chad
On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote:
> Hi,
>
> Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of