Displaying 20 results from an estimated 1000 matches similar to: "use clang++ to build lulesh 2.0 failed"
2018 Feb 20
0
use clang++ to build lulesh 2.0 failed
> It looks like clang++ is complaining about the thrust library comes with
cuda,
The Thrust library that comes with CUDA is indeed not compatible with
clang. We made a number of changes to Thrust to make it work with clang
(it was relying on what we considered to be bugs in nvcc), but they're only
available in the upstream Thrust: https://github.com/thrust/thrust.
No promises that one
2016 Oct 27
3
problem on compiling cuda program with clang++
Hi all,
I compiled the *llvm3.9* source code on the *Nvidia TX1* board. And now I
am following the document in the docs/CompileCudaWithLLVM.rst to compile
cuda program with clang++.
However, when I compile `axpy.cu` using `nvcc`, *nvcc* can generate the
correct the binary;
while compiling `axpy.cu` using clang++, the detailed command is `clang++
axpy.cu -o axpy --cuda-gpu-arch=sm_53
2016 Oct 27
0
problem on compiling cuda program with clang++
> NVidia TX1 is the AArch64 Jetson board with proper GPU (we use those).
Sure, I believe that others use this configuration. I was saying,
"we", being, myself and those whom I work closely with, do not. Sorry
if that wasn't precise.
It is still not clear to me if the original poster is compiling for
ARM or not. But it sounds like you're going to help them get this
2016 Oct 27
3
problem on compiling cuda program with clang++
On 27 October 2016 at 19:02, Justin Lebar via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Hi, it looks like you're compiling CUDA for an ARM host? This is not
> a configuration we have tested, nor is it something we have the
> capability of testing at the moment.
Hi Justin,
NVidia TX1 is the AArch64 Jetson board with proper GPU (we use those).
> You may be able to
2006 Nov 09
4
Running a 32-bit application on CentOS3-x64
Hi,
I'm trying to run Norman anti-virus on a CentOS 3 box, x64. Is it
possible?
Running the binary gives me this error:
[root at server bin]# ./nvcc
-bash: ./nvcc: /lib/ld-linux.so.2: bad ELF interpreter: No such file or
directory
I guess I would have to install i386 libraries that it requires, as well.
It it possible?
Regards,
Ugo
2015 Aug 21
3
[CUDA/NVPTX] is inlining __syncthreads allowed?
Hi Justin,
Is a compiler allowed to inline a function that calls __syncthreads? I saw
nvcc does that, but not sure it's valid though. For example,
void foo() {
__syncthreads();
}
if (threadIdx.x % 2 == 0) {
...
foo();
} else {
...
foo();
}
Before inlining, all threads meet at one __syncthreads(). After inlining
if (threadIdx.x % 2 == 0) {
...
__syncthreads();
} else {
...
2016 Oct 27
0
problem on compiling cuda program with clang++
(+llvm-dev)
My question was whether your host machine, the one which is running
the compiler, is ARM (as opposed to x86 or POWER). The header you
pointed to was in "aarch64-linux-gnu", which made me think you might
be on an ARM system.
If you are not running linux x86, it is not likely to work.
If you are running linux x86, we will need much more details about
your system in order to
2017 Jun 14
2
Separate compilation of CUDA code?
Hi,
I wonder whether the current version of LLVM supports separate compilation and linking of device code, i.e., is there a flag analogous to nvcc's --relocatable-device-code flag? If not, is there any plan to support this?
Thanks!
Yuanfeng Peng
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2012 Sep 03
2
[LLVMdev] [NVPTX] Backend cannot handle array-of-arrays constant
Dear all,
Looks like the NVPTX backend cannot handle array-of-arrays contant
(please see the reporocase below). Is it supposed to work? Any ideas
how to get it working? Important for our target applications.
Thanks,
- Dima.
$ cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout =
2008 Nov 04
1
Help needed using 3rd party C library/functions from within R (Nvidia CUDA)
Hello,
I'm trying to combine the parallel computing power available through NVIDIA
CUDA (www.nvidia.com/cuda) from within R. CUDA is an extension to the C
language, so I thought it would be possible to do this.
If I have a C file with an empty function which includes a needed CUDA
library (cutil.h) and compile this to an .so file using a NVIDIA compiler
(nvcc), called 'myFunc.so' I
2015 Aug 21
2
[CUDA/NVPTX] is inlining __syncthreads allowed?
I'm using 7.0. I am attaching the reduced example.
nvcc sync.cu -arch=sm_35 -ptx
gives
// .globl _Z3foov
.visible .entry _Z3foov(
)
{
.reg .pred %p<2>;
.reg .s32 %r<3>;
mov.u32 %r1, %tid.x;
and.b32 %r2, %r1, 1;
setp.eq.b32 %p1, %r2, 1;
@!%p1 bra BB7_2;
bra.uni
2012 Sep 04
2
[LLVMdev] [NVPTX] Backend cannot handle array-of-arrays constant
I think our test case demonstrates that requiring the array item being
initialized to be constant is incorrect. NVPTX does not crash anymore
and produces correct result with the following change:
--- NVPTXAsmPrinter.cpp 2012-09-03 15:14:00.000000000 +0200
+++ NVPTXAsmPrinter.cpp 2012-09-04 15:47:17.859398193 +0200
@@ -1890,17 +1890,15 @@
case Type::ArrayTyID:
case Type::VectorTyID:
case
2015 Jun 04
2
[LLVMdev] `Ty && "Trying to add a type that doesn't exist?
Upgrade clang? I can't reproduce it with trunk.
On 4 June 2015 at 14:48, Hui Zhang <wayne.huizhang at gmail.com> wrote:
> Yes, I found this link, but what's the solution??
>
> On Thu, Jun 4, 2015 at 1:09 PM, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>>
>> I think this is https://llvm.org/bugs/show_bug.cgi?id=16846
>>
>> On
2015 Nov 04
2
how to add the location debug info for each instruction
> On Nov 3, 2015, at 5:00 PM, Hui Zhang <wayne.huizhang at gmail.com> wrote:
>
> Hello,
>
> I found a weird thing in llvm 3.3:
>
> For exactly the same MDNode *space, if I cast it to DILocation loc(space) and call loc.getFileName(), or I cast it to DIScope sco(space) and call sco.getFilename(), the return value would be different ! Totally two different files
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi,
I wanted to ask whether there is ongoing effort (or an already established
tool) that enables to convert CUDA kernels (that uses CUDA specific
intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware
that I can do this for OpenCL with the help of libclc but I can not find
something similar for CUDA.
Thanks
-------------- next part --------------
An HTML attachment was
2005 May 10
4
density function
Hi,
I wonder if the function "density" outputs the gaussian mixture formula
that is estimated from the input data, assuming a gaussian model is used
at each data point ? I want to take the derivative of the finally
estimated gaussian mixture formula for further analysis.
Thanks in advance for any help that you can offer me!
Hui
2012 May 22
3
pad leading zeros in front of strings
Dear All,
This question sounds very simple but I don't know where I am wrong. I just want to pad leading zeros in some string, for example, "123" becomes "00123". What is wrong if I do following?
> sprintf("%05s", "123")
[1] " 123"
It didn't return "00123", instead it padded with 'blank'.
Thank you for your help
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all,
I'm looking into the performance difference of a benchmark compiled with
NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a
significant difference due to PTX instruction ordering. The relevant
source code consists of two nested loops that get fully unrolled, doing
some basic arithmetic with values loaded from shared memory:
> #define BLOCK_SIZE 16
>
>
2004 Mar 31
3
help with the usage of "randomForest"
Dear all,
Can anybody give me some hint on the following error msg I got with using
randomForest?
I have two-class classification problem. The data file "sample" is:
----------------------------------------------------------
udomain.edu udomain.hcs hpclass
1 1.0000 1 not
2 NA 2 not
3 NA 0.8 not
4 NA 0.2 hp
5 NA 0.9 hp
------------------------------------------------------------
The
2005 Apr 18
4
longer object length, is not a multiple of shorter object length in: kappa * gcounts
Hi,
I was using a density estimation function as follows:
> est <- KernSmooth::bkde(x3, bandwidth=10)
When setting bandwidth less than 5, I got the error "longer object
length, is not a multiple of shorter object length in: kappa * gcounts ".
I wonder if there is anybody who can explain the error for me?
Thanks!
Hui