thr3ads.net - similar to: "[3.8 Release] Please write release notes!"

Displaying 20 results from an estimated 10000 matches similar to: "[3.8 Release] Please write release notes!"

2020 Jul 30

Status of CUDA 11 support

Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:

CUDA separate compilation

2017 Aug 16

CUDA separate compilation

Clang currently doesn't support CUDA separate compilation and thus extern __device__ functions and variables cannot be used. Could someone give me any pointers where to look or what has to be done to support this? If at all possible, I'd like to see what's missing and possibly try to tackle it. -------------- next part -------------- An HTML attachment was scrubbed... URL:

cuda cross compiling issue for target aarch64-linux-androideabi

2018 Mar 23

cuda cross compiling issue for target aarch64-linux-androideabi

I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function

cuda cross compiling issue for target aarch64-linux-androideabi

2018 Mar 23

cuda cross compiling issue for target aarch64-linux-androideabi

+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 09

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng

Buildbot numbers for the last week of 6/05/2016 - 6/11/2016

2016 Jun 14

Buildbot numbers for the last week of 6/05/2016 - 6/11/2016

Hello everyone, Below are some buildbot numbers for the last week of 6/05/2016 - 6/11/2016. Thanks Galina buildername | was_red -----------------------------------------------------------+----------- sanitizer-x86_64-linux-bootstrap | 134:12:25 perf-x86_64-penryn-O3-polly-parallel-fast | 46:29:26

[GPUCC] link against libdevice

2016 Aug 01

[GPUCC] link against libdevice

OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 08

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any

NV50 compute support questions

2015 Nov 30

NV50 compute support questions

Hi, On 26-11-15 13:52, Samuel Pitoiset wrote: <snip> >> I do not have a GK106, I've a GK208, and IIRC that one is known to not >> work, >> I guess I can give it a try. > > Compute support is not supported on GK110+, yeah... > > If you provide me a MMT trace of, for example, vectorAdd from the CUDA samples I could have a look. Ok, here is a MMT trace of

Buildbot numbers for the week of 7/10/2016 - 7/16/2016

2016 Jul 27

Buildbot numbers for the week of 7/10/2016 - 7/16/2016

Hello everyone, Below are some buildbot numbers for the week of 7/10/2016 - 7/16/2016. Please see the same data in attached csv files: The longest time each builder was red during the week; "Status change ratio" by active builder (percent of builds that changed the builder status from greed to red or from red to green); Count of commits by project; Number of completed builds, failed

Buildbot numbers for the week of 9/25/2016 - 10/1/2016

2016 Oct 05

Buildbot numbers for the week of 9/25/2016 - 10/1/2016

Hello everyone, Below are some buildbot numbers for the last week of 9/25/2016 - 10/1/2016. Please see the same data in attached csv files: The longest time each builder was red during the last week; "Status change ratio" by active builder (percent of builds that changed the builder status from greed to red or from red to green); Count of commits by project; Number of completed

Moving docs?

2019 Apr 07

Moving docs?

Hi llvm-admin, (cc llvm-dev for visibility) We’re working on some improvements to the documentation, and want to move things (e.g. the Kaleidoscope tutorial into a subdirectory) around without breaking any links to it. Is there a way to do forwards on the web page that you prefer? -Chris

[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?

2014 Sep 25

[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?

Thanks Quentin. I'm trying to examine from the operands of the return instruction, and then to get the last assignment of those. I thought use_iterator/reg_iterator may suit better than just loop through the machine basicblock in the reverse order. Cheng-Chih On Thu, Sep 25, 2014 at 1:51 PM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi Cheng-Chih, > > On Sep 25,

Debug info for CUDA code

2020 Jan 15

Debug info for CUDA code

Hi Alexey, Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10

[LLVMdev] Custom pass that runs before EmitStartOfAsmFile()?

2014 Sep 30

[LLVMdev] Custom pass that runs before EmitStartOfAsmFile()?

Hi all, I'm trying to write a custom module-level pass that runs before AsmPrinter::EmitStartOfAsmFile(), since I'd like to have some processed information available once entering this function. Looking through "Writing an LLVM pass" documentation, it's not clear to me if this is possible. I've also tried putting the pass in different orders (addPreISel, addIRPasses,

[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?

2014 Sep 25

[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?

Hi folks, I would like to find out the machine instructions that use some given registers in the reverse order, and I came across these iterators (use_iterator/reg_iterator). However, there are two things I noticed: 1) These iterators seem to traverse the machine function a bit differently from what I get from the machine function dump. In other words, the use_iterator list is not constructed in

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is

[LLVMdev] Default/initial values for function arguments?

2014 Aug 15

[LLVMdev] Default/initial values for function arguments?

Hi guys, I’m trying to figure out a way to assign initial values to function arguments. For a function in IR: define i32 @main (i32 %0, i32 %1) { %tmp = add i32 %0, %1 ... } I would like to make sure %0 has some initial value (e.g. i32 0) under some circumstances. Is there any easy way to do this? I understand that %0 comes from a live-in value which is defined from outside of the function. I

using emulated-tls on Darwin 8, 9, 10

2018 Dec 08

using emulated-tls on Darwin 8, 9, 10

> On 2018-12-07 22:30, Ken Cunningham via llvm-dev wrote: >> Please excuse hobbiest-level question. >> Darwin 11+ enables thread_local variables using system-level supports. >> I have an interest in enabling TLS on darwin < 11 using emulated-tls. > > Is anyone still running macOS 10.6 or older? > > -- > /Jacob Carlborg > [off topic, apologies] Yes,

similar to: [3.8 Release] Please write release notes!