similar to: [3.8 Release] Please write release notes!

Displaying 20 results from an estimated 10000 matches similar to: "[3.8 Release] Please write release notes!"

2020 Jul 30
2
Status of CUDA 11 support
Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2016 Jun 02
5
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:
2017 Aug 16
3
CUDA separate compilation
Clang currently doesn't support CUDA separate compilation and thus extern __device__ functions and variables cannot be used. Could someone give me any pointers where to look or what has to be done to support this? If at all possible, I'd like to see what's missing and possibly try to tackle it. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2016 Jun 14
2
Buildbot numbers for the last week of 6/05/2016 - 6/11/2016
Hello everyone, Below are some buildbot numbers for the last week of 6/05/2016 - 6/11/2016. Thanks Galina buildername | was_red -----------------------------------------------------------+----------- sanitizer-x86_64-linux-bootstrap | 134:12:25 perf-x86_64-penryn-O3-polly-parallel-fast | 46:29:26
2016 Aug 01
3
[GPUCC] link against libdevice
OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to
2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any
2015 Nov 30
4
NV50 compute support questions
Hi, On 26-11-15 13:52, Samuel Pitoiset wrote: <snip> >> I do not have a GK106, I've a GK208, and IIRC that one is known to not >> work, >> I guess I can give it a try. > > Compute support is not supported on GK110+, yeah... > > If you provide me a MMT trace of, for example, vectorAdd from the CUDA samples I could have a look. Ok, here is a MMT trace of
2016 Jul 27
1
Buildbot numbers for the week of 7/10/2016 - 7/16/2016
Hello everyone, Below are some buildbot numbers for the week of 7/10/2016 - 7/16/2016. Please see the same data in attached csv files: The longest time each builder was red during the week; "Status change ratio" by active builder (percent of builds that changed the builder status from greed to red or from red to green); Count of commits by project; Number of completed builds, failed
2016 Oct 05
1
Buildbot numbers for the week of 9/25/2016 - 10/1/2016
Hello everyone, Below are some buildbot numbers for the last week of 9/25/2016 - 10/1/2016. Please see the same data in attached csv files: The longest time each builder was red during the last week; "Status change ratio" by active builder (percent of builds that changed the builder status from greed to red or from red to green); Count of commits by project; Number of completed
2019 Apr 07
2
Moving docs?
Hi llvm-admin, (cc llvm-dev for visibility) We’re working on some improvements to the documentation, and want to move things (e.g. the Kaleidoscope tutorial into a subdirectory) around without breaking any links to it. Is there a way to do forwards on the web page that you prefer? -Chris
2014 Sep 25
2
[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?
Thanks Quentin. I'm trying to examine from the operands of the return instruction, and then to get the last assignment of those. I thought use_iterator/reg_iterator may suit better than just loop through the machine basicblock in the reverse order. Cheng-Chih On Thu, Sep 25, 2014 at 1:51 PM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi Cheng-Chih, > > On Sep 25,
2020 Jan 15
2
Debug info for CUDA code
Hi Alexey, Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10
2014 Sep 30
2
[LLVMdev] Custom pass that runs before EmitStartOfAsmFile()?
Hi all, I'm trying to write a custom module-level pass that runs before AsmPrinter::EmitStartOfAsmFile(), since I'd like to have some processed information available once entering this function. Looking through "Writing an LLVM pass" documentation, it's not clear to me if this is possible. I've also tried putting the pass in different orders (addPreISel, addIRPasses,
2014 Sep 25
2
[LLVMdev] MachineRegisterInfo use_iterator/reg_iterator?
Hi folks, I would like to find out the machine instructions that use some given registers in the reverse order, and I came across these iterators (use_iterator/reg_iterator). However, there are two things I noticed: 1) These iterators seem to traverse the machine function a bit differently from what I get from the machine function dump. In other words, the use_iterator list is not constructed in
2013 Mar 11
2
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is
2014 Aug 15
2
[LLVMdev] Default/initial values for function arguments?
Hi guys, I’m trying to figure out a way to assign initial values to function arguments. For a function in IR: define i32 @main (i32 %0, i32 %1) { %tmp = add i32 %0, %1 ... } I would like to make sure %0 has some initial value (e.g. i32 0) under some circumstances. Is there any easy way to do this? I understand that %0 comes from a live-in value which is defined from outside of the function. I
2018 Dec 08
2
using emulated-tls on Darwin 8, 9, 10
> On 2018-12-07 22:30, Ken Cunningham via llvm-dev wrote: >> Please excuse hobbiest-level question. >> Darwin 11+ enables thread_local variables using system-level supports. >> I have an interest in enabling TLS on darwin < 11 using emulated-tls. > > Is anyone still running macOS 10.6 or older? > > -- > /Jacob Carlborg > [off topic, apologies] Yes,