thr3ads.net - similar to: "Looking for suggestions: Inferring GPU memory accesses"

Displaying 20 results from an estimated 400 matches similar to: "Looking for suggestions: Inferring GPU memory accesses"

Looking for suggestions: Inferring GPU memory accesses

2020 Aug 23

Looking for suggestions: Inferring GPU memory accesses

@Ees, Oh, I see what you mean now. Doing such analysis would be useful for a thread block and not just a single thread but as you say you are onto something bigger than just a thread. We had published a short paper in ICS around this which uses polyhedral techniques to do such analysis and reason about uncoalesced access patterns in Cuda programs. You can find paper at

[LLVMdev] OpenCL backend for LLVM

2012 Mar 05

[LLVMdev] OpenCL backend for LLVM

Hi, this is a follow-up on my email from august (http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-August/042737.html). i have, finally, released my OpenCL backend and control-flow restructuring framework for LLVM (AST-Extractor, or short axtor). The framework restructures function CFGs such that they can be expressed entirely without GOTOs or switch/loop-trickery. Hence, making it possible to

[LLVMdev] Portable OpenCL (pocl) v0.6 released

2012 Aug 17

[LLVMdev] Portable OpenCL (pocl) v0.6 released

Portable OpenCL (pocl) v0.6 released ------------------------------------ Portable OpenCL aims to be an efficient open source (MIT-licensed) implementation of the OpenCL 1.2 standard. In addition to producing an easily portable open source OpenCL implementation, another major goal of the project is improving performance portability of OpenCL programs with compiler optimizations, reducing the

[LLVMdev] OpenCL backend for LLVM

2012 Mar 06

[LLVMdev] OpenCL backend for LLVM

Hi Micah, i just had a quick look at your structurizer. Here is what if found (correct me, if i am mistaken): * Our approaches for handling Loops with multiple exits are identical. ("Loop-Exit Enumeration") * Axtor implements Controlled-Node Splitting and can cope with irreducible control-flow. (http://cardit.et.tudelft.nl/MOVE/papers/cc96.ps) * Axtor translates switches to cascading

[LLVMdev] OpenCL backend for LLVM

2012 Mar 05

[LLVMdev] OpenCL backend for LLVM

Simon, Have you looked at the control flow structizer that we have in the Open Source AMDIL backend? > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Simon Moll > Sent: Monday, March 05, 2012 1:01 PM > To: llvmdev at cs.uiuc.edu > Subject: [LLVMdev] OpenCL backend for LLVM > > Hi, > > this

[RFC] Late (OpenMP) GPU code "SPMD-zation"

2019 Mar 13

[RFC] Late (OpenMP) GPU code "SPMD-zation"

There are tooooooo(!) many changes, I don't who's going to review sooooo big patch. You definitely need to split it into several smaller patches. Also, I don't like the idea adding of one more class for NVPTX codegen. All your changes should be on top of the eixisting solution. ------------- Best regards, Alexey Bataev 13.03.2019 15:08, Doerfert, Johannes пишет: > Please consider

[LLVMdev] OpenCL backend for LLVM

2012 Mar 06

[LLVMdev] OpenCL backend for LLVM

The person that wrote our structurizer agrees with your analysis. Too bad the licenses are incompatible, it would be nice to merge similar efforts. > -----Original Message----- > From: Simon Moll [mailto:simon.m.moll at googlemail.com] > Sent: Tuesday, March 06, 2012 2:49 AM > To: Villmow, Micah > Cc: llvmdev at cs.uiuc.edu > Subject: RE: [LLVMdev] OpenCL backend for LLVM >

[Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1

2018 Mar 09

[Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1

Hi Johannes, Perfect, thanks! The CFG now looks very similar to what I got on LLVM 3.9.0 ([1] vs [2]). Any idea why setting -simplifycfg-sink-common=false is necessary? Similar to LLVM 5.0.1, the default for 3.9.0 is true [3], and setting it to false wasn't necessary in the latter version. [1] https://nautilus.bjornweb.nl/files/polly501-cfg-simplifycfg-sink-common.pdf [2]

[RFC] Late (OpenMP) GPU code "SPMD-zation"

2019 Mar 13

[RFC] Late (OpenMP) GPU code "SPMD-zation"

------------- Best regards, Alexey Bataev 13.03.2019 15:35, Doerfert, Johannes пишет: > > Hi Alexey, > > > thank you for your quick feedback. > > > > There are tooooooo(!) many changes, I don't who's going to review sooooo big > patch. > > > I can for sure split it in the three components/repositories that are > touched, clang, llvm, and openmp.

[RFC] Late (OpenMP) GPU code "SPMD-zation"

2019 Mar 13

[RFC] Late (OpenMP) GPU code "SPMD-zation"

1. You don't need to implement everything in a single patch. The development process is a step-by-step process, when you commit something in small pieces. The code must nit be fully functional, you may start from some basic features. Currently it is very hard to review. 2. I rather doubt that it can be reused without changes for AMD etc., especially without being fully tested. The only tested

[RFC] Late (OpenMP) GPU code "SPMD-zation"

2019 Mar 13

[RFC] Late (OpenMP) GPU code "SPMD-zation"

Johannes, did you try it on AMD GPUs? If not, I think it might be early to claim it as a general interface for NVidia/AMD GPUs. I'm ok, if you want tointroduce a basic class for the GPU-specific codegen, but it must be done step-by-step and thoroughly tested and reviewed. Theremightbe some parts, common with NVPTX codegen. You can put the commonfunctions into a base class and remove them from

[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

2016 Oct 12

[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Wed, Oct 12, 2016 at 10:53 AM, Hal Finkel <hfinkel at anl.gov> wrote: > I don't think that Clang/LLVM uses it by default on x86_64. If you're using -Ofast, however, that would explain it. I recommend looking at -O3 vs -O0 and make sure those are the same. -Ofast enables -ffast-math, which can legitimately cause differences. > The following tests pass at "-O3" and

package "growth" ... where is it ?

2007 Dec 09

package "growth" ... where is it ?

I would like to install the package "growth" as it contains the function "corgram" and some other presumably useful stuff for time series analysis. I can see it is in R standard library list: http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/doc/html/packages.html<http://hosho.ees.hokudai.ac.jp/%7Ekubo/Rdoc/doc/html/packages.html>

Getting R to emit an image file as a pipe or Base64 strea m: Mac OSX 10.3 - R 2.0.1

2004 Dec 03

Getting R to emit an image file as a pipe or Base64 strea m: Mac OSX 10.3 - R 2.0.1

> From: Yuandan Zhang > > If you want to call R from perl, why don't you do a simple > system call like: > > $callR="/usr/loca/bin/R CMD BATCH plotscript.R"; > system ($callR); > > It is not necessary to start X display if anything can be > done in background But the problem is jpeg()/png() are not available unless an X display is available to the

[LLVMdev] [Polly] Set up performance tester for GSOC2013 FastPolly project

2013 Jun 06

[LLVMdev] [Polly] Set up performance tester for GSOC2013 FastPolly project

Hi Tobias, I am recently trying to set up the performance tester for FastPolly project. According to your suggestion, I plan to use the LNT infrastructure to set up the performance tester. For this purpose, I think I should do this job in three steps: First, I will add PolyBench to LLVM test-suite since PolyBench is the critical benchmarks for FastPolly. I have adjust the PolyBench-c-3.2 so we

[LLVMdev] [Patch] Apply for adding PolyBench to LLVM testsuite

2013 Jun 09

[LLVMdev] [Patch] Apply for adding PolyBench to LLVM testsuite

Hi all, PolyBench (http://www.cse.ohio-state.edu/~pouchet/software/polybench/) is a well-known benchmark for polyhedral compiler. Since LLVM-Polly http://polly.llvm.org/) has provided a very good polyhedral optimizer for LLVM, could we add this benchmark to LLVM test-suite? I have attached the patch file to add PolyBench to LLVM test-suite. Best wishes, Star Tan -------------- next part

[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

2016 Oct 20

[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

On Fri, Oct 14, 2016 at 6:10 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> polybench/linear-algebra/kernels/symm, FP_ABSTOLERANCE=1e1 >> polybench/linear-algebra/solvers/gramschmidt, FP_ABSTOLERANCE=1e0 >> What should be a good relative tolerance to set for these two tests? > > What's the minimum relative tolerance that you need for them to pass? Setting

[LLVMdev] [Polly] Set up performance tester for GSOC2013 FastPolly project

2013 Jun 09

[LLVMdev] [Polly] Set up performance tester for GSOC2013 FastPolly project

On 06/06/2013 11:17 AM, Star Tan wrote: > Hi Tobias, > > > I am recently trying to set up the performance tester for FastPolly project. According to your suggestion, I plan to use the LNT infrastructure to set up the performance tester. For this purpose, I think I should do this job in three steps: > > > First, I will add PolyBench to LLVM test-suite since PolyBench is the

Information about the number of indices in memory accesses

2020 Oct 03

Information about the number of indices in memory accesses

Michael makes a great point about aliasing here and different indexing that accesses the same element! Another note: x = A[0][2] is fundamentally different depending on the type of `A`. If e.g. A was declared: int A[10][20], there's only _one_ load. A is a (and is treated as) a linear buffer, and GEPs only pinpoint the specific position of A[0][2] in this buffer (i.e. 0*10 + 2). But if A was

Information about the number of indices in memory accesses

2020 Sep 23

Information about the number of indices in memory accesses

Hi all, For loads and stores i want to extract information about the number of indices accessed. For instance: struct S {int X, int *Y}; __global__ void kernel(int *A, int **B, struct S) { int x = A[..][..]; // -> L: A[..][..] int y = *B[2]; // -> L: B[0][2] int z = S.y[..]; // -> L: S.1[..] // etc.. } I am performing some preprocessing on IR to: 1. Move constant

similar to: Looking for suggestions: Inferring GPU memory accesses