thr3ads.net - search: "openacc"

Displaying 20 results from an estimated 57 matches for "openacc".

2019 May 28

Course Announcement

Dear everyone, Your code is slow and you are interested in performance optimization for scientific software? Figuring out where’s the bottlenecks guided by performance evaluation tools. If you are interested in porting your code to a HPC hardware platform and architecture, than OpenACC as a user-driven directive-based performance-portable parallel programming model might be a solution. What are the difference of OpenMP and OpenACC directive APIs? OpenACC uses directives to tell the compiler where and how to parallelize loops. You would like to look behind of the Spack Package Ma...

Problems Running an executable from samba share.

2006 Mar 02

Problems Running an executable from samba share.

...- sec_ctx_stack_ndx = 0 > [2006/03/01 14:39:40, 3] smbd/trans2.c:call_trans2qfilepathinfo(2859) > call_trans2qfilepathinfo: TRANSACT2_QPATHINFO: level = 1004 > [2006/03/01 14:39:40, 3] smbd/trans2.c:call_trans2qfilepathinfo(2884) > call_trans2qfilepathinfo: SMB_VFS_STAT of > com/openacc/oa_start/CG42_Install_Gd.pdf failed (No such file or > directory) > [2006/03/01 14:39:40, 3] smbd/error.c:error_packet(146) > error packet at smbd/trans2.c(2627) cmd=50 (SMBtrans2) > NT_STATUS_OBJECT_NAME_NOT_FOUND > [2006/03/01 14:39:40, 3] smbd/process.c:process_smb(1194) >...

Automatic Insertion of OpenACC/OpenMP directives

2017 Jan 03

Automatic Insertion of OpenACC/OpenMP directives

...I1[4] = (AI1[3] > 0); >> AI1[5] = (AI1[4] ? AI1[3] : 0); >> #pragma acc data pcopy(x[0:AI1[5]],y[0:AI1[5]]) >> #pragma acc kernels >> for (int i = 0; i < n; ++i) { >> y[j] = a * x[i] + y[j]; >> ++j; >> } > > I'm not familiar with OpenACC, but doesn't this still have a loop carried dependence on j, and therefore isn't correctly parallelizable as written? That was my original concern as well, but I had forgot that OpenACC pragma are not necessarily saying to the compiler that the loop is parallel: #pragma acc kernels only...

Automatic Insertion of OpenACC/OpenMP directives

2016 Dec 31

Automatic Insertion of OpenACC/OpenMP directives

...:58 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > Hi, > >> On Dec 31, 2016, at 8:33 AM, Fernando Magno Quintao Pereira via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Dear LLVMers, >> >> we have released a tool that uses LLVM to insert OpenACC or OpenMP >> 4.0 directives in programs. You can use the tool online here: >> http://cuda.dcc.ufmg.br/dawn/. Our tool, dawn-cc, analyzes the LLVM IR >> to infer the sizes of memory chunks, and to find dependences within >> loops. After that, we use debug information to trans...

Automatic Insertion of OpenACC/OpenMP directives

2016 Dec 31

Automatic Insertion of OpenACC/OpenMP directives

Hi, > On Dec 31, 2016, at 8:33 AM, Fernando Magno Quintao Pereira via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Dear LLVMers, > > we have released a tool that uses LLVM to insert OpenACC or OpenMP > 4.0 directives in programs. You can use the tool online here: > http://cuda.dcc.ufmg.br/dawn/. Our tool, dawn-cc, analyzes the LLVM IR > to infer the sizes of memory chunks, and to find dependences within > loops. After that, we use debug information to translate the low-lev...

Automatic Insertion of OpenACC/OpenMP directives

2016 Dec 31

Automatic Insertion of OpenACC/OpenMP directives

Dear LLVMers, we have released a tool that uses LLVM to insert OpenACC or OpenMP 4.0 directives in programs. You can use the tool online here: http://cuda.dcc.ufmg.br/dawn/. Our tool, dawn-cc, analyzes the LLVM IR to infer the sizes of memory chunks, and to find dependences within loops. After that, we use debug information to translate the low-level information back...

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM

2012 Aug 15

[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM

...any additional information that the compiler writer needs like number of threads, scheduling parameters, chunk size, etc etc which are specific perhaps to OpenMP. The point is that the same parallel loop could be targeted by another standard to accelerators today (like GPUs) using another standard OpenACC. We may get a new standard to capture and target for different kind of parallel device, which could look quite different, and has to specifically targeted. Since we are at the intermediate layer, we could be independent of both user level standards like OpenMP, OpenACC, OpenCL, Cilk+, C++AMP etc a...

CUDA fixed VA allocations and sparse mappings

2015 Jul 07

CUDA fixed VA allocations and sparse mappings

...the RM needs to make some of these allocations itself (for graphics context mapping, etc), how should potential conflicts with user mappings be handled? -------- As an initial implemetation you can probably assume that the GPU offloading is in "exclusive" mode. Basically that the CUDA or OpenACC code has full ownership of the card. The Tesla cards don't even have a video out on them. To complicate this even more - some offloading code has very long running kernels and even worse - may critically depend on using the full available GPU ram. (Large matrix sizes and soon big Fortran arrays...

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

...allocations itself (for graphics context mapping, etc), how >> should potential conflicts with user mappings be handled? >> -------- >> As an initial implemetation you can probably assume that the GPU >> offloading is in "exclusive" mode. Basically that the CUDA or OpenACC >> code has full ownership of the card. The Tesla cards don't even have a >> video out on them. To complicate this even more - some offloading code >> has very long running kernels and even worse - may critically depend >> on using the full available GPU ram. (Large matr...

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

...context mapping, etc), how >>>> should potential conflicts with user mappings be handled? >>>> -------- >>>> As an initial implemetation you can probably assume that the GPU >>>> offloading is in "exclusive" mode. Basically that the CUDA or OpenACC >>>> code has full ownership of the card. The Tesla cards don't even have a >>>> video out on them. To complicate this even more - some offloading code >>>> has very long running kernels and even worse - may critically depend >>>> on using the ful...

[LLVMdev] Fwd: Documentation about converting GIMPLE IR to LLVM IR in LLVM-GCC/DragonEgg

2012 Jul 13

[LLVMdev] Fwd: Documentation about converting GIMPLE IR to LLVM IR in LLVM-GCC/DragonEgg

...d instructions into LLVM IR? 3. Assume that tomorrow, some other front-end other than GCC/Clang is being plugged into LLVM back-end. Or assume that sometime later new FROTRAN front-end is being introduced along with Clang. Or Someone wants to plug-in EDG front-end into LVVM back-end. Or OpenACC needs to be supported in LLVM. Or OpenACC may get merge with OpenMP and becomes of one of the standard platforms to program heterogeneous architectures. By considering all these and other future possibilities, what is the best way to support OpenMP in LLVM. By best, I mean here that LLV...

CUDA fixed VA allocations and sparse mappings

2015 Jul 08

CUDA fixed VA allocations and sparse mappings

...;>>>> should potential conflicts with user mappings be handled? > >>>>> -------- > >>>>> As an initial implemetation you can probably assume that the GPU > >>>>> offloading is in "exclusive" mode. Basically that the CUDA or OpenACC > >>>>> code has full ownership of the card. The Tesla cards don't even have a > >>>>> video out on them. To complicate this even more - some offloading code > >>>>> has very long running kernels and even worse - may critically depend > &...

[Openmp-dev] [cfe-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Jun 01

[Openmp-dev] [cfe-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

...ny are working on implementations, and some have shipped. There's a list here: http://openmp.org/wp/openmp-compilers/ - no one here really lists any OpenMP 4 offloading implementations before 2015. PGI does not currently list OpenMP 4 at all (although they've certainly done a lot of work on OpenACC). > >> >> > Furthermore, our implementation is certainly >> > quite new, and OpenMP 4 offloading is really quite akin to SE in >> > that >> > regard. >> >> Strongly disagree - Intel has been working with the LLVM community on >> the par...

[LLVMdev] Documentation about converting GIMPLE IR to LLVM IR in LLVM-GCC/DragonEgg

2012 Jul 12

[LLVMdev] Documentation about converting GIMPLE IR to LLVM IR in LLVM-GCC/DragonEgg

Dear All, I am trying to understand the process followed for converting GIMPLE IR to LLVM IR in LLVM-GCC/DragonEgg - more importantly conversion of OpenMP extended GIMPLE IR to LLVM IR. It would be great if anybody points me to some documentation before I my-self delve into the understanding of related source code. -- Cheers -mahesha -------------- next part -------------- An HTML attachment

[LLVMdev] translating from OpenMP to CUDA

2012 Nov 09

[LLVMdev] translating from OpenMP to CUDA

The PTX back-end is robust (it's based on the sources used by nvcc), but I'm not sure about the OpenMP representation in LLVM IR. I believe the OpenMP constructs are already lowered into libgomp calls before leaving DragonEgg. It's been awhile since I've loooked at it though. If you use the PTX back-end and have any issues, please don't hesitate to post to the list and cc:

[LLVMdev] Pointer "data direction"

2013 Jan 09

[LLVMdev] Pointer "data direction"

...nction "f" has nested calls of other functions > with "side effects", meaning they could potentially change the contents of > "in" or "out" indirectly. For this reason, even current state-of-art > commercial APIs that imply strong data analysis (like OpenACC or HMPP) > require functions to be free of side effects, because nobody could solve > this problem well at compile-time. > The functions I'm going to analyze are not having side effects (sorry for not mentioning). Basically, they are enclosed kernels. > Depending on the purpose o...

[LLVMdev] Add a 'notrap' function attribute?

2013 Oct 31

[LLVMdev] Add a 'notrap' function attribute?

...;discouraged" in the specs). If they do, it's vendor-specific how the hardware exceptions are handled. It might be also the case with some other (future) languages targeting "streamlined" parallel accelerators in an heterogeneous setting. At least CUDA comes to mind. What about OpenACC and the new OpenMP, does someone know offhand? It would help several optimizations if they could assume certain instructions do not trap. E.g., I was looking at the if-conversion of the loop vectorizer, and it seems to not support speculating stores, divs, etc. which could be done if we knew it...

[LLVMdev] translating from OpenMP to CUDA

2012 Nov 08

[LLVMdev] translating from OpenMP to CUDA

Hi, Is it possible to translate an OpenMP program to CUDA using LLVM? I read that dragonegg has a OpenMP front-end and LLVM has a PTX back-end. I don't know how mature these tools are. Please let me know. Thanks. -Apala Postdoctoral Scholar Department of Computer Science, University of Chicago Computation Institute, Argonne National Laboratory http://sites.google.com/site/apalaguha/home/

[cfe-dev] [Openmp-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

2016 Jun 02

[cfe-dev] [Openmp-dev] RFC: Proposing an LLVM subproject for parallelism runtime and support libraries

On Thu, Jun 2, 2016 at 11:47 AM, Chandler Carruth <chandlerc at google.com> wrote: > (Mostly trying to re-focus the thread somewhat) > > Given support from Mehdi, Renato, and especially Hal who has contributed > specifically in this area to LLVM as a whole, and no strong objections from > significant contributors (I feel like the primary concerns Intel raised have > been

[LLVMdev] Pointer "data direction"

2013 Jan 09

[LLVMdev] Pointer "data direction"

...or instance, function "f" has nested calls of other functions with "side effects", meaning they could potentially change the contents of "in" or "out" indirectly. For this reason, even current state-of-art commercial APIs that imply strong data analysis (like OpenACC or HMPP) require functions to be free of side effects, because nobody could solve this problem well at compile-time. Depending on the purpose of your question, this may or may not help: in comparison to general analysis, LLVM community makes way better progress in analysing data access patterns fo...

search for: openacc