search for: gpuocelot

Displaying 5 results from an estimated 5 matches for "gpuocelot".

2012 Apr 08
1
[LLVMdev] LLVM show error preprocessor "Must #define __STDC_LIMIT_MACROS before #including Support/DataTypes.h"
Hello All, I build source code of Ocelot[http://code.google.com/p/gpuocelot/]. It using LLVM dependency of Ocelot. llvm-config get cppflags represent as below in order to build with Ocelot. ./llvm-config --cppflags -I/home/chatsiri/workspacecpp/llvm/include -I/home/chatsiri/workspacecpp/llvm/include -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS...
2013 May 08
0
[LLVMdev] Predicated Vector Operations
...resumably %newvalue will be consumed, possibly by another arithmetic operation. Presumably %oldvalue can similarly come from a previous arithmetic operation feeding into the add. If that's true, then %oldvalue is either %x or %y. Otherwise it is some other thing highly context-dependent. The gpuocelot project ran into the problem and they talk about it here: http://code.google.com/p/gpuocelot/source/browse/wiki/LLVM.wiki?r=272 The bottom line is that it is probably easier to set this up before LLVM IR goes into SSA form. There is a lot of interest in predication and a lot of recent discussion...
2009 Oct 12
0
[LLVMdev] Re presenting SIMT programs in LLVM
...uld like to start by thanking every developer who has contributed to LLVM for releasing such a high quality project. It has been incredibly valuable to several projects that I have worked on. My name is Gregory Diamos, I am a PhD student at Georgia Tech working on Ocelot (http://code.google.com/p/gpuocelot/). Ocelot is a dynamic binary translator from PTX (a virtual instruction set used by NVIDIA GPUs) to multi-core x86. We currently use LLVM's JIT as our x86 code generator. We have a prototype implementation finished that can execute most CUDA applications on our google code page using LLVM a...
2013 May 07
6
[LLVMdev] Predicated Vector Operations
I'm trying to understand how predicated/masked instructions can be generated in llvm, specifically an instruction where a set bit in the mask will write the new result into the corresponding vector lane in the destination and a clear bit will cause the lane in the destination to remain what it was before the instruction executed. I've seen a few places that suggest 'select' is the
2013 May 02
8
[LLVMdev] Handling Masked Vector Operations
...tion is to create an intrinsic: llvm_int_load_masked mask, [addr] But this unnecessarily shuts down optimization. Similar problems exist with any trapping instruction (div, mod, etc.). It gets even worse when you consider than any floating point operation can trap on a signalling NaN input. The gpuocelot project is essentially trying to do the same thing but I haven't dived deep enough into their notes and implementation to see how they handle this issue. Perhaps because current GPUs don't trap it's a non-issue. But that will likely change in the future. So are there any ideas out th...